* [PATCH v1 01/29] cxl: Remove else after return
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:10 ` Gregory Price
2025-01-07 16:37 ` Dave Jiang
2025-01-07 14:09 ` [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init() Robert Richter
` (28 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Remove unnecessary 'else' after return. Improves readability of code.
It is easier to place comments. Check and fix all occurrences under
drivers/cxl/.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/cdat.c | 2 +-
drivers/cxl/core/pci.c | 3 ++-
drivers/cxl/core/region.c | 4 +++-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 8153f8d83a16..2e69d9f27028 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -28,7 +28,7 @@ static u32 cdat_normalize(u16 entry, u64 base, u8 type)
*/
if (entry == 0xffff || !entry)
return 0;
- else if (base > (UINT_MAX / (entry)))
+ if (base > (UINT_MAX / (entry)))
return 0;
/*
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index b3aac9964e0d..3e8d20f8955c 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -415,7 +415,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
*/
if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
return devm_cxl_enable_mem(&port->dev, cxlds);
- else if (!hdm)
+
+ if (!hdm)
return -ENODEV;
root = to_cxl_port(port->dev.parent);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index b98b1ccffd1c..9b3efa841c8f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1915,7 +1915,9 @@ static int cxl_region_attach(struct cxl_region *cxlr,
if (p->state > CXL_CONFIG_INTERLEAVE_ACTIVE) {
dev_dbg(&cxlr->dev, "region already active\n");
return -EBUSY;
- } else if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) {
+ }
+
+ if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) {
dev_dbg(&cxlr->dev, "interleave config missing\n");
return -ENXIO;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 01/29] cxl: Remove else after return
2025-01-07 14:09 ` [PATCH v1 01/29] cxl: Remove else after return Robert Richter
@ 2025-01-07 16:10 ` Gregory Price
2025-01-07 16:37 ` Dave Jiang
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:10 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:47PM +0100, Robert Richter wrote:
> Remove unnecessary 'else' after return. Improves readability of code.
> It is easier to place comments. Check and fix all occurrences under
> drivers/cxl/.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/cdat.c | 2 +-
> drivers/cxl/core/pci.c | 3 ++-
> drivers/cxl/core/region.c | 4 +++-
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
Reviewed-by: Gregory Price <gourry@gourry.net>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 01/29] cxl: Remove else after return
2025-01-07 14:09 ` [PATCH v1 01/29] cxl: Remove else after return Robert Richter
2025-01-07 16:10 ` Gregory Price
@ 2025-01-07 16:37 ` Dave Jiang
2025-01-09 12:00 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Dave Jiang @ 2025-01-07 16:37 UTC (permalink / raw)
To: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman
On 1/7/25 7:09 AM, Robert Richter wrote:
> Remove unnecessary 'else' after return. Improves readability of code.
> It is easier to place comments. Check and fix all occurrences under
> drivers/cxl/.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Just send this ahead of the series to reduce this series size. It's trivial enough I can probably take it for 5.13 merge window.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/cdat.c | 2 +-
> drivers/cxl/core/pci.c | 3 ++-
> drivers/cxl/core/region.c | 4 +++-
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index 8153f8d83a16..2e69d9f27028 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -28,7 +28,7 @@ static u32 cdat_normalize(u16 entry, u64 base, u8 type)
> */
> if (entry == 0xffff || !entry)
> return 0;
> - else if (base > (UINT_MAX / (entry)))
> + if (base > (UINT_MAX / (entry)))
> return 0;
>
> /*
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index b3aac9964e0d..3e8d20f8955c 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -415,7 +415,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> */
> if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
> return devm_cxl_enable_mem(&port->dev, cxlds);
> - else if (!hdm)
> +
> + if (!hdm)
> return -ENODEV;
>
> root = to_cxl_port(port->dev.parent);
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index b98b1ccffd1c..9b3efa841c8f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1915,7 +1915,9 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> if (p->state > CXL_CONFIG_INTERLEAVE_ACTIVE) {
> dev_dbg(&cxlr->dev, "region already active\n");
> return -EBUSY;
> - } else if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) {
> + }
> +
> + if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) {
> dev_dbg(&cxlr->dev, "interleave config missing\n");
> return -ENXIO;
> }
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 01/29] cxl: Remove else after return
2025-01-07 16:37 ` Dave Jiang
@ 2025-01-09 12:00 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-09 12:00 UTC (permalink / raw)
To: Dave Jiang
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 07.01.25 09:37:43, Dave Jiang wrote:
>
>
> On 1/7/25 7:09 AM, Robert Richter wrote:
> > Remove unnecessary 'else' after return. Improves readability of code.
> > It is easier to place comments. Check and fix all occurrences under
> > drivers/cxl/.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
>
> Just send this ahead of the series to reduce this series size. It's trivial enough I can probably take it for 5.13 merge window.
>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
I realized patches #1 to #4 are independent of the rest of the series
so removing it from the series and sending it separately.
Thanks for review,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
2025-01-07 14:09 ` [PATCH v1 01/29] cxl: Remove else after return Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:18 ` Gregory Price
2025-01-07 14:09 ` [PATCH v1 03/29] cxl/pci: cxl_hdm_decode_init: Move comment Robert Richter
` (27 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Commit 3f9e07531778 ("cxl/pci: simplify the check of mem_enabled in
cxl_hdm_decode_init()") changed the code flow in this function. The
root port is determined before a check to leave the function. Since
the root port is not used by the check it can be moved to run the
check first. This improves code readability and avoids unnesessary
code execution.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/pci.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 3e8d20f8955c..d206378c4cbc 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -419,14 +419,6 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
if (!hdm)
return -ENODEV;
- root = to_cxl_port(port->dev.parent);
- while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
- root = to_cxl_port(root->dev.parent);
- if (!is_cxl_root(root)) {
- dev_err(dev, "Failed to acquire root port for HDM enable\n");
- return -ENODEV;
- }
-
if (!info->mem_enabled) {
rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
if (rc)
@@ -435,6 +427,14 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
return devm_cxl_enable_mem(&port->dev, cxlds);
}
+ root = to_cxl_port(port->dev.parent);
+ while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
+ root = to_cxl_port(root->dev.parent);
+ if (!is_cxl_root(root)) {
+ dev_err(dev, "Failed to acquire root port for HDM enable\n");
+ return -ENODEV;
+ }
+
for (i = 0, allowed = 0; i < info->ranges; i++) {
struct device *cxld_dev;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init()
2025-01-07 14:09 ` [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init() Robert Richter
@ 2025-01-07 16:18 ` Gregory Price
2025-01-29 12:47 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:18 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:48PM +0100, Robert Richter wrote:
> Commit 3f9e07531778 ("cxl/pci: simplify the check of mem_enabled in
> cxl_hdm_decode_init()") changed the code flow in this function. The
> root port is determined before a check to leave the function. Since
> the root port is not used by the check it can be moved to run the
> check first. This improves code readability and avoids unnesessary
> code execution.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/pci.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 3e8d20f8955c..d206378c4cbc 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -419,14 +419,6 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> if (!hdm)
> return -ENODEV;
>
> - root = to_cxl_port(port->dev.parent);
> - while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> - root = to_cxl_port(root->dev.parent);
> - if (!is_cxl_root(root)) {
> - dev_err(dev, "Failed to acquire root port for HDM enable\n");
> - return -ENODEV;
> - }
> -
Can't say definitively, but my reading of the original ordering suggests
the intent was to bail out of enabling anything if the cxl root cannot
be found (which suggests much larger issues).
This code flow allows the device to have its bits twiddled when the root
cannot be found - is that what we want?
> if (!info->mem_enabled) {
> rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> if (rc)
> @@ -435,6 +427,14 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> return devm_cxl_enable_mem(&port->dev, cxlds);
> }
>
> + root = to_cxl_port(port->dev.parent);
> + while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> + root = to_cxl_port(root->dev.parent);
> + if (!is_cxl_root(root)) {
> + dev_err(dev, "Failed to acquire root port for HDM enable\n");
> + return -ENODEV;
> + }
> +
> for (i = 0, allowed = 0; i < info->ranges; i++) {
> struct device *cxld_dev;
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init()
2025-01-07 16:18 ` Gregory Price
@ 2025-01-29 12:47 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-29 12:47 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 11:18:39, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:09:48PM +0100, Robert Richter wrote:
> > Commit 3f9e07531778 ("cxl/pci: simplify the check of mem_enabled in
> > cxl_hdm_decode_init()") changed the code flow in this function. The
> > root port is determined before a check to leave the function. Since
> > the root port is not used by the check it can be moved to run the
> > check first. This improves code readability and avoids unnesessary
> > code execution.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/pci.c | 16 ++++++++--------
> > 1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 3e8d20f8955c..d206378c4cbc 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -419,14 +419,6 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > if (!hdm)
> > return -ENODEV;
> >
> > - root = to_cxl_port(port->dev.parent);
> > - while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> > - root = to_cxl_port(root->dev.parent);
> > - if (!is_cxl_root(root)) {
> > - dev_err(dev, "Failed to acquire root port for HDM enable\n");
> > - return -ENODEV;
> > - }
> > -
>
> Can't say definitively, but my reading of the original ordering suggests
> the intent was to bail out of enabling anything if the cxl root cannot
> be found (which suggests much larger issues).
>
> This code flow allows the device to have its bits twiddled when the root
> cannot be found - is that what we want?
A soon as a port is created, the cxl root should always exist and in
practice never fails. There is no other ways to allocate the
port. Variable 'root' is used later below in the code and that is the
reason to determine it here.
>
> > if (!info->mem_enabled) {
> > rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> > if (rc)
> > @@ -435,6 +427,14 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > return devm_cxl_enable_mem(&port->dev, cxlds);
> > }
For the above reasons just enabling the memory without checking root
is safe.
-Robert
> >
> > + root = to_cxl_port(port->dev.parent);
> > + while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> > + root = to_cxl_port(root->dev.parent);
> > + if (!is_cxl_root(root)) {
> > + dev_err(dev, "Failed to acquire root port for HDM enable\n");
> > + return -ENODEV;
> > + }
> > +
> > for (i = 0, allowed = 0; i < info->ranges; i++) {
> > struct device *cxld_dev;
> >
> > --
> > 2.39.5
> >
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 03/29] cxl/pci: cxl_hdm_decode_init: Move comment
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
2025-01-07 14:09 ` [PATCH v1 01/29] cxl: Remove else after return Robert Richter
2025-01-07 14:09 ` [PATCH v1 02/29] cxl/pci: Moving code in cxl_hdm_decode_init() Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:46 ` Gregory Price
2025-01-07 14:09 ` [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init() Robert Richter
` (26 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
The comment applies to the check, move it there.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/pci.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index d206378c4cbc..c7050c13f71a 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -419,6 +419,15 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
if (!hdm)
return -ENODEV;
+ /*
+ * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
+ * [High,Low] when HDM operation is enabled the range register values
+ * are ignored by the device, but the spec also recommends matching the
+ * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
+ * are expected even though Linux does not require or maintain that
+ * match. If at least one DVSEC range is enabled and allowed, skip HDM
+ * Decoder Capability Enable.
+ */
if (!info->mem_enabled) {
rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
if (rc)
@@ -454,15 +463,6 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
return -ENXIO;
}
- /*
- * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
- * [High,Low] when HDM operation is enabled the range register values
- * are ignored by the device, but the spec also recommends matching the
- * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
- * are expected even though Linux does not require or maintain that
- * match. If at least one DVSEC range is enabled and allowed, skip HDM
- * Decoder Capability Enable.
- */
return 0;
}
EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, "CXL");
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 03/29] cxl/pci: cxl_hdm_decode_init: Move comment
2025-01-07 14:09 ` [PATCH v1 03/29] cxl/pci: cxl_hdm_decode_init: Move comment Robert Richter
@ 2025-01-07 16:46 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:46 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:49PM +0100, Robert Richter wrote:
> The comment applies to the check, move it there.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/pci.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index d206378c4cbc..c7050c13f71a 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -419,6 +419,15 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> if (!hdm)
> return -ENODEV;
>
> + /*
> + * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
> + * [High,Low] when HDM operation is enabled the range register values
> + * are ignored by the device, but the spec also recommends matching the
> + * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
> + * are expected even though Linux does not require or maintain that
> + * match. If at least one DVSEC range is enabled and allowed, skip HDM
> + * Decoder Capability Enable.
> + */
I agree this comment applies to the earlier check, but wow this function
is confusing when compared to what this comment says.
For example this line
/*
* If the HDM Decoder Capability is already enabled then assume
* that some other agent like platform firmware set it up.
*/
if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
return devm_cxl_enable_mem(&port->dev, cxlds);
It seems like range register validation logic is unreachable if HDM
decoders are enabled? It's unclear.
> if (!info->mem_enabled) {
> rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> if (rc)
> @@ -454,15 +463,6 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> return -ENXIO;
> }
>
> - /*
> - * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
> - * [High,Low] when HDM operation is enabled the range register values
> - * are ignored by the device, but the spec also recommends matching the
> - * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
> - * are expected even though Linux does not require or maintain that
> - * match. If at least one DVSEC range is enabled and allowed, skip HDM
> - * Decoder Capability Enable.
> - */
> return 0;
> }
> EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, "CXL");
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (2 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 03/29] cxl/pci: cxl_hdm_decode_init: Move comment Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:51 ` Gregory Price
2025-01-07 14:09 ` [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region() Robert Richter
` (25 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
There are various configuration cases of HDM decoder registers causing
different code paths. Add comments to cxl_hdm_decode_init() to better
explain them.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/pci.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index c7050c13f71a..4d2154457efb 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -416,9 +416,17 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
return devm_cxl_enable_mem(&port->dev, cxlds);
+ /*
+ * If the HDM Decoder Capability does not exist and DVSEC was
+ * not setup, the DVSEC based emulation cannot be used.
+ */
if (!hdm)
return -ENODEV;
+ /*
+ * The HDM Decoder Capability exists but is globally disabled.
+ */
+
/*
* Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
* [High,Low] when HDM operation is enabled the range register values
@@ -426,7 +434,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
* DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
* are expected even though Linux does not require or maintain that
* match. If at least one DVSEC range is enabled and allowed, skip HDM
- * Decoder Capability Enable.
+ * Decoder Capability Enable. Else, use the HDM Decoder Capability and
+ * enable it.
*/
if (!info->mem_enabled) {
rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init()
2025-01-07 14:09 ` [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init() Robert Richter
@ 2025-01-07 16:51 ` Gregory Price
2025-01-13 16:47 ` Jonathan Cameron
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:51 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:50PM +0100, Robert Richter wrote:
> There are various configuration cases of HDM decoder registers causing
> different code paths. Add comments to cxl_hdm_decode_init() to better
> explain them.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/pci.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
This addresses some of my prior questions, but I still think this
function is worth some extra scrutiny.
Reviewed-by: Gregory Price <gourry@gourry.net>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index c7050c13f71a..4d2154457efb 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -416,9 +416,17 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
> return devm_cxl_enable_mem(&port->dev, cxlds);
>
> + /*
> + * If the HDM Decoder Capability does not exist and DVSEC was
> + * not setup, the DVSEC based emulation cannot be used.
> + */
> if (!hdm)
> return -ENODEV;
>
> + /*
> + * The HDM Decoder Capability exists but is globally disabled.
> + */
> +
> /*
> * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
> * [High,Low] when HDM operation is enabled the range register values
> @@ -426,7 +434,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
> * are expected even though Linux does not require or maintain that
> * match. If at least one DVSEC range is enabled and allowed, skip HDM
> - * Decoder Capability Enable.
> + * Decoder Capability Enable. Else, use the HDM Decoder Capability and
> + * enable it.
> */
> if (!info->mem_enabled) {
> rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init()
2025-01-07 16:51 ` Gregory Price
@ 2025-01-13 16:47 ` Jonathan Cameron
0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-13 16:47 UTC (permalink / raw)
To: Gregory Price
Cc: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 11:51:23 -0500
Gregory Price <gourry@gourry.net> wrote:
> On Tue, Jan 07, 2025 at 03:09:50PM +0100, Robert Richter wrote:
> > There are various configuration cases of HDM decoder registers causing
> > different code paths. Add comments to cxl_hdm_decode_init() to better
> > explain them.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/pci.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
>
> This addresses some of my prior questions, but I still think this
> function is worth some extra scrutiny.
>
> Reviewed-by: Gregory Price <gourry@gourry.net>
Definitely an improvement.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index c7050c13f71a..4d2154457efb 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -416,9 +416,17 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > if (global_ctrl & CXL_HDM_DECODER_ENABLE || (!hdm && info->mem_enabled))
> > return devm_cxl_enable_mem(&port->dev, cxlds);
> >
> > + /*
> > + * If the HDM Decoder Capability does not exist and DVSEC was
> > + * not setup, the DVSEC based emulation cannot be used.
> > + */
> > if (!hdm)
> > return -ENODEV;
> >
> > + /*
> > + * The HDM Decoder Capability exists but is globally disabled.
> > + */
> > +
> > /*
> > * Per CXL 2.0 Section 8.1.3.8.3 and 8.1.3.8.4 DVSEC CXL Range 1 Base
> > * [High,Low] when HDM operation is enabled the range register values
> > @@ -426,7 +434,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > * DVSEC Range 1,2 to HDM Decoder Range 0,1. So, non-zero info->ranges
> > * are expected even though Linux does not require or maintain that
> > * match. If at least one DVSEC range is enabled and allowed, skip HDM
> > - * Decoder Capability Enable.
> > + * Decoder Capability Enable. Else, use the HDM Decoder Capability and
> > + * enable it.
> > */
> > if (!info->mem_enabled) {
> > rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> > --
> > 2.39.5
> >
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (3 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 04/29] cxl/pci: Add comments to cxl_hdm_decode_init() Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:49 ` Gregory Price
2025-01-13 16:52 ` Jonathan Cameron
2025-01-07 14:09 ` [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder Robert Richter
` (24 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
When adding an endpoint to a region, the root port is determined
first. Move this directly into cxl_add_to_region(). This is in
preparation of the initialization of endpoints that iterates the port
hierarchy from the endpoint up to the root port.
As a side-effect the root argument is removed from the argument lists
of cxl_add_to_region() and related functions. Now, the endpoint is the
only parameter to add a region. This simplifies the function
interface.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 6 ++++--
drivers/cxl/cxl.h | 6 ++----
drivers/cxl/port.c | 15 +++------------
3 files changed, 9 insertions(+), 18 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 9b3efa841c8f..752440a5c162 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3307,9 +3307,11 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
return ERR_PTR(rc);
}
-int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
+int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+ struct cxl_port *port = cxled_to_port(cxled);
+ struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
struct device *cxlrd_dev, *region_dev;
@@ -3319,7 +3321,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
bool attach = false;
int rc;
- cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
+ cxlrd_dev = device_find_child(&cxl_root->port.dev, &cxld->hpa_range,
match_root_decoder_by_range);
if (!cxlrd_dev) {
dev_err(cxlmd->dev.parent,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index fdac3ddb8635..5c1a55181e0f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -872,8 +872,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port);
#ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev);
struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
-int cxl_add_to_region(struct cxl_port *root,
- struct cxl_endpoint_decoder *cxled);
+int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
#else
static inline bool is_cxl_pmem_region(struct device *dev)
@@ -884,8 +883,7 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
{
return NULL;
}
-static inline int cxl_add_to_region(struct cxl_port *root,
- struct cxl_endpoint_decoder *cxled)
+static inline int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
{
return 0;
}
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index d2bfd1ff5492..74587a403e3d 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -30,7 +30,7 @@ static void schedule_detach(void *cxlmd)
schedule_cxl_memdev_detach(cxlmd);
}
-static int discover_region(struct device *dev, void *root)
+static int discover_region(struct device *dev, void *unused)
{
struct cxl_endpoint_decoder *cxled;
int rc;
@@ -49,7 +49,7 @@ static int discover_region(struct device *dev, void *root)
* Region enumeration is opportunistic, if this add-event fails,
* continue to the next endpoint decoder.
*/
- rc = cxl_add_to_region(root, cxled);
+ rc = cxl_add_to_region(cxled);
if (rc)
dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
@@ -95,7 +95,6 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
struct cxl_dev_state *cxlds = cxlmd->cxlds;
struct cxl_hdm *cxlhdm;
- struct cxl_port *root;
int rc;
rc = cxl_dvsec_rr_decode(cxlds, &info);
@@ -126,19 +125,11 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
if (rc)
return rc;
- /*
- * This can't fail in practice as CXL root exit unregisters all
- * descendant ports and that in turn synchronizes with cxl_port_probe()
- */
- struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
-
- root = &cxl_root->port;
-
/*
* Now that all endpoint decoders are successfully enumerated, try to
* assemble regions from committed decoders
*/
- device_for_each_child(&port->dev, root, discover_region);
+ device_for_each_child(&port->dev, NULL, discover_region);
return 0;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region()
2025-01-07 14:09 ` [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region() Robert Richter
@ 2025-01-07 16:49 ` Gregory Price
2025-01-13 16:52 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:49 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:51PM +0100, Robert Richter wrote:
> When adding an endpoint to a region, the root port is determined
> first. Move this directly into cxl_add_to_region(). This is in
> preparation of the initialization of endpoints that iterates the port
> hierarchy from the endpoint up to the root port.
>
> As a side-effect the root argument is removed from the argument lists
> of cxl_add_to_region() and related functions. Now, the endpoint is the
> only parameter to add a region. This simplifies the function
> interface.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region()
2025-01-07 14:09 ` [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region() Robert Richter
2025-01-07 16:49 ` Gregory Price
@ 2025-01-13 16:52 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-13 16:52 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 15:09:51 +0100
Robert Richter <rrichter@amd.com> wrote:
> When adding an endpoint to a region, the root port is determined
> first. Move this directly into cxl_add_to_region(). This is in
> preparation of the initialization of endpoints that iterates the port
> hierarchy from the endpoint up to the root port.
>
> As a side-effect the root argument is removed from the argument lists
> of cxl_add_to_region() and related functions. Now, the endpoint is the
> only parameter to add a region. This simplifies the function
> interface.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Seems reasonable to me.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/core/region.c | 6 ++++--
> drivers/cxl/cxl.h | 6 ++----
> drivers/cxl/port.c | 15 +++------------
> 3 files changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 9b3efa841c8f..752440a5c162 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3307,9 +3307,11 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> return ERR_PTR(rc);
> }
>
> -int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> +int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct cxl_port *port = cxled_to_port(cxled);
> + struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> struct device *cxlrd_dev, *region_dev;
> @@ -3319,7 +3321,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> bool attach = false;
> int rc;
>
> - cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
> + cxlrd_dev = device_find_child(&cxl_root->port.dev, &cxld->hpa_range,
> match_root_decoder_by_range);
> if (!cxlrd_dev) {
> dev_err(cxlmd->dev.parent,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index fdac3ddb8635..5c1a55181e0f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -872,8 +872,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port);
> #ifdef CONFIG_CXL_REGION
> bool is_cxl_pmem_region(struct device *dev);
> struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
> -int cxl_add_to_region(struct cxl_port *root,
> - struct cxl_endpoint_decoder *cxled);
> +int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
> struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
> #else
> static inline bool is_cxl_pmem_region(struct device *dev)
> @@ -884,8 +883,7 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
> {
> return NULL;
> }
> -static inline int cxl_add_to_region(struct cxl_port *root,
> - struct cxl_endpoint_decoder *cxled)
> +static inline int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> {
> return 0;
> }
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index d2bfd1ff5492..74587a403e3d 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -30,7 +30,7 @@ static void schedule_detach(void *cxlmd)
> schedule_cxl_memdev_detach(cxlmd);
> }
>
> -static int discover_region(struct device *dev, void *root)
> +static int discover_region(struct device *dev, void *unused)
> {
> struct cxl_endpoint_decoder *cxled;
> int rc;
> @@ -49,7 +49,7 @@ static int discover_region(struct device *dev, void *root)
> * Region enumeration is opportunistic, if this add-event fails,
> * continue to the next endpoint decoder.
> */
> - rc = cxl_add_to_region(root, cxled);
> + rc = cxl_add_to_region(cxled);
> if (rc)
> dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
> cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
> @@ -95,7 +95,6 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
> struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
> struct cxl_dev_state *cxlds = cxlmd->cxlds;
> struct cxl_hdm *cxlhdm;
> - struct cxl_port *root;
> int rc;
>
> rc = cxl_dvsec_rr_decode(cxlds, &info);
> @@ -126,19 +125,11 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
> if (rc)
> return rc;
>
> - /*
> - * This can't fail in practice as CXL root exit unregisters all
> - * descendant ports and that in turn synchronizes with cxl_port_probe()
> - */
> - struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> -
> - root = &cxl_root->port;
> -
> /*
> * Now that all endpoint decoders are successfully enumerated, try to
> * assemble regions from committed decoders
> */
> - device_for_each_child(&port->dev, root, discover_region);
> + device_for_each_child(&port->dev, NULL, discover_region);
>
> return 0;
> }
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (4 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 05/29] cxl/region: Move find_cxl_root() to cxl_add_to_region() Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:57 ` Gregory Price
2025-01-13 16:59 ` Jonathan Cameron
2025-01-07 14:09 ` [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region Robert Richter
` (23 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
In function cxl_add_to_region() there is code to determine the root
decoder associated to an endpoint decoder. Factor out that code for
later reuse. This also simplifies the function cxl_add_to_region() as
the change reduces its size and the number of used variables.
The reference of cxlrd_dev can be freed earlier. Since the root
decoder exists as long as the root port exists and the endpoint
already holds a reference to the root port, this additional reference
is not needed. Though it looks obvious to use __free() for the
reference of cxlrd_dev here too, this is done in a later rework. So
just move the code.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++-------------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 752440a5c162..448408918def 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3202,6 +3202,38 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
return range_contains(r1, r2);
}
+static struct cxl_root_decoder *
+cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
+{
+ struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+ struct cxl_port *port = cxled_to_port(cxled);
+ struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
+ struct range *hpa = &cxled->cxld.hpa_range;
+ struct cxl_decoder *cxld = &cxled->cxld;
+ struct device *cxlrd_dev;
+
+ cxlrd_dev = device_find_child(&cxl_root->port.dev, hpa,
+ match_root_decoder_by_range);
+ if (!cxlrd_dev) {
+ dev_err(cxlmd->dev.parent,
+ "%s:%s no CXL window for range %#llx:%#llx\n",
+ dev_name(&cxlmd->dev), dev_name(&cxld->dev),
+ cxld->hpa_range.start, cxld->hpa_range.end);
+ return NULL;
+ }
+
+ /*
+ * device_find_child() created a reference to the root
+ * decoder. Since the root decoder exists as long as the root
+ * port exists and the endpoint already holds a reference to
+ * the root port, this additional reference is not needed.
+ * Free it here.
+ */
+ put_device(cxlrd_dev);
+
+ return to_cxl_root_decoder(cxlrd_dev);
+}
+
static int match_region_by_range(struct device *dev, void *data)
{
struct cxl_region_params *p;
@@ -3309,29 +3341,17 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
{
- struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct cxl_port *port = cxled_to_port(cxled);
- struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
struct range *hpa = &cxled->cxld.hpa_range;
- struct cxl_decoder *cxld = &cxled->cxld;
- struct device *cxlrd_dev, *region_dev;
+ struct device *region_dev;
struct cxl_root_decoder *cxlrd;
struct cxl_region_params *p;
struct cxl_region *cxlr;
bool attach = false;
int rc;
- cxlrd_dev = device_find_child(&cxl_root->port.dev, &cxld->hpa_range,
- match_root_decoder_by_range);
- if (!cxlrd_dev) {
- dev_err(cxlmd->dev.parent,
- "%s:%s no CXL window for range %#llx:%#llx\n",
- dev_name(&cxlmd->dev), dev_name(&cxld->dev),
- cxld->hpa_range.start, cxld->hpa_range.end);
+ cxlrd = cxl_find_root_decoder(cxled);
+ if (!cxlrd)
return -ENXIO;
- }
-
- cxlrd = to_cxl_root_decoder(cxlrd_dev);
/*
* Ensure that if multiple threads race to construct_region() for @hpa
@@ -3349,7 +3369,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
rc = PTR_ERR_OR_ZERO(cxlr);
if (rc)
- goto out;
+ return rc;
attach_target(cxlr, cxled, -1, TASK_UNINTERRUPTIBLE);
@@ -3370,8 +3390,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
}
put_device(region_dev);
-out:
- put_device(cxlrd_dev);
+
return rc;
}
EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder
2025-01-07 14:09 ` [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder Robert Richter
@ 2025-01-07 16:57 ` Gregory Price
2025-01-13 16:59 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:57 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:52PM +0100, Robert Richter wrote:
> In function cxl_add_to_region() there is code to determine the root
> decoder associated to an endpoint decoder. Factor out that code for
> later reuse. This also simplifies the function cxl_add_to_region() as
> the change reduces its size and the number of used variables.
>
> The reference of cxlrd_dev can be freed earlier. Since the root
> decoder exists as long as the root port exists and the endpoint
> already holds a reference to the root port, this additional reference
> is not needed. Though it looks obvious to use __free() for the
> reference of cxlrd_dev here too, this is done in a later rework. So
> just move the code.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++-------------
> 1 file changed, 37 insertions(+), 18 deletions(-)
>
Reviewed-by: Gregory Price <gourry@gourry.net>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder
2025-01-07 14:09 ` [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder Robert Richter
2025-01-07 16:57 ` Gregory Price
@ 2025-01-13 16:59 ` Jonathan Cameron
2025-01-29 13:13 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-13 16:59 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 15:09:52 +0100
Robert Richter <rrichter@amd.com> wrote:
> In function cxl_add_to_region() there is code to determine the root
> decoder associated to an endpoint decoder. Factor out that code for
> later reuse. This also simplifies the function cxl_add_to_region() as
> the change reduces its size and the number of used variables.
>
> The reference of cxlrd_dev can be freed earlier. Since the root
> decoder exists as long as the root port exists and the endpoint
> already holds a reference to the root port, this additional reference
> is not needed. Though it looks obvious to use __free() for the
> reference of cxlrd_dev here too, this is done in a later rework. So
> just move the code.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
One trivial comment inline.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++-------------
> 1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 752440a5c162..448408918def 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3202,6 +3202,38 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
> return range_contains(r1, r2);
> }
>
> +static struct cxl_root_decoder *
> +cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> +{
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct cxl_port *port = cxled_to_port(cxled);
> + struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> + struct range *hpa = &cxled->cxld.hpa_range;
> + struct cxl_decoder *cxld = &cxled->cxld;
Flip this and the line above and you can have
struct range *hpa = cxld->hpa_range;
which is both shorter and matches original code save a few moments of
thinking.
> + struct device *cxlrd_dev;
> +
> + cxlrd_dev = device_find_child(&cxl_root->port.dev, hpa,
> + match_root_decoder_by_range);
> + if (!cxlrd_dev) {
> + dev_err(cxlmd->dev.parent,
> + "%s:%s no CXL window for range %#llx:%#llx\n",
> + dev_name(&cxlmd->dev), dev_name(&cxld->dev),
> + cxld->hpa_range.start, cxld->hpa_range.end);
> + return NULL;
> + }
> +
> + /*
> + * device_find_child() created a reference to the root
> + * decoder. Since the root decoder exists as long as the root
> + * port exists and the endpoint already holds a reference to
> + * the root port, this additional reference is not needed.
> + * Free it here.
> + */
> + put_device(cxlrd_dev);
> +
> + return to_cxl_root_decoder(cxlrd_dev);
> +}
> +
> static int match_region_by_range(struct device *dev, void *data)
> {
> struct cxl_region_params *p;
> @@ -3309,29 +3341,17 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>
> int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> {
> - struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_port *port = cxled_to_port(cxled);
> - struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> struct range *hpa = &cxled->cxld.hpa_range;
> - struct cxl_decoder *cxld = &cxled->cxld;
> - struct device *cxlrd_dev, *region_dev;
> + struct device *region_dev;
> struct cxl_root_decoder *cxlrd;
> struct cxl_region_params *p;
> struct cxl_region *cxlr;
> bool attach = false;
> int rc;
>
> - cxlrd_dev = device_find_child(&cxl_root->port.dev, &cxld->hpa_range,
> - match_root_decoder_by_range);
> - if (!cxlrd_dev) {
> - dev_err(cxlmd->dev.parent,
> - "%s:%s no CXL window for range %#llx:%#llx\n",
> - dev_name(&cxlmd->dev), dev_name(&cxld->dev),
> - cxld->hpa_range.start, cxld->hpa_range.end);
> + cxlrd = cxl_find_root_decoder(cxled);
> + if (!cxlrd)
> return -ENXIO;
> - }
> -
> - cxlrd = to_cxl_root_decoder(cxlrd_dev);
>
> /*
> * Ensure that if multiple threads race to construct_region() for @hpa
> @@ -3349,7 +3369,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
>
> rc = PTR_ERR_OR_ZERO(cxlr);
> if (rc)
> - goto out;
> + return rc;
>
> attach_target(cxlr, cxled, -1, TASK_UNINTERRUPTIBLE);
>
> @@ -3370,8 +3390,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> }
>
> put_device(region_dev);
> -out:
> - put_device(cxlrd_dev);
> +
> return rc;
> }
> EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder
2025-01-13 16:59 ` Jonathan Cameron
@ 2025-01-29 13:13 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-29 13:13 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 13.01.25 16:59:35, Jonathan Cameron wrote:
> On Tue, 7 Jan 2025 15:09:52 +0100
> Robert Richter <rrichter@amd.com> wrote:
>
> > In function cxl_add_to_region() there is code to determine the root
> > decoder associated to an endpoint decoder. Factor out that code for
> > later reuse. This also simplifies the function cxl_add_to_region() as
> > the change reduces its size and the number of used variables.
> >
> > The reference of cxlrd_dev can be freed earlier. Since the root
> > decoder exists as long as the root port exists and the endpoint
> > already holds a reference to the root port, this additional reference
> > is not needed. Though it looks obvious to use __free() for the
> > reference of cxlrd_dev here too, this is done in a later rework. So
> > just move the code.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> One trivial comment inline.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> > ---
> > drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++-------------
> > 1 file changed, 37 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 752440a5c162..448408918def 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -3202,6 +3202,38 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
> > return range_contains(r1, r2);
> > }
> >
> > +static struct cxl_root_decoder *
> > +cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> > +{
> > + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > + struct cxl_port *port = cxled_to_port(cxled);
> > + struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> > + struct range *hpa = &cxled->cxld.hpa_range;
> > + struct cxl_decoder *cxld = &cxled->cxld;
>
> Flip this and the line above and you can have
>
> struct range *hpa = cxld->hpa_range;
Changed that to:
struct range *hpa = &cxld->hpa_range;
-Robert
>
> which is both shorter and matches original code save a few moments of
> thinking.
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (5 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 06/29] cxl/region: Factor out code to find the root decoder Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 16:59 ` Gregory Price
2025-01-07 14:09 ` [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part Robert Richter
` (22 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
In function cxl_add_to_region() there is code to determine a root
decoder's region. Factor that code out. This is in preparation to
further rework and simplify function cxl_add_to_region().
No functional changes.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 448408918def..d5dcc94df0a5 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3255,6 +3255,19 @@ static int match_region_by_range(struct device *dev, void *data)
return rc;
}
+static struct cxl_region *
+cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
+{
+ struct device *region_dev;
+
+ region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
+ match_region_by_range);
+ if (!region_dev)
+ return NULL;
+
+ return to_cxl_region(region_dev);
+}
+
/* Establish an empty region covering the given HPA range */
static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
struct cxl_endpoint_decoder *cxled)
@@ -3342,7 +3355,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
{
struct range *hpa = &cxled->cxld.hpa_range;
- struct device *region_dev;
struct cxl_root_decoder *cxlrd;
struct cxl_region_params *p;
struct cxl_region *cxlr;
@@ -3358,13 +3370,9 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
* one does the construction and the others add to that.
*/
mutex_lock(&cxlrd->range_lock);
- region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
- match_region_by_range);
- if (!region_dev) {
+ cxlr = cxl_find_region_by_range(cxlrd, hpa);
+ if (!cxlr)
cxlr = construct_region(cxlrd, cxled);
- region_dev = &cxlr->dev;
- } else
- cxlr = to_cxl_region(region_dev);
mutex_unlock(&cxlrd->range_lock);
rc = PTR_ERR_OR_ZERO(cxlr);
@@ -3389,7 +3397,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
p->res);
}
- put_device(region_dev);
+ put_device(&cxlr->dev); /* cxl_find_region_by_range() */
return rc;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region
2025-01-07 14:09 ` [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region Robert Richter
@ 2025-01-07 16:59 ` Gregory Price
2025-01-30 16:43 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 16:59 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:53PM +0100, Robert Richter wrote:
> In function cxl_add_to_region() there is code to determine a root
> decoder's region. Factor that code out. This is in preparation to
> further rework and simplify function cxl_add_to_region().
>
> No functional changes.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 24 ++++++++++++++++--------
> 1 file changed, 16 insertions(+), 8 deletions(-)
>
... snip ...
> * one does the construction and the others add to that.
> */
> mutex_lock(&cxlrd->range_lock);
If the function must be called with the cxlrd range lock held, then the
function should have a comment/contract that states this.
> - region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
> - match_region_by_range);
> - if (!region_dev) {
> + cxlr = cxl_find_region_by_range(cxlrd, hpa);
> + if (!cxlr)
> cxlr = construct_region(cxlrd, cxled);
> - region_dev = &cxlr->dev;
> - } else
> - cxlr = to_cxl_region(region_dev);
> mutex_unlock(&cxlrd->range_lock);
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region
2025-01-07 16:59 ` Gregory Price
@ 2025-01-30 16:43 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-30 16:43 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 11:59:43AM -0500, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:09:53PM +0100, Robert Richter wrote:
> > In function cxl_add_to_region() there is code to determine a root
> > decoder's region. Factor that code out. This is in preparation to
> > further rework and simplify function cxl_add_to_region().
> >
> > No functional changes.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 24 ++++++++++++++++--------
> > 1 file changed, 16 insertions(+), 8 deletions(-)
> >
> ... snip ...
> > * one does the construction and the others add to that.
> > */
> > mutex_lock(&cxlrd->range_lock);
>
> If the function must be called with the cxlrd range lock held, then the
> function should have a comment/contract that states this.
No, the mutex locks the check for cxlr and construct_region().
cxl_find_region_by_range() itself doesn't need a lock.
-Robert
>
> > - region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
> > - match_region_by_range);
> > - if (!region_dev) {
> > + cxlr = cxl_find_region_by_range(cxlrd, hpa);
> > + if (!cxlr)
> > cxlr = construct_region(cxlrd, cxled);
> > - region_dev = &cxlr->dev;
> > - } else
> > - cxlr = to_cxl_region(region_dev);
> > mutex_unlock(&cxlrd->range_lock);
> >
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (6 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 07/29] cxl/region: Factor out code to find a root decoder's region Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 18:29 ` Gregory Price
2025-01-09 1:08 ` Li Ming
2025-01-07 14:09 ` [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder() Robert Richter
` (21 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Before adding an endpoint to a region, the endpoint is initialized
first. Move that part to a new function cxl_endpoint_initialize().
The function is in preparation of adding more parameters that need to
be determined in a setup.
The split also helps better separating the code. After initialization
the addition of an endpoint may fail with an error code and all the
data would need to be reverted to not leave the endpoint in an
undefined state. With separate functions the init part can succeed
even if the endpoint cannot be added.
Function naming follows the style of device_register() etc. Thus,
rename function cxl_add_to_region() to cxl_endpoint_register().
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 36 ++++++++++++++++++++++++++++--------
drivers/cxl/cxl.h | 5 +++--
drivers/cxl/port.c | 9 +++++----
3 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index d5dcc94df0a5..5132c689b1f2 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3340,7 +3340,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
dev_name(&cxlr->dev), p->res, p->interleave_ways,
p->interleave_granularity);
- /* ...to match put_device() in cxl_add_to_region() */
+ /* ...to match put_device() in cxl_endpoint_add() */
get_device(&cxlr->dev);
up_write(&cxl_region_rwsem);
@@ -3352,19 +3352,28 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
return ERR_PTR(rc);
}
-int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
+static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
{
- struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_root_decoder *cxlrd;
- struct cxl_region_params *p;
- struct cxl_region *cxlr;
- bool attach = false;
- int rc;
cxlrd = cxl_find_root_decoder(cxled);
if (!cxlrd)
return -ENXIO;
+ cxled->cxlrd = cxlrd;
+
+ return 0;
+}
+
+static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
+{
+ struct range *hpa = &cxled->cxld.hpa_range;
+ struct cxl_root_decoder *cxlrd = cxled->cxlrd;
+ struct cxl_region_params *p;
+ struct cxl_region *cxlr;
+ bool attach = false;
+ int rc;
+
/*
* Ensure that if multiple threads race to construct_region() for @hpa
* one does the construction and the others add to that.
@@ -3401,7 +3410,18 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
return rc;
}
-EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
+
+int cxl_endpoint_register(struct cxl_endpoint_decoder *cxled)
+{
+ int rc;
+
+ rc = cxl_endpoint_initialize(cxled);
+ if (rc)
+ return rc;
+
+ return cxl_endpoint_add(cxled);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_endpoint_register, "CXL");
static int is_system_ram(struct resource *res, void *arg)
{
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5c1a55181e0f..b3989dc58ed1 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -417,6 +417,7 @@ enum cxl_decoder_state {
*/
struct cxl_endpoint_decoder {
struct cxl_decoder cxld;
+ struct cxl_root_decoder *cxlrd;
struct resource *dpa_res;
resource_size_t skip;
enum cxl_decoder_mode mode;
@@ -872,7 +873,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_port *port);
#ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev);
struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
-int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
+int cxl_endpoint_register(struct cxl_endpoint_decoder *cxled);
struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
#else
static inline bool is_cxl_pmem_region(struct device *dev)
@@ -883,7 +884,7 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
{
return NULL;
}
-static inline int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
+static inline int cxl_endpoint_register(struct cxl_endpoint_decoder *cxled)
{
return 0;
}
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 74587a403e3d..6eb82a118bd5 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -46,13 +46,14 @@ static int discover_region(struct device *dev, void *unused)
return 0;
/*
- * Region enumeration is opportunistic, if this add-event fails,
+ * Region enumeration is opportunistic, ignore errors and
* continue to the next endpoint decoder.
*/
- rc = cxl_add_to_region(cxled);
+ rc = cxl_endpoint_register(cxled);
if (rc)
- dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
- cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
+ dev_warn(cxled->cxld.dev.parent,
+ "failed to register %s: %d\n",
+ dev_name(&cxled->cxld.dev), rc);
return 0;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part
2025-01-07 14:09 ` [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part Robert Richter
@ 2025-01-07 18:29 ` Gregory Price
2025-01-30 16:53 ` Robert Richter
2025-01-09 1:08 ` Li Ming
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:29 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:54PM +0100, Robert Richter wrote:
> Before adding an endpoint to a region, the endpoint is initialized
> first. Move that part to a new function cxl_endpoint_initialize().
> The function is in preparation of adding more parameters that need to
> be determined in a setup.
>
> The split also helps better separating the code. After initialization
> the addition of an endpoint may fail with an error code and all the
> data would need to be reverted to not leave the endpoint in an
> undefined state. With separate functions the init part can succeed
> even if the endpoint cannot be added.
>
> Function naming follows the style of device_register() etc. Thus,
> rename function cxl_add_to_region() to cxl_endpoint_register().
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Little bit difficult to read mixing style and functionality, but I like
this update and I understand why. One inline question
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/region.c | 36 ++++++++++++++++++++++++++++--------
> drivers/cxl/cxl.h | 5 +++--
> drivers/cxl/port.c | 9 +++++----
> 3 files changed, 36 insertions(+), 14 deletions(-)
>
... snip ...
> + rc = cxl_endpoint_register(cxled);
> if (rc)
> - dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
> - cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
> + dev_warn(cxled->cxld.dev.parent,
> + "failed to register %s: %d\n",
> + dev_name(&cxled->cxld.dev), rc);
Is it worth differentiating obvious failures here for a better warning?
I'm fine either way.
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part
2025-01-07 18:29 ` Gregory Price
@ 2025-01-30 16:53 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-30 16:53 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 01:29:03PM -0500, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:09:54PM +0100, Robert Richter wrote:
> ... snip ...
> > + rc = cxl_endpoint_register(cxled);
> > if (rc)
> > - dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
> > - cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
> > + dev_warn(cxled->cxld.dev.parent,
> > + "failed to register %s: %d\n",
> > + dev_name(&cxled->cxld.dev), rc);
>
> Is it worth differentiating obvious failures here for a better warning?
> I'm fine either way.
If an endpoint cannot be registered, this will likly cause a region
probe failure too. I raised the log level to make this visible for
non-dbg logging. I have also removed access to cxled->cxld.hpa_range
as this is implemenation specific to cxl_endpoint_register(). There
are other debug messages to determine the details of the failure here.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part
2025-01-07 14:09 ` [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part Robert Richter
2025-01-07 18:29 ` Gregory Price
@ 2025-01-09 1:08 ` Li Ming
2025-01-09 10:30 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Li Ming @ 2025-01-09 1:08 UTC (permalink / raw)
To: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman
On 1/7/2025 10:09 PM, Robert Richter wrote:
> Before adding an endpoint to a region, the endpoint is initialized
> first. Move that part to a new function cxl_endpoint_initialize().
> The function is in preparation of adding more parameters that need to
> be determined in a setup.
>
> The split also helps better separating the code. After initialization
> the addition of an endpoint may fail with an error code and all the
> data would need to be reverted to not leave the endpoint in an
> undefined state. With separate functions the init part can succeed
> even if the endpoint cannot be added.
>
> Function naming follows the style of device_register() etc. Thus,
> rename function cxl_add_to_region() to cxl_endpoint_register().
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 36 ++++++++++++++++++++++++++++--------
> drivers/cxl/cxl.h | 5 +++--
> drivers/cxl/port.c | 9 +++++----
> 3 files changed, 36 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index d5dcc94df0a5..5132c689b1f2 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3340,7 +3340,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> dev_name(&cxlr->dev), p->res, p->interleave_ways,
> p->interleave_granularity);
>
> - /* ...to match put_device() in cxl_add_to_region() */
> + /* ...to match put_device() in cxl_endpoint_add() */
> get_device(&cxlr->dev);
> up_write(&cxl_region_rwsem);
>
> @@ -3352,19 +3352,28 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> return ERR_PTR(rc);
> }
>
> -int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> +static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> {
> - struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_root_decoder *cxlrd;
> - struct cxl_region_params *p;
> - struct cxl_region *cxlr;
> - bool attach = false;
> - int rc;
>
> cxlrd = cxl_find_root_decoder(cxled);
> if (!cxlrd)
> return -ENXIO;
>
> + cxled->cxlrd = cxlrd;
> +
> + return 0;
> +}
> +
> +static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
> +{
> + struct range *hpa = &cxled->cxld.hpa_range;
> + struct cxl_root_decoder *cxlrd = cxled->cxlrd;
> + struct cxl_region_params *p;
> + struct cxl_region *cxlr;
> + bool attach = false;
> + int rc;
> +
> /*
> * Ensure that if multiple threads race to construct_region() for @hpa
> * one does the construction and the others add to that.
> @@ -3401,7 +3410,18 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
>
> return rc;
> }
> -EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, "CXL");
> +
> +int cxl_endpoint_register(struct cxl_endpoint_decoder *cxled)
> +{
> + int rc;
> +
> + rc = cxl_endpoint_initialize(cxled);
> + if (rc)
> + return rc;
> +
> + return cxl_endpoint_add(cxled);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_register, "CXL");
Hi Robert,
cxl_endpoint_initialize(), cxl_endpoint_add(), cxl_endpoint_register() feels like some functions related to an endpoint, but I think they are for an endpoint decoder enabling, maybe rename them to cxl_endpoint_decoder_initialize()/add()/register()?
Ming
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part
2025-01-09 1:08 ` Li Ming
@ 2025-01-09 10:30 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-09 10:30 UTC (permalink / raw)
To: Li Ming
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman
Ming,
On 09.01.25 09:08:32, Li Ming wrote:
> On 1/7/2025 10:09 PM, Robert Richter wrote:
> > +int cxl_endpoint_register(struct cxl_endpoint_decoder *cxled)
> > +{
> > + int rc;
> > +
> > + rc = cxl_endpoint_initialize(cxled);
> > + if (rc)
> > + return rc;
> > +
> > + return cxl_endpoint_add(cxled);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_register, "CXL");
>
> Hi Robert,
> cxl_endpoint_initialize(), cxl_endpoint_add(),
> cxl_endpoint_register() feels like some functions related to an
> endpoint, but I think they are for an endpoint decoder enabling,
> maybe rename them to
> cxl_endpoint_decoder_initialize()/add()/register()?
Yes, this handles the endpoint decoder. I noticed that too but kept
the short naming. Will rename it. This aligns then with other existing
cxl_endpoint_decoder_*() functions.
Thanks for review,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (7 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 08/29] cxl/region: Split region registration into an initialization and adding part Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 17:23 ` Gregory Price
2025-01-13 18:11 ` Jonathan Cameron
2025-01-07 14:09 ` [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range Robert Richter
` (20 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
The function cxl_find_root_decoder() uses find_cxl_root() to find the
root port. For the implementation of support of address translation an
iterator is needed that traverses all ports from the endpoint to the
root port.
Use the iterator in find_cxl_root() and unfold it into
cxl_find_root_decoder().
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5132c689b1f2..5750ed2796a8 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3206,13 +3206,18 @@ static struct cxl_root_decoder *
cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct cxl_port *port = cxled_to_port(cxled);
- struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
+ struct cxl_port *iter = cxled_to_port(cxled);
struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
struct device *cxlrd_dev;
- cxlrd_dev = device_find_child(&cxl_root->port.dev, hpa,
+ while (iter && !is_cxl_root(iter))
+ iter = to_cxl_port(iter->dev.parent);
+
+ if (!iter)
+ return NULL;
+
+ cxlrd_dev = device_find_child(&iter->dev, hpa,
match_root_decoder_by_range);
if (!cxlrd_dev) {
dev_err(cxlmd->dev.parent,
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder()
2025-01-07 14:09 ` [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder() Robert Richter
@ 2025-01-07 17:23 ` Gregory Price
2025-01-13 18:11 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 17:23 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:55PM +0100, Robert Richter wrote:
> The function cxl_find_root_decoder() uses find_cxl_root() to find the
> root port. For the implementation of support of address translation an
> iterator is needed that traverses all ports from the endpoint to the
> root port.
>
> Use the iterator in find_cxl_root() and unfold it into
> cxl_find_root_decoder().
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
I wonder if we can expose this mapping in the sysfs hierarchy, because
I've always been frustrated about how confusing what decoders/endpoints
relate to each other.
(I say this not looking forward in the series to see if you did exactly
this, just spitballing)
Reviewed-by: Gregory Price <gourry@gourry.net>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 5132c689b1f2..5750ed2796a8 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3206,13 +3206,18 @@ static struct cxl_root_decoder *
> cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_port *port = cxled_to_port(cxled);
> - struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> + struct cxl_port *iter = cxled_to_port(cxled);
> struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> struct device *cxlrd_dev;
>
> - cxlrd_dev = device_find_child(&cxl_root->port.dev, hpa,
> + while (iter && !is_cxl_root(iter))
> + iter = to_cxl_port(iter->dev.parent);
> +
> + if (!iter)
> + return NULL;
> +
> + cxlrd_dev = device_find_child(&iter->dev, hpa,
> match_root_decoder_by_range);
> if (!cxlrd_dev) {
> dev_err(cxlmd->dev.parent,
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder()
2025-01-07 14:09 ` [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder() Robert Richter
2025-01-07 17:23 ` Gregory Price
@ 2025-01-13 18:11 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-13 18:11 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 15:09:55 +0100
Robert Richter <rrichter@amd.com> wrote:
> The function cxl_find_root_decoder() uses find_cxl_root() to find the
> root port. For the implementation of support of address translation an
> iterator is needed that traverses all ports from the endpoint to the
> root port.
>
> Use the iterator in find_cxl_root() and unfold it into
> cxl_find_root_decoder().
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
It is functionally the same. So I'll assume it makes sense later :)
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/core/region.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 5132c689b1f2..5750ed2796a8 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3206,13 +3206,18 @@ static struct cxl_root_decoder *
> cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_port *port = cxled_to_port(cxled);
> - struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
> + struct cxl_port *iter = cxled_to_port(cxled);
> struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> struct device *cxlrd_dev;
>
> - cxlrd_dev = device_find_child(&cxl_root->port.dev, hpa,
> + while (iter && !is_cxl_root(iter))
> + iter = to_cxl_port(iter->dev.parent);
> +
> + if (!iter)
> + return NULL;
> +
> + cxlrd_dev = device_find_child(&iter->dev, hpa,
> match_root_decoder_by_range);
> if (!cxlrd_dev) {
> dev_err(cxlmd->dev.parent,
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (8 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 09/29] cxl/region: Use iterator to find the root port in cxl_find_root_decoder() Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 18:38 ` Gregory Price
2025-01-17 21:31 ` Ben Cheatham
2025-01-07 14:09 ` [PATCH v1 11/29] cxl/region: Unfold cxl_find_root_decoder() into cxl_endpoint_initialize() Robert Richter
` (19 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Factor out code to find the switch decoder of a port for a specific
address range. Reuse the code to search a root decoder, create the
function cxl_port_find_switch_decoder() and rework
match_root_decoder_by_range() to be usable for switch decoders too.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 43 +++++++++++++++++++++++----------------
1 file changed, 25 insertions(+), 18 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5750ed2796a8..48add814924b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3189,19 +3189,35 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
return rc;
}
-static int match_root_decoder_by_range(struct device *dev, void *data)
+static int match_decoder_by_range(struct device *dev, void *data)
{
struct range *r1, *r2 = data;
- struct cxl_root_decoder *cxlrd;
+ struct cxl_decoder *cxld;
- if (!is_root_decoder(dev))
+ if (!is_switch_decoder(dev))
return 0;
- cxlrd = to_cxl_root_decoder(dev);
- r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ cxld = to_cxl_decoder(dev);
+ r1 = &cxld->hpa_range;
return range_contains(r1, r2);
}
+static struct cxl_decoder *
+cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
+{
+ /*
+ * device_find_child() creates a reference to the root
+ * decoder. Since the root decoder exists as long as the root
+ * port exists and the endpoint already holds a reference to
+ * the root port, this additional reference is not needed.
+ * Free it here.
+ */
+ struct device *cxld_dev __free(put_device) =
+ device_find_child(&port->dev, hpa, match_decoder_by_range);
+
+ return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
+}
+
static struct cxl_root_decoder *
cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
{
@@ -3209,7 +3225,6 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
struct cxl_port *iter = cxled_to_port(cxled);
struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
- struct device *cxlrd_dev;
while (iter && !is_cxl_root(iter))
iter = to_cxl_port(iter->dev.parent);
@@ -3217,9 +3232,8 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
if (!iter)
return NULL;
- cxlrd_dev = device_find_child(&iter->dev, hpa,
- match_root_decoder_by_range);
- if (!cxlrd_dev) {
+ cxld = cxl_port_find_switch_decoder(iter, hpa);
+ if (!cxld) {
dev_err(cxlmd->dev.parent,
"%s:%s no CXL window for range %#llx:%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxld->dev),
@@ -3227,16 +3241,9 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
return NULL;
}
- /*
- * device_find_child() created a reference to the root
- * decoder. Since the root decoder exists as long as the root
- * port exists and the endpoint already holds a reference to
- * the root port, this additional reference is not needed.
- * Free it here.
- */
- put_device(cxlrd_dev);
- return to_cxl_root_decoder(cxlrd_dev);
+
+ return to_cxl_root_decoder(&cxld->dev);
}
static int match_region_by_range(struct device *dev, void *data)
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range
2025-01-07 14:09 ` [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range Robert Richter
@ 2025-01-07 18:38 ` Gregory Price
2025-01-30 16:58 ` Robert Richter
2025-01-17 21:31 ` Ben Cheatham
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:38 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:56PM +0100, Robert Richter wrote:
> Factor out code to find the switch decoder of a port for a specific
> address range. Reuse the code to search a root decoder, create the
> function cxl_port_find_switch_decoder() and rework
> match_root_decoder_by_range() to be usable for switch decoders too.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 43 +++++++++++++++++++++++----------------
> 1 file changed, 25 insertions(+), 18 deletions(-)
... snip ...
>
> - cxlrd_dev = device_find_child(&iter->dev, hpa,
> - match_root_decoder_by_range);
> - if (!cxlrd_dev) {
> + cxld = cxl_port_find_switch_decoder(iter, hpa);
> + if (!cxld) {
Are there scenarios where this would return a different decoder than
previously? For example, is there an assumption that root decoders
will be search first, as opposed to intermediate decoders?
The match function was changed to check is_switch_decoder from
is_root_decoder, i'm just worried about the case where we might have
multiple decoders in the path and the switch decoder is hit first -
resulting in the wrong decoder returned.
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range
2025-01-07 18:38 ` Gregory Price
@ 2025-01-30 16:58 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-30 16:58 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 01:38:33PM -0500, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:09:56PM +0100, Robert Richter wrote:
> > Factor out code to find the switch decoder of a port for a specific
> > address range. Reuse the code to search a root decoder, create the
> > function cxl_port_find_switch_decoder() and rework
> > match_root_decoder_by_range() to be usable for switch decoders too.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 43 +++++++++++++++++++++++----------------
> > 1 file changed, 25 insertions(+), 18 deletions(-)
> ... snip ...
> >
> > - cxlrd_dev = device_find_child(&iter->dev, hpa,
> > - match_root_decoder_by_range);
> > - if (!cxlrd_dev) {
> > + cxld = cxl_port_find_switch_decoder(iter, hpa);
> > + if (!cxld) {
>
> Are there scenarios where this would return a different decoder than
> previously? For example, is there an assumption that root decoders
> will be search first, as opposed to intermediate decoders?
>
> The match function was changed to check is_switch_decoder from
> is_root_decoder, i'm just worried about the case where we might have
> multiple decoders in the path and the switch decoder is hit first -
> resulting in the wrong decoder returned.
Intermediate decoders never share the port with a root decoder, both
decoders always have different parents. So depending on the direction
of walking the tree, the same port is always found first. That is,
starting at the endpoint, an intermediate switch decoder would be
found first. Search is always deterministic.
Note the "root_decoder" is a subset of "switch_decoder" here.
-Robert
>
> ~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range
2025-01-07 14:09 ` [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range Robert Richter
2025-01-07 18:38 ` Gregory Price
@ 2025-01-17 21:31 ` Ben Cheatham
2025-01-30 17:02 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:31 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 1/7/25 8:09 AM, Robert Richter wrote:
> Factor out code to find the switch decoder of a port for a specific
> address range. Reuse the code to search a root decoder, create the
> function cxl_port_find_switch_decoder() and rework
> match_root_decoder_by_range() to be usable for switch decoders too.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 43 +++++++++++++++++++++++----------------
> 1 file changed, 25 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 5750ed2796a8..48add814924b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3189,19 +3189,35 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> return rc;
> }
>
> -static int match_root_decoder_by_range(struct device *dev, void *data)
> +static int match_decoder_by_range(struct device *dev, void *data)
> {
> struct range *r1, *r2 = data;
> - struct cxl_root_decoder *cxlrd;
> + struct cxl_decoder *cxld;
>
> - if (!is_root_decoder(dev))
> + if (!is_switch_decoder(dev))
> return 0;
>
> - cxlrd = to_cxl_root_decoder(dev);
> - r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + cxld = to_cxl_decoder(dev);
> + r1 = &cxld->hpa_range;
> return range_contains(r1, r2);
> }
>
> +static struct cxl_decoder *
> +cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> +{
> + /*
> + * device_find_child() creates a reference to the root
> + * decoder. Since the root decoder exists as long as the root
> + * port exists and the endpoint already holds a reference to
> + * the root port, this additional reference is not needed.
> + * Free it here.
> + */
Is this comment still true? I haven't read the rest of the series yet, but there's
nothing enforcing that this function is called on a root port. If it's meant to
only be used for root ports then it should probably be named that way.
Also, if it is meant to be used for a general switch decoder, can we always free
the reference? If so then all that needs to happen is a comment update, otherwise
you'll need to keep the reference and put a comment somewhere that the function
needs a matching put_device().
> + struct device *cxld_dev __free(put_device) =
> + device_find_child(&port->dev, hpa, match_decoder_by_range);
> +
> + return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
> +}
> +
> static struct cxl_root_decoder *
> cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> {
> @@ -3209,7 +3225,6 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> struct cxl_port *iter = cxled_to_port(cxled);
> struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> - struct device *cxlrd_dev;
>
> while (iter && !is_cxl_root(iter))
> iter = to_cxl_port(iter->dev.parent);
> @@ -3217,9 +3232,8 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> if (!iter)
> return NULL;
>
> - cxlrd_dev = device_find_child(&iter->dev, hpa,
> - match_root_decoder_by_range);
> - if (!cxlrd_dev) {
> + cxld = cxl_port_find_switch_decoder(iter, hpa);
> + if (!cxld) {
> dev_err(cxlmd->dev.parent,
> "%s:%s no CXL window for range %#llx:%#llx\n",
> dev_name(&cxlmd->dev), dev_name(&cxld->dev),
> @@ -3227,16 +3241,9 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
> return NULL;
> }
>
> - /*
> - * device_find_child() created a reference to the root
> - * decoder. Since the root decoder exists as long as the root
> - * port exists and the endpoint already holds a reference to
> - * the root port, this additional reference is not needed.
> - * Free it here.
> - */
> - put_device(cxlrd_dev);
>
> - return to_cxl_root_decoder(cxlrd_dev);
> +
> + return to_cxl_root_decoder(&cxld->dev);
> }
>
> static int match_region_by_range(struct device *dev, void *data)
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range
2025-01-17 21:31 ` Ben Cheatham
@ 2025-01-30 17:02 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-30 17:02 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On Fri, Jan 17, 2025 at 03:31:34PM -0600, Ben Cheatham wrote:
> On 1/7/25 8:09 AM, Robert Richter wrote:
> > Factor out code to find the switch decoder of a port for a specific
> > address range. Reuse the code to search a root decoder, create the
> > function cxl_port_find_switch_decoder() and rework
> > match_root_decoder_by_range() to be usable for switch decoders too.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 43 +++++++++++++++++++++++----------------
> > 1 file changed, 25 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 5750ed2796a8..48add814924b 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -3189,19 +3189,35 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> > return rc;
> > }
> >
> > -static int match_root_decoder_by_range(struct device *dev, void *data)
> > +static int match_decoder_by_range(struct device *dev, void *data)
> > {
> > struct range *r1, *r2 = data;
> > - struct cxl_root_decoder *cxlrd;
> > + struct cxl_decoder *cxld;
> >
> > - if (!is_root_decoder(dev))
> > + if (!is_switch_decoder(dev))
> > return 0;
> >
> > - cxlrd = to_cxl_root_decoder(dev);
> > - r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > + cxld = to_cxl_decoder(dev);
> > + r1 = &cxld->hpa_range;
> > return range_contains(r1, r2);
> > }
> >
> > +static struct cxl_decoder *
> > +cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> > +{
> > + /*
> > + * device_find_child() creates a reference to the root
> > + * decoder. Since the root decoder exists as long as the root
> > + * port exists and the endpoint already holds a reference to
> > + * the root port, this additional reference is not needed.
> > + * Free it here.
> > + */
>
> Is this comment still true? I haven't read the rest of the series yet, but there's
> nothing enforcing that this function is called on a root port. If it's meant to
> only be used for root ports then it should probably be named that way.
>
> Also, if it is meant to be used for a general switch decoder, can we always free
> the reference? If so then all that needs to happen is a comment update, otherwise
> you'll need to keep the reference and put a comment somewhere that the function
> needs a matching put_device().
In general, the assumption is true and all ports in the hierarchy
exist as long as the endpoint exists. The reference can be freed. I
have updated the comment:
/*
* device_find_child() increments the reference count of the
* the switch decoder's parent port to protect the reference
* to its child. The port is already a parent of the endpoint
* decoder's port, at least indirectly in the port hierarchy.
* Thus, the endpoint already holds a reference for the parent
* port of the switch decoder. Free the unnecessary reference
* here.
*/
Thanks for catching this.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 11/29] cxl/region: Unfold cxl_find_root_decoder() into cxl_endpoint_initialize()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (9 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 10/29] cxl/region: Add function to find a port's switch decoder by range Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 18:41 ` Gregory Price
2025-01-07 14:09 ` [PATCH v1 12/29] cxl: Modify address translation callback for generic use Robert Richter
` (18 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
To determine other endpoint parameters such as interleaving parameters
during endpoint initialization, the iterator function in
cxl_find_root_decoder() can be used. Unfold this function into
cxl_endpoint_initialize() and make the iterator available there.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 24 +++++-------------------
1 file changed, 5 insertions(+), 19 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 48add814924b..6fcf56806606 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3218,8 +3218,7 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
}
-static struct cxl_root_decoder *
-cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
+static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
struct cxl_port *iter = cxled_to_port(cxled);
@@ -3230,7 +3229,7 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
iter = to_cxl_port(iter->dev.parent);
if (!iter)
- return NULL;
+ return -ENXIO;
cxld = cxl_port_find_switch_decoder(iter, hpa);
if (!cxld) {
@@ -3238,12 +3237,12 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
"%s:%s no CXL window for range %#llx:%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxld->dev),
cxld->hpa_range.start, cxld->hpa_range.end);
- return NULL;
+ return -ENXIO;
}
+ cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
-
- return to_cxl_root_decoder(&cxld->dev);
+ return 0;
}
static int match_region_by_range(struct device *dev, void *data)
@@ -3364,19 +3363,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
return ERR_PTR(rc);
}
-static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
-{
- struct cxl_root_decoder *cxlrd;
-
- cxlrd = cxl_find_root_decoder(cxled);
- if (!cxlrd)
- return -ENXIO;
-
- cxled->cxlrd = cxlrd;
-
- return 0;
-}
-
static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
{
struct range *hpa = &cxled->cxld.hpa_range;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 11/29] cxl/region: Unfold cxl_find_root_decoder() into cxl_endpoint_initialize()
2025-01-07 14:09 ` [PATCH v1 11/29] cxl/region: Unfold cxl_find_root_decoder() into cxl_endpoint_initialize() Robert Richter
@ 2025-01-07 18:41 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:41 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:57PM +0100, Robert Richter wrote:
> To determine other endpoint parameters such as interleaving parameters
> during endpoint initialization, the iterator function in
> cxl_find_root_decoder() can be used. Unfold this function into
> cxl_endpoint_initialize() and make the iterator available there.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
This makes sense in the context of the prior patch, so long as the
prior patch holds.
Reviewed-by: Gregory Price <gourry@gourry.net>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 12/29] cxl: Modify address translation callback for generic use
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (10 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 11/29] cxl/region: Unfold cxl_find_root_decoder() into cxl_endpoint_initialize() Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 18:44 ` Gregory Price
2025-01-17 21:31 ` Ben Cheatham
2025-01-07 14:09 ` [PATCH v1 13/29] cxl: Introduce callback to translate an HPA range from a port to its parent Robert Richter
` (17 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
The root decoder address translation callback could be reused for
other decoders too. For generic use of the callback, change the
function interface to use a decoder argument instead of the root
decoder.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/acpi.c | 4 ++--
drivers/cxl/core/region.c | 2 +-
drivers/cxl/cxl.h | 5 ++---
3 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index cb14829bb9be..b42cffd6751f 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -22,9 +22,9 @@ static const guid_t acpi_cxl_qtg_id_guid =
GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
-
-static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
+static u64 cxl_xor_hpa_to_spa(struct cxl_decoder *cxld, u64 hpa)
{
+ struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(&cxld->dev);
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
int hbiw = cxlrd->cxlsd.nr_targets;
u64 val;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6fcf56806606..9443507ed4e1 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2925,7 +2925,7 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
/* Root decoder translation overrides typical modulo decode */
if (cxlrd->hpa_to_spa)
- hpa = cxlrd->hpa_to_spa(cxlrd, hpa);
+ hpa = cxlrd->hpa_to_spa(&cxlrd->cxlsd.cxld, hpa);
if (hpa < p->res->start || hpa > p->res->end) {
dev_dbg(&cxlr->dev,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b3989dc58ed1..be7685fe8a23 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -443,8 +443,7 @@ struct cxl_switch_decoder {
struct cxl_dport *target[];
};
-struct cxl_root_decoder;
-typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
+typedef u64 (*cxl_to_hpa_fn)(struct cxl_decoder *cxld, u64 hpa);
/**
* struct cxl_root_decoder - Static platform CXL address decoder
@@ -459,7 +458,7 @@ typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
struct cxl_root_decoder {
struct resource *res;
atomic_t region_id;
- cxl_hpa_to_spa_fn hpa_to_spa;
+ cxl_to_hpa_fn hpa_to_spa;
void *platform_data;
struct mutex range_lock;
int qos_class;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 12/29] cxl: Modify address translation callback for generic use
2025-01-07 14:09 ` [PATCH v1 12/29] cxl: Modify address translation callback for generic use Robert Richter
@ 2025-01-07 18:44 ` Gregory Price
2025-01-31 14:19 ` Robert Richter
2025-01-17 21:31 ` Ben Cheatham
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:44 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:58PM +0100, Robert Richter wrote:
> The root decoder address translation callback could be reused for
> other decoders too. For generic use of the callback, change the
> function interface to use a decoder argument instead of the root
> decoder.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/acpi.c | 4 ++--
> drivers/cxl/core/region.c | 2 +-
> drivers/cxl/cxl.h | 5 ++---
> 3 files changed, 5 insertions(+), 6 deletions(-)
>
... snip ...
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b3989dc58ed1..be7685fe8a23 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -443,8 +443,7 @@ struct cxl_switch_decoder {
> struct cxl_dport *target[];
> };
>
> -struct cxl_root_decoder;
> -typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
> +typedef u64 (*cxl_to_hpa_fn)(struct cxl_decoder *cxld, u64 hpa);
>
changed from _to_spa to _to_hpa? Was the name wrong previously? Maybe at
least comment on this in the changelog.
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 12/29] cxl: Modify address translation callback for generic use
2025-01-07 18:44 ` Gregory Price
@ 2025-01-31 14:19 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-31 14:19 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 01:44:56PM -0500, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:09:58PM +0100, Robert Richter wrote:
> > The root decoder address translation callback could be reused for
> > other decoders too. For generic use of the callback, change the
> > function interface to use a decoder argument instead of the root
> > decoder.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/acpi.c | 4 ++--
> > drivers/cxl/core/region.c | 2 +-
> > drivers/cxl/cxl.h | 5 ++---
> > 3 files changed, 5 insertions(+), 6 deletions(-)
> >
> ... snip ...
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index b3989dc58ed1..be7685fe8a23 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -443,8 +443,7 @@ struct cxl_switch_decoder {
> > struct cxl_dport *target[];
> > };
> >
> > -struct cxl_root_decoder;
> > -typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
> > +typedef u64 (*cxl_to_hpa_fn)(struct cxl_decoder *cxld, u64 hpa);
> >
>
> changed from _to_spa to _to_hpa? Was the name wrong previously? Maybe at
> least comment on this in the changelog.
A root decoder's HPA is equal to its SPA, but else it may be
different. Thus, I changed the name of the function type to
cxl_to_hpa_fn.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 12/29] cxl: Modify address translation callback for generic use
2025-01-07 14:09 ` [PATCH v1 12/29] cxl: Modify address translation callback for generic use Robert Richter
2025-01-07 18:44 ` Gregory Price
@ 2025-01-17 21:31 ` Ben Cheatham
2025-01-31 14:27 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:31 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 1/7/25 8:09 AM, Robert Richter wrote:
> The root decoder address translation callback could be reused for
> other decoders too. For generic use of the callback, change the
> function interface to use a decoder argument instead of the root
> decoder.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/acpi.c | 4 ++--
> drivers/cxl/core/region.c | 2 +-
> drivers/cxl/cxl.h | 5 ++---
> 3 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index cb14829bb9be..b42cffd6751f 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -22,9 +22,9 @@ static const guid_t acpi_cxl_qtg_id_guid =
> GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
> 0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
>
> -
> -static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
> +static u64 cxl_xor_hpa_to_spa(struct cxl_decoder *cxld, u64 hpa)
> {
> + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(&cxld->dev);
I think this needs to use cxl_find_root_decoder() instead of to_cxl_root_decoder() since the
cxld is no longer guaranteed to be a root decoder (as per commit message).
> struct cxl_cxims_data *cximsd = cxlrd->platform_data;
> int hbiw = cxlrd->cxlsd.nr_targets;
> u64 val;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6fcf56806606..9443507ed4e1 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2925,7 +2925,7 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
>
> /* Root decoder translation overrides typical modulo decode */
> if (cxlrd->hpa_to_spa)
> - hpa = cxlrd->hpa_to_spa(cxlrd, hpa);
> + hpa = cxlrd->hpa_to_spa(&cxlrd->cxlsd.cxld, hpa);
>
> if (hpa < p->res->start || hpa > p->res->end) {
> dev_dbg(&cxlr->dev,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b3989dc58ed1..be7685fe8a23 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -443,8 +443,7 @@ struct cxl_switch_decoder {
> struct cxl_dport *target[];
> };
>
> -struct cxl_root_decoder;
> -typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
> +typedef u64 (*cxl_to_hpa_fn)(struct cxl_decoder *cxld, u64 hpa);
>
> /**
> * struct cxl_root_decoder - Static platform CXL address decoder
> @@ -459,7 +458,7 @@ typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
> struct cxl_root_decoder {
> struct resource *res;
> atomic_t region_id;
> - cxl_hpa_to_spa_fn hpa_to_spa;
> + cxl_to_hpa_fn hpa_to_spa;
> void *platform_data;
> struct mutex range_lock;
> int qos_class;
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 12/29] cxl: Modify address translation callback for generic use
2025-01-17 21:31 ` Ben Cheatham
@ 2025-01-31 14:27 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-31 14:27 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On Fri, Jan 17, 2025 at 03:31:40PM -0600, Ben Cheatham wrote:
> On 1/7/25 8:09 AM, Robert Richter wrote:
> > The root decoder address translation callback could be reused for
> > other decoders too. For generic use of the callback, change the
> > function interface to use a decoder argument instead of the root
> > decoder.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/acpi.c | 4 ++--
> > drivers/cxl/core/region.c | 2 +-
> > drivers/cxl/cxl.h | 5 ++---
> > 3 files changed, 5 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index cb14829bb9be..b42cffd6751f 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -22,9 +22,9 @@ static const guid_t acpi_cxl_qtg_id_guid =
> > GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
> > 0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
> >
> > -
> > -static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
> > +static u64 cxl_xor_hpa_to_spa(struct cxl_decoder *cxld, u64 hpa)
> > {
> > + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(&cxld->dev);
>
> I think this needs to use cxl_find_root_decoder() instead of to_cxl_root_decoder() since the
> cxld is no longer guaranteed to be a root decoder (as per commit message).
cxl_xor_hpa_to_spa() is expected to be used for cxlrd->hpa_to_spa, the
decoder arg is always a root decoder:
cxlrd->hpa_to_spa(&cxlrd->cxlsd.cxld, hpa);
The decoder must be converted back to a root decoder in that function
which is what to_cxl_root_decoder() does.
Looks correct to me.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 13/29] cxl: Introduce callback to translate an HPA range from a port to its parent
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (11 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 12/29] cxl: Modify address translation callback for generic use Robert Richter
@ 2025-01-07 14:09 ` Robert Richter
2025-01-07 18:47 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 14/29] cxl: Introduce parent_port_of() helper Robert Richter
` (16 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:09 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
To enable address translation, the endpoint's HPA range must be
translated to each of the parent port's address ranges up to the root
decoder. Traverse the decoder and port hierarchy from the endpoint up
to the root port and apply platform specific translation functions to
determine the next HPA range of the parent port where needed:
if (cxl_port->to_hpa)
hpa = cxl_port->to_hpa(cxl_decoder, hpa)
The root port's HPA range is equivalent to the system's SPA range.
Introduce a callback to translate an HPA range from a port to its
parent.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/cxl.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index be7685fe8a23..49280e0f8840 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -601,6 +601,7 @@ struct cxl_dax_region {
* @parent_dport: dport that points to this port in the parent
* @decoder_ida: allocator for decoder ids
* @reg_map: component and ras register mapping parameters
+ * @to_hpa: Callback to translate a child port's decoder address to the port's HPA address range
* @nr_dports: number of entries in @dports
* @hdm_end: track last allocated HDM decoder instance for allocation ordering
* @commit_end: cursor to track highest committed decoder for commit ordering
@@ -621,6 +622,7 @@ struct cxl_port {
struct cxl_dport *parent_dport;
struct ida decoder_ida;
struct cxl_register_map reg_map;
+ cxl_to_hpa_fn to_hpa;
int nr_dports;
int hdm_end;
int commit_end;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 13/29] cxl: Introduce callback to translate an HPA range from a port to its parent
2025-01-07 14:09 ` [PATCH v1 13/29] cxl: Introduce callback to translate an HPA range from a port to its parent Robert Richter
@ 2025-01-07 18:47 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:47 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:59PM +0100, Robert Richter wrote:
> To enable address translation, the endpoint's HPA range must be
> translated to each of the parent port's address ranges up to the root
> decoder. Traverse the decoder and port hierarchy from the endpoint up
> to the root port and apply platform specific translation functions to
> determine the next HPA range of the parent port where needed:
>
> if (cxl_port->to_hpa)
> hpa = cxl_port->to_hpa(cxl_decoder, hpa)
>
> The root port's HPA range is equivalent to the system's SPA range.
>
> Introduce a callback to translate an HPA range from a port to its
> parent.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Assuming prior patch holds
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/cxl.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index be7685fe8a23..49280e0f8840 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -601,6 +601,7 @@ struct cxl_dax_region {
> * @parent_dport: dport that points to this port in the parent
> * @decoder_ida: allocator for decoder ids
> * @reg_map: component and ras register mapping parameters
> + * @to_hpa: Callback to translate a child port's decoder address to the port's HPA address range
> * @nr_dports: number of entries in @dports
> * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> * @commit_end: cursor to track highest committed decoder for commit ordering
> @@ -621,6 +622,7 @@ struct cxl_port {
> struct cxl_dport *parent_dport;
> struct ida decoder_ida;
> struct cxl_register_map reg_map;
> + cxl_to_hpa_fn to_hpa;
> int nr_dports;
> int hdm_end;
> int commit_end;
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 14/29] cxl: Introduce parent_port_of() helper
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (12 preceding siblings ...)
2025-01-07 14:09 ` [PATCH v1 13/29] cxl: Introduce callback to translate an HPA range from a port to its parent Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 18:50 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region Robert Richter
` (15 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Often a parent port must be determined. Introduce the parent_port_of()
helper function for this.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/port.c | 15 +++++++++------
drivers/cxl/core/region.c | 11 ++---------
drivers/cxl/cxl.h | 1 +
3 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 78a5c2c25982..901555bf4b73 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -596,17 +596,20 @@ struct cxl_port *to_cxl_port(const struct device *dev)
}
EXPORT_SYMBOL_NS_GPL(to_cxl_port, "CXL");
+struct cxl_port *parent_port_of(struct cxl_port *port)
+{
+ if (!port || !port->parent_dport)
+ return NULL;
+ return port->parent_dport->port;
+}
+EXPORT_SYMBOL_NS_GPL(parent_port_of, "CXL");
+
static void unregister_port(void *_port)
{
struct cxl_port *port = _port;
- struct cxl_port *parent;
+ struct cxl_port *parent = parent_port_of(port);
struct device *lock_dev;
- if (is_cxl_root(port))
- parent = NULL;
- else
- parent = to_cxl_port(port->dev.parent);
-
/*
* CXL root port's and the first level of ports are unregistered
* under the platform firmware device lock, all other ports are
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 9443507ed4e1..09a68e266a79 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1726,13 +1726,6 @@ static int cmp_interleave_pos(const void *a, const void *b)
return cxled_a->pos - cxled_b->pos;
}
-static struct cxl_port *next_port(struct cxl_port *port)
-{
- if (!port->parent_dport)
- return NULL;
- return port->parent_dport->port;
-}
-
static int match_switch_decoder_by_range(struct device *dev, void *data)
{
struct cxl_switch_decoder *cxlsd;
@@ -1757,7 +1750,7 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
struct device *dev;
int rc = -ENXIO;
- parent = next_port(port);
+ parent = parent_port_of(port);
if (!parent)
return rc;
@@ -1837,7 +1830,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
*/
/* Iterate from endpoint to root_port refining the position */
- for (iter = port; iter; iter = next_port(iter)) {
+ for (iter = port; iter; iter = parent_port_of(iter)) {
if (is_cxl_root(iter))
break;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 49280e0f8840..c04f66fe2a93 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -741,6 +741,7 @@ static inline bool is_cxl_root(struct cxl_port *port)
int cxl_num_decoders_committed(struct cxl_port *port);
bool is_cxl_port(const struct device *dev);
struct cxl_port *to_cxl_port(const struct device *dev);
+struct cxl_port *parent_port_of(struct cxl_port *port);
void cxl_port_commit_reap(struct cxl_decoder *cxld);
struct pci_bus;
int devm_cxl_register_pci_bus(struct device *host, struct device *uport_dev,
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 14/29] cxl: Introduce parent_port_of() helper
2025-01-07 14:10 ` [PATCH v1 14/29] cxl: Introduce parent_port_of() helper Robert Richter
@ 2025-01-07 18:50 ` Gregory Price
2025-01-13 18:20 ` Jonathan Cameron
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 18:50 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:00PM +0100, Robert Richter wrote:
> Often a parent port must be determined. Introduce the parent_port_of()
> helper function for this.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
This is a welcome cleanup for readability, wonder if it should be pulled
ahead to reduce the size of this changeset.
Reviewed-by: Gregory Price <gourry@gourry.net>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 14/29] cxl: Introduce parent_port_of() helper
2025-01-07 18:50 ` Gregory Price
@ 2025-01-13 18:20 ` Jonathan Cameron
0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-13 18:20 UTC (permalink / raw)
To: Gregory Price
Cc: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 13:50:05 -0500
Gregory Price <gourry@gourry.net> wrote:
> On Tue, Jan 07, 2025 at 03:10:00PM +0100, Robert Richter wrote:
> > Often a parent port must be determined. Introduce the parent_port_of()
> > helper function for this.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
>
> This is a welcome cleanup for readability, wonder if it should be pulled
> ahead to reduce the size of this changeset.
>
> Reviewed-by: Gregory Price <gourry@gourry.net>
Agreed.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (13 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 14/29] cxl: Introduce parent_port_of() helper Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 19:14 ` Gregory Price
` (2 more replies)
2025-01-07 14:10 ` [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position Robert Richter
` (14 subsequent siblings)
29 siblings, 3 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
To find the correct region and root port of an endpoint of a system
needing address translation, the endpoint's HPA range must be
translated to each of the parent port address ranges up to the root
decoder.
Calculate the SPA range using the newly introduced callback function
port->to_hpa() that translates the decoder's HPA range to its parent
port's HPA range of the next outer memory domain. Introduce the helper
function cxl_port_calc_hpa() for this to calculate address ranges
using the low-level port->to_hpa() callbacks. Determine the root port
SPA range by iterating all the ports up to the root. Store the
endpoint's SPA range and use it to find the endpoint's region.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
drivers/cxl/cxl.h | 1 +
2 files changed, 71 insertions(+), 15 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 09a68e266a79..007a2016760d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -824,6 +824,41 @@ static int match_free_decoder(struct device *dev, void *data)
return 1;
}
+static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
+ struct range *hpa_range)
+{
+ struct range hpa = *hpa_range;
+ u64 len = range_len(&hpa);
+
+ if (!port->to_hpa)
+ return 0;
+
+ /* Translate HPA to the next upper domain. */
+ hpa.start = port->to_hpa(cxld, hpa.start);
+ hpa.end = port->to_hpa(cxld, hpa.end);
+
+ if (!hpa.start || !hpa.end ||
+ hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
+ dev_warn(&port->dev,
+ "CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
+ hpa.start, hpa.end, hpa_range->start,
+ hpa_range->end, dev_name(&cxld->dev));
+ return -ENXIO;
+ }
+
+ if (range_len(&hpa) != len * cxld->interleave_ways) {
+ dev_warn(&port->dev,
+ "CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
+ hpa.start, hpa.end, hpa_range->start,
+ hpa_range->end, dev_name(&cxld->dev));
+ return -ENXIO;
+ }
+
+ *hpa_range = hpa;
+
+ return 0;
+}
+
static int match_auto_decoder(struct device *dev, void *data)
{
struct cxl_region_params *p = data;
@@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct cxl_port *iter = cxled_to_port(cxled);
- struct range *hpa = &cxled->cxld.hpa_range;
+ struct cxl_port *parent, *iter = cxled_to_port(cxled);
+ struct range hpa = cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
- while (iter && !is_cxl_root(iter))
- iter = to_cxl_port(iter->dev.parent);
-
- if (!iter)
+ if (!iter || is_cxl_root(iter))
return -ENXIO;
- cxld = cxl_port_find_switch_decoder(iter, hpa);
- if (!cxld) {
- dev_err(cxlmd->dev.parent,
- "%s:%s no CXL window for range %#llx:%#llx\n",
- dev_name(&cxlmd->dev), dev_name(&cxld->dev),
- cxld->hpa_range.start, cxld->hpa_range.end);
- return -ENXIO;
+ while (1) {
+ parent = parent_port_of(iter);
+
+ if (is_cxl_endpoint(iter))
+ cxld = &cxled->cxld;
+ else if (!parent || parent->to_hpa)
+ cxld = cxl_port_find_switch_decoder(iter, &hpa);
+
+ if (!cxld) {
+ dev_err(cxlmd->dev.parent,
+ "%s:%s no CXL window for range %#llx:%#llx\n",
+ dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
+ hpa.start, hpa.end);
+ return -ENXIO;
+ }
+
+ /* No parent means the root port was found. */
+ if (!parent)
+ break;
+
+ /* Translate HPA to the next upper memory domain. */
+ if (cxl_port_calc_hpa(parent, cxld, &hpa))
+ return -ENXIO;
+
+ iter = parent;
}
+ dev_dbg(cxld->dev.parent,
+ "%s:%s: range:%#llx-%#llx\n",
+ dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
+ hpa.start, hpa.end);
+
cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
+ cxled->spa_range = hpa;
return 0;
}
@@ -3358,7 +3414,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
{
- struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_root_decoder *cxlrd = cxled->cxlrd;
struct cxl_region_params *p;
struct cxl_region *cxlr;
@@ -3370,7 +3425,7 @@ static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
* one does the construction and the others add to that.
*/
mutex_lock(&cxlrd->range_lock);
- cxlr = cxl_find_region_by_range(cxlrd, hpa);
+ cxlr = cxl_find_region_by_range(cxlrd, &cxled->spa_range);
if (!cxlr)
cxlr = construct_region(cxlrd, cxled);
mutex_unlock(&cxlrd->range_lock);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index c04f66fe2a93..4ccb2b3b31c9 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -419,6 +419,7 @@ struct cxl_endpoint_decoder {
struct cxl_decoder cxld;
struct cxl_root_decoder *cxlrd;
struct resource *dpa_res;
+ struct range spa_range;
resource_size_t skip;
enum cxl_decoder_mode mode;
enum cxl_decoder_state state;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-07 14:10 ` [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region Robert Richter
@ 2025-01-07 19:14 ` Gregory Price
2025-02-05 8:48 ` Robert Richter
2025-01-14 10:59 ` Jonathan Cameron
2025-01-17 21:31 ` Ben Cheatham
2 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 19:14 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:01PM +0100, Robert Richter wrote:
> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
>
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
>
Some comments/questions inline.
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
> drivers/cxl/cxl.h | 1 +
> 2 files changed, 71 insertions(+), 15 deletions(-)
... snip ...
> + while (1) {
> + parent = parent_port_of(iter);
> +
I'm always suspicious of unbounded while loops. Should this simply be
while(iter) {
...
}
Instead? Given the rest of the function, I don't think this matters,
but it's at least a bit clearer.
> + if (is_cxl_endpoint(iter))
> + cxld = &cxled->cxld;
> + else if (!parent || parent->to_hpa)
> + cxld = cxl_port_find_switch_decoder(iter, &hpa);
> +
> + if (!cxld) {
> + dev_err(cxlmd->dev.parent,
> + "%s:%s no CXL window for range %#llx:%#llx\n",
> + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> + hpa.start, hpa.end);
> + return -ENXIO;
> + }
I also think we should be able to move this check out of the loop on
various break conditions (iter==NULL, cxld not found, etc).
> +
> + /* No parent means the root port was found. */
> + if (!parent)
> + break;
> +
> + /* Translate HPA to the next upper memory domain. */
> + if (cxl_port_calc_hpa(parent, cxld, &hpa))
> + return -ENXIO;
> +
> + iter = parent;
> }
>
> + dev_dbg(cxld->dev.parent,
> + "%s:%s: range:%#llx-%#llx\n",
> + dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> + hpa.start, hpa.end);
> +
> cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> + cxled->spa_range = hpa;
>
> return 0;
> }
Overall good, just think we might be able to improve the readability /
safety of this loop a bit.
Stupid question: I presume a look in this iteration is not (generally?)
possible, but if it were to happen this while loop as-is would go infinite.
Is that something we should worry about it?
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-07 19:14 ` Gregory Price
@ 2025-02-05 8:48 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-05 8:48 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 14:14:14, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:01PM +0100, Robert Richter wrote:
> > To find the correct region and root port of an endpoint of a system
> > needing address translation, the endpoint's HPA range must be
> > translated to each of the parent port address ranges up to the root
> > decoder.
> >
> > Calculate the SPA range using the newly introduced callback function
> > port->to_hpa() that translates the decoder's HPA range to its parent
> > port's HPA range of the next outer memory domain. Introduce the helper
> > function cxl_port_calc_hpa() for this to calculate address ranges
> > using the low-level port->to_hpa() callbacks. Determine the root port
> > SPA range by iterating all the ports up to the root. Store the
> > endpoint's SPA range and use it to find the endpoint's region.
> >
>
> Some comments/questions inline.
>
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
> > drivers/cxl/cxl.h | 1 +
> > 2 files changed, 71 insertions(+), 15 deletions(-)
> ... snip ...
> > + while (1) {
> > + parent = parent_port_of(iter);
> > +
>
> I'm always suspicious of unbounded while loops. Should this simply be
>
> while(iter) {
> ...
> }
>
> Instead? Given the rest of the function, I don't think this matters,
> but it's at least a bit clearer.
>
> > + if (is_cxl_endpoint(iter))
> > + cxld = &cxled->cxld;
> > + else if (!parent || parent->to_hpa)
> > + cxld = cxl_port_find_switch_decoder(iter, &hpa);
> > +
> > + if (!cxld) {
> > + dev_err(cxlmd->dev.parent,
> > + "%s:%s no CXL window for range %#llx:%#llx\n",
> > + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> > + hpa.start, hpa.end);
> > + return -ENXIO;
> > + }
>
> I also think we should be able to move this check out of the loop on
> various break conditions (iter==NULL, cxld not found, etc).
>
> > +
> > + /* No parent means the root port was found. */
> > + if (!parent)
> > + break;
> > +
> > + /* Translate HPA to the next upper memory domain. */
> > + if (cxl_port_calc_hpa(parent, cxld, &hpa))
> > + return -ENXIO;
We must check for !parent before and it is clearer then to directly
leave the loop.
That is, using
while(iter) {}
does not improve the loop because that would introduce duplicate
checks. It is not possible otherwise to break at the beginning or end
of the loop, as there is code to run, either
cxl_port_find_switch_decoder() or cxl_port_calc_hpa().
> > +
> > + iter = parent;
> > }
> >
> > + dev_dbg(cxld->dev.parent,
> > + "%s:%s: range:%#llx-%#llx\n",
> > + dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> > + hpa.start, hpa.end);
> > +
> > cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> > + cxled->spa_range = hpa;
> >
> > return 0;
> > }
>
> Overall good, just think we might be able to improve the readability /
> safety of this loop a bit.
>
> Stupid question: I presume a look in this iteration is not (generally?)
> possible, but if it were to happen this while loop as-is would go infinite.
> Is that something we should worry about it?
It is safe, this kind of iterator is used elsewhere too.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-07 14:10 ` [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region Robert Richter
2025-01-07 19:14 ` Gregory Price
@ 2025-01-14 10:59 ` Jonathan Cameron
2025-01-31 15:46 ` Robert Richter
2025-01-17 21:31 ` Ben Cheatham
2 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-14 10:59 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 15:10:01 +0100
Robert Richter <rrichter@amd.com> wrote:
> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
>
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
> drivers/cxl/cxl.h | 1 +
> 2 files changed, 71 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 09a68e266a79..007a2016760d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -824,6 +824,41 @@ static int match_free_decoder(struct device *dev, void *data)
> return 1;
> }
>
> +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> + struct range *hpa_range)
> +{
> + struct range hpa = *hpa_range;
> + u64 len = range_len(&hpa);
> +
> + if (!port->to_hpa)
> + return 0;
> +
> + /* Translate HPA to the next upper domain. */
> + hpa.start = port->to_hpa(cxld, hpa.start);
> + hpa.end = port->to_hpa(cxld, hpa.end);
> +
> + if (!hpa.start || !hpa.end ||
On general basis, why can't hpa.start be 0?
It is an unusual physical memory map, but technically possible on some
architectures.
> + hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> + dev_warn(&port->dev,
> + "CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> + hpa.start, hpa.end, hpa_range->start,
> + hpa_range->end, dev_name(&cxld->dev));
> + return -ENXIO;
> + }
> +
> + if (range_len(&hpa) != len * cxld->interleave_ways) {
> + dev_warn(&port->dev,
> + "CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> + hpa.start, hpa.end, hpa_range->start,
> + hpa_range->end, dev_name(&cxld->dev));
> + return -ENXIO;
> + }
> +
> + *hpa_range = hpa;
> +
> + return 0;
> +}
> +
> static int match_auto_decoder(struct device *dev, void *data)
> {
> struct cxl_region_params *p = data;
> @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_port *iter = cxled_to_port(cxled);
> - struct range *hpa = &cxled->cxld.hpa_range;
> + struct cxl_port *parent, *iter = cxled_to_port(cxled);
I'd prefer that spit into two lines. Mixing cases that allocate and ones
that don't isn't great for readability. Would also reduce the diff a little
which is always nice!
> + struct range hpa = cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
>
...
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-14 10:59 ` Jonathan Cameron
@ 2025-01-31 15:46 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-31 15:46 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 14.01.25 10:59:27, Jonathan Cameron wrote:
> On Tue, 7 Jan 2025 15:10:01 +0100
> Robert Richter <rrichter@amd.com> wrote:
> > +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> > + struct range *hpa_range)
> > +{
> > + struct range hpa = *hpa_range;
> > + u64 len = range_len(&hpa);
> > +
> > + if (!port->to_hpa)
> > + return 0;
> > +
> > + /* Translate HPA to the next upper domain. */
> > + hpa.start = port->to_hpa(cxld, hpa.start);
> > + hpa.end = port->to_hpa(cxld, hpa.end);
> > +
> > + if (!hpa.start || !hpa.end ||
>
> On general basis, why can't hpa.start be 0?
> It is an unusual physical memory map, but technically possible on some
> architectures.
>
> > + hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> > + dev_warn(&port->dev,
> > + "CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> > + hpa.start, hpa.end, hpa_range->start,
> > + hpa_range->end, dev_name(&cxld->dev));
> > + return -ENXIO;
> > + }
> > @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> > static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> > {
> > struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > - struct cxl_port *iter = cxled_to_port(cxled);
> > - struct range *hpa = &cxled->cxld.hpa_range;
> > + struct cxl_port *parent, *iter = cxled_to_port(cxled);
>
> I'd prefer that spit into two lines. Mixing cases that allocate and ones
> that don't isn't great for readability. Would also reduce the diff a little
> which is always nice!
>
> > + struct range hpa = cxled->cxld.hpa_range;
> > struct cxl_decoder *cxld = &cxled->cxld;
> >
Changed both, thanks.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-07 14:10 ` [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region Robert Richter
2025-01-07 19:14 ` Gregory Price
2025-01-14 10:59 ` Jonathan Cameron
@ 2025-01-17 21:31 ` Ben Cheatham
2025-02-05 9:00 ` Robert Richter
2 siblings, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:31 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 1/7/25 8:10 AM, Robert Richter wrote:
> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
>
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
> drivers/cxl/cxl.h | 1 +
> 2 files changed, 71 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 09a68e266a79..007a2016760d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -824,6 +824,41 @@ static int match_free_decoder(struct device *dev, void *data)
> return 1;
> }
>
> +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> + struct range *hpa_range)
> +{
> + struct range hpa = *hpa_range;
> + u64 len = range_len(&hpa);
> +
> + if (!port->to_hpa)
> + return 0;
> +
> + /* Translate HPA to the next upper domain. */
> + hpa.start = port->to_hpa(cxld, hpa.start);
> + hpa.end = port->to_hpa(cxld, hpa.end);
> +
> + if (!hpa.start || !hpa.end ||
> + hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> + dev_warn(&port->dev,
> + "CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> + hpa.start, hpa.end, hpa_range->start,
> + hpa_range->end, dev_name(&cxld->dev));
> + return -ENXIO;
> + }
> +
> + if (range_len(&hpa) != len * cxld->interleave_ways) {
> + dev_warn(&port->dev,
> + "CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> + hpa.start, hpa.end, hpa_range->start,
> + hpa_range->end, dev_name(&cxld->dev));
> + return -ENXIO;
> + }
> +
> + *hpa_range = hpa;
> +
> + return 0;
> +}
> +
> static int match_auto_decoder(struct device *dev, void *data)
> {
> struct cxl_region_params *p = data;
> @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct cxl_port *iter = cxled_to_port(cxled);
> - struct range *hpa = &cxled->cxld.hpa_range;
> + struct cxl_port *parent, *iter = cxled_to_port(cxled);
> + struct range hpa = cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
Can cut down on the dereferencing a bit here by doing:
struct cxl_decoder *cxld = &cxled->cxld;
struct range hpa = cxld->hpa_range;
instead.
>
> - while (iter && !is_cxl_root(iter))
> - iter = to_cxl_port(iter->dev.parent);
> -
> - if (!iter)
> + if (!iter || is_cxl_root(iter))
> return -ENXIO;
>
> - cxld = cxl_port_find_switch_decoder(iter, hpa);
> - if (!cxld) {
> - dev_err(cxlmd->dev.parent,
> - "%s:%s no CXL window for range %#llx:%#llx\n",
> - dev_name(&cxlmd->dev), dev_name(&cxld->dev),
> - cxld->hpa_range.start, cxld->hpa_range.end);
> - return -ENXIO;
> + while (1) {
> + parent = parent_port_of(iter);
> +
> + if (is_cxl_endpoint(iter))
> + cxld = &cxled->cxld;
> + else if (!parent || parent->to_hpa)
> + cxld = cxl_port_find_switch_decoder(iter, &hpa);
> +
> + if (!cxld) {
> + dev_err(cxlmd->dev.parent,
> + "%s:%s no CXL window for range %#llx:%#llx\n",
> + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> + hpa.start, hpa.end);
> + return -ENXIO;
> + }
> +
> + /* No parent means the root port was found. */
> + if (!parent)
> + break;
> +
> + /* Translate HPA to the next upper memory domain. */
> + if (cxl_port_calc_hpa(parent, cxld, &hpa))
> + return -ENXIO;
> +
> + iter = parent;
> }
>
> + dev_dbg(cxld->dev.parent,
> + "%s:%s: range:%#llx-%#llx\n",
> + dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> + hpa.start, hpa.end);
> +
> cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> + cxled->spa_range = hpa;
>
> return 0;
> }
> @@ -3358,7 +3414,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>
> static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
> {
> - struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_root_decoder *cxlrd = cxled->cxlrd;
> struct cxl_region_params *p;
> struct cxl_region *cxlr;
> @@ -3370,7 +3425,7 @@ static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
> * one does the construction and the others add to that.
> */
> mutex_lock(&cxlrd->range_lock);
> - cxlr = cxl_find_region_by_range(cxlrd, hpa);
> + cxlr = cxl_find_region_by_range(cxlrd, &cxled->spa_range);
> if (!cxlr)
> cxlr = construct_region(cxlrd, cxled);
> mutex_unlock(&cxlrd->range_lock);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index c04f66fe2a93..4ccb2b3b31c9 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -419,6 +419,7 @@ struct cxl_endpoint_decoder {
> struct cxl_decoder cxld;
> struct cxl_root_decoder *cxlrd;
> struct resource *dpa_res;
> + struct range spa_range;
> resource_size_t skip;
> enum cxl_decoder_mode mode;
> enum cxl_decoder_state state;
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region
2025-01-17 21:31 ` Ben Cheatham
@ 2025-02-05 9:00 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-05 9:00 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 17.01.25 15:31:43, Ben Cheatham wrote:
> On 1/7/25 8:10 AM, Robert Richter wrote:
> > static int match_auto_decoder(struct device *dev, void *data)
> > {
> > struct cxl_region_params *p = data;
> > @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> > static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> > {
> > struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > - struct cxl_port *iter = cxled_to_port(cxled);
> > - struct range *hpa = &cxled->cxld.hpa_range;
> > + struct cxl_port *parent, *iter = cxled_to_port(cxled);
> > + struct range hpa = cxled->cxld.hpa_range;
> > struct cxl_decoder *cxld = &cxled->cxld;
>
> Can cut down on the dereferencing a bit here by doing:
> struct cxl_decoder *cxld = &cxled->cxld;
> struct range hpa = cxld->hpa_range;
>
> instead.
Changed that, thanks.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (14 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 15/29] cxl/region: Use an endpoint's SPA range to find a region Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:01 ` Gregory Price
2025-01-17 21:31 ` Ben Cheatham
2025-01-07 14:10 ` [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early() Robert Richter
` (13 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
To enable address translation, the calculation of the endpoint
position must use translated HPA ranges. The function
cxl_endpoint_initialize() already uses translation which could be
reused to calculate the endpoint position.
Use translated HPA address ranges for the calculation of endpoint
position by moving it to cxl_endpoint_initialize(). Create a function
cxl_port_calc_pos() for use in the iterator there, but keep a
simplified version of cxl_calc_interleave_pos() for the
non-auto-discovery code path without address translation since it is
not support there.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 66 ++++++++++++++++++++++++++-------------
1 file changed, 44 insertions(+), 22 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 007a2016760d..c1e384e80d10 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1813,26 +1813,29 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
}
/**
- * cxl_calc_interleave_pos() - calculate an endpoint position in a region
- * @cxled: endpoint decoder member of given region
+ * cxl_port_calc_pos() - calculate an endpoint position for @port
+ * @port: Port the new position is calculated for.
+ * @range: The HPA address range for the port.
+ * @pos: Current position in the topology.
*
- * The endpoint position is calculated by traversing the topology from
- * the endpoint to the root decoder and iteratively applying this
- * calculation:
+ * The endpoint position for the next port is calculated by applying
+ * this calculation:
*
* position = position * parent_ways + parent_pos;
*
* ...where @position is inferred from switch and root decoder target lists.
*
+ * The endpoint position of region can be calculated by traversing the
+ * topology from the endpoint to the root decoder and iteratively
+ * applying the function for each port.
+ *
* Return: position >= 0 on success
* -ENXIO on failure
*/
-static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
+static int cxl_port_calc_pos(struct cxl_port *port, struct range *range,
+ int pos)
{
- struct cxl_port *iter, *port = cxled_to_port(cxled);
- struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct range *range = &cxled->cxld.hpa_range;
- int parent_ways = 0, parent_pos = 0, pos = 0;
+ int parent_ways = 0, parent_pos = 0;
int rc;
/*
@@ -1864,17 +1867,30 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
* complex topologies, including those with switches.
*/
- /* Iterate from endpoint to root_port refining the position */
- for (iter = port; iter; iter = parent_port_of(iter)) {
- if (is_cxl_root(iter))
- break;
+ if (is_cxl_root(port))
+ return pos;
- rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
- if (rc)
- return rc;
+ rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
+ if (rc)
+ return rc;
- pos = pos * parent_ways + parent_pos;
- }
+ return pos * parent_ways + parent_pos;
+}
+
+static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
+{
+ struct cxl_port *iter, *port = cxled_to_port(cxled);
+ struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+ struct range *range = &cxled->cxld.hpa_range;
+ int pos = 0;
+
+ /*
+ * Address translation is only supported for auto-discovery of
+ * decoders. There is no need to support address translation
+ * here.
+ */
+ for (iter = cxled_to_port(cxled); iter; iter = parent_port_of(iter))
+ pos = cxl_port_calc_pos(iter, range, pos);
dev_dbg(&cxlmd->dev,
"decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
@@ -1892,7 +1908,6 @@ static int cxl_region_sort_targets(struct cxl_region *cxlr)
for (i = 0; i < p->nr_targets; i++) {
struct cxl_endpoint_decoder *cxled = p->targets[i];
- cxled->pos = cxl_calc_interleave_pos(cxled);
/*
* Record that sorting failed, but still continue to calc
* cxled->pos so that follow-on code paths can reliably
@@ -3252,6 +3267,7 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
struct cxl_port *parent, *iter = cxled_to_port(cxled);
struct range hpa = cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
+ int pos = 0;
if (!iter || is_cxl_root(iter))
return -ENXIO;
@@ -3280,16 +3296,22 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
if (cxl_port_calc_hpa(parent, cxld, &hpa))
return -ENXIO;
+ /* Iterate from endpoint to root_port refining the position */
+ pos = cxl_port_calc_pos(iter, &hpa, pos);
+ if (pos < 0)
+ return pos;
+
iter = parent;
}
dev_dbg(cxld->dev.parent,
- "%s:%s: range:%#llx-%#llx\n",
+ "%s:%s: range:%#llx-%#llx pos:%d\n",
dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
- hpa.start, hpa.end);
+ hpa.start, hpa.end, pos);
cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
cxled->spa_range = hpa;
+ cxled->pos = pos;
return 0;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position
2025-01-07 14:10 ` [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position Robert Richter
@ 2025-01-07 22:01 ` Gregory Price
2025-02-05 10:38 ` Robert Richter
2025-01-17 21:31 ` Ben Cheatham
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:01 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:02PM +0100, Robert Richter wrote:
> To enable address translation, the calculation of the endpoint
> position must use translated HPA ranges. The function
> cxl_endpoint_initialize() already uses translation which could be
> reused to calculate the endpoint position.
>
> Use translated HPA address ranges for the calculation of endpoint
> position by moving it to cxl_endpoint_initialize(). Create a function
> cxl_port_calc_pos() for use in the iterator there, but keep a
> simplified version of cxl_calc_interleave_pos() for the
> non-auto-discovery code path without address translation since it is
> not support there.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
just one inline question
> +static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> +{
> + struct cxl_port *iter, *port = cxled_to_port(cxled);
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct range *range = &cxled->cxld.hpa_range;
> + int pos = 0;
> +
> + /*
> + * Address translation is only supported for auto-discovery of
> + * decoders. There is no need to support address translation
> + * here.
> + */
Just clarifying - it's only supported for discovery of already
programmed decoders (programmed in BIOS)? i.e. driver-programmed
decoders shouldn't need this translation / won't support this type of
interleaving?
Comment here begs some questions but not necessarily a review blocker.
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position
2025-01-07 22:01 ` Gregory Price
@ 2025-02-05 10:38 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-05 10:38 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 17:01:22, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:02PM +0100, Robert Richter wrote:
> > To enable address translation, the calculation of the endpoint
> > position must use translated HPA ranges. The function
> > cxl_endpoint_initialize() already uses translation which could be
> > reused to calculate the endpoint position.
> >
> > Use translated HPA address ranges for the calculation of endpoint
> > position by moving it to cxl_endpoint_initialize(). Create a function
> > cxl_port_calc_pos() for use in the iterator there, but keep a
> > simplified version of cxl_calc_interleave_pos() for the
> > non-auto-discovery code path without address translation since it is
> > not support there.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
>
> just one inline question
>
> > +static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> > +{
> > + struct cxl_port *iter, *port = cxled_to_port(cxled);
> > + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > + struct range *range = &cxled->cxld.hpa_range;
> > + int pos = 0;
> > +
> > + /*
> > + * Address translation is only supported for auto-discovery of
> > + * decoders. There is no need to support address translation
> > + * here.
> > + */
>
> Just clarifying - it's only supported for discovery of already
> programmed decoders (programmed in BIOS)? i.e. driver-programmed
> decoders shouldn't need this translation / won't support this type of
> interleaving?
Right now only translation from endpoint to root is
supported/implemented, not the other direction root to endpoint. That
is, only address ranges of firmware programmed decoders can be
translated.
However, support could be added but that is not the focus of this
series. Current implementation can determine the endpoints base SPA,
so it might be feasible to add support of driver-programmed decoders
using only endpoint-to-root translation, but I haven't tried that yet.
Another use case is RAS to translate endpoint addresses to SPA.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position
2025-01-07 14:10 ` [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position Robert Richter
2025-01-07 22:01 ` Gregory Price
@ 2025-01-17 21:31 ` Ben Cheatham
2025-02-05 10:43 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:31 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 1/7/25 8:10 AM, Robert Richter wrote:
> To enable address translation, the calculation of the endpoint
> position must use translated HPA ranges. The function
> cxl_endpoint_initialize() already uses translation which could be
> reused to calculate the endpoint position.
>
> Use translated HPA address ranges for the calculation of endpoint
s/HPA address ranges/HPA ranges/
> position by moving it to cxl_endpoint_initialize(). Create a function
> cxl_port_calc_pos() for use in the iterator there, but keep a
> simplified version of cxl_calc_interleave_pos() for the
> non-auto-discovery code path without address translation since it is
> not support there.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 66 ++++++++++++++++++++++++++-------------
> 1 file changed, 44 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 007a2016760d..c1e384e80d10 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1813,26 +1813,29 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> }
>
> /**
> - * cxl_calc_interleave_pos() - calculate an endpoint position in a region
> - * @cxled: endpoint decoder member of given region
> + * cxl_port_calc_pos() - calculate an endpoint position for @port
> + * @port: Port the new position is calculated for.
> + * @range: The HPA address range for the port.
Same as above, but it may be better to write Host Physical address range instead for
the extra context.
> + * @pos: Current position in the topology.
> *
> - * The endpoint position is calculated by traversing the topology from
> - * the endpoint to the root decoder and iteratively applying this
> - * calculation:
> + * The endpoint position for the next port is calculated by applying
> + * this calculation:
> *
> * position = position * parent_ways + parent_pos;
> *
> * ...where @position is inferred from switch and root decoder target lists.
> *
> + * The endpoint position of region can be calculated by traversing the
I would word this as "The endpoint's position in a region can be..." instead since
I think that's what you are doing below. This currently reads like the region's
position is being found.
> + * topology from the endpoint to the root decoder and iteratively
> + * applying the function for each port.
> + *
> * Return: position >= 0 on success
> * -ENXIO on failure
> */
> -static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> +static int cxl_port_calc_pos(struct cxl_port *port, struct range *range,
> + int pos)
> {
> - struct cxl_port *iter, *port = cxled_to_port(cxled);
> - struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct range *range = &cxled->cxld.hpa_range;
> - int parent_ways = 0, parent_pos = 0, pos = 0;
> + int parent_ways = 0, parent_pos = 0;
> int rc;
>
> /*
> @@ -1864,17 +1867,30 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> * complex topologies, including those with switches.
> */
>
> - /* Iterate from endpoint to root_port refining the position */
> - for (iter = port; iter; iter = parent_port_of(iter)) {
> - if (is_cxl_root(iter))
> - break;
> + if (is_cxl_root(port))
> + return pos;
>
> - rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> - if (rc)
> - return rc;
> + rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
> + if (rc)
> + return rc;
>
> - pos = pos * parent_ways + parent_pos;
> - }
> + return pos * parent_ways + parent_pos;
> +}
> +
> +static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> +{
> + struct cxl_port *iter, *port = cxled_to_port(cxled);
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct range *range = &cxled->cxld.hpa_range;
> + int pos = 0;
> +
> + /*
> + * Address translation is only supported for auto-discovery of
> + * decoders. There is no need to support address translation
> + * here.
> + */
> + for (iter = cxled_to_port(cxled); iter; iter = parent_port_of(iter))
> + pos = cxl_port_calc_pos(iter, range, pos);
>
> dev_dbg(&cxlmd->dev,
> "decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
> @@ -1892,7 +1908,6 @@ static int cxl_region_sort_targets(struct cxl_region *cxlr)
> for (i = 0; i < p->nr_targets; i++) {
> struct cxl_endpoint_decoder *cxled = p->targets[i];
>
> - cxled->pos = cxl_calc_interleave_pos(cxled);
> /*
> * Record that sorting failed, but still continue to calc
> * cxled->pos so that follow-on code paths can reliably
> @@ -3252,6 +3267,7 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> struct cxl_port *parent, *iter = cxled_to_port(cxled);
> struct range hpa = cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> + int pos = 0;
>
> if (!iter || is_cxl_root(iter))
> return -ENXIO;
> @@ -3280,16 +3296,22 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> if (cxl_port_calc_hpa(parent, cxld, &hpa))
> return -ENXIO;
>
> + /* Iterate from endpoint to root_port refining the position */
> + pos = cxl_port_calc_pos(iter, &hpa, pos);
> + if (pos < 0)
> + return pos;
> +
> iter = parent;
> }
>
> dev_dbg(cxld->dev.parent,
> - "%s:%s: range:%#llx-%#llx\n",
> + "%s:%s: range:%#llx-%#llx pos:%d\n",
> dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> - hpa.start, hpa.end);
> + hpa.start, hpa.end, pos);
>
> cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> cxled->spa_range = hpa;
> + cxled->pos = pos;
>
> return 0;
> }
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position
2025-01-17 21:31 ` Ben Cheatham
@ 2025-02-05 10:43 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-05 10:43 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 17.01.25 15:31:48, Ben Cheatham wrote:
> On 1/7/25 8:10 AM, Robert Richter wrote:
> > To enable address translation, the calculation of the endpoint
> > position must use translated HPA ranges. The function
> > cxl_endpoint_initialize() already uses translation which could be
> > reused to calculate the endpoint position.
> >
> > Use translated HPA address ranges for the calculation of endpoint
>
> s/HPA address ranges/HPA ranges/
>
> > position by moving it to cxl_endpoint_initialize(). Create a function
> > cxl_port_calc_pos() for use in the iterator there, but keep a
> > simplified version of cxl_calc_interleave_pos() for the
> > non-auto-discovery code path without address translation since it is
> > not support there.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 66 ++++++++++++++++++++++++++-------------
> > 1 file changed, 44 insertions(+), 22 deletions(-)
Updated all the wording, thanks.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (15 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 16/29] cxl/region: Use translated HPA ranges to calculate the endpoint position Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:06 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 18/29] cxl/region: Avoid duplicate call of cxl_find_decoder_early() Robert Richter
` (12 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Current function cxl_region_find_decoder() is used to find a port's
decoder during region setup. The decoder is later used to attach the
port to a region.
Rename function to cxl_find_decoder_early() to emphasize its use only
during region setup in the early setup stage. Once a port is attached
to a region, the region reference can be used to lookup a region's
port and decoder configuration (see struct cxl_region_ref).
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c1e384e80d10..2bc2028988d3 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -878,9 +878,9 @@ static int match_auto_decoder(struct device *dev, void *data)
}
static struct cxl_decoder *
-cxl_region_find_decoder(struct cxl_port *port,
- struct cxl_endpoint_decoder *cxled,
- struct cxl_region *cxlr)
+cxl_find_decoder_early(struct cxl_port *port,
+ struct cxl_endpoint_decoder *cxled,
+ struct cxl_region *cxlr)
{
struct device *dev;
@@ -944,7 +944,7 @@ alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
struct cxl_decoder *cxld;
- cxld = cxl_region_find_decoder(port, cxled, cxlr);
+ cxld = cxl_find_decoder_early(port, cxled, cxlr);
if (auto_order_ok(port, iter->region, cxld))
continue;
}
@@ -1032,7 +1032,7 @@ static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
{
struct cxl_decoder *cxld;
- cxld = cxl_region_find_decoder(port, cxled, cxlr);
+ cxld = cxl_find_decoder_early(port, cxled, cxlr);
if (!cxld) {
dev_dbg(&cxlr->dev, "%s: no decoder available\n",
dev_name(&port->dev));
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early()
2025-01-07 14:10 ` [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early() Robert Richter
@ 2025-01-07 22:06 ` Gregory Price
2025-02-05 10:56 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:06 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:03PM +0100, Robert Richter wrote:
> Current function cxl_region_find_decoder() is used to find a port's
> decoder during region setup. The decoder is later used to attach the
> port to a region.
>
> Rename function to cxl_find_decoder_early() to emphasize its use only
> during region setup in the early setup stage. Once a port is attached
> to a region, the region reference can be used to lookup a region's
> port and decoder configuration (see struct cxl_region_ref).
>
May want to add a comment preceding the function to record this in the
code along with the changelog.
> Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/region.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c1e384e80d10..2bc2028988d3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -878,9 +878,9 @@ static int match_auto_decoder(struct device *dev, void *data)
> }
>
> static struct cxl_decoder *
> -cxl_region_find_decoder(struct cxl_port *port,
> - struct cxl_endpoint_decoder *cxled,
> - struct cxl_region *cxlr)
> +cxl_find_decoder_early(struct cxl_port *port,
> + struct cxl_endpoint_decoder *cxled,
> + struct cxl_region *cxlr)
> {
> struct device *dev;
>
> @@ -944,7 +944,7 @@ alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
> if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
> struct cxl_decoder *cxld;
>
> - cxld = cxl_region_find_decoder(port, cxled, cxlr);
> + cxld = cxl_find_decoder_early(port, cxled, cxlr);
> if (auto_order_ok(port, iter->region, cxld))
> continue;
> }
> @@ -1032,7 +1032,7 @@ static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
> {
> struct cxl_decoder *cxld;
>
> - cxld = cxl_region_find_decoder(port, cxled, cxlr);
> + cxld = cxl_find_decoder_early(port, cxled, cxlr);
> if (!cxld) {
> dev_dbg(&cxlr->dev, "%s: no decoder available\n",
> dev_name(&port->dev));
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early()
2025-01-07 22:06 ` Gregory Price
@ 2025-02-05 10:56 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-05 10:56 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 17:06:04, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:03PM +0100, Robert Richter wrote:
> > Current function cxl_region_find_decoder() is used to find a port's
> > decoder during region setup. The decoder is later used to attach the
> > port to a region.
> >
> > Rename function to cxl_find_decoder_early() to emphasize its use only
> > during region setup in the early setup stage. Once a port is attached
> > to a region, the region reference can be used to lookup a region's
> > port and decoder configuration (see struct cxl_region_ref).
> >
>
> May want to add a comment preceding the function to record this in the
> code along with the changelog.
>
> > Signed-off-by: Robert Richter <rrichter@amd.com>
>
> Reviewed-by: Gregory Price <gourry@gourry.net>
I added a comment. Thanks,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 18/29] cxl/region: Avoid duplicate call of cxl_find_decoder_early()
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (16 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 17/29] cxl/region: Rename function to cxl_find_decoder_early() Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:11 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder Robert Richter
` (11 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Function cxl_find_decoder_early() is called twice, in
alloc_region_ref() and cxl_rr_alloc_decoder(). Move it out there and
instead pass the decoder as function argument to both.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 2bc2028988d3..b7f6d8a83e4e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -928,7 +928,8 @@ static bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_iter,
static struct cxl_region_ref *
alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
- struct cxl_endpoint_decoder *cxled)
+ struct cxl_endpoint_decoder *cxled,
+ struct cxl_decoder *cxld)
{
struct cxl_region_params *p = &cxlr->params;
struct cxl_region_ref *cxl_rr, *iter;
@@ -942,9 +943,6 @@ alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
continue;
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
- struct cxl_decoder *cxld;
-
- cxld = cxl_find_decoder_early(port, cxled, cxlr);
if (auto_order_ok(port, iter->region, cxld))
continue;
}
@@ -1028,17 +1026,9 @@ static int cxl_rr_ep_add(struct cxl_region_ref *cxl_rr,
static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
struct cxl_endpoint_decoder *cxled,
- struct cxl_region_ref *cxl_rr)
+ struct cxl_region_ref *cxl_rr,
+ struct cxl_decoder *cxld)
{
- struct cxl_decoder *cxld;
-
- cxld = cxl_find_decoder_early(port, cxled, cxlr);
- if (!cxld) {
- dev_dbg(&cxlr->dev, "%s: no decoder available\n",
- dev_name(&port->dev));
- return -EBUSY;
- }
-
if (cxld->region) {
dev_dbg(&cxlr->dev, "%s: %s already attached to %s\n",
dev_name(&port->dev), dev_name(&cxld->dev),
@@ -1129,7 +1119,16 @@ static int cxl_port_attach_region(struct cxl_port *port,
nr_targets_inc = true;
}
} else {
- cxl_rr = alloc_region_ref(port, cxlr, cxled);
+ struct cxl_decoder *cxld;
+
+ cxld = cxl_find_decoder_early(port, cxled, cxlr);
+ if (!cxld) {
+ dev_dbg(&cxlr->dev, "%s: no decoder available\n",
+ dev_name(&port->dev));
+ return -EBUSY;
+ }
+
+ cxl_rr = alloc_region_ref(port, cxlr, cxled, cxld);
if (IS_ERR(cxl_rr)) {
dev_dbg(&cxlr->dev,
"%s: failed to allocate region reference\n",
@@ -1138,7 +1137,7 @@ static int cxl_port_attach_region(struct cxl_port *port,
}
nr_targets_inc = true;
- rc = cxl_rr_alloc_decoder(port, cxlr, cxled, cxl_rr);
+ rc = cxl_rr_alloc_decoder(port, cxlr, cxled, cxl_rr, cxld);
if (rc)
goto out_erase;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 18/29] cxl/region: Avoid duplicate call of cxl_find_decoder_early()
2025-01-07 14:10 ` [PATCH v1 18/29] cxl/region: Avoid duplicate call of cxl_find_decoder_early() Robert Richter
@ 2025-01-07 22:11 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:11 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:04PM +0100, Robert Richter wrote:
> Function cxl_find_decoder_early() is called twice, in
> alloc_region_ref() and cxl_rr_alloc_decoder(). Move it out there and
> instead pass the decoder as function argument to both.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 31 +++++++++++++++----------------
> 1 file changed, 15 insertions(+), 16 deletions(-)
>
Reviewed-by: Gregory Price <gourry@gourry.net>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 2bc2028988d3..b7f6d8a83e4e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -928,7 +928,8 @@ static bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_iter,
>
> static struct cxl_region_ref *
> alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
> - struct cxl_endpoint_decoder *cxled)
> + struct cxl_endpoint_decoder *cxled,
> + struct cxl_decoder *cxld)
> {
> struct cxl_region_params *p = &cxlr->params;
> struct cxl_region_ref *cxl_rr, *iter;
> @@ -942,9 +943,6 @@ alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr,
> continue;
>
> if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
> - struct cxl_decoder *cxld;
> -
> - cxld = cxl_find_decoder_early(port, cxled, cxlr);
> if (auto_order_ok(port, iter->region, cxld))
> continue;
> }
> @@ -1028,17 +1026,9 @@ static int cxl_rr_ep_add(struct cxl_region_ref *cxl_rr,
>
> static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
> struct cxl_endpoint_decoder *cxled,
> - struct cxl_region_ref *cxl_rr)
> + struct cxl_region_ref *cxl_rr,
> + struct cxl_decoder *cxld)
> {
> - struct cxl_decoder *cxld;
> -
> - cxld = cxl_find_decoder_early(port, cxled, cxlr);
> - if (!cxld) {
> - dev_dbg(&cxlr->dev, "%s: no decoder available\n",
> - dev_name(&port->dev));
> - return -EBUSY;
> - }
> -
> if (cxld->region) {
> dev_dbg(&cxlr->dev, "%s: %s already attached to %s\n",
> dev_name(&port->dev), dev_name(&cxld->dev),
> @@ -1129,7 +1119,16 @@ static int cxl_port_attach_region(struct cxl_port *port,
> nr_targets_inc = true;
> }
> } else {
> - cxl_rr = alloc_region_ref(port, cxlr, cxled);
> + struct cxl_decoder *cxld;
> +
> + cxld = cxl_find_decoder_early(port, cxled, cxlr);
> + if (!cxld) {
> + dev_dbg(&cxlr->dev, "%s: no decoder available\n",
> + dev_name(&port->dev));
> + return -EBUSY;
> + }
> +
> + cxl_rr = alloc_region_ref(port, cxlr, cxled, cxld);
> if (IS_ERR(cxl_rr)) {
> dev_dbg(&cxlr->dev,
> "%s: failed to allocate region reference\n",
> @@ -1138,7 +1137,7 @@ static int cxl_port_attach_region(struct cxl_port *port,
> }
> nr_targets_inc = true;
>
> - rc = cxl_rr_alloc_decoder(port, cxlr, cxled, cxl_rr);
> + rc = cxl_rr_alloc_decoder(port, cxlr, cxled, cxl_rr, cxld);
> if (rc)
> goto out_erase;
> }
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (17 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 18/29] cxl/region: Avoid duplicate call of cxl_find_decoder_early() Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:18 ` Gregory Price
2025-01-17 21:31 ` Ben Cheatham
2025-01-07 14:10 ` [PATCH v1 20/29] cxl/region: Use translated HPA ranges " Robert Richter
` (10 subsequent siblings)
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
For the implementation of address translation it might not be possible
to determine the root decoder in the early enumeration state since the
SPA range is still unknown. Instead, the endpoint's HPA range is known
and from there the topology can be traversed up to the root port while
the memory range is adjusted from one memory domain to the next up to
the root port.
In a first step, use endpoint's HPA range to find the port's decoder.
Without address translation there is HPA == SPA. Then, the HPA range
of the endpoint can be used instead of the root decoder's range as
both are the same.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index b7f6d8a83e4e..23b86de3d4e7 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -861,9 +861,8 @@ static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
static int match_auto_decoder(struct device *dev, void *data)
{
- struct cxl_region_params *p = data;
+ struct range *r, *hpa = data;
struct cxl_decoder *cxld;
- struct range *r;
if (!is_switch_decoder(dev))
return 0;
@@ -871,7 +870,7 @@ static int match_auto_decoder(struct device *dev, void *data)
cxld = to_cxl_decoder(dev);
r = &cxld->hpa_range;
- if (p->res && p->res->start == r->start && p->res->end == r->end)
+ if (hpa && hpa->start == r->start && hpa->end == r->end)
return 1;
return 0;
@@ -888,7 +887,7 @@ cxl_find_decoder_early(struct cxl_port *port,
return &cxled->cxld;
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
- dev = device_find_child(&port->dev, &cxlr->params,
+ dev = device_find_child(&port->dev, &cxled->cxld.hpa_range,
match_auto_decoder);
else
dev = device_find_child(&port->dev, NULL, match_free_decoder);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder
2025-01-07 14:10 ` [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder Robert Richter
@ 2025-01-07 22:18 ` Gregory Price
2025-02-06 10:50 ` Robert Richter
2025-01-17 21:31 ` Ben Cheatham
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:18 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:05PM +0100, Robert Richter wrote:
> For the implementation of address translation it might not be possible
> to determine the root decoder in the early enumeration state since the
> SPA range is still unknown. Instead, the endpoint's HPA range is known
> and from there the topology can be traversed up to the root port while
> the memory range is adjusted from one memory domain to the next up to
> the root port.
>
> In a first step, use endpoint's HPA range to find the port's decoder.
> Without address translation there is HPA == SPA. Then, the HPA range
> of the endpoint can be used instead of the root decoder's range as
> both are the same.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
I'll make a note to test up to this patch in particular in HPA==SPA
mode without the follow ons.
The functional change here is the move from cxlr->params to
cxled->cxld.hpa_range.
What was the likely value of cxlr->params previously? Undefined?
Otherwise I don't immediately see any issues.
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/region.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index b7f6d8a83e4e..23b86de3d4e7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -861,9 +861,8 @@ static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
>
> static int match_auto_decoder(struct device *dev, void *data)
> {
> - struct cxl_region_params *p = data;
> + struct range *r, *hpa = data;
> struct cxl_decoder *cxld;
> - struct range *r;
>
> if (!is_switch_decoder(dev))
> return 0;
> @@ -871,7 +870,7 @@ static int match_auto_decoder(struct device *dev, void *data)
> cxld = to_cxl_decoder(dev);
> r = &cxld->hpa_range;
>
> - if (p->res && p->res->start == r->start && p->res->end == r->end)
> + if (hpa && hpa->start == r->start && hpa->end == r->end)
> return 1;
>
> return 0;
> @@ -888,7 +887,7 @@ cxl_find_decoder_early(struct cxl_port *port,
> return &cxled->cxld;
>
> if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
> - dev = device_find_child(&port->dev, &cxlr->params,
> + dev = device_find_child(&port->dev, &cxled->cxld.hpa_range,
> match_auto_decoder);
> else
> dev = device_find_child(&port->dev, NULL, match_free_decoder);
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder
2025-01-07 22:18 ` Gregory Price
@ 2025-02-06 10:50 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-06 10:50 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 17:18:49, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:05PM +0100, Robert Richter wrote:
> > For the implementation of address translation it might not be possible
> > to determine the root decoder in the early enumeration state since the
> > SPA range is still unknown. Instead, the endpoint's HPA range is known
> > and from there the topology can be traversed up to the root port while
> > the memory range is adjusted from one memory domain to the next up to
> > the root port.
> >
> > In a first step, use endpoint's HPA range to find the port's decoder.
> > Without address translation there is HPA == SPA. Then, the HPA range
> > of the endpoint can be used instead of the root decoder's range as
> > both are the same.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
>
> I'll make a note to test up to this patch in particular in HPA==SPA
> mode without the follow ons.
>
> The functional change here is the move from cxlr->params to
> cxled->cxld.hpa_range.
>
> What was the likely value of cxlr->params previously? Undefined?
The region's hpa range was set up with the same values. Thus it can be
used interchangeable.
Call chain:
cxl_endpoint_decoder_add
hpa = &cxled->cxld.hpa_range;
cxl_find_region_by_range(..., hpa);
match_region_by_range(..., hpa);
construct_region
hpa = &cxled->cxld.hpa_range;
*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa), ...)
attach_target
cxl_region_attach
cxl_region_attach_position
cxl_port_attach_region
cxl_find_decoder_early
hpa = &cxled->cxld.hpa_range
match_auto_decoder(..., hpa)
Note I have moved out the switch to spa_range of "[PATCH v1 15/29]
cxl/region: Use an endpoint's SPA range to find a region" to a later
patch for a better understanding of the changes, but technically it is
the same as there is still spa == hpa at this point.
>
> Otherwise I don't immediately see any issues.
>
> Reviewed-by: Gregory Price <gourry@gourry.net>
Thanks,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder
2025-01-07 14:10 ` [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder Robert Richter
2025-01-07 22:18 ` Gregory Price
@ 2025-01-17 21:31 ` Ben Cheatham
2025-02-06 11:03 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:31 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 1/7/25 8:10 AM, Robert Richter wrote:
> For the implementation of address translation it might not be possible
> to determine the root decoder in the early enumeration state since the
> SPA range is still unknown. Instead, the endpoint's HPA range is known
> and from there the topology can be traversed up to the root port while
> the memory range is adjusted from one memory domain to the next up to
> the root port.
>
> In a first step, use endpoint's HPA range to find the port's decoder.
> Without address translation there is HPA == SPA. Then, the HPA range
> of the endpoint can be used instead of the root decoder's range as
> both are the same.
I think this can be clearer. Something like:
"In a first step, use endpoint's HPA range to find the port's decoder.
Without address translation HPA == SPA, so the endpoint's HPA range can
be used since it is the same as the root decoder's.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index b7f6d8a83e4e..23b86de3d4e7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -861,9 +861,8 @@ static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
>
> static int match_auto_decoder(struct device *dev, void *data)
> {
> - struct cxl_region_params *p = data;
> + struct range *r, *hpa = data;
> struct cxl_decoder *cxld;
> - struct range *r;
>
> if (!is_switch_decoder(dev))
> return 0;
> @@ -871,7 +870,7 @@ static int match_auto_decoder(struct device *dev, void *data)
> cxld = to_cxl_decoder(dev);
> r = &cxld->hpa_range;
>
> - if (p->res && p->res->start == r->start && p->res->end == r->end)
> + if (hpa && hpa->start == r->start && hpa->end == r->end)
> return 1;
>
> return 0;
> @@ -888,7 +887,7 @@ cxl_find_decoder_early(struct cxl_port *port,
> return &cxled->cxld;
>
> if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
> - dev = device_find_child(&port->dev, &cxlr->params,
> + dev = device_find_child(&port->dev, &cxled->cxld.hpa_range,
> match_auto_decoder);
> else
> dev = device_find_child(&port->dev, NULL, match_free_decoder);
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder
2025-01-17 21:31 ` Ben Cheatham
@ 2025-02-06 11:03 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-06 11:03 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso
On 17.01.25 15:31:51, Ben Cheatham wrote:
> On 1/7/25 8:10 AM, Robert Richter wrote:
> > For the implementation of address translation it might not be possible
> > to determine the root decoder in the early enumeration state since the
> > SPA range is still unknown. Instead, the endpoint's HPA range is known
> > and from there the topology can be traversed up to the root port while
> > the memory range is adjusted from one memory domain to the next up to
> > the root port.
> >
> > In a first step, use endpoint's HPA range to find the port's decoder.
> > Without address translation there is HPA == SPA. Then, the HPA range
> > of the endpoint can be used instead of the root decoder's range as
> > both are the same.
>
> I think this can be clearer. Something like:
>
> "In a first step, use endpoint's HPA range to find the port's decoder.
> Without address translation HPA == SPA, so the endpoint's HPA range can
> be used since it is the same as the root decoder's.
Changed, thanks.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 20/29] cxl/region: Use translated HPA ranges to find the port's decoder
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (18 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 19/29] cxl/region: Use endpoint's HPA range to find the port's decoder Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:33 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 21/29] cxl/region: Lock decoders that need address translation Robert Richter
` (9 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
This is the second step to find the port's decoder with address
translation enabled. The translated HPA range must be used to find a
decoder. The port's HPA range is determined by applying address
translation when crossing memory domains for the HPA range to each
port while traversing the topology from the endpoint up to the port.
Introduce a function cxl_find_auto_decoder() that calculates the
port's translated address range to determine the corresponding
decoder. Use the existing helper function cxl_port_calc_hpa() for HPA
range calculation.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 63 +++++++++++++++++++++++++++++++++++++--
1 file changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 23b86de3d4e7..8d7893878362 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -876,6 +876,66 @@ static int match_auto_decoder(struct device *dev, void *data)
return 0;
}
+static struct device *
+cxl_find_auto_decoder(struct cxl_port *port, struct cxl_endpoint_decoder *cxled,
+ struct cxl_region *cxlr)
+{
+ struct cxl_port *parent, *iter = cxled_to_port(cxled);
+ struct cxl_decoder *cxld = &cxled->cxld;
+ struct range hpa = cxld->hpa_range;
+ struct cxl_region_ref *rr;
+
+ while (1) {
+ parent = parent_port_of(iter);
+ if (!parent) {
+ dev_warn(&port->dev,
+ "port not a parent of endpoint decoder %s\n",
+ dev_name(&cxled->cxld.dev));
+ return NULL;
+ }
+
+ if (!parent->to_hpa) {
+ iter = parent;
+ continue;
+ }
+
+ /* Lower domain decoders are already attached. */
+ rr = cxl_rr_load(iter, cxlr);
+ cxld = rr ? rr->decoder : NULL;
+ if (!cxld) {
+ dev_warn(&iter->dev,
+ "no decoder found for region %s\n",
+ dev_name(&cxlr->dev));
+ return NULL;
+ }
+
+ /* Check switch decoder range. */
+ if (cxld != &cxled->cxld &&
+ !match_auto_decoder(&cxld->dev, &hpa)) {
+ dev_warn(&iter->dev,
+ "decoder %s out of range %#llx-%#llx:%#llx-%#llx(%s)\n",
+ dev_name(&cxld->dev), cxld->hpa_range.start,
+ cxld->hpa_range.end, hpa.start, hpa.end,
+ dev_name(&cxled->cxld.dev));
+ return NULL;
+ }
+
+ if (cxl_port_calc_hpa(parent, cxld, &hpa))
+ return NULL;
+
+ if (parent == port)
+ break;
+
+ iter = parent;
+ }
+
+ dev_dbg(cxld->dev.parent, "%s: range: %#llx-%#llx iw: %d ig: %d\n",
+ dev_name(&cxld->dev), hpa.start, hpa.end,
+ cxld->interleave_ways, cxld->interleave_granularity);
+
+ return device_find_child(&port->dev, &hpa, match_auto_decoder);
+}
+
static struct cxl_decoder *
cxl_find_decoder_early(struct cxl_port *port,
struct cxl_endpoint_decoder *cxled,
@@ -887,8 +947,7 @@ cxl_find_decoder_early(struct cxl_port *port,
return &cxled->cxld;
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
- dev = device_find_child(&port->dev, &cxled->cxld.hpa_range,
- match_auto_decoder);
+ dev = cxl_find_auto_decoder(port, cxled, cxlr);
else
dev = device_find_child(&port->dev, NULL, match_free_decoder);
if (!dev)
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 20/29] cxl/region: Use translated HPA ranges to find the port's decoder
2025-01-07 14:10 ` [PATCH v1 20/29] cxl/region: Use translated HPA ranges " Robert Richter
@ 2025-01-07 22:33 ` Gregory Price
2025-02-06 11:31 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:33 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:06PM +0100, Robert Richter wrote:
> This is the second step to find the port's decoder with address
> translation enabled. The translated HPA range must be used to find a
> decoder. The port's HPA range is determined by applying address
> translation when crossing memory domains for the HPA range to each
> port while traversing the topology from the endpoint up to the port.
>
> Introduce a function cxl_find_auto_decoder() that calculates the
> port's translated address range to determine the corresponding
> decoder. Use the existing helper function cxl_port_calc_hpa() for HPA
> range calculation.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 63 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 23b86de3d4e7..8d7893878362 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -876,6 +876,66 @@ static int match_auto_decoder(struct device *dev, void *data)
> return 0;
> }
>
> +static struct device *
> +cxl_find_auto_decoder(struct cxl_port *port, struct cxl_endpoint_decoder *cxled,
> + struct cxl_region *cxlr)
> +{
> + struct cxl_port *parent, *iter = cxled_to_port(cxled);
> + struct cxl_decoder *cxld = &cxled->cxld;
> + struct range hpa = cxld->hpa_range;
> + struct cxl_region_ref *rr;
> +
> + while (1) {
Similar to prior patch, probably we should have while(parent) instead of
while(1) and be at least a bit defensive against (extremely unlikely) loops.
At the very least maybe we can express the condition more explicitly.
> + parent = parent_port_of(iter);
> + if (!parent) {
> + dev_warn(&port->dev,
> + "port not a parent of endpoint decoder %s\n",
> + dev_name(&cxled->cxld.dev));
> + return NULL;
> + }
... snip ...
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 20/29] cxl/region: Use translated HPA ranges to find the port's decoder
2025-01-07 22:33 ` Gregory Price
@ 2025-02-06 11:31 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-06 11:31 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 17:33:09, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:06PM +0100, Robert Richter wrote:
> > This is the second step to find the port's decoder with address
> > translation enabled. The translated HPA range must be used to find a
> > decoder. The port's HPA range is determined by applying address
> > translation when crossing memory domains for the HPA range to each
> > port while traversing the topology from the endpoint up to the port.
> >
> > Introduce a function cxl_find_auto_decoder() that calculates the
> > port's translated address range to determine the corresponding
> > decoder. Use the existing helper function cxl_port_calc_hpa() for HPA
> > range calculation.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 63 +++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 61 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 23b86de3d4e7..8d7893878362 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -876,6 +876,66 @@ static int match_auto_decoder(struct device *dev, void *data)
> > return 0;
> > }
> >
> > +static struct device *
> > +cxl_find_auto_decoder(struct cxl_port *port, struct cxl_endpoint_decoder *cxled,
> > + struct cxl_region *cxlr)
> > +{
> > + struct cxl_port *parent, *iter = cxled_to_port(cxled);
> > + struct cxl_decoder *cxld = &cxled->cxld;
> > + struct range hpa = cxld->hpa_range;
> > + struct cxl_region_ref *rr;
> > +
> > + while (1) {
>
> Similar to prior patch, probably we should have while(parent) instead of
> while(1) and be at least a bit defensive against (extremely unlikely) loops.
>
> At the very least maybe we can express the condition more explicitly.
My review showed there is a bug around the (!parent->to_hpa) check
which is missing the break if (parent == port).
Changed the code using while (iter != port) {} which also fixes this
issue.
-Robert
>
> > + parent = parent_port_of(iter);
> > + if (!parent) {
> > + dev_warn(&port->dev,
> > + "port not a parent of endpoint decoder %s\n",
> > + dev_name(&cxled->cxld.dev));
> > + return NULL;
> > + }
> ... snip ...
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 21/29] cxl/region: Lock decoders that need address translation
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (19 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 20/29] cxl/region: Use translated HPA ranges " Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 22:35 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region Robert Richter
` (8 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
There is only support to translate from the endpoint to its parent
port, but not in the opposite direction from the parent to the
endpoint. Thus, the endpoint address range cannot be determined and
setup manually. If the parent implements the ->to_hpa() callback and
needs address translation, forbit reprogramming of the decoders and
lock them.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 8d7893878362..681c26abc26e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3353,6 +3353,17 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
if (cxl_port_calc_hpa(parent, cxld, &hpa))
return -ENXIO;
+ /*
+ * There is only support to translate from the endpoint to its
+ * parent port, but not in the opposite direction from the
+ * parent to the endpoint. Thus, the endpoint address range
+ * cannot be determined and setup manually. If the parent
+ * implements the ->to_hpa() callback and needs address
+ * translation, forbit reprogramming of the decoders and lock
+ * them.
+ */
+ cxld->flags |= CXL_DECODER_F_LOCK;
+
/* Iterate from endpoint to root_port refining the position */
pos = cxl_port_calc_pos(iter, &hpa, pos);
if (pos < 0)
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 21/29] cxl/region: Lock decoders that need address translation
2025-01-07 14:10 ` [PATCH v1 21/29] cxl/region: Lock decoders that need address translation Robert Richter
@ 2025-01-07 22:35 ` Gregory Price
2025-02-06 13:23 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 22:35 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:07PM +0100, Robert Richter wrote:
> There is only support to translate from the endpoint to its parent
> port, but not in the opposite direction from the parent to the
> endpoint. Thus, the endpoint address range cannot be determined and
> setup manually. If the parent implements the ->to_hpa() callback and
> needs address translation, forbit reprogramming of the decoders and
^^^^^^
forbid?
> lock them.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 8d7893878362..681c26abc26e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3353,6 +3353,17 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> if (cxl_port_calc_hpa(parent, cxld, &hpa))
> return -ENXIO;
>
> + /*
> + * There is only support to translate from the endpoint to its
> + * parent port, but not in the opposite direction from the
> + * parent to the endpoint. Thus, the endpoint address range
> + * cannot be determined and setup manually. If the parent
> + * implements the ->to_hpa() callback and needs address
> + * translation, forbit reprogramming of the decoders and lock
^^^^^^
forbid?
> + * them.
> + */
> + cxld->flags |= CXL_DECODER_F_LOCK;
> +
> /* Iterate from endpoint to root_port refining the position */
> pos = cxl_port_calc_pos(iter, &hpa, pos);
> if (pos < 0)
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 21/29] cxl/region: Lock decoders that need address translation
2025-01-07 22:35 ` Gregory Price
@ 2025-02-06 13:23 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-06 13:23 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 17:35:00, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:07PM +0100, Robert Richter wrote:
> > There is only support to translate from the endpoint to its parent
> > port, but not in the opposite direction from the parent to the
> > endpoint. Thus, the endpoint address range cannot be determined and
> > setup manually. If the parent implements the ->to_hpa() callback and
> > needs address translation, forbit reprogramming of the decoders and
> ^^^^^^
> forbid?
>
> > lock them.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 8d7893878362..681c26abc26e 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -3353,6 +3353,17 @@ static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> > if (cxl_port_calc_hpa(parent, cxld, &hpa))
> > return -ENXIO;
> >
> > + /*
> > + * There is only support to translate from the endpoint to its
> > + * parent port, but not in the opposite direction from the
> > + * parent to the endpoint. Thus, the endpoint address range
> > + * cannot be determined and setup manually. If the parent
> > + * implements the ->to_hpa() callback and needs address
> > + * translation, forbit reprogramming of the decoders and lock
> ^^^^^^
> forbid?
Updated both.
-Robert
>
> > + * them.
> > + */
> > + cxld->flags |= CXL_DECODER_F_LOCK;
> > +
> > /* Iterate from endpoint to root_port refining the position */
> > pos = cxl_port_calc_pos(iter, &hpa, pos);
> > if (pos < 0)
> > --
> > 2.39.5
> >
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (20 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 21/29] cxl/region: Lock decoders that need address translation Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 23:08 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters " Robert Richter
` (7 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
To create a region, SPA ranges must be used. With address translation
the endpoint's HPA range is not the same as the SPA range. Use the
previously calculated SPA range instead.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 681c26abc26e..e218f0be2409 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3424,7 +3424,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
struct cxl_port *port = cxlrd_to_port(cxlrd);
- struct range *hpa = &cxled->cxld.hpa_range;
+ struct range *spa = &cxled->spa_range;
struct cxl_region_params *p;
struct cxl_region *cxlr;
struct resource *res;
@@ -3462,7 +3462,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
goto err;
}
- *res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
+ *res = DEFINE_RES_MEM_NAMED(spa->start, range_len(spa),
dev_name(&cxlr->dev));
rc = insert_resource(cxlrd->res, res);
if (rc) {
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region
2025-01-07 14:10 ` [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region Robert Richter
@ 2025-01-07 23:08 ` Gregory Price
2025-02-06 13:25 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 23:08 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:08PM +0100, Robert Richter wrote:
> To create a region, SPA ranges must be used. With address translation
> the endpoint's HPA range is not the same as the SPA range. Use the
> previously calculated SPA range instead.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Patches 22-25 seem reliant on each other. Should I expect errors if I
were to test them individually?
The individual changes in 22-24 seem ok, but is spa->* expected to be
correct in the absense of the PRMT translation functions when HPA==SPA?
~Gregory
> ---
> drivers/cxl/core/region.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 681c26abc26e..e218f0be2409 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3424,7 +3424,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> struct cxl_port *port = cxlrd_to_port(cxlrd);
> - struct range *hpa = &cxled->cxld.hpa_range;
> + struct range *spa = &cxled->spa_range;
> struct cxl_region_params *p;
> struct cxl_region *cxlr;
> struct resource *res;
> @@ -3462,7 +3462,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> goto err;
> }
>
> - *res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> + *res = DEFINE_RES_MEM_NAMED(spa->start, range_len(spa),
> dev_name(&cxlr->dev));
> rc = insert_resource(cxlrd->res, res);
> if (rc) {
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region
2025-01-07 23:08 ` Gregory Price
@ 2025-02-06 13:25 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-06 13:25 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On 07.01.25 18:08:59, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:08PM +0100, Robert Richter wrote:
> > To create a region, SPA ranges must be used. With address translation
> > the endpoint's HPA range is not the same as the SPA range. Use the
> > previously calculated SPA range instead.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
>
> Patches 22-25 seem reliant on each other. Should I expect errors if I
> were to test them individually?
>
> The individual changes in 22-24 seem ok, but is spa->* expected to be
> correct in the absense of the PRMT translation functions when HPA==SPA?
Exactly, as long as ->to_hpa() is not enabled, there is HPA==SPA and
code should work as before.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters to create a region
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (21 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 22/29] cxl/region: Use translated HPA ranges to create a region Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-13 17:48 ` Alison Schofield
2025-01-07 14:10 ` [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check " Robert Richter
` (6 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Endpoints requiring address translation might not be aware of the
system's interleaving configuration. Instead, interleaving can be
configured on an upper memory domain (from an endpoint view) and thus
is not visible to the endpoint. For region creation this might cause
an invalid interleaving config that does not match the CFMWS entries.
Use the interleaving configuration of the root decoders to create a
region which bases on CFMWS entries. This always matches the system's
interleaving configuration and is independent of the underlying memory
topology.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e218f0be2409..c3322bae05b9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3477,8 +3477,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
}
p->res = res;
- p->interleave_ways = cxled->cxld.interleave_ways;
- p->interleave_granularity = cxled->cxld.interleave_granularity;
+ p->interleave_ways = cxlrd->cxlsd.cxld.interleave_ways;
+ p->interleave_granularity = cxlrd->cxlsd.cxld.interleave_granularity;
p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters to create a region
2025-01-07 14:10 ` [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters " Robert Richter
@ 2025-01-13 17:48 ` Alison Schofield
2025-02-14 13:06 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Alison Schofield @ 2025-01-13 17:48 UTC (permalink / raw)
To: Robert Richter
Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:09PM +0100, Robert Richter wrote:
> Endpoints requiring address translation might not be aware of the
> system's interleaving configuration. Instead, interleaving can be
> configured on an upper memory domain (from an endpoint view) and thus
> is not visible to the endpoint. For region creation this might cause
> an invalid interleaving config that does not match the CFMWS entries.
>
> Use the interleaving configuration of the root decoders to create a
> region which bases on CFMWS entries. This always matches the system's
> interleaving configuration and is independent of the underlying memory
> topology.
This sounds like a restriction, more restrictive than present.
Won't it block all region interleave ways greater than root decoder
interleave ways? ie. disallows 2, 2+2, 2+2+2, 4, etc.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index e218f0be2409..c3322bae05b9 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3477,8 +3477,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> }
>
> p->res = res;
> - p->interleave_ways = cxled->cxld.interleave_ways;
> - p->interleave_granularity = cxled->cxld.interleave_granularity;
> + p->interleave_ways = cxlrd->cxlsd.cxld.interleave_ways;
> + p->interleave_granularity = cxlrd->cxlsd.cxld.interleave_granularity;
> p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
>
> rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters to create a region
2025-01-13 17:48 ` Alison Schofield
@ 2025-02-14 13:06 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-14 13:06 UTC (permalink / raw)
To: Alison Schofield
Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 13.01.25 09:48:47, Alison Schofield wrote:
> On Tue, Jan 07, 2025 at 03:10:09PM +0100, Robert Richter wrote:
> > Endpoints requiring address translation might not be aware of the
> > system's interleaving configuration. Instead, interleaving can be
> > configured on an upper memory domain (from an endpoint view) and thus
> > is not visible to the endpoint. For region creation this might cause
> > an invalid interleaving config that does not match the CFMWS entries.
> >
> > Use the interleaving configuration of the root decoders to create a
> > region which bases on CFMWS entries. This always matches the system's
> > interleaving configuration and is independent of the underlying memory
> > topology.
>
> This sounds like a restriction, more restrictive than present.
>
> Won't it block all region interleave ways greater than root decoder
> interleave ways? ie. disallows 2, 2+2, 2+2+2, 4, etc.
Yes, a combination of all decoders from the endpoint up to the root
result in the interleaving config. This will be fixed in the next
version.
Thanks for catching this,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check a region
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (22 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 23/29] cxl/region: Use root decoders interleaving parameters " Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-13 17:38 ` Alison Schofield
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
` (5 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Endpoints or switches requiring address translation might not be aware
of the system's interleaving configuration. Then, the configured
endpoint's address range might not match the expected range. In
contrast, the SPA range of an endpoint is calculated applying platform
specific address translation. That range is correct and can be used to
check a region range.
Adjust the region range check and use the endpoint's SPA range to
check it.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c3322bae05b9..1dae7d36d37c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1516,22 +1516,26 @@ static int cxl_port_setup_targets(struct cxl_port *port,
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
if (cxld->interleave_ways != iw ||
cxld->interleave_granularity != ig ||
- cxld->hpa_range.start != p->res->start ||
- cxld->hpa_range.end != p->res->end ||
+ cxled->spa_range.start != p->res->start ||
+ cxled->spa_range.end != p->res->end ||
((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
dev_err(&cxlr->dev,
"%s:%s %s expected iw: %d ig: %d %pr\n",
dev_name(port->uport_dev), dev_name(&port->dev),
__func__, iw, ig, p->res);
dev_err(&cxlr->dev,
- "%s:%s %s got iw: %d ig: %d state: %s %#llx:%#llx\n",
+ "%s:%s %s got iw: %d ig: %d state: %s %#llx-%#llx:%#llx-%#llx(%s):%#llx-%#llx(%s)\n",
dev_name(port->uport_dev), dev_name(&port->dev),
__func__, cxld->interleave_ways,
cxld->interleave_granularity,
(cxld->flags & CXL_DECODER_F_ENABLE) ?
"enabled" :
"disabled",
- cxld->hpa_range.start, cxld->hpa_range.end);
+ p->res->start, p->res->end,
+ cxled->spa_range.start, cxled->spa_range.end,
+ dev_name(&cxled->cxld.dev),
+ cxld->hpa_range.start, cxld->hpa_range.end,
+ dev_name(&cxld->dev));
return -ENXIO;
}
} else {
@@ -2051,13 +2055,12 @@ static int cxl_region_attach(struct cxl_region *cxlr,
return -ENXIO;
}
- if (resource_size(cxled->dpa_res) * p->interleave_ways !=
- resource_size(p->res)) {
+ if (range_len(&cxled->spa_range) != resource_size(p->res)) {
dev_dbg(&cxlr->dev,
- "%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
+ "%s:%s: SPA size mismatch: %#llx-%#llx:%#llx-%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
- (u64)resource_size(cxled->dpa_res), p->interleave_ways,
- (u64)resource_size(p->res));
+ p->res->start, p->res->end,
+ cxled->spa_range.start, cxled->spa_range.end);
return -EINVAL;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check a region
2025-01-07 14:10 ` [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check " Robert Richter
@ 2025-01-13 17:38 ` Alison Schofield
2025-02-14 13:09 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Alison Schofield @ 2025-01-13 17:38 UTC (permalink / raw)
To: Robert Richter
Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:10PM +0100, Robert Richter wrote:
> Endpoints or switches requiring address translation might not be aware
> of the system's interleaving configuration. Then, the configured
> endpoint's address range might not match the expected range. In
> contrast, the SPA range of an endpoint is calculated applying platform
> specific address translation. That range is correct and can be used to
> check a region range.
>
> Adjust the region range check and use the endpoint's SPA range to
> check it.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c3322bae05b9..1dae7d36d37c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1516,22 +1516,26 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
> if (cxld->interleave_ways != iw ||
> cxld->interleave_granularity != ig ||
> - cxld->hpa_range.start != p->res->start ||
> - cxld->hpa_range.end != p->res->end ||
> + cxled->spa_range.start != p->res->start ||
> + cxled->spa_range.end != p->res->end ||
> ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
> dev_err(&cxlr->dev,
> "%s:%s %s expected iw: %d ig: %d %pr\n",
> dev_name(port->uport_dev), dev_name(&port->dev),
> __func__, iw, ig, p->res);
> dev_err(&cxlr->dev,
> - "%s:%s %s got iw: %d ig: %d state: %s %#llx:%#llx\n",
> + "%s:%s %s got iw: %d ig: %d state: %s %#llx-%#llx:%#llx-%#llx(%s):%#llx-%#llx(%s)\n",
> dev_name(port->uport_dev), dev_name(&port->dev),
> __func__, cxld->interleave_ways,
> cxld->interleave_granularity,
> (cxld->flags & CXL_DECODER_F_ENABLE) ?
> "enabled" :
> "disabled",
> - cxld->hpa_range.start, cxld->hpa_range.end);
> + p->res->start, p->res->end,
> + cxled->spa_range.start, cxled->spa_range.end,
> + dev_name(&cxled->cxld.dev),
> + cxld->hpa_range.start, cxld->hpa_range.end,
> + dev_name(&cxld->dev));
> return -ENXIO;
> }
> } else {
> @@ -2051,13 +2055,12 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> return -ENXIO;
> }
>
> - if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> - resource_size(p->res)) {
> + if (range_len(&cxled->spa_range) != resource_size(p->res)) {
> dev_dbg(&cxlr->dev,
> - "%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
> + "%s:%s: SPA size mismatch: %#llx-%#llx:%#llx-%#llx\n",
The cxled->spa_range is only set in the auto region path, yet this
path is taken by both auto and user created regions. User created regions
die here.
> dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> - (u64)resource_size(cxled->dpa_res), p->interleave_ways,
> - (u64)resource_size(p->res));
> + p->res->start, p->res->end,
> + cxled->spa_range.start, cxled->spa_range.end);
> return -EINVAL;
> }
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check a region
2025-01-13 17:38 ` Alison Schofield
@ 2025-02-14 13:09 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-14 13:09 UTC (permalink / raw)
To: Alison Schofield
Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 13.01.25 09:38:04, Alison Schofield wrote:
> On Tue, Jan 07, 2025 at 03:10:10PM +0100, Robert Richter wrote:
> > @@ -2051,13 +2055,12 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> > return -ENXIO;
> > }
> >
> > - if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> > - resource_size(p->res)) {
> > + if (range_len(&cxled->spa_range) != resource_size(p->res)) {
> > dev_dbg(&cxlr->dev,
> > - "%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
> > + "%s:%s: SPA size mismatch: %#llx-%#llx:%#llx-%#llx\n",
>
> The cxled->spa_range is only set in the auto region path, yet this
> path is taken by both auto and user created regions. User created regions
> die here.
The original check at this point should still work and .spa_range will
not be needed then. Fixed in next version.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (23 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 24/29] cxl/region: Use endpoint's SPA range to check " Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 16:32 ` Robert Richter
` (5 more replies)
2025-01-07 14:10 ` [PATCH v1 26/29] MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD) Robert Richter
` (4 subsequent siblings)
29 siblings, 6 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Robert Richter,
Terry Bowman
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco
Add AMD platform specific Zen5 support for address translation.
Zen5 systems may be configured to use 'Normalized addresses'. Then,
CXL endpoints use their own physical address space and Host Physical
Addresses (HPAs) need address translation from the endpoint to its CXL
host bridge. The HPA of a CXL host bridge is equivalent to the System
Physical Address (SPA).
ACPI Platform Runtime Mechanism (PRM) is used to translate the CXL
Device Physical Address (DPA) to its System Physical Address. This is
documented in:
AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
ACPI v6.5 Porting Guide, Publication # 58088
https://www.amd.com/en/search/documentation/hub.html
Note that DPA and HPA of an endpoint may differ depending on the
interleaving configuration. That is, an additional calculation between
DPA and HPA is needed.
To implement AMD Zen5 address translation the following steps are
needed:
Introduce the generic function cxl_port_platform_setup() that allows
to apply platform specific changes to each port where necessary.
Add a function cxl_port_setup_amd() to implement AMD platform specific
code. Use Kbuild and Kconfig options respectivly to enable the code
depending on architecture and platform options. Create a new file
core/amd.c for this.
Introduce a function cxl_zen5_init() to handle Zen5 specific
enablement. Zen5 platforms are detected using the PCIe vendor and
device ID of the corresponding CXL root port.
Apply cxl_zen5_to_hpa() as cxl_port->to_hpa() callback to Zen5 CXL
host bridges to enable platform specific address translation.
Use ACPI PRM DPA to SPA translation to determine an endpoint's
interleaving configuration and base address during the early
initialization proces. This is used to determine an endpoint's SPA
range.
Since the PRM translates DPA->SPA, but HPA->SPA is needed, determine
the interleaving config and base address of the endpoint first, then
calculate the SPA based on the given HPA using the address base.
The config can be determined calling the PRM for specific DPAs
given. Since the interleaving configuration is still unknown, chose
DPAs starting at 0xd20000. This address is factor for all values from
1 to 8 and thus valid for all possible interleaving configuration.
The resulting SPAs are used to calculate interleaving paramters and
the SPA base address of the endpoint. The maximum granularity (chunk
size) is 16k, minimum is 256. Use the following calculation for a
given DPA:
ways = hpa_len(SZ_16K) / SZ_16K
gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
/ (ways - 1)
pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
Once the endpoint is attached to a region and its SPA range is know,
calling the PRM is no longer needed, the SPA base can be used.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/Kconfig | 4 +
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/amd.c | 227 ++++++++++++++++++++++++++++++++++++++
drivers/cxl/core/core.h | 6 +
drivers/cxl/core/port.c | 7 ++
5 files changed, 245 insertions(+)
create mode 100644 drivers/cxl/core/amd.c
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 876469e23f7a..e576028dd983 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -146,4 +146,8 @@ config CXL_REGION_INVALIDATION_TEST
If unsure, or if this kernel is meant for production environments,
say N.
+config CXL_AMD
+ def_bool y
+ depends on AMD_NB
+
endif
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 9259bcc6773c..dc368e61d281 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -16,3 +16,4 @@ cxl_core-y += pmu.o
cxl_core-y += cdat.o
cxl_core-$(CONFIG_TRACING) += trace.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
+cxl_core-$(CONFIG_CXL_AMD) += amd.o
diff --git a/drivers/cxl/core/amd.c b/drivers/cxl/core/amd.c
new file mode 100644
index 000000000000..553b7d0caefd
--- /dev/null
+++ b/drivers/cxl/core/amd.c
@@ -0,0 +1,227 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/prmt.h>
+#include <linux/pci.h>
+
+#include "cxlmem.h"
+#include "core.h"
+
+#define PCI_DEVICE_ID_AMD_ZEN5_ROOT 0x153e
+
+static const struct pci_device_id zen5_root_port_ids[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ZEN5_ROOT) },
+ {},
+};
+
+static int is_zen5_root_port(struct device *dev, void *unused)
+{
+ if (!dev_is_pci(dev))
+ return 0;
+
+ return !!pci_match_id(zen5_root_port_ids, to_pci_dev(dev));
+}
+
+static bool is_zen5(struct cxl_port *port)
+{
+ if (!IS_ENABLED(CONFIG_ACPI_PRMT))
+ return false;
+
+ /* To get the CXL root port, find the CXL host bridge first. */
+ if (is_cxl_root(port) ||
+ !port->host_bridge ||
+ !is_cxl_root(to_cxl_port(port->dev.parent)))
+ return false;
+
+ return !!device_for_each_child(port->host_bridge, NULL,
+ is_zen5_root_port);
+}
+
+/*
+ * PRM Address Translation - CXL DPA to System Physical Address
+ *
+ * Reference:
+ *
+ * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
+ * ACPI v6.5 Porting Guide, Publication # 58088
+ */
+
+static const guid_t prm_cxl_dpa_spa_guid =
+ GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
+ 0x48, 0x0b, 0x94);
+
+struct prm_cxl_dpa_spa_data {
+ u64 dpa;
+ u8 reserved;
+ u8 devfn;
+ u8 bus;
+ u8 segment;
+ void *out;
+} __packed;
+
+static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
+{
+ struct prm_cxl_dpa_spa_data data;
+ u64 spa;
+ int rc;
+
+ data = (struct prm_cxl_dpa_spa_data) {
+ .dpa = dpa,
+ .devfn = pci_dev->devfn,
+ .bus = pci_dev->bus->number,
+ .segment = pci_domain_nr(pci_dev->bus),
+ .out = &spa,
+ };
+
+ rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
+ if (rc) {
+ pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
+ return ULLONG_MAX;
+ }
+
+ pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
+
+ return spa;
+}
+
+static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
+{
+ struct cxl_memdev *cxlmd;
+ struct pci_dev *pci_dev;
+ struct cxl_port *port;
+ u64 dpa, base, spa, spa2, len, len2, offset, granularity;
+ int ways, pos;
+
+ /*
+ * Nothing to do if base is non-zero and Normalized Addressing
+ * is disabled.
+ */
+ if (cxld->hpa_range.start)
+ return hpa;
+
+ /* Only translate from endpoint to its parent port. */
+ if (!is_endpoint_decoder(&cxld->dev))
+ return hpa;
+
+ if (hpa > cxld->hpa_range.end) {
+ dev_dbg(&cxld->dev, "hpa addr %#llx out of range %#llx-%#llx\n",
+ hpa, cxld->hpa_range.start, cxld->hpa_range.end);
+ return ULLONG_MAX;
+ }
+
+ /*
+ * If the decoder is already attached, the region's base can
+ * be used.
+ */
+ if (cxld->region)
+ return cxld->region->params.res->start + hpa;
+
+ port = to_cxl_port(cxld->dev.parent);
+ cxlmd = port ? to_cxl_memdev(port->uport_dev) : NULL;
+ if (!port || !dev_is_pci(cxlmd->dev.parent)) {
+ dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
+ dev_name(cxld->dev.parent), cxld->hpa_range.start,
+ cxld->hpa_range.end);
+ return ULLONG_MAX;
+ }
+ pci_dev = to_pci_dev(cxlmd->dev.parent);
+
+ /*
+ * The PRM translates DPA->SPA, but we need HPA->SPA.
+ * Determine the interleaving config first, then calculate the
+ * DPA. Maximum granularity (chunk size) is 16k, minimum is
+ * 256. Calculated with:
+ *
+ * ways = hpa_len(SZ_16K) / SZ_16K
+ * gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
+ * / (ways - 1)
+ * pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
+ */
+
+ /*
+ * DPA magic:
+ *
+ * Position and granularity are unknown yet, use an always
+ * valid DPA:
+ *
+ * 0xd20000 = 13762560 = 16k * 2 * 3 * 2 * 5 * 7 * 2
+ *
+ * It is divisible by all positions 1 to 8. The DPA is valid
+ * for all positions and granularities.
+ */
+#define DPA_MAGIC 0xd20000
+ base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
+ spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
+ spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
+
+ /* Includes checks to avoid div by zero */
+ if (!base || base == ULLONG_MAX || spa == ULLONG_MAX ||
+ spa2 == ULLONG_MAX || spa < base + SZ_16K || spa2 <= base ||
+ (spa > base + SZ_16K && spa - spa2 < SZ_256 * 2)) {
+ dev_dbg(&cxld->dev, "Error translating HPA: base %#llx, spa %#llx, spa2 %#llx\n",
+ base, spa, spa2);
+ return ULLONG_MAX;
+ }
+
+ len = spa - base;
+ len2 = spa2 - base;
+
+ /* offset = pos * granularity */
+ if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
+ ways = 1;
+ offset = 0;
+ granularity = 0;
+ pos = 0;
+ } else {
+ ways = len / SZ_16K;
+ offset = spa & (SZ_16K - 1);
+ granularity = (len - len2 - SZ_256) / (ways - 1);
+ pos = offset / granularity;
+ }
+
+ base = base - DPA_MAGIC * ways - pos * granularity;
+ spa = base + hpa;
+
+ /*
+ * Check SPA using a PRM call for the closest DPA calculated
+ * for the HPA. If the HPA matches a different interleaving
+ * position other than the decoder's, determine its offset to
+ * adjust the SPA.
+ */
+
+ dpa = (hpa & ~(granularity * ways - 1)) / ways
+ + (hpa & (granularity - 1));
+ offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
+ offset -= pos * granularity;
+ spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
+
+ dev_dbg(&cxld->dev,
+ "address mapping found for %s (dpa -> hpa -> spa): %#llx -> %#llx -> %#llx base: %#llx ways: %d pos: %d granularity: %llu\n",
+ pci_name(pci_dev), dpa, hpa, spa, base, ways, pos, granularity);
+
+ if (spa != spa2) {
+ dev_dbg(&cxld->dev, "SPA calculation failed: %#llx:%#llx\n",
+ spa, spa2);
+ return ULLONG_MAX;
+ }
+
+ return spa;
+}
+
+static void cxl_zen5_init(struct cxl_port *port)
+{
+ if (!is_zen5(port))
+ return;
+
+ port->to_hpa = cxl_zen5_to_hpa;
+
+ dev_dbg(port->host_bridge, "PRM address translation enabled for %s.\n",
+ dev_name(&port->dev));
+}
+
+void cxl_port_setup_amd(struct cxl_port *port)
+{
+ cxl_zen5_init(port);
+}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 800466f96a68..efe34ae6943e 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -115,4 +115,10 @@ bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
+#ifdef CONFIG_CXL_AMD
+void cxl_port_setup_amd(struct cxl_port *port);
+#else
+static inline void cxl_port_setup_amd(struct cxl_port *port) {};
+#endif
+
#endif /* __CXL_CORE_H__ */
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 901555bf4b73..c8176265c15c 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
&cxl_einj_inject_fops);
}
+static void cxl_port_platform_setup(struct cxl_port *port)
+{
+ cxl_port_setup_amd(port);
+}
+
static int cxl_port_add(struct cxl_port *port,
resource_size_t component_reg_phys,
struct cxl_dport *parent_dport)
@@ -868,6 +873,8 @@ static int cxl_port_add(struct cxl_port *port,
return rc;
}
+ cxl_port_platform_setup(port);
+
rc = device_add(dev);
if (rc)
return rc;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
@ 2025-01-07 16:32 ` Robert Richter
2025-01-07 23:28 ` Gregory Price
` (4 subsequent siblings)
5 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 16:32 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco
On 07.01.25 15:10:11, Robert Richter wrote:
> +#include <linux/prmt.h>
Note: linux/prmt.h is broken and causes a build error:
./include/linux/prmt.h:5:27: error: unknown type name ‘guid_t’
5 | int acpi_call_prm_handler(guid_t handler_guid, void *param_buffer);
| ^~~~~~
A fix for this is handled through linux-acpi:
https://patchwork.kernel.org/project/linux-acpi/patch/20250107161923.3387552-1-rrichter@amd.com/
I guess it will have been applied already once those cxl patches go
in.
Anyway, I will add it to a v2 to make testing easier.
Thanks,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
2025-01-07 16:32 ` Robert Richter
@ 2025-01-07 23:28 ` Gregory Price
2025-01-08 14:52 ` Robert Richter
2025-01-08 15:48 ` Gregory Price
` (3 subsequent siblings)
5 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-07 23:28 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
>
> Add a function cxl_port_setup_amd() to implement AMD platform specific
> code. Use Kbuild and Kconfig options respectivly to enable the code
> depending on architecture and platform options. Create a new file
> core/amd.c for this.
>
A build option here specific to AMD doesn't seem the best. At Meta,
we try to maintain a platform agnostic kernel for our fleet (at least
for build options), and this would necessitate us maintaining separate
builds for AMD systems vs other vendors.
Is there a reason to simply not include it by default and just report
whether translation is required or not? (i.e. no build option)
Or maybe generalize to CXL_PLATFORM_QUIRKS rather than CXL_AMD?
~Gregory
> Introduce a function cxl_zen5_init() to handle Zen5 specific
> enablement. Zen5 platforms are detected using the PCIe vendor and
> device ID of the corresponding CXL root port.
>
> Apply cxl_zen5_to_hpa() as cxl_port->to_hpa() callback to Zen5 CXL
> host bridges to enable platform specific address translation.
>
> Use ACPI PRM DPA to SPA translation to determine an endpoint's
> interleaving configuration and base address during the early
> initialization proces. This is used to determine an endpoint's SPA
> range.
>
> Since the PRM translates DPA->SPA, but HPA->SPA is needed, determine
> the interleaving config and base address of the endpoint first, then
> calculate the SPA based on the given HPA using the address base.
>
> The config can be determined calling the PRM for specific DPAs
> given. Since the interleaving configuration is still unknown, chose
> DPAs starting at 0xd20000. This address is factor for all values from
> 1 to 8 and thus valid for all possible interleaving configuration.
> The resulting SPAs are used to calculate interleaving paramters and
> the SPA base address of the endpoint. The maximum granularity (chunk
> size) is 16k, minimum is 256. Use the following calculation for a
> given DPA:
>
> ways = hpa_len(SZ_16K) / SZ_16K
> gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
> / (ways - 1)
> pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
>
> Once the endpoint is attached to a region and its SPA range is know,
> calling the PRM is no longer needed, the SPA base can be used.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/Kconfig | 4 +
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/amd.c | 227 ++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/core.h | 6 +
> drivers/cxl/core/port.c | 7 ++
> 5 files changed, 245 insertions(+)
> create mode 100644 drivers/cxl/core/amd.c
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 876469e23f7a..e576028dd983 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -146,4 +146,8 @@ config CXL_REGION_INVALIDATION_TEST
> If unsure, or if this kernel is meant for production environments,
> say N.
>
> +config CXL_AMD
> + def_bool y
> + depends on AMD_NB
> +
> endif
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..dc368e61d281 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -16,3 +16,4 @@ cxl_core-y += pmu.o
> cxl_core-y += cdat.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +cxl_core-$(CONFIG_CXL_AMD) += amd.o
> diff --git a/drivers/cxl/core/amd.c b/drivers/cxl/core/amd.c
> new file mode 100644
> index 000000000000..553b7d0caefd
> --- /dev/null
> +++ b/drivers/cxl/core/amd.c
> @@ -0,0 +1,227 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
> + */
> +
> +#include <linux/prmt.h>
> +#include <linux/pci.h>
> +
> +#include "cxlmem.h"
> +#include "core.h"
> +
> +#define PCI_DEVICE_ID_AMD_ZEN5_ROOT 0x153e
> +
> +static const struct pci_device_id zen5_root_port_ids[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ZEN5_ROOT) },
> + {},
> +};
> +
> +static int is_zen5_root_port(struct device *dev, void *unused)
> +{
> + if (!dev_is_pci(dev))
> + return 0;
> +
> + return !!pci_match_id(zen5_root_port_ids, to_pci_dev(dev));
> +}
> +
> +static bool is_zen5(struct cxl_port *port)
> +{
> + if (!IS_ENABLED(CONFIG_ACPI_PRMT))
> + return false;
> +
> + /* To get the CXL root port, find the CXL host bridge first. */
> + if (is_cxl_root(port) ||
> + !port->host_bridge ||
> + !is_cxl_root(to_cxl_port(port->dev.parent)))
> + return false;
> +
> + return !!device_for_each_child(port->host_bridge, NULL,
> + is_zen5_root_port);
> +}
> +
> +/*
> + * PRM Address Translation - CXL DPA to System Physical Address
> + *
> + * Reference:
> + *
> + * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> + * ACPI v6.5 Porting Guide, Publication # 58088
> + */
> +
> +static const guid_t prm_cxl_dpa_spa_guid =
> + GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
> + 0x48, 0x0b, 0x94);
> +
> +struct prm_cxl_dpa_spa_data {
> + u64 dpa;
> + u8 reserved;
> + u8 devfn;
> + u8 bus;
> + u8 segment;
> + void *out;
> +} __packed;
> +
> +static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
> +{
> + struct prm_cxl_dpa_spa_data data;
> + u64 spa;
> + int rc;
> +
> + data = (struct prm_cxl_dpa_spa_data) {
> + .dpa = dpa,
> + .devfn = pci_dev->devfn,
> + .bus = pci_dev->bus->number,
> + .segment = pci_domain_nr(pci_dev->bus),
> + .out = &spa,
> + };
> +
> + rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> + if (rc) {
> + pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
> + return ULLONG_MAX;
> + }
> +
> + pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
> +
> + return spa;
> +}
> +
> +static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
> +{
> + struct cxl_memdev *cxlmd;
> + struct pci_dev *pci_dev;
> + struct cxl_port *port;
> + u64 dpa, base, spa, spa2, len, len2, offset, granularity;
> + int ways, pos;
> +
> + /*
> + * Nothing to do if base is non-zero and Normalized Addressing
> + * is disabled.
> + */
> + if (cxld->hpa_range.start)
> + return hpa;
> +
> + /* Only translate from endpoint to its parent port. */
> + if (!is_endpoint_decoder(&cxld->dev))
> + return hpa;
> +
> + if (hpa > cxld->hpa_range.end) {
> + dev_dbg(&cxld->dev, "hpa addr %#llx out of range %#llx-%#llx\n",
> + hpa, cxld->hpa_range.start, cxld->hpa_range.end);
> + return ULLONG_MAX;
> + }
> +
> + /*
> + * If the decoder is already attached, the region's base can
> + * be used.
> + */
> + if (cxld->region)
> + return cxld->region->params.res->start + hpa;
> +
> + port = to_cxl_port(cxld->dev.parent);
> + cxlmd = port ? to_cxl_memdev(port->uport_dev) : NULL;
> + if (!port || !dev_is_pci(cxlmd->dev.parent)) {
> + dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
> + dev_name(cxld->dev.parent), cxld->hpa_range.start,
> + cxld->hpa_range.end);
> + return ULLONG_MAX;
> + }
> + pci_dev = to_pci_dev(cxlmd->dev.parent);
> +
> + /*
> + * The PRM translates DPA->SPA, but we need HPA->SPA.
> + * Determine the interleaving config first, then calculate the
> + * DPA. Maximum granularity (chunk size) is 16k, minimum is
> + * 256. Calculated with:
> + *
> + * ways = hpa_len(SZ_16K) / SZ_16K
> + * gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
> + * / (ways - 1)
> + * pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
> + */
> +
> + /*
> + * DPA magic:
> + *
> + * Position and granularity are unknown yet, use an always
> + * valid DPA:
> + *
> + * 0xd20000 = 13762560 = 16k * 2 * 3 * 2 * 5 * 7 * 2
> + *
> + * It is divisible by all positions 1 to 8. The DPA is valid
> + * for all positions and granularities.
> + */
> +#define DPA_MAGIC 0xd20000
> + base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
> + spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
> + spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
> +
> + /* Includes checks to avoid div by zero */
> + if (!base || base == ULLONG_MAX || spa == ULLONG_MAX ||
> + spa2 == ULLONG_MAX || spa < base + SZ_16K || spa2 <= base ||
> + (spa > base + SZ_16K && spa - spa2 < SZ_256 * 2)) {
> + dev_dbg(&cxld->dev, "Error translating HPA: base %#llx, spa %#llx, spa2 %#llx\n",
> + base, spa, spa2);
> + return ULLONG_MAX;
> + }
> +
> + len = spa - base;
> + len2 = spa2 - base;
> +
> + /* offset = pos * granularity */
> + if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
> + ways = 1;
> + offset = 0;
> + granularity = 0;
> + pos = 0;
> + } else {
> + ways = len / SZ_16K;
> + offset = spa & (SZ_16K - 1);
> + granularity = (len - len2 - SZ_256) / (ways - 1);
> + pos = offset / granularity;
> + }
> +
> + base = base - DPA_MAGIC * ways - pos * granularity;
> + spa = base + hpa;
> +
> + /*
> + * Check SPA using a PRM call for the closest DPA calculated
> + * for the HPA. If the HPA matches a different interleaving
> + * position other than the decoder's, determine its offset to
> + * adjust the SPA.
> + */
> +
> + dpa = (hpa & ~(granularity * ways - 1)) / ways
> + + (hpa & (granularity - 1));
> + offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
> + offset -= pos * granularity;
> + spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
> +
> + dev_dbg(&cxld->dev,
> + "address mapping found for %s (dpa -> hpa -> spa): %#llx -> %#llx -> %#llx base: %#llx ways: %d pos: %d granularity: %llu\n",
> + pci_name(pci_dev), dpa, hpa, spa, base, ways, pos, granularity);
> +
> + if (spa != spa2) {
> + dev_dbg(&cxld->dev, "SPA calculation failed: %#llx:%#llx\n",
> + spa, spa2);
> + return ULLONG_MAX;
> + }
> +
> + return spa;
> +}
> +
> +static void cxl_zen5_init(struct cxl_port *port)
> +{
> + if (!is_zen5(port))
> + return;
> +
> + port->to_hpa = cxl_zen5_to_hpa;
> +
> + dev_dbg(port->host_bridge, "PRM address translation enabled for %s.\n",
> + dev_name(&port->dev));
> +}
> +
> +void cxl_port_setup_amd(struct cxl_port *port)
> +{
> + cxl_zen5_init(port);
> +}
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 800466f96a68..efe34ae6943e 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -115,4 +115,10 @@ bool cxl_need_node_perf_attrs_update(int nid);
> int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> struct access_coordinate *c);
>
> +#ifdef CONFIG_CXL_AMD
> +void cxl_port_setup_amd(struct cxl_port *port);
> +#else
> +static inline void cxl_port_setup_amd(struct cxl_port *port) {};
> +#endif
> +
> #endif /* __CXL_CORE_H__ */
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 901555bf4b73..c8176265c15c 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> &cxl_einj_inject_fops);
> }
>
> +static void cxl_port_platform_setup(struct cxl_port *port)
> +{
> + cxl_port_setup_amd(port);
> +}
> +
> static int cxl_port_add(struct cxl_port *port,
> resource_size_t component_reg_phys,
> struct cxl_dport *parent_dport)
> @@ -868,6 +873,8 @@ static int cxl_port_add(struct cxl_port *port,
> return rc;
> }
>
> + cxl_port_platform_setup(port);
> +
> rc = device_add(dev);
> if (rc)
> return rc;
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 23:28 ` Gregory Price
@ 2025-01-08 14:52 ` Robert Richter
2025-01-08 15:49 ` Gregory Price
0 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-08 14:52 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 07.01.25 18:28:57, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> >
> > Add a function cxl_port_setup_amd() to implement AMD platform specific
> > code. Use Kbuild and Kconfig options respectivly to enable the code
> > depending on architecture and platform options. Create a new file
> > core/amd.c for this.
> >
>
> A build option here specific to AMD doesn't seem the best. At Meta,
> we try to maintain a platform agnostic kernel for our fleet (at least
> for build options), and this would necessitate us maintaining separate
> builds for AMD systems vs other vendors.
>
> Is there a reason to simply not include it by default and just report
> whether translation is required or not? (i.e. no build option)
There is no (menu) option for CXL_AMD, it is just checking the
dependencies to AMD_NB (and indirectly arch, platform and vendor). In
that case it is always enabled.
Thanks for review.
-Robert
>
> Or maybe generalize to CXL_PLATFORM_QUIRKS rather than CXL_AMD?
>
> ~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-08 14:52 ` Robert Richter
@ 2025-01-08 15:49 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-08 15:49 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Wed, Jan 08, 2025 at 03:52:35PM +0100, Robert Richter wrote:
> On 07.01.25 18:28:57, Gregory Price wrote:
> > On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> > >
> > > Add a function cxl_port_setup_amd() to implement AMD platform specific
> > > code. Use Kbuild and Kconfig options respectivly to enable the code
> > > depending on architecture and platform options. Create a new file
> > > core/amd.c for this.
> > >
> >
> > A build option here specific to AMD doesn't seem the best. At Meta,
> > we try to maintain a platform agnostic kernel for our fleet (at least
> > for build options), and this would necessitate us maintaining separate
> > builds for AMD systems vs other vendors.
> >
> > Is there a reason to simply not include it by default and just report
> > whether translation is required or not? (i.e. no build option)
>
> There is no (menu) option for CXL_AMD, it is just checking the
> dependencies to AMD_NB (and indirectly arch, platform and vendor). In
> that case it is always enabled.
>
Ah! I completely misunderstood the build option, my bad. This makes
sense, sorry about that
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
2025-01-07 16:32 ` Robert Richter
2025-01-07 23:28 ` Gregory Price
@ 2025-01-08 15:48 ` Gregory Price
2025-01-09 10:14 ` Robert Richter
2025-01-09 22:25 ` Gregory Price
` (2 subsequent siblings)
5 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-08 15:48 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> Add AMD platform specific Zen5 support for address translation.
>
... snip ...
>
> Once the endpoint is attached to a region and its SPA range is know,
> calling the PRM is no longer needed, the SPA base can be used.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
One inline question, but not a blocker
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/Kconfig | 4 +
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/amd.c | 227 ++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/core.h | 6 +
> drivers/cxl/core/port.c | 7 ++
> 5 files changed, 245 insertions(+)
> create mode 100644 drivers/cxl/core/amd.c
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 901555bf4b73..c8176265c15c 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> &cxl_einj_inject_fops);
> }
>
> +static void cxl_port_platform_setup(struct cxl_port *port)
> +{
> + cxl_port_setup_amd(port);
> +}
> +
Assuming this gets expanded (which it may not), should we expect this
function to end up like so?
static void cxl_port_platform_setup(struct cxl_port *port)
{
cxl_port_setup_amd(port);
cxl_port_setup_intel(port);
cxl_port_setup_arm(port);
... etc ...
}
I suppose this logic has to exist somewhere in some form, just want to make
sure this is what we want. Either way, this is easily modifiable, so
not a blocker as I said.
> static int cxl_port_add(struct cxl_port *port,
> resource_size_t component_reg_phys,
> struct cxl_dport *parent_dport)
> @@ -868,6 +873,8 @@ static int cxl_port_add(struct cxl_port *port,
> return rc;
> }
>
> + cxl_port_platform_setup(port);
> +
> rc = device_add(dev);
> if (rc)
> return rc;
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-08 15:48 ` Gregory Price
@ 2025-01-09 10:14 ` Robert Richter
2025-01-14 11:13 ` Jonathan Cameron
0 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-09 10:14 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 08.01.25 10:48:23, Gregory Price wrote:
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index 901555bf4b73..c8176265c15c 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> > &cxl_einj_inject_fops);
> > }
> >
> > +static void cxl_port_platform_setup(struct cxl_port *port)
> > +{
> > + cxl_port_setup_amd(port);
> > +}
> > +
>
> Assuming this gets expanded (which it may not), should we expect this
> function to end up like so?
>
> static void cxl_port_platform_setup(struct cxl_port *port)
> {
> cxl_port_setup_amd(port);
> cxl_port_setup_intel(port);
> cxl_port_setup_arm(port);
> ... etc ...
> }
>
> I suppose this logic has to exist somewhere in some form, just want to make
> sure this is what we want. Either way, this is easily modifiable, so
> not a blocker as I said.
Yes, it is exactly designed like that. I will update the patch
description.
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-09 10:14 ` Robert Richter
@ 2025-01-14 11:13 ` Jonathan Cameron
2025-01-17 7:59 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-14 11:13 UTC (permalink / raw)
To: Robert Richter
Cc: Gregory Price, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Thu, 9 Jan 2025 11:14:46 +0100
Robert Richter <rrichter@amd.com> wrote:
> On 08.01.25 10:48:23, Gregory Price wrote:
>
> > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > index 901555bf4b73..c8176265c15c 100644
> > > --- a/drivers/cxl/core/port.c
> > > +++ b/drivers/cxl/core/port.c
> > > @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> > > &cxl_einj_inject_fops);
> > > }
> > >
> > > +static void cxl_port_platform_setup(struct cxl_port *port)
> > > +{
> > > + cxl_port_setup_amd(port);
> > > +}
> > > +
> >
> > Assuming this gets expanded (which it may not), should we expect this
> > function to end up like so?
> >
> > static void cxl_port_platform_setup(struct cxl_port *port)
> > {
> > cxl_port_setup_amd(port);
> > cxl_port_setup_intel(port);
> > cxl_port_setup_arm(port);
> > ... etc ...
> > }
> >
> > I suppose this logic has to exist somewhere in some form, just want to make
> > sure this is what we want. Either way, this is easily modifiable, so
> > not a blocker as I said.
>
> Yes, it is exactly designed like that. I will update the patch
> description.
If we need it on ARM then we might wrap this in an arch_cxl_port_platform_setup()
as never building a kernel that does x86 and arm. Could rely on stubs but that
tends to get ugly as things grow.
Other than that, all makes sense.
Jonathan
>
> -Robert
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-14 11:13 ` Jonathan Cameron
@ 2025-01-17 7:59 ` Robert Richter
2025-01-17 11:46 ` Jonathan Cameron
0 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-17 7:59 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Gregory Price, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 14.01.25 11:13:07, Jonathan Cameron wrote:
> On Thu, 9 Jan 2025 11:14:46 +0100
> Robert Richter <rrichter@amd.com> wrote:
>
> > On 08.01.25 10:48:23, Gregory Price wrote:
> >
> > > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > > index 901555bf4b73..c8176265c15c 100644
> > > > --- a/drivers/cxl/core/port.c
> > > > +++ b/drivers/cxl/core/port.c
> > > > @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> > > > &cxl_einj_inject_fops);
> > > > }
> > > >
> > > > +static void cxl_port_platform_setup(struct cxl_port *port)
> > > > +{
> > > > + cxl_port_setup_amd(port);
> > > > +}
> > > > +
> > >
> > > Assuming this gets expanded (which it may not), should we expect this
> > > function to end up like so?
> > >
> > > static void cxl_port_platform_setup(struct cxl_port *port)
> > > {
> > > cxl_port_setup_amd(port);
> > > cxl_port_setup_intel(port);
> > > cxl_port_setup_arm(port);
> > > ... etc ...
> > > }
> > >
> > > I suppose this logic has to exist somewhere in some form, just want to make
> > > sure this is what we want. Either way, this is easily modifiable, so
> > > not a blocker as I said.
> >
> > Yes, it is exactly designed like that. I will update the patch
> > description.
>
> If we need it on ARM then we might wrap this in an arch_cxl_port_platform_setup()
> as never building a kernel that does x86 and arm. Could rely on stubs but that
> tends to get ugly as things grow.
I could move the function and file to core/x86/amd.c already and add
a:
void __weak arch_cxl_port_platform_setup(struct cxl_port *port) { }
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-17 7:59 ` Robert Richter
@ 2025-01-17 11:46 ` Jonathan Cameron
2025-01-17 14:10 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-17 11:46 UTC (permalink / raw)
To: Robert Richter
Cc: Gregory Price, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Fri, 17 Jan 2025 08:59:00 +0100
Robert Richter <rrichter@amd.com> wrote:
> On 14.01.25 11:13:07, Jonathan Cameron wrote:
> > On Thu, 9 Jan 2025 11:14:46 +0100
> > Robert Richter <rrichter@amd.com> wrote:
> >
> > > On 08.01.25 10:48:23, Gregory Price wrote:
> > >
> > > > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > > > index 901555bf4b73..c8176265c15c 100644
> > > > > --- a/drivers/cxl/core/port.c
> > > > > +++ b/drivers/cxl/core/port.c
> > > > > @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> > > > > &cxl_einj_inject_fops);
> > > > > }
> > > > >
> > > > > +static void cxl_port_platform_setup(struct cxl_port *port)
> > > > > +{
> > > > > + cxl_port_setup_amd(port);
> > > > > +}
> > > > > +
> > > >
> > > > Assuming this gets expanded (which it may not), should we expect this
> > > > function to end up like so?
> > > >
> > > > static void cxl_port_platform_setup(struct cxl_port *port)
> > > > {
> > > > cxl_port_setup_amd(port);
> > > > cxl_port_setup_intel(port);
> > > > cxl_port_setup_arm(port);
> > > > ... etc ...
> > > > }
> > > >
> > > > I suppose this logic has to exist somewhere in some form, just want to make
> > > > sure this is what we want. Either way, this is easily modifiable, so
> > > > not a blocker as I said.
> > >
> > > Yes, it is exactly designed like that. I will update the patch
> > > description.
> >
> > If we need it on ARM then we might wrap this in an arch_cxl_port_platform_setup()
> > as never building a kernel that does x86 and arm. Could rely on stubs but that
> > tends to get ugly as things grow.
>
> I could move the function and file to core/x86/amd.c already and add
> a:
>
> void __weak arch_cxl_port_platform_setup(struct cxl_port *port) { }
Something like that probably makes sense. I don't like x86 calls in what
I'm building for arm, even if they are stubbed out ;)
Jonathan
>
> -Robert
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-17 11:46 ` Jonathan Cameron
@ 2025-01-17 14:10 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-17 14:10 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Gregory Price, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 17.01.25 11:46:42, Jonathan Cameron wrote:
> On Fri, 17 Jan 2025 08:59:00 +0100
> Robert Richter <rrichter@amd.com> wrote:
> > On 14.01.25 11:13:07, Jonathan Cameron wrote:
> > > > > static void cxl_port_platform_setup(struct cxl_port *port)
> > > > > {
> > > > > cxl_port_setup_amd(port);
> > > > > cxl_port_setup_intel(port);
> > > > > cxl_port_setup_arm(port);
> > > > > ... etc ...
> > > > > }
> > > > >
> > > > > I suppose this logic has to exist somewhere in some form, just want to make
> > > > > sure this is what we want. Either way, this is easily modifiable, so
> > > > > not a blocker as I said.
> > > >
> > > > Yes, it is exactly designed like that. I will update the patch
> > > > description.
> > >
> > > If we need it on ARM then we might wrap this in an arch_cxl_port_platform_setup()
> > > as never building a kernel that does x86 and arm. Could rely on stubs but that
> > > tends to get ugly as things grow.
> >
> > I could move the function and file to core/x86/amd.c already and add
> > a:
> >
> > void __weak arch_cxl_port_platform_setup(struct cxl_port *port) { }
> Something like that probably makes sense. I don't like x86 calls in what
> I'm building for arm, even if they are stubbed out ;)
Sure, will change that.
Thanks for review,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
` (2 preceding siblings ...)
2025-01-08 15:48 ` Gregory Price
@ 2025-01-09 22:25 ` Gregory Price
2025-01-15 15:05 ` Robert Richter
2025-01-10 22:48 ` Gregory Price
2025-01-17 21:32 ` Ben Cheatham
5 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-09 22:25 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> Add AMD platform specific Zen5 support for address translation.
Doing some testing here and I'm seeing some odd results, also noticing
some naming inconsistencies
>
> +static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
> +{
Function name is _to_hpa, but hpa is an argument?
Should be dpa as argument? Confusing to convert an hpa to an hpa.
... snip ...
> +#define DPA_MAGIC 0xd20000
> + base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
> + spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
> + spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
For two devices interleaved, the base should be the same, correct?
example: 2 128GB devices interleaved/normalized:
dev0: base(0xc051a40000) spa(0xc051a48000) spa2(0xc051a47e00)
dev1: base(0xc051a40100) spa(0xc051a48100) spa2(0xc051a47f00)
I believe these numbers are correct.
(Note: Using PRMT emulation because I don't have a BIOS with this blob,
but this is the same emulation i have been using for about 4 months now
with operational hardware, so unless the translation contract changed
and this code expects something different, it should be correct).
... snip ...
> + len = spa - base;
> + len2 = spa2 - base;
> +
> + /* offset = pos * granularity */
> + if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
> + ways = 1;
> + offset = 0;
> + granularity = 0;
> + pos = 0;
> + } else {
> + ways = len / SZ_16K;
> + offset = spa & (SZ_16K - 1);
> + granularity = (len - len2 - SZ_256) / (ways - 1);
> + pos = offset / granularity;
> + }
the interleave ways and such calculate out correctly
dev0: ways(0x2) offset(0x0) granularity(0x100) pos(0x0)
dev1: ways(0x2) offset(0x100) granularity(0x100) pos(0x1)
> +
> + base = base - DPA_MAGIC * ways - pos * granularity;
> + spa = base + hpa;
DPA(0)
dev0: base(0xc050000000) spa(0xc050000000)
dev1: base(0xc050000000) spa(0xc050000000)
DPA(0x1fffffffff)
dev0: base(0xc050000000) spa(0xe04fffffff)
dev1: base(0xc050000000) spa(0xe04fffffff)
The bases seems correct, the SPAs looks suspect.
dev1 should have a very different SPA shouldn't it?
> +
> + /*
> + * Check SPA using a PRM call for the closest DPA calculated
> + * for the HPA. If the HPA matches a different interleaving
> + * position other than the decoder's, determine its offset to
> + * adjust the SPA.
> + */
> +
> + dpa = (hpa & ~(granularity * ways - 1)) / ways
> + + (hpa & (granularity - 1));
I do not understand this chunk here, we seem to just be chopping the HPA
in half to acquire the DPA. But the value passed in is already a DPA.
dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
= 0xfffffffff
I don't understand why the DPA address is suddenly half (64GB boundary).
> + offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
> + offset -= pos * granularity;
> + spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
> +
> + dev_dbg(&cxld->dev,
> + "address mapping found for %s (dpa -> hpa -> spa): %#llx -> %#llx -> %#llx base: %#llx ways: %d pos: %d granularity: %llu\n",
> + pci_name(pci_dev), dpa, hpa, spa, base, ways, pos, granularity);
> +
This results in a translation that appears to be wrong:
dev0:
cxl decoder5.0: address mapping found for 0000:e1:00.0
(dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
base: 0xc050000000 ways: 2 pos: 0 granularity: 256
cxl decoder5.0: address mapping found for 0000:e1:00.0
(dpa -> hpa -> spa): 0xfffffffff -> 0x1fffffffff -> 0xe04fffffff
base: 0xc050000000 ways: 2 pos: 0 granularity: 256
dev1:
cxl decoder6.0: address mapping found for 0000:c1:00.0
(dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
base: 0xc050000000 ways: 2 pos: 1 granularity: 256
cxl decoder6.0: address mapping found for 0000:c1:00.0
(dpa -> hpa -> spa): 0xfffffffff -> 0x1fffffffff -> 0xe04fffffff
base: 0xc050000000 ways: 2 pos: 1 granularity: 256
These do not look correct.
Is my understanding of the PRMT translation incorrect?
I expect the following: (assuming one contiguous CFMW)
dev0 (dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
dev1 (dpa -> hpa -> spa): 0x0 -> 0x100 -> 0xc050000100
dev0 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3ffffffeff -> 0x1004ffffeff
dev1 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3fffffffff -> 0x1004fffffff
Extra data: here are the programmed endpoint decoder values
[endpoint5/decoder5.0]# cat start size dpa_size interleave_ways interleave_granularity
0x0
0x2000000000
0x0000002000000000
1
256
[endpoint6/decoder6.0]# cat start size dpa_size interleave_ways interleave_granularity
0x0
0x2000000000
0x0000002000000000
1
256
Anyway, yeah I'm a bit confused how this is all supposed to actually
work given that both devices translate to the same addresses.
In theory this *should* work since the root decoder covers the whole
space - as this has been working for me previously with some hacked up
PRMT emulation code.
[decoder0.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
2
256
[decoder1.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
1
256
[decoder3.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
1
256
[decoder5.0]# cat start size interleave_ways interleave_granularity
0x0
0x2000000000
1
256
[decoder6.0]# cat start size interleave_ways interleave_granularity
0x0
0x2000000000
1
256
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-09 22:25 ` Gregory Price
@ 2025-01-15 15:05 ` Robert Richter
2025-01-15 17:05 ` Gregory Price
2025-01-15 22:24 ` Gregory Price
0 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-15 15:05 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 09.01.25 17:25:13, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> > Add AMD platform specific Zen5 support for address translation.
>
> Doing some testing here and I'm seeing some odd results, also noticing
> some naming inconsistencies
>
> >
> > +static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
> > +{
>
> Function name is _to_hpa, but hpa is an argument?
Conversion is always done from (old) HPA to (new) HPA of the parent
port. Note that the HPA of the root port/host bridge is same as SPA.
Port's in between may have an own HPA range.
>
> Should be dpa as argument? Confusing to convert an hpa to an hpa.
We need to handle the decoder address ranges, the argument is always
the HPA range the decoder belongs to. The DPA is only on the device
side which is a different address range compared to the decoders. The
decoders do the interleaving arithmetic too and DPA range may be
different. E.g. the decoders may split requests to different endpoints
depending on the number of interleaving ways and endpoints have their
own (smaller) DPA address ranges then.
>
> ... snip ...
>
> > +#define DPA_MAGIC 0xd20000
> > + base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
> > + spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
> > + spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
>
> For two devices interleaved, the base should be the same, correct?
Same except for the interleaving offset, which is seen below (dev1
shows *100). At this stage we don't know the interleaving position of
the endpoint yet.
>
> example: 2 128GB devices interleaved/normalized:
>
> dev0: base(0xc051a40000) spa(0xc051a48000) spa2(0xc051a47e00)
> dev1: base(0xc051a40100) spa(0xc051a48100) spa2(0xc051a47f00)
>
> I believe these numbers are correct.
Looks good.
>
> (Note: Using PRMT emulation because I don't have a BIOS with this blob,
> but this is the same emulation i have been using for about 4 months now
> with operational hardware, so unless the translation contract changed
> and this code expects something different, it should be correct).
>
> ... snip ...
> > + len = spa - base;
> > + len2 = spa2 - base;
> > +
> > + /* offset = pos * granularity */
> > + if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
> > + ways = 1;
> > + offset = 0;
> > + granularity = 0;
> > + pos = 0;
> > + } else {
> > + ways = len / SZ_16K;
> > + offset = spa & (SZ_16K - 1);
> > + granularity = (len - len2 - SZ_256) / (ways - 1);
> > + pos = offset / granularity;
> > + }
>
> the interleave ways and such calculate out correctly
>
> dev0: ways(0x2) offset(0x0) granularity(0x100) pos(0x0)
> dev1: ways(0x2) offset(0x100) granularity(0x100) pos(0x1)
>
> > +
> > + base = base - DPA_MAGIC * ways - pos * granularity;
> > + spa = base + hpa;
>
> DPA(0)
> dev0: base(0xc050000000) spa(0xc050000000)
> dev1: base(0xc050000000) spa(0xc050000000)
>
> DPA(0x1fffffffff)
> dev0: base(0xc050000000) spa(0xe04fffffff)
> dev1: base(0xc050000000) spa(0xe04fffffff)
>
> The bases seems correct, the SPAs looks suspect.
SPA range length must be 0x4000000000 (2x 128G). That is, upper SPA
must be 0x10050000000 (0xc050000000 + 0x4000000000 - 1). This one is
too short.
The decoder range lengths below look correct (0x2000000000), the
interleaving configuration should be checked for the decoders.
>
> dev1 should have a very different SPA shouldn't it?
No, the HPA range is calculated, not the DPA range. Both endpoints
have the same HPA range, it must be equal and this looks correct. In
the end we calculate the following here (see cxl_find_auto_decoder()):
hpa = cxld->hpa_range;
// endpoint's hpa range is zero-based, equivalent to:
// hpa->start = 0;
// hpa->end = range_len(&hpa) - 1;
base = hpa.start = port->to_hpa(cxld, hpa.start); // HPA(0)
spa = hpa.end = port->to_hpa(cxld, hpa.end)); // HPA(decoder_size - 1)
Again, the HPA is the address the decoder is programmed with. HPA
length is 0x2000000000 (spa - base + 1). The DPA range is (for 2 way)
half it's size. The PRM uses DPA to SPA, but we want to translate HPA
to SPA. That is we need the calculation for.
>
> > +
> > + /*
> > + * Check SPA using a PRM call for the closest DPA calculated
> > + * for the HPA. If the HPA matches a different interleaving
> > + * position other than the decoder's, determine its offset to
> > + * adjust the SPA.
> > + */
> > +
> > + dpa = (hpa & ~(granularity * ways - 1)) / ways
> > + + (hpa & (granularity - 1));
>
> I do not understand this chunk here, we seem to just be chopping the HPA
> in half to acquire the DPA. But the value passed in is already a DPA.
>
> dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
> = 0xfffffffff
HPA is:
HPA = 2 * 0x2000000000 - 1 = 0x3fffffffff
Should calculate for a 2-way config to:
DPA = 0x1fffffffff.
Actual formula:
dpa = HPA div (granularity * ways) * granularity + HPA mod granularity
pos = (HPA mod (granularity * ways)) div granularity
Bits used (e.g. HPA length: 0x4000000000 = 2^38, ways: 2):
hpa = 00000000000000000000000000XXXXXXXXXXXXXXXXXXXXXXXXXXXXXYZZZZZZZZ
dpa = 000000000000000000000000000XXXXXXXXXXXXXXXXXXXXXXXXXXXXXZZZZZZZZ
pos = Y
With:
X ... base part of the address
Y ... interleaving position
Z ... address offset
For DPA the positional bits are removed.
>
> I don't understand why the DPA address is suddenly half (64GB boundary).
There is probably a broken interleaving config causing half the size
of total device mem.
>
> > + offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
> > + offset -= pos * granularity;
> > + spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
> > +
> > + dev_dbg(&cxld->dev,
> > + "address mapping found for %s (dpa -> hpa -> spa): %#llx -> %#llx -> %#llx base: %#llx ways: %d pos: %d granularity: %llu\n",
> > + pci_name(pci_dev), dpa, hpa, spa, base, ways, pos, granularity);
> > +
>
> This results in a translation that appears to be wrong:
>
> dev0:
> cxl decoder5.0: address mapping found for 0000:e1:00.0
> (dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
> base: 0xc050000000 ways: 2 pos: 0 granularity: 256
> cxl decoder5.0: address mapping found for 0000:e1:00.0
> (dpa -> hpa -> spa): 0xfffffffff -> 0x1fffffffff -> 0xe04fffffff
> base: 0xc050000000 ways: 2 pos: 0 granularity: 256
>
> dev1:
> cxl decoder6.0: address mapping found for 0000:c1:00.0
> (dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
> base: 0xc050000000 ways: 2 pos: 1 granularity: 256
> cxl decoder6.0: address mapping found for 0000:c1:00.0
> (dpa -> hpa -> spa): 0xfffffffff -> 0x1fffffffff -> 0xe04fffffff
> base: 0xc050000000 ways: 2 pos: 1 granularity: 256
>
> These do not look correct.
>
> Is my understanding of the PRMT translation incorrect?
> I expect the following: (assuming one contiguous CFMW)
>
> dev0 (dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
> dev1 (dpa -> hpa -> spa): 0x0 -> 0x100 -> 0xc050000100
> dev0 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3ffffffeff -> 0x1004ffffeff
> dev1 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3fffffffff -> 0x1004fffffff
Yes, would be the result without the offset applied for spa2 above.
The check above calculates the *total* length of hpa and spa with out
considering the interleaving position. This is corrected using the
offset. There is no call prm_cxl_dpa_spa(dev0, 0x1fffffffff) that
returns 0x1004fffffff, but we want to check the upper boundery of the
SPA range.
>
> Extra data: here are the programmed endpoint decoder values
>
> [endpoint5/decoder5.0]# cat start size dpa_size interleave_ways interleave_granularity
> 0x0
> 0x2000000000
> 0x0000002000000000
> 1
> 256
>
> [endpoint6/decoder6.0]# cat start size dpa_size interleave_ways interleave_granularity
> 0x0
> 0x2000000000
> 0x0000002000000000
> 1
> 256
This is correct and and must be half the size of the HPA window.
Thanks for testing.
-Robert
>
>
> Anyway, yeah I'm a bit confused how this is all supposed to actually
> work given that both devices translate to the same addresses.
>
> In theory this *should* work since the root decoder covers the whole
> space - as this has been working for me previously with some hacked up
> PRMT emulation code.
>
> [decoder0.0]# cat start size interleave_ways interleave_granularity
> 0xc050000000
> 0x4000000000
> 2
> 256
>
> [decoder1.0]# cat start size interleave_ways interleave_granularity
> 0xc050000000
> 0x4000000000
> 1
> 256
>
> [decoder3.0]# cat start size interleave_ways interleave_granularity
> 0xc050000000
> 0x4000000000
> 1
> 256
>
> [decoder5.0]# cat start size interleave_ways interleave_granularity
> 0x0
> 0x2000000000
> 1
> 256
>
> [decoder6.0]# cat start size interleave_ways interleave_granularity
> 0x0
> 0x2000000000
> 1
> 256
>
> ~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-15 15:05 ` Robert Richter
@ 2025-01-15 17:05 ` Gregory Price
2025-01-15 22:24 ` Gregory Price
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-15 17:05 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Wed, Jan 15, 2025 at 04:05:16PM +0100, Robert Richter wrote:
> >
> > Should be dpa as argument? Confusing to convert an hpa to an hpa.
>
> We need to handle the decoder address ranges, the argument is always
> the HPA range the decoder belongs to.
I see, and this is where my confusion stems from. Basically these
addresses are consider "HPA" because they are programmed to decoders,
and decoder addresses are "always HPA".
i.e. 2 interleaved devices (endpoint decoders) with normalized addresses:
dev0: base(0x0) len(0x200000000)
dev1: base(0x0) len(0x200000000)
These are HPAs because decoders are programmed with HPAs.
It's just that in this (specific) case HPA=DPA, while root decoders and
host bridge decoders will always have HPA=SPA. We're just translating
up the stack from HPA range to HPA range.
I've been dealing with virtualization for a long time and this has been
painful for me to follow - but I think I'm getting there.
> > DPA(0)
> > dev0: base(0xc050000000) spa(0xc050000000)
> > dev1: base(0xc050000000) spa(0xc050000000)
> >
> > DPA(0x1fffffffff)
> > dev0: base(0xc050000000) spa(0xe04fffffff)
> > dev1: base(0xc050000000) spa(0xe04fffffff)
> >
> > The bases seems correct, the SPAs looks suspect.
>
> SPA range length must be 0x4000000000 (2x 128G). That is, upper SPA
> must be 0x10050000000 (0xc050000000 + 0x4000000000 - 1). This one is
> too short.
>
> The decoder range lengths below look correct (0x2000000000), the
> interleaving configuration should be checked for the decoders.
>
If i understand correctly, this configuration may be suspect
[decoder0.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
2 <----- root decoder reports interleave ways = 2
256
[decoder1.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
1 <----- host bridge decoder reports interleave ways = 1
256
[decoder3.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000
1 <----- host bridge decoder reports interleave ways = 1
256
> > I do not understand this chunk here, we seem to just be chopping the HPA
> > in half to acquire the DPA. But the value passed in is already a DPA.
> >
> > dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
> > = 0xfffffffff
>
> HPA is:
>
> HPA = 2 * 0x2000000000 - 1 = 0x3fffffffff
>
... snip ...
> There is probably a broken interleaving config causing half the size
> of total device mem.
>
In my case, I never see 0x3fffffffff passed in. The value 0x1fffffffff
from the endpoint decoders is always passed in. This suggests the host
bridge interleave ways should be 2.
I can force this and figure out why its reporting 1 and get back to you.
> > dev0 (dpa -> hpa -> spa): 0x0 -> 0x0 -> 0xc050000000
> > dev1 (dpa -> hpa -> spa): 0x0 -> 0x100 -> 0xc050000100
> > dev0 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3ffffffeff -> 0x1004ffffeff
> > dev1 (dpa -> hpa -> spa): 0x1fffffffff -> 0x3fffffffff -> 0x1004fffffff
>
> Yes, would be the result without the offset applied for spa2 above.
> The check above calculates the *total* length of hpa and spa with out
> considering the interleaving position. This is corrected using the
> offset. There is no call prm_cxl_dpa_spa(dev0, 0x1fffffffff) that
> returns 0x1004fffffff, but we want to check the upper boundery of the
> SPA range.
>
This makes sense now, there's no dpa->spa direct translation because you
may have to go through multiple layers of translation to get there - so
the best you can do is calculate the highest possible endpoint and say
"Yeah this range is in there somewhere".
Thank you for taking the time to walk me through this, I'm sorry I've
been confused on DPA/HPA/SPA for so long - it's been a bit of a
struggle.
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-15 15:05 ` Robert Richter
2025-01-15 17:05 ` Gregory Price
@ 2025-01-15 22:24 ` Gregory Price
2025-01-17 14:06 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-15 22:24 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Wed, Jan 15, 2025 at 04:05:16PM +0100, Robert Richter wrote:
> On 09.01.25 17:25:13, Gregory Price wrote:
> > > + dpa = (hpa & ~(granularity * ways - 1)) / ways
> > > + + (hpa & (granularity - 1));
> >
> > I do not understand this chunk here, we seem to just be chopping the HPA
> > in half to acquire the DPA. But the value passed in is already a DPA.
> >
> > dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
> > = 0xfffffffff
>
> HPA is:
>
> HPA = 2 * 0x2000000000 - 1 = 0x3fffffffff
>
> Should calculate for a 2-way config to:
>
> DPA = 0x1fffffffff.
>
I'm looking back through all of this again, and I'm not seeing how the
current code is ever capable of ending up with hpa=0x3fffffffff.
Taking an example endpoint in my setup:
[decoder5.0]# cat start size interleave_ways interleave_granularity
0x0
0x2000000000 <- 128GB (half the total 256GB interleaved range)
1 <- this decoder does not apply interleave
256
translating up to a root decoder:
[decoder0.0]# cat start size interleave_ways interleave_granularity
0xc050000000
0x4000000000 <- 256GB (total interleaved capacity)
2 <- interleaved 2 ways, this decoder applies interleave
256
Now looking at the code that actually invokes the translation
static struct device *
cxl_find_auto_decoder(struct cxl_port *port, struct cxl_endpoint_decoder *cxled,
struct cxl_region *cxlr)
{
struct cxl_decoder *cxld = &cxled->cxld;
struct range hpa = cxld->hpa_range;
... snip ...
if (cxl_port_calc_hpa(parent, cxld, &hpa))
return NULL;
}
or
static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
struct cxl_port *parent, *iter = cxled_to_port(cxled);
struct range hpa = cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
while (1) {
if (is_cxl_endpoint(iter))
cxld = &cxled->cxld;
...
/* Translate HPA to the next upper memory domain. */
if (cxl_port_calc_hpa(parent, cxld, &hpa)) {
}
...
}
....
}
Both of these will call cxl_port_calc_hpa with
hpa = [0, 0x1fffffffff]
Resulting in the following
static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
struct range *hpa_range)
{
...
/* Translate HPA to the next upper domain. */
hpa.start = port->to_hpa(cxld, hpa.start); <---- 0x0
hpa.end = port->to_hpa(cxld, hpa.end); <---- 0x1fffffffff
}
So we call:
to_hpa(decoder5.0, hpa.end)
to_hpa(decoder5.0, 0x1fffffffff)
^^^^^^^^^^^^^ --- hpa will never be 0x3fffffffff
Should the to_hpa() code be taking an decoder length as an argument?
to_hpa(decoder5.0, range_length, addr) ?
This would actually let us calculate the end of region with the
interleave ways and granularity:
upper_hpa_base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC) - DPA_MAGIC;
upper_hpa_end = upper_hpa_base + (range_length * ways) - 1
Without this, you don't have enough information to actually calculate
the upper_hpa_end as you suggested. The result is the math ends up
chopping the endpoint decoder's range (128GB) in half (64GB).
Below I walk through the translation code with these inputs step by step.
~Gregory
------------------------------------------------------------------------
Walking through the translation code by hand here:
[decoder5.0]# cat start size
0x0
0x2000000000
call: to_hpa(decoder5.0, 0x1fffffffff)
--- code
#define DPA_MAGIC 0xd20000
base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
len = spa - base;
len2 = spa2 - base;
/* offset = pos * granularity */
if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
... snip - not taken ...
} else {
ways = len / SZ_16K;
offset = spa & (SZ_16K - 1);
granularity = (len - len2 - SZ_256) / (ways - 1);
pos = offset / granularity;
}
--- end code
At this point in the code i have the following values:
base = 0xc051a40100
spa = 0xc051a48100
spa2 = 0xc051a47f00
len = 0x8000
len2 = 0x7E00
ways = 2
offset = 256
granularity = 256
pos = 1
--- code
base = base - DPA_MAGIC * ways - pos * granularity;
spa = base + hpa;
--- end code
base = 0xc051a40100 - 0xd20000 * 2 - 1 * 256
= 0xc050000000
spa = base + hpa
= 0xc050000000 + 0x1fffffffff <-----
= 0xe04fffffff |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Isn't this just incorrect? Should it be something like:
base + ((hpa & ~(granularity - 1)) * pos)
+ (hpa & (ways * granularity - 1))
--- code
dpa = (hpa & ~(granularity * ways - 1)) / ways
+ (hpa & (granularity - 1));
--- end code
dpa = (hpa & ~(granularity * ways - 1)) / ways + (hpa & (granularity - 1));
^^^ ^^^
0x1fffffffff / 2
We are dividing the endpoint decoder HPA by interleave ways.
This is how we end up with the truncated size.
Full math
dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
= (0x1fffffffff & (0xF...E00) / 2 + (0x1fffffffff & 0xFF)
= (0x1ffffffe00) / 2 + (0xFF)
= 0xfffffff00 + 0xff
= 0xfffffffff
--- code
offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
offset -= pos * granularity;
---
offset = (0x1fffffffff & (256 * 2 - 1) & ~(256 - 1)) - (1 * 256)
(0x1fffffffff & 0x1ff & 0xffffffffffffff00) - 0x100
(0x1ff & 0xffffffffffffff00) - 0x100
0x100 - 0x100
0x0
Final Result:
--- code
spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
---
= prm_cxl_dpa_spa(pci_dev, 0xfffffffff) + 0
= 0xe04fffffff + 0
^^^ note that in thise case my emulation gives the exact address
you seem to suggest that i'll get the closet granularity
So if not for my emulation, the offset calculation is wrong?
--------------------------------------------------------------------
For the sake of completeness, here is my PRMT emulation code to show
you that it is doing the translation as-expected.
All this does is just force translation for a particular set of PCI
devices based on the known static CFMW regions.
Note for onlookers: This patch is extremely dangerous and only applies
to my specific system / interleave configuration.
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index ac74b6f6dad7..8ccf2d5638ed 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -432,6 +432,31 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
return 0;
}
+#define MAX_CFMWS (32)
+static unsigned int num_cfmws;
+static unsigned long cfmws_bases[MAX_CFMWS];
+static unsigned long cfmws_sizes[MAX_CFMWS];
+unsigned int cxl_get_num_cfmws(void)
+{
+ return num_cfmws;
+}
+
+unsigned long cxl_get_cfmws_base(unsigned int idx)
+{
+ if (idx >= MAX_CFMWS || idx >= num_cfmws)
+ return ~0;
+
+ return cfmws_bases[idx];
+}
+
+unsigned long cxl_get_cfmws_size(unsigned int idx)
+{
+ if (idx >= MAX_CFMWS || idx >= num_cfmws)
+ return ~0;
+
+ return cfmws_sizes[idx];
+}
+
static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
const unsigned long end)
{
@@ -446,10 +471,16 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
"Failed to add decode range: [%#llx - %#llx] (%d)\n",
cfmws->base_hpa,
cfmws->base_hpa + cfmws->window_size - 1, rc);
- else
+ else {
dev_dbg(dev, "decode range: node: %d range [%#llx - %#llx]\n",
phys_to_target_node(cfmws->base_hpa), cfmws->base_hpa,
cfmws->base_hpa + cfmws->window_size - 1);
+ if (num_cfmws < MAX_CFMWS) {
+ cfmws_bases[num_cfmws] = cfmws->base_hpa;
+ cfmws_sizes[num_cfmws] = cfmws->window_size;
+ num_cfmws++;
+ }
+ }
/* never fail cxl_acpi load for a single window failure */
return 0;
diff --git a/drivers/cxl/core/amd.c b/drivers/cxl/core/amd.c
index 553b7d0caefd..08a5bfb9fbd6 100644
--- a/drivers/cxl/core/amd.c
+++ b/drivers/cxl/core/amd.c
@@ -64,6 +64,10 @@ struct prm_cxl_dpa_spa_data {
static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
{
struct prm_cxl_dpa_spa_data data;
+ unsigned int cfmws_nr;
+ unsigned int idx;
+ unsigned long offset, size;
+ unsigned int dev;
u64 spa;
int rc;
@@ -75,12 +79,35 @@ static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
.out = &spa,
};
+ cfmws_nr = cxl_get_num_cfmws();
+ if (!cfmws_nr)
+ goto try_prmt;
+
+ /* HACK: Calculate the interleaved offset and find the matching base */
+ if (pci_dev->bus->number != 0xe1 && pci_dev->bus->number != 0xc1)
+ goto try_prmt;
+
+ dev = pci_dev->bus->number == 0xe1 ? 0 : 1;
+ offset = (0x100 * (((dpa >> 8) * 2) + dev)) + (dpa & 0xff);
+
+ for (idx = 0; idx < cfmws_nr; idx++) {
+ size = cxl_get_cfmws_size(idx);
+ if (offset < size) {
+ spa = cxl_get_cfmws_base(idx) + offset;
+ goto out;
+ }
+ offset -= size;
+ }
+ /* We failed, fall back to calling the PRMT */
+try_prmt:
+
rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
if (rc) {
pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
return ULLONG_MAX;
}
+out:
pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
return spa;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f93eb464fc97..48cbfa68d739 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -919,6 +919,10 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
struct cxl_mem_command *cxl_find_feature_command(u16 opcode);
+unsigned int cxl_get_num_cfmws(void);
+unsigned long cxl_get_cfmws_base(unsigned int idx);
+unsigned long cxl_get_cfmws_size(unsigned int idx);
+
/*
* Unit test builds overrides this to __weak, find the 'strong' version
* of these symbols in tools/testing/cxl/.
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-15 22:24 ` Gregory Price
@ 2025-01-17 14:06 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-17 14:06 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 15.01.25 17:24:54, Gregory Price wrote:
> On Wed, Jan 15, 2025 at 04:05:16PM +0100, Robert Richter wrote:
> > On 09.01.25 17:25:13, Gregory Price wrote:
> > > > + dpa = (hpa & ~(granularity * ways - 1)) / ways
> > > > + + (hpa & (granularity - 1));
> > >
> > > I do not understand this chunk here, we seem to just be chopping the HPA
> > > in half to acquire the DPA. But the value passed in is already a DPA.
> > >
> > > dpa = (0x1fffffffff & ~(256 * 2 - 1)) / 2 + (0x1fffffffff & (256 - 1))
> > > = 0xfffffffff
> >
> > HPA is:
> >
> > HPA = 2 * 0x2000000000 - 1 = 0x3fffffffff
> >
> > Should calculate for a 2-way config to:
> >
> > DPA = 0x1fffffffff.
> >
>
> I'm looking back through all of this again, and I'm not seeing how the
> current code is ever capable of ending up with hpa=0x3fffffffff.
>
> Taking an example endpoint in my setup:
>
> [decoder5.0]# cat start size interleave_ways interleave_granularity
> 0x0
> 0x2000000000 <- 128GB (half the total 256GB interleaved range)
> 1 <- this decoder does not apply interleave
> 256
>
> translating up to a root decoder:
>
> [decoder0.0]# cat start size interleave_ways interleave_granularity
> 0xc050000000
> 0x4000000000 <- 256GB (total interleaved capacity)
> 2 <- interleaved 2 ways, this decoder applies interleave
> 256
>
>
> Now looking at the code that actually invokes the translation
>
> static struct device *
> cxl_find_auto_decoder(struct cxl_port *port, struct cxl_endpoint_decoder *cxled,
> struct cxl_region *cxlr)
> {
> struct cxl_decoder *cxld = &cxled->cxld;
> struct range hpa = cxld->hpa_range;
> ... snip ...
> if (cxl_port_calc_hpa(parent, cxld, &hpa))
> return NULL;
> }
>
> or
>
> static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> struct cxl_port *parent, *iter = cxled_to_port(cxled);
> struct range hpa = cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
>
> while (1) {
> if (is_cxl_endpoint(iter))
> cxld = &cxled->cxld;
> ...
> /* Translate HPA to the next upper memory domain. */
> if (cxl_port_calc_hpa(parent, cxld, &hpa)) {
> }
> ...
> }
> ....
> }
>
> Both of these will call cxl_port_calc_hpa with
> hpa = [0, 0x1fffffffff]
>
> Resulting in the following
>
> static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> struct range *hpa_range)
> {
> ...
> /* Translate HPA to the next upper domain. */
> hpa.start = port->to_hpa(cxld, hpa.start); <---- 0x0
> hpa.end = port->to_hpa(cxld, hpa.end); <---- 0x1fffffffff
> }
>
> So we call:
> to_hpa(decoder5.0, hpa.end)
> to_hpa(decoder5.0, 0x1fffffffff)
> ^^^^^^^^^^^^^ --- hpa will never be 0x3fffffffff
Depending on the endpoint the PRM call returns the following here
(2-way interleaving with 256 gran):
hpa = 0x1fffffff00 * 2 + pos * 0x100 + 0xff;
It will either return 0x3ffffffeff or 0x3fffffffff.
But implementation in cxl_zen5_to_hpa() is not correct, see below.
>
>
> Should the to_hpa() code be taking an decoder length as an argument?
>
> to_hpa(decoder5.0, range_length, addr) ?
>
> This would actually let us calculate the end of region with the
> interleave ways and granularity:
>
> upper_hpa_base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC) - DPA_MAGIC;
> upper_hpa_end = upper_hpa_base + (range_length * ways) - 1
>
> Without this, you don't have enough information to actually calculate
> the upper_hpa_end as you suggested. The result is the math ends up
> chopping the endpoint decoder's range (128GB) in half (64GB).
>
> Below I walk through the translation code with these inputs step by step.
>
> ~Gregory
>
> ------------------------------------------------------------------------
>
> Walking through the translation code by hand here:
>
> [decoder5.0]# cat start size
> 0x0
> 0x2000000000
>
> call: to_hpa(decoder5.0, 0x1fffffffff)
>
> --- code
> #define DPA_MAGIC 0xd20000
> base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
> spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
> spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
>
> len = spa - base;
> len2 = spa2 - base;
>
> /* offset = pos * granularity */
> if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
> ... snip - not taken ...
> } else {
> ways = len / SZ_16K;
> offset = spa & (SZ_16K - 1);
> granularity = (len - len2 - SZ_256) / (ways - 1);
> pos = offset / granularity;
> }
> --- end code
>
> At this point in the code i have the following values:
>
> base = 0xc051a40100
> spa = 0xc051a48100
> spa2 = 0xc051a47f00
> len = 0x8000
> len2 = 0x7E00
> ways = 2
> offset = 256
> granularity = 256
> pos = 1
>
> --- code
> base = base - DPA_MAGIC * ways - pos * granularity;
> spa = base + hpa;
> --- end code
>
> base = 0xc051a40100 - 0xd20000 * 2 - 1 * 256
> = 0xc050000000
>
> spa = base + hpa
This is the wrong part, the interleaving parameters must be considered
for SPA too, like:
spa = base + hpa div gran * gran * ways + pos * gran + hpa mod gran
This caused a wrong range size and this went unnoticed due to an
overlaying issue while testing. The decection of the interleaving
config still is correct.
> = 0xc050000000 + 0x1fffffffff <-----
> = 0xe04fffffff |
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Isn't this just incorrect? Should it be something like:
> --------------------------------------------------------------------
>
> For the sake of completeness, here is my PRMT emulation code to show
> you that it is doing the translation as-expected.
>
> All this does is just force translation for a particular set of PCI
> devices based on the known static CFMW regions.
> @@ -75,12 +79,35 @@ static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
> .out = &spa,
> };
>
> + cfmws_nr = cxl_get_num_cfmws();
> + if (!cfmws_nr)
> + goto try_prmt;
> +
> + /* HACK: Calculate the interleaved offset and find the matching base */
> + if (pci_dev->bus->number != 0xe1 && pci_dev->bus->number != 0xc1)
> + goto try_prmt;
> +
> + dev = pci_dev->bus->number == 0xe1 ? 0 : 1;
> + offset = (0x100 * (((dpa >> 8) * 2) + dev)) + (dpa & 0xff);
That looks correct to me, a test too:
>>> dpa = 0x1fffffffff
>>> dev = 1
>>> offset = (0x100 * (((dpa >> 8) * 2) + dev)) + (dpa & 0xff)
>>> "0x%x" % offset
'0x3fffffffff'
>>> dev = 0
>>> offset = (0x100 * (((dpa >> 8) * 2) + dev)) + (dpa & 0xff)
>>> "0x%x" % offset
'0x3ffffffeff'
Thanks for review and testing.
Will fix in v2.
-Robert
> +
> + for (idx = 0; idx < cfmws_nr; idx++) {
> + size = cxl_get_cfmws_size(idx);
> + if (offset < size) {
> + spa = cxl_get_cfmws_base(idx) + offset;
> + goto out;
> + }
> + offset -= size;
> + }
> + /* We failed, fall back to calling the PRMT */
> +try_prmt:
> +
> rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> if (rc) {
> pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
> return ULLONG_MAX;
> }
>
> +out:
> pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
>
> return spa;
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
` (3 preceding siblings ...)
2025-01-09 22:25 ` Gregory Price
@ 2025-01-10 22:48 ` Gregory Price
2025-01-17 8:41 ` Robert Richter
2025-01-17 21:32 ` Ben Cheatham
5 siblings, 1 reply; 117+ messages in thread
From: Gregory Price @ 2025-01-10 22:48 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> Add AMD platform specific Zen5 support for address translation.
>
> Zen5 systems may be configured to use 'Normalized addresses'. Then,
> CXL endpoints use their own physical address space and Host Physical
> Addresses (HPAs) need address translation from the endpoint to its CXL
> host bridge. The HPA of a CXL host bridge is equivalent to the System
> Physical Address (SPA).
>
Just adding the note that I've tested this patch set for HPA==SPA and
found it causes no regressions in my setup.
Still working on testing the normalized address mode due to a few BIOS
quirks I'm running up against.
~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-10 22:48 ` Gregory Price
@ 2025-01-17 8:41 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-17 8:41 UTC (permalink / raw)
To: Gregory Price
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman,
linux-cxl, linux-kernel, Fabio M. De Francesco
On 10.01.25 17:48:35, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:11PM +0100, Robert Richter wrote:
> > Add AMD platform specific Zen5 support for address translation.
> >
> > Zen5 systems may be configured to use 'Normalized addresses'. Then,
> > CXL endpoints use their own physical address space and Host Physical
> > Addresses (HPAs) need address translation from the endpoint to its CXL
> > host bridge. The HPA of a CXL host bridge is equivalent to the System
> > Physical Address (SPA).
> >
>
> Just adding the note that I've tested this patch set for HPA==SPA and
> found it causes no regressions in my setup.
Thanks for testing HPA==SPA.
-Robert
>
> Still working on testing the normalized address mode due to a few BIOS
> quirks I'm running up against.
>
> ~Gregory
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
` (4 preceding siblings ...)
2025-01-10 22:48 ` Gregory Price
@ 2025-01-17 21:32 ` Ben Cheatham
2025-01-28 9:29 ` Robert Richter
5 siblings, 1 reply; 117+ messages in thread
From: Ben Cheatham @ 2025-01-17 21:32 UTC (permalink / raw)
To: Robert Richter
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman
On 1/7/25 8:10 AM, Robert Richter wrote:
> Add AMD platform specific Zen5 support for address translation.
>
> Zen5 systems may be configured to use 'Normalized addresses'. Then,
> CXL endpoints use their own physical address space and Host Physical
> Addresses (HPAs) need address translation from the endpoint to its CXL
> host bridge. The HPA of a CXL host bridge is equivalent to the System
> Physical Address (SPA).
>
> ACPI Platform Runtime Mechanism (PRM) is used to translate the CXL
> Device Physical Address (DPA) to its System Physical Address. This is
> documented in:
>
> AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> ACPI v6.5 Porting Guide, Publication # 58088
> https://www.amd.com/en/search/documentation/hub.html
>
> Note that DPA and HPA of an endpoint may differ depending on the
> interleaving configuration. That is, an additional calculation between
> DPA and HPA is needed.
>
> To implement AMD Zen5 address translation the following steps are
> needed:
>
> Introduce the generic function cxl_port_platform_setup() that allows
> to apply platform specific changes to each port where necessary.
>
> Add a function cxl_port_setup_amd() to implement AMD platform specific
> code. Use Kbuild and Kconfig options respectivly to enable the code
> depending on architecture and platform options. Create a new file
> core/amd.c for this.
>
> Introduce a function cxl_zen5_init() to handle Zen5 specific
> enablement. Zen5 platforms are detected using the PCIe vendor and
> device ID of the corresponding CXL root port.
>
> Apply cxl_zen5_to_hpa() as cxl_port->to_hpa() callback to Zen5 CXL
> host bridges to enable platform specific address translation.
>
> Use ACPI PRM DPA to SPA translation to determine an endpoint's
> interleaving configuration and base address during the early
> initialization proces. This is used to determine an endpoint's SPA
> range.
>
> Since the PRM translates DPA->SPA, but HPA->SPA is needed, determine
> the interleaving config and base address of the endpoint first, then
> calculate the SPA based on the given HPA using the address base.
>
> The config can be determined calling the PRM for specific DPAs
> given. Since the interleaving configuration is still unknown, chose
> DPAs starting at 0xd20000. This address is factor for all values from
> 1 to 8 and thus valid for all possible interleaving configuration.
> The resulting SPAs are used to calculate interleaving paramters and
> the SPA base address of the endpoint. The maximum granularity (chunk
> size) is 16k, minimum is 256. Use the following calculation for a
> given DPA:
>
> ways = hpa_len(SZ_16K) / SZ_16K
> gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
> / (ways - 1)
> pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
>
> Once the endpoint is attached to a region and its SPA range is know,
> calling the PRM is no longer needed, the SPA base can be used.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/Kconfig | 4 +
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/amd.c | 227 ++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/core.h | 6 +
> drivers/cxl/core/port.c | 7 ++
> 5 files changed, 245 insertions(+)
> create mode 100644 drivers/cxl/core/amd.c
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 876469e23f7a..e576028dd983 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -146,4 +146,8 @@ config CXL_REGION_INVALIDATION_TEST
> If unsure, or if this kernel is meant for production environments,
> say N.
>
> +config CXL_AMD
> + def_bool y
> + depends on AMD_NB
> +
> endif
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..dc368e61d281 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -16,3 +16,4 @@ cxl_core-y += pmu.o
> cxl_core-y += cdat.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +cxl_core-$(CONFIG_CXL_AMD) += amd.o
> diff --git a/drivers/cxl/core/amd.c b/drivers/cxl/core/amd.c
> new file mode 100644
> index 000000000000..553b7d0caefd
> --- /dev/null
> +++ b/drivers/cxl/core/amd.c
> @@ -0,0 +1,227 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc.
Make sure to update the year. Don't know if it's supposed to be 2024-2025 or just 2025.
> + */
> +
> +#include <linux/prmt.h>
> +#include <linux/pci.h>
> +
> +#include "cxlmem.h"
> +#include "core.h"
> +
> +#define PCI_DEVICE_ID_AMD_ZEN5_ROOT 0x153e
> +
> +static const struct pci_device_id zen5_root_port_ids[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ZEN5_ROOT) },
> + {},
> +};
> +
> +static int is_zen5_root_port(struct device *dev, void *unused)
> +{
> + if (!dev_is_pci(dev))
> + return 0;
> +
> + return !!pci_match_id(zen5_root_port_ids, to_pci_dev(dev));
> +}
> +
> +static bool is_zen5(struct cxl_port *port)
> +{
> + if (!IS_ENABLED(CONFIG_ACPI_PRMT))
> + return false;
> +
> + /* To get the CXL root port, find the CXL host bridge first. */
> + if (is_cxl_root(port) ||
> + !port->host_bridge ||
> + !is_cxl_root(to_cxl_port(port->dev.parent)))
> + return false;
> +
> + return !!device_for_each_child(port->host_bridge, NULL,
> + is_zen5_root_port);
> +}
> +
> +/*
> + * PRM Address Translation - CXL DPA to System Physical Address
> + *
> + * Reference:
> + *
> + * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> + * ACPI v6.5 Porting Guide, Publication # 58088
> + */
> +
> +static const guid_t prm_cxl_dpa_spa_guid =
> + GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
> + 0x48, 0x0b, 0x94);
> +
> +struct prm_cxl_dpa_spa_data {
> + u64 dpa;
> + u8 reserved;
> + u8 devfn;
> + u8 bus;
> + u8 segment;
> + void *out;
> +} __packed;
> +
> +static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
> +{
> + struct prm_cxl_dpa_spa_data data;
> + u64 spa;
> + int rc;
> +
> + data = (struct prm_cxl_dpa_spa_data) {
> + .dpa = dpa,
> + .devfn = pci_dev->devfn,
> + .bus = pci_dev->bus->number,
> + .segment = pci_domain_nr(pci_dev->bus),
> + .out = &spa,
> + };
> +
> + rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> + if (rc) {
> + pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
> + return ULLONG_MAX;
> + }
> +
> + pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
> +
> + return spa;
> +}
> +
> +static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
After reading through your discussion with Gregory I'm not confused about the function name
anymore, but I would definitely include a comment specifying the expected usage and return value.
Thanks,
Ben
> +{
> + struct cxl_memdev *cxlmd;
> + struct pci_dev *pci_dev;
> + struct cxl_port *port;
> + u64 dpa, base, spa, spa2, len, len2, offset, granularity;
> + int ways, pos;
> +
> + /*
> + * Nothing to do if base is non-zero and Normalized Addressing
> + * is disabled.
> + */
> + if (cxld->hpa_range.start)
> + return hpa;
> +
> + /* Only translate from endpoint to its parent port. */
> + if (!is_endpoint_decoder(&cxld->dev))
> + return hpa;
> +
> + if (hpa > cxld->hpa_range.end) {
> + dev_dbg(&cxld->dev, "hpa addr %#llx out of range %#llx-%#llx\n",
> + hpa, cxld->hpa_range.start, cxld->hpa_range.end);
> + return ULLONG_MAX;
> + }
> +
> + /*
> + * If the decoder is already attached, the region's base can
> + * be used.
> + */
> + if (cxld->region)
> + return cxld->region->params.res->start + hpa;
> +
> + port = to_cxl_port(cxld->dev.parent);
> + cxlmd = port ? to_cxl_memdev(port->uport_dev) : NULL;
> + if (!port || !dev_is_pci(cxlmd->dev.parent)) {
> + dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
> + dev_name(cxld->dev.parent), cxld->hpa_range.start,
> + cxld->hpa_range.end);
> + return ULLONG_MAX;
> + }
> + pci_dev = to_pci_dev(cxlmd->dev.parent);
> +
> + /*
> + * The PRM translates DPA->SPA, but we need HPA->SPA.
> + * Determine the interleaving config first, then calculate the
> + * DPA. Maximum granularity (chunk size) is 16k, minimum is
> + * 256. Calculated with:
> + *
> + * ways = hpa_len(SZ_16K) / SZ_16K
> + * gran = (hpa_len(SZ_16K) - hpa_len(SZ_16K - SZ_256) - SZ_256)
> + * / (ways - 1)
> + * pos = (hpa_len(SZ_16K) - ways * SZ_16K) / gran
> + */
> +
> + /*
> + * DPA magic:
> + *
> + * Position and granularity are unknown yet, use an always
> + * valid DPA:
> + *
> + * 0xd20000 = 13762560 = 16k * 2 * 3 * 2 * 5 * 7 * 2
> + *
> + * It is divisible by all positions 1 to 8. The DPA is valid
> + * for all positions and granularities.
> + */
> +#define DPA_MAGIC 0xd20000
> + base = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC);
> + spa = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K);
> + spa2 = prm_cxl_dpa_spa(pci_dev, DPA_MAGIC + SZ_16K - SZ_256);
> +
> + /* Includes checks to avoid div by zero */
> + if (!base || base == ULLONG_MAX || spa == ULLONG_MAX ||
> + spa2 == ULLONG_MAX || spa < base + SZ_16K || spa2 <= base ||
> + (spa > base + SZ_16K && spa - spa2 < SZ_256 * 2)) {
> + dev_dbg(&cxld->dev, "Error translating HPA: base %#llx, spa %#llx, spa2 %#llx\n",
> + base, spa, spa2);
> + return ULLONG_MAX;
> + }
> +
> + len = spa - base;
> + len2 = spa2 - base;
> +
> + /* offset = pos * granularity */
> + if (len == SZ_16K && len2 == SZ_16K - SZ_256) {
> + ways = 1;
> + offset = 0;
> + granularity = 0;
> + pos = 0;
> + } else {
> + ways = len / SZ_16K;
> + offset = spa & (SZ_16K - 1);
> + granularity = (len - len2 - SZ_256) / (ways - 1);
> + pos = offset / granularity;
> + }
> +
> + base = base - DPA_MAGIC * ways - pos * granularity;
> + spa = base + hpa;
> +
> + /*
> + * Check SPA using a PRM call for the closest DPA calculated
> + * for the HPA. If the HPA matches a different interleaving
> + * position other than the decoder's, determine its offset to
> + * adjust the SPA.
> + */
> +
> + dpa = (hpa & ~(granularity * ways - 1)) / ways
> + + (hpa & (granularity - 1));
> + offset = hpa & (granularity * ways - 1) & ~(granularity - 1);
> + offset -= pos * granularity;
> + spa2 = prm_cxl_dpa_spa(pci_dev, dpa) + offset;
> +
> + dev_dbg(&cxld->dev,
> + "address mapping found for %s (dpa -> hpa -> spa): %#llx -> %#llx -> %#llx base: %#llx ways: %d pos: %d granularity: %llu\n",
> + pci_name(pci_dev), dpa, hpa, spa, base, ways, pos, granularity);
> +
> + if (spa != spa2) {
> + dev_dbg(&cxld->dev, "SPA calculation failed: %#llx:%#llx\n",
> + spa, spa2);
> + return ULLONG_MAX;
> + }
> +
> + return spa;
> +}
> +
> +static void cxl_zen5_init(struct cxl_port *port)
> +{
> + if (!is_zen5(port))
> + return;
> +
> + port->to_hpa = cxl_zen5_to_hpa;
> +
> + dev_dbg(port->host_bridge, "PRM address translation enabled for %s.\n",
> + dev_name(&port->dev));
> +}
> +
> +void cxl_port_setup_amd(struct cxl_port *port)
> +{
> + cxl_zen5_init(port);
> +}
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 800466f96a68..efe34ae6943e 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -115,4 +115,10 @@ bool cxl_need_node_perf_attrs_update(int nid);
> int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> struct access_coordinate *c);
>
> +#ifdef CONFIG_CXL_AMD
> +void cxl_port_setup_amd(struct cxl_port *port);
> +#else
> +static inline void cxl_port_setup_amd(struct cxl_port *port) {};
> +#endif
> +
> #endif /* __CXL_CORE_H__ */
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 901555bf4b73..c8176265c15c 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -831,6 +831,11 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> &cxl_einj_inject_fops);
> }
>
> +static void cxl_port_platform_setup(struct cxl_port *port)
> +{
> + cxl_port_setup_amd(port);
> +}
> +
> static int cxl_port_add(struct cxl_port *port,
> resource_size_t component_reg_phys,
> struct cxl_dport *parent_dport)
> @@ -868,6 +873,8 @@ static int cxl_port_add(struct cxl_port *port,
> return rc;
> }
>
> + cxl_port_platform_setup(port);
> +
> rc = device_add(dev);
> if (rc)
> return rc;
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT
2025-01-17 21:32 ` Ben Cheatham
@ 2025-01-28 9:29 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-28 9:29 UTC (permalink / raw)
To: Ben Cheatham
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, Terry Bowman
On 17.01.25 15:32:25, Ben Cheatham wrote:
> On 1/7/25 8:10 AM, Robert Richter wrote:
> > +static u64 cxl_zen5_to_hpa(struct cxl_decoder *cxld, u64 hpa)
>
> After reading through your discussion with Gregory I'm not confused about the function name
> anymore, but I would definitely include a comment specifying the expected usage and return value.
I am going to add that description to struct cxl_port where the
callback is implemented.
Thanks for review,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 26/29] MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD)
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (24 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 25/29] cxl/amd: Enable Zen5 address translation using ACPI PRMT Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 14:10 ` [PATCH v1 27/29] cxl/region: Show message on registration failure Robert Richter
` (3 subsequent siblings)
29 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Adding a maintainer's entry for AMD platform specific CXL support.
Cc: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
MAINTAINERS | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 910305c11e8a..13791005995e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5778,6 +5778,13 @@ F: include/cxl/
F: include/uapi/linux/cxl_mem.h
F: tools/testing/cxl/
+COMPUTE EXPRESS LINK - AMD (CXL_AMD)
+M: Robert Richter <rrichter@amd.com>
+M: Terry Bowman <terry.bowman@amd.com>
+L: linux-cxl@vger.kernel.org
+S: Maintained
+F: drivers/cxl/core/amd.c
+
COMPUTE EXPRESS LINK PMU (CPMU)
M: Jonathan Cameron <jonathan.cameron@huawei.com>
L: linux-cxl@vger.kernel.org
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* [PATCH v1 27/29] cxl/region: Show message on registration failure
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (25 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 26/29] MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD) Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 23:11 ` Gregory Price
2025-01-07 14:10 ` [PATCH v1 28/29] cxl/region: Show message on broken target list Robert Richter
` (2 subsequent siblings)
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Esp. in complex system configurations with multiple endpoints and
interleaving setups it is hard to detect region setup failures as its
registration may silently fail. Add messages to show registration
failures.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 1dae7d36d37c..775450a1a887 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2240,6 +2240,12 @@ static int attach_target(struct cxl_region *cxlr,
rc = cxl_region_attach(cxlr, cxled, pos);
up_read(&cxl_dpa_rwsem);
up_write(&cxl_region_rwsem);
+
+ if (rc)
+ dev_warn(cxled->cxld.dev.parent,
+ "failed to attach %s to %s: %d\n",
+ dev_name(&cxled->cxld.dev), dev_name(&cxlr->dev), rc);
+
return rc;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 27/29] cxl/region: Show message on registration failure
2025-01-07 14:10 ` [PATCH v1 27/29] cxl/region: Show message on registration failure Robert Richter
@ 2025-01-07 23:11 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 23:11 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:13PM +0100, Robert Richter wrote:
> Esp. in complex system configurations with multiple endpoints and
> interleaving setups it is hard to detect region setup failures as its
> registration may silently fail. Add messages to show registration
> failures.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/region.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 1dae7d36d37c..775450a1a887 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2240,6 +2240,12 @@ static int attach_target(struct cxl_region *cxlr,
> rc = cxl_region_attach(cxlr, cxled, pos);
> up_read(&cxl_dpa_rwsem);
> up_write(&cxl_region_rwsem);
> +
> + if (rc)
> + dev_warn(cxled->cxld.dev.parent,
> + "failed to attach %s to %s: %d\n",
> + dev_name(&cxled->cxld.dev), dev_name(&cxlr->dev), rc);
> +
> return rc;
> }
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (26 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 27/29] cxl/region: Show message on registration failure Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 23:12 ` Gregory Price
2025-01-14 11:16 ` Jonathan Cameron
2025-01-07 14:10 ` [PATCH v1 29/29] cxl: Show message when a decoder was added to a port Robert Richter
2025-01-13 18:41 ` [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Alison Schofield
29 siblings, 2 replies; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Broken target lists are hard to discover as the driver fails at a
later initialization stage. Add an error message for this.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/core/region.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 775450a1a887..2af3b6c14f46 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
}
put_device(dev);
+ if (rc)
+ dev_err(port->uport_dev,
+ "failed to find %s:%s in target list of %s\n",
+ dev_name(&port->dev),
+ dev_name(port->parent_dport->dport_dev),
+ dev_name(&cxlsd->cxld.dev));
+
return rc;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-01-07 14:10 ` [PATCH v1 28/29] cxl/region: Show message on broken target list Robert Richter
@ 2025-01-07 23:12 ` Gregory Price
2025-01-14 11:16 ` Jonathan Cameron
1 sibling, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 23:12 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:14PM +0100, Robert Richter wrote:
> Broken target lists are hard to discover as the driver fails at a
> later initialization stage. Add an error message for this.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/core/region.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 775450a1a887..2af3b6c14f46 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> }
> put_device(dev);
>
> + if (rc)
> + dev_err(port->uport_dev,
> + "failed to find %s:%s in target list of %s\n",
> + dev_name(&port->dev),
> + dev_name(port->parent_dport->dport_dev),
> + dev_name(&cxlsd->cxld.dev));
> +
> return rc;
> }
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-01-07 14:10 ` [PATCH v1 28/29] cxl/region: Show message on broken target list Robert Richter
2025-01-07 23:12 ` Gregory Price
@ 2025-01-14 11:16 ` Jonathan Cameron
2025-02-06 21:23 ` Robert Richter
1 sibling, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-01-14 11:16 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, 7 Jan 2025 15:10:14 +0100
Robert Richter <rrichter@amd.com> wrote:
> Broken target lists are hard to discover as the driver fails at a
> later initialization stage. Add an error message for this.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
> drivers/cxl/core/region.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 775450a1a887..2af3b6c14f46 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> }
> put_device(dev);
>
> + if (rc)
> + dev_err(port->uport_dev,
> + "failed to find %s:%s in target list of %s\n",
> + dev_name(&port->dev),
> + dev_name(port->parent_dport->dport_dev),
> + dev_name(&cxlsd->cxld.dev));
> +
> return rc;
> }
This function would benefit from some __free() magic dust.
Then we could return in the good path in the loop and not need the if (rc)
check here.
Otherwise looks fine.
Jonathan
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-01-14 11:16 ` Jonathan Cameron
@ 2025-02-06 21:23 ` Robert Richter
2025-02-07 17:51 ` Jonathan Cameron
0 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-02-06 21:23 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 14.01.25 11:16:41, Jonathan Cameron wrote:
> On Tue, 7 Jan 2025 15:10:14 +0100
> Robert Richter <rrichter@amd.com> wrote:
>
> > Broken target lists are hard to discover as the driver fails at a
> > later initialization stage. Add an error message for this.
> >
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> > drivers/cxl/core/region.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 775450a1a887..2af3b6c14f46 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> > }
> > put_device(dev);
> >
> > + if (rc)
> > + dev_err(port->uport_dev,
> > + "failed to find %s:%s in target list of %s\n",
> > + dev_name(&port->dev),
> > + dev_name(port->parent_dport->dport_dev),
> > + dev_name(&cxlsd->cxld.dev));
> > +
> > return rc;
> > }
> This function would benefit from some __free() magic dust.
> Then we could return in the good path in the loop and not need the if (rc)
> check here.
That does not really simplify the code. It would just this one
indentation. On the other side there is a central exit for the code
and we just need only that one put_device(). Plus, I like to have the
'success' code path returning at the end of block.
-Robert
>
> Otherwise looks fine.
>
> Jonathan
>
> >
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-02-06 21:23 ` Robert Richter
@ 2025-02-07 17:51 ` Jonathan Cameron
2025-02-12 9:08 ` Robert Richter
0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-02-07 17:51 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Thu, 6 Feb 2025 22:23:40 +0100
Robert Richter <rrichter@amd.com> wrote:
> On 14.01.25 11:16:41, Jonathan Cameron wrote:
> > On Tue, 7 Jan 2025 15:10:14 +0100
> > Robert Richter <rrichter@amd.com> wrote:
> >
> > > Broken target lists are hard to discover as the driver fails at a
> > > later initialization stage. Add an error message for this.
> > >
> > > Signed-off-by: Robert Richter <rrichter@amd.com>
> > > ---
> > > drivers/cxl/core/region.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > > index 775450a1a887..2af3b6c14f46 100644
> > > --- a/drivers/cxl/core/region.c
> > > +++ b/drivers/cxl/core/region.c
> > > @@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> > > }
> > > put_device(dev);
> > >
> > > + if (rc)
> > > + dev_err(port->uport_dev,
> > > + "failed to find %s:%s in target list of %s\n",
> > > + dev_name(&port->dev),
> > > + dev_name(port->parent_dport->dport_dev),
> > > + dev_name(&cxlsd->cxld.dev));
> > > +
> > > return rc;
> > > }
> > This function would benefit from some __free() magic dust.
> > Then we could return in the good path in the loop and not need the if (rc)
> > check here.
>
> That does not really simplify the code. It would just this one
> indentation. On the other side there is a central exit for the code
> and we just need only that one put_device(). Plus, I like to have the
> 'success' code path returning at the end of block.
Seems simpler to me to return early on finding a match.
static int find_pos_and_ways(struct cxl_port *port, struct range *range,
int *pos, int *ways)
{
struct cxl_switch_decoder *cxlsd;
struct cxl_port *parent;
parent = next_port(port);
if (!parent)
return -ENXIO;
struct device *dev __free(device) =
device_find_child(&parent->dev, range,
match_switch_decoder_by_range);
if (!dev) {
dev_err(port->uport_dev,
"failed to find decoder mapping %#llx-%#llx\n",
range->start, range->end);
return -ENODEV;
}
cxlsd = to_cxl_switch_decoder(dev);
*ways = cxlsd->cxld.interleave_ways;
for (int i = 0; i < *ways; i++) {
if (cxlsd->target[i] == port->parent_dport) {
*pos = i;
return 0;
}
}
dev_err(port->uport_dev,
"failed to find %s:%s in target list of %s\n",
dev_name(&port->dev),
dev_name(port->parent_dport->dport_dev),
dev_name(&cxlsd->cxld.dev));
return -ENXIO;
}
I don't mind that much though. I'd also suggest returning -ENXIO
doesn't seem the right choice for failing to find something.
>
> -Robert
>
> >
> > Otherwise looks fine.
> >
> > Jonathan
> >
> > >
> >
>
^ permalink raw reply [flat|nested] 117+ messages in thread* Re: [PATCH v1 28/29] cxl/region: Show message on broken target list
2025-02-07 17:51 ` Jonathan Cameron
@ 2025-02-12 9:08 ` Robert Richter
0 siblings, 0 replies; 117+ messages in thread
From: Robert Richter @ 2025-02-12 9:08 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On 07.02.25 17:51:39, Jonathan Cameron wrote:
> On Thu, 6 Feb 2025 22:23:40 +0100
> Robert Richter <rrichter@amd.com> wrote:
>
> > On 14.01.25 11:16:41, Jonathan Cameron wrote:
> > > On Tue, 7 Jan 2025 15:10:14 +0100
> > > Robert Richter <rrichter@amd.com> wrote:
> > >
> > > > Broken target lists are hard to discover as the driver fails at a
> > > > later initialization stage. Add an error message for this.
> > > >
> > > > Signed-off-by: Robert Richter <rrichter@amd.com>
> > > > ---
> > > > drivers/cxl/core/region.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > > > index 775450a1a887..2af3b6c14f46 100644
> > > > --- a/drivers/cxl/core/region.c
> > > > +++ b/drivers/cxl/core/region.c
> > > > @@ -1870,6 +1870,13 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> > > > }
> > > > put_device(dev);
> > > >
> > > > + if (rc)
> > > > + dev_err(port->uport_dev,
> > > > + "failed to find %s:%s in target list of %s\n",
> > > > + dev_name(&port->dev),
> > > > + dev_name(port->parent_dport->dport_dev),
> > > > + dev_name(&cxlsd->cxld.dev));
> > > > +
> > > > return rc;
> > > > }
> > > This function would benefit from some __free() magic dust.
> > > Then we could return in the good path in the loop and not need the if (rc)
> > > check here.
> >
> > That does not really simplify the code. It would just this one
> > indentation. On the other side there is a central exit for the code
> > and we just need only that one put_device(). Plus, I like to have the
> > 'success' code path returning at the end of block.
>
> Seems simpler to me to return early on finding a match.
>
> static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> int *pos, int *ways)
> {
> struct cxl_switch_decoder *cxlsd;
> struct cxl_port *parent;
>
> parent = next_port(port);
> if (!parent)
> return -ENXIO;
>
> struct device *dev __free(device) =
> device_find_child(&parent->dev, range,
> match_switch_decoder_by_range);
> if (!dev) {
> dev_err(port->uport_dev,
> "failed to find decoder mapping %#llx-%#llx\n",
> range->start, range->end);
> return -ENODEV;
> }
> cxlsd = to_cxl_switch_decoder(dev);
> *ways = cxlsd->cxld.interleave_ways;
>
> for (int i = 0; i < *ways; i++) {
> if (cxlsd->target[i] == port->parent_dport) {
> *pos = i;
> return 0;
> }
> }
> dev_err(port->uport_dev,
> "failed to find %s:%s in target list of %s\n",
> dev_name(&port->dev),
> dev_name(port->parent_dport->dport_dev),
> dev_name(&cxlsd->cxld.dev));
>
> return -ENXIO;
> }
>
> I don't mind that much though. I'd also suggest returning -ENXIO
> doesn't seem the right choice for failing to find something.
Right now I just want to add the dev_err() here. I could consider
changing to __free() and the error code in a next cleanup series. I
rather want to avoid to add (those 2) more patches to the series as
that increases time for upstream acceptance.
Thanks,
-Robert
^ permalink raw reply [flat|nested] 117+ messages in thread
* [PATCH v1 29/29] cxl: Show message when a decoder was added to a port
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (27 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 28/29] cxl/region: Show message on broken target list Robert Richter
@ 2025-01-07 14:10 ` Robert Richter
2025-01-07 23:15 ` Gregory Price
2025-01-13 18:41 ` [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Alison Schofield
29 siblings, 1 reply; 117+ messages in thread
From: Robert Richter @ 2025-01-07 14:10 UTC (permalink / raw)
To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso
Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
Terry Bowman, Robert Richter
Improve debugging by adding and unifying messages whenever a decoder
was added to a port. It is especially useful to get the decoder
mapping of the involved CXL host bridge or PCI device. This avoids a
complex lookup of the decoder/port/device mappings in sysfs.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/acpi.c | 10 +++++++++-
drivers/cxl/core/hdm.c | 3 ++-
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index b42cffd6751f..ac74b6f6dad7 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -421,7 +421,15 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
rc = cxl_decoder_add(cxld, target_map);
if (rc)
return rc;
- return cxl_root_decoder_autoremove(dev, no_free_ptr(cxlrd));
+
+ rc = cxl_root_decoder_autoremove(dev, no_free_ptr(cxlrd));
+ if (rc)
+ return rc;
+
+ dev_dbg(root_port->dev.parent, "%s added to %s\n",
+ dev_name(&cxld->dev), dev_name(&root_port->dev));
+
+ return 0;
}
static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 28edd5822486..7d6778f908c8 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -34,7 +34,8 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
if (rc)
return rc;
- dev_dbg(&cxld->dev, "Added to port %s\n", dev_name(&port->dev));
+ dev_dbg(port->uport_dev, "%s added to %s\n",
+ dev_name(&cxld->dev), dev_name(&port->dev));
return 0;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread* Re: [PATCH v1 29/29] cxl: Show message when a decoder was added to a port
2025-01-07 14:10 ` [PATCH v1 29/29] cxl: Show message when a decoder was added to a port Robert Richter
@ 2025-01-07 23:15 ` Gregory Price
0 siblings, 0 replies; 117+ messages in thread
From: Gregory Price @ 2025-01-07 23:15 UTC (permalink / raw)
To: Robert Richter
Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
linux-kernel, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:10:15PM +0100, Robert Richter wrote:
> Improve debugging by adding and unifying messages whenever a decoder
> was added to a port. It is especially useful to get the decoder
> mapping of the involved CXL host bridge or PCI device. This avoids a
> complex lookup of the decoder/port/device mappings in sysfs.
It would be nice if we could create some of these links in the absense
of the full configuration, but I understand why that's difficult.
>
> Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/cxl/acpi.c | 10 +++++++++-
> drivers/cxl/core/hdm.c | 3 ++-
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index b42cffd6751f..ac74b6f6dad7 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -421,7 +421,15 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
> rc = cxl_decoder_add(cxld, target_map);
> if (rc)
> return rc;
> - return cxl_root_decoder_autoremove(dev, no_free_ptr(cxlrd));
> +
> + rc = cxl_root_decoder_autoremove(dev, no_free_ptr(cxlrd));
> + if (rc)
> + return rc;
> +
> + dev_dbg(root_port->dev.parent, "%s added to %s\n",
> + dev_name(&cxld->dev), dev_name(&root_port->dev));
> +
> + return 0;
> }
>
> static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 28edd5822486..7d6778f908c8 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -34,7 +34,8 @@ static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> if (rc)
> return rc;
>
> - dev_dbg(&cxld->dev, "Added to port %s\n", dev_name(&port->dev));
> + dev_dbg(port->uport_dev, "%s added to %s\n",
> + dev_name(&cxld->dev), dev_name(&port->dev));
>
> return 0;
> }
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms
2025-01-07 14:09 [PATCH v1 00/29] cxl: Add address translation support and enable AMD Zen5 platforms Robert Richter
` (28 preceding siblings ...)
2025-01-07 14:10 ` [PATCH v1 29/29] cxl: Show message when a decoder was added to a port Robert Richter
@ 2025-01-13 18:41 ` Alison Schofield
29 siblings, 0 replies; 117+ messages in thread
From: Alison Schofield @ 2025-01-13 18:41 UTC (permalink / raw)
To: Robert Richter
Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
Gregory Price, Fabio M. De Francesco, Terry Bowman
On Tue, Jan 07, 2025 at 03:09:46PM +0100, Robert Richter wrote:
> This patch set adds support of address translation and enables this
> for AMD Zen5 platforms. This is a new appoach in response to an
> earlier attempt to implement CXL address translation [1] and the
> comments on it, esp. Dan's [2]. Dan suggested to solve this by walking
> the port hierarchy from the host port to the host bridge. When
> crossing memory domains from one port to the other, HPA translations
> are applied using a callback function to handle platform specifics.
>
> The CXL driver currently does not implement address translation which
> assumes the host physical addresses (HPA) and system physical
> addresses (SPA) are equal.
Hi Robert,
I tried this out and have some sporadic review. (It's a big set so please
pardon that I'm not giving a comprehensive review on v1.)
I commented directly on a couple of patches, which were things that I touched
to get user created regions working. So that leads to my main feedback - let's
keep user created regions alive.
This set made me wonder if there may be an advantage in further delineating the
auto vs user region creation path. This set is a special auto method and we
have hetero-interleave coming which will introduce more auto only special-ness. Just a thought that was triggered by seeing the refactoring of the interleave
calculation path.
Suggest ditching the term 'address translation' and specifically stating what
is being translated. And that may lead to some clean up in code/comments that
I added during the DPA->HPA->SPA translation when I thought it was the only
translation in town.
DPA to HPA: exists to report HPAs in trace events when a device reports a DPA
media error.
HPA to SPA: exists to report the above HPAs correctly when XOR interleave
arithmetic is used.
HPA to SPA: introduced here
The latter 2 HPA to SPA are not equivalent. The root decoder callback added
for XOR doesn't apply on the front-end like is introduced here. The HB
decoder (hardware) does XOR the translation. The callback (as it exists today)
only comes into play when DPA addresses are directly reported in events. So
there may be some renaming needs there.
That's all for now. I'll review, try out more, in the v2.
--Alison
>
> Systems with different HPA and SPA addresses need address translation.
> If this is the case, the hardware addresses esp. used in the HDM
> decoder configurations are different to the system's or parent port
> address ranges. E.g. AMD Zen5 systems may be configured to use
> 'Normalized addresses'. Then, CXL endpoints have their own physical
> address base which is not the same as the SPA used by the CXL host
> bridge. Thus, addresses need to be translated from the endpoint's to
> its CXL host bridge's address range.
>
> To enable address translation, the endpoint's HPA range must be
> translated to each of the parent port's address ranges up to the root
> decoder. This is implemented by traversing the decoder and port
> hierarchy from the endpoint up to the root port and applying platform
> specific translation functions to determine the next HPA range of the
> parent port where needed:
>
> if (cxl_port->to_hpa)
> hpa = cxl_port->to_hpa(cxl_decoder, hpa)
>
> A callback is introduced to translate an HPA range from a port to its
> parent.
>
> The root port's HPA range is equivalent to the system's SPA range and
> can then be used to find an endpoint's root port and region.
>
> Also, translated HPA ranges must be used to calculate the endpoint
> position in the region.
>
> Once the region was found, the decoders of all ports between the
> endpoint and the root port need to be found based on the translated
> HPA. Configuration checks and interleaving setup must be modified as
> necessary to support address translation.
>
> Note that only auto-discovery of decoders is supported. Thus, decoders
> are locked and cannot be configured manually.
>
> Finally, Zen5 address translation is enabled using ACPI PRMT.
>
> Purpose of patches:
>
> * Patches #1-#4: Minor cleanups and updates separated from the actual
> implementation
>
> * Patches #5-#12, #14, #17, #18: Code rework and refactoring.
>
> * Patches #13, #15, #16, #19-#24: Functional changes for address
> translation (common code).
>
> * Patch #25, #26: AMD Zen5 address translation.
>
> * Patch #27-#29: Changes to improve debug messages for better debugging.
>
> [1] https://lore.kernel.org/linux-cxl/20240701174754.967954-1-rrichter@amd.com/
> [2] https://lore.kernel.org/linux-cxl/669086821f136_5fffa29473@dwillia2-xfh.jf.intel.com.notmuch/
>
>
> Robert Richter (29):
> cxl: Remove else after return
> cxl/pci: Moving code in cxl_hdm_decode_init()
> cxl/pci: cxl_hdm_decode_init: Move comment
> cxl/pci: Add comments to cxl_hdm_decode_init()
> cxl/region: Move find_cxl_root() to cxl_add_to_region()
> cxl/region: Factor out code to find the root decoder
> cxl/region: Factor out code to find a root decoder's region
> cxl/region: Split region registration into an initialization and
> adding part
> cxl/region: Use iterator to find the root port in
> cxl_find_root_decoder()
> cxl/region: Add function to find a port's switch decoder by range
> cxl/region: Unfold cxl_find_root_decoder() into
> cxl_endpoint_initialize()
> cxl: Modify address translation callback for generic use
> cxl: Introduce callback to translate an HPA range from a port to its
> parent
> cxl: Introduce parent_port_of() helper
> cxl/region: Use an endpoint's SPA range to find a region
> cxl/region: Use translated HPA ranges to calculate the endpoint
> position
> cxl/region: Rename function to cxl_find_decoder_early()
> cxl/region: Avoid duplicate call of cxl_find_decoder_early()
> cxl/region: Use endpoint's HPA range to find the port's decoder
> cxl/region: Use translated HPA ranges to find the port's decoder
> cxl/region: Lock decoders that need address translation
> cxl/region: Use translated HPA ranges to create a region
> cxl/region: Use root decoders interleaving parameters to create a
> region
> cxl/region: Use endpoint's SPA range to check a region
> cxl/amd: Enable Zen5 address translation using ACPI PRMT
> MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD)
> cxl/region: Show message on registration failure
> cxl/region: Show message on broken target list
> cxl: Show message when a decoder was added to a port
>
> MAINTAINERS | 7 +
> drivers/cxl/Kconfig | 4 +
> drivers/cxl/acpi.c | 14 +-
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/amd.c | 227 +++++++++++++++++++++
> drivers/cxl/core/cdat.c | 2 +-
> drivers/cxl/core/core.h | 6 +
> drivers/cxl/core/hdm.c | 3 +-
> drivers/cxl/core/pci.c | 44 +++--
> drivers/cxl/core/port.c | 22 ++-
> drivers/cxl/core/region.c | 407 ++++++++++++++++++++++++++++----------
> drivers/cxl/cxl.h | 16 +-
> drivers/cxl/port.c | 22 +--
> 13 files changed, 623 insertions(+), 152 deletions(-)
> create mode 100644 drivers/cxl/core/amd.c
>
>
> base-commit: 2f84d072bdcb7d6ec66cc4d0de9f37a3dc394cd2
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread