* [PATCH v2 0/3] cxl/region: Autodiscovery position repair
@ 2023-10-16 6:02 alison.schofield
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: alison.schofield @ 2023-10-16 6:02 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
Changes in v2:
- Use a 'skip', which is a number of siblings to skip over, rather than
an 'index' when finding a child's position in a parent interleave.
- Tidy up commit messages for clarity and grammar. (DaveJ)
- Update this cover letter with added testing configs that led to the
'skip' change in the calculation.
Begin original cover letter:
Some region configurations fail to assemble through the auto-discovered
region path. These are valid region configurations that can be assembled
correctly if presented as user defined regions.
The difference being that user defined regions arrive at the driver
with their targets in interleave order, whereas with autodiscovered
regions, the driver needs to assign each target in the interleave
set a correct position. And, in some cases, that fails.
cxl_region_sort_targets() uses the kernel sort() function to put the
targets in relative order. Once the relative ordering is complete,
positions are assigned based on each targets index in the sorted list.
That relative sort doesn't consider the offset of a port into its
parent port. In the failure case, a 2 + 2 config (2 host bridges each
with 2 endpoints), this causes the sort to put all targets of one port
ahead of another port, when they were expected to be interleaved.
While examining the problem and weighing the option of repairing the
existing sort algorithm with assigning positions another way, I chose
the latter. Each endpoint can be examined individually to discover its
position in the region interleave.
The presentation of this patchset was a challenge. While the changes
are essentially a replacement the resulting diff is horrible. (I did
try multiple git diff algs). So, after a simple preparation patch,
its presented like this:
Patch 2:The new method, calc_interleave_pos(), is introduced and used
in a dev_dbg() exercise on user defined regions.
Patch 3:Calc_interleave_pos() replaces the relative sort() in
cxl_region_sort_targets() for auto-discoverd regions
and the now obsolete sort helpers are removed.
The only function that seems useful for a side by side diff viewing
is cxl_region_sort_targets() and it is in Patch 3.
Tested with interleave at the Host Bridges, no switches:
1, 1+1, 1+1+1, 1+1+1+1, 2, 2+2, 2+2+2, 2+2+2+2, 4, 4+4
Tested with switches, 8 way interleave using 2 HB 2 Switches.
Tested against all regions created in CXL unit test runs.
Alison Schofield (3):
cxl/region: Prepare the decoder match range helper for reuse
cxl/region: Calculate a target position in a region interleave
cxl/region: Use calc_interleave_pos() with autodiscovered regions
drivers/cxl/core/region.c | 207 ++++++++++++++++++++------------------
1 file changed, 108 insertions(+), 99 deletions(-)
base-commit: 8a749fd1a8720d4619c91c8b6e7528c0a355c0aa
--
2.37.3
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-16 6:02 [PATCH v2 0/3] cxl/region: Autodiscovery position repair alison.schofield
@ 2023-10-16 6:02 ` alison.schofield
2023-10-17 16:21 ` Jim Harris
2023-10-23 20:54 ` Dan Williams
2023-10-16 6:02 ` [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave alison.schofield
2023-10-16 6:02 ` [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions alison.schofield
2 siblings, 2 replies; 19+ messages in thread
From: alison.schofield @ 2023-10-16 6:02 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko, Jonathan Cameron
From: Alison Schofield <alison.schofield@intel.com>
match_decoder_by_range() and decoder_match_range() both determine
if an HPA range matches a decoder. The first does it for root
decoders and the second one operates on switch decoders.
Tidy these up with clear naming and make the switch helper more
like the root decoder helper in style and functionality. Make it
take the actual range, rather than an endpoint decoder from which
it extracts the range.
Aside from aesthetics and maintainability, this is in preparation
for reuse.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
drivers/cxl/core/region.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6d63b8798c29..64206fc4d99b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1487,16 +1487,19 @@ static struct cxl_port *next_port(struct cxl_port *port)
return port->parent_dport->port;
}
-static int decoder_match_range(struct device *dev, void *data)
+static int match_switch_decoder_by_range(struct device *dev, void *data)
{
- struct cxl_endpoint_decoder *cxled = data;
+ struct range *r1, *r2 = data;
struct cxl_switch_decoder *cxlsd;
if (!is_switch_decoder(dev))
return 0;
cxlsd = to_cxl_switch_decoder(dev);
- return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
+ r1 = &cxlsd->cxld.hpa_range;
+ return range_contains(r1, r2);
+}
+
}
static void find_positions(const struct cxl_switch_decoder *cxlsd,
@@ -1565,7 +1568,8 @@ static int cmp_decode_pos(const void *a, const void *b)
goto err;
}
- dev = device_find_child(&port->dev, cxled_a, decoder_match_range);
+ dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
+ match_switch_decoder_by_range);
if (!dev) {
struct range *range = &cxled_a->cxld.hpa_range;
@@ -2696,7 +2700,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
return rc;
}
-static int match_decoder_by_range(struct device *dev, void *data)
+static int match_root_decoder_by_range(struct device *dev, void *data)
{
struct range *r1, *r2 = data;
struct cxl_root_decoder *cxlrd;
@@ -2827,7 +2831,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
int rc;
cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
- match_decoder_by_range);
+ match_root_decoder_by_range);
if (!cxlrd_dev) {
dev_err(cxlmd->dev.parent,
"%s:%s no CXL window for range %#llx:%#llx\n",
--
2.37.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave
2023-10-16 6:02 [PATCH v2 0/3] cxl/region: Autodiscovery position repair alison.schofield
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
@ 2023-10-16 6:02 ` alison.schofield
2023-10-17 17:33 ` Jim Harris
2023-10-23 21:47 ` Dan Williams
2023-10-16 6:02 ` [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions alison.schofield
2 siblings, 2 replies; 19+ messages in thread
From: alison.schofield @ 2023-10-16 6:02 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko
From: Alison Schofield <alison.schofield@intel.com>
Introduce a calculation that determines a targets position in a region
interleave. Perform a selftest of the calculation on user-defined
regions.
The region driver uses the kernel sort() function to put region
targets in relative order. Positions are assigned based on each
targets index in that sorted list. That relative sort doesn't
consider the offset of a port into its parent port which causes
some auto-discovered regions to fail creation. In one failure case,
a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
puts all the targets of one port ahead of another port when they
were expected to be interleaved.
In preparation for repairing the autodiscovery region assembly,
introduce a new method for discovering a target position in the
region interleave.
cxl_interleave_pos() offers a method to determine a targets position
by ascending from an endpoint to a root decoder. The calculation starts
with the endpoints local position and its position in its parents port.
It traverses towards the root decoder and examines both position and
ways in order to allow the position to be refined all the way to the
root decoder.
This calculation, applied iteratively, yields the correct position:
position = position * parent_ways + parent_pos;
...with these rules:
Rule #1 - When (parent_ways == region_ways), Stop!
position = parent_position;
This rule is applied in calc_interleave_pos()
Rule #2 - Skip over siblings that come before this memdev in
the decoder list when searching for the parent position.
This rule is applied in the helper find_pos_and_ways().
Include a selftest that exercises this new position calculation against
every successfully configured user-defined region.
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
drivers/cxl/core/region.c | 102 ++++++++++++++++++++++++++++++++++++++
1 file changed, 102 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 64206fc4d99b..b451d215c3c5 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1500,6 +1500,93 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
return range_contains(r1, r2);
}
+/* Find the position of a port in it's parent and the parents ways */
+static int find_pos_and_ways(struct cxl_port *port, struct range *range,
+ int *pos, int *ways)
+{
+ struct cxl_switch_decoder *cxlsd;
+ struct cxl_port *parent;
+ int child_ways = *ways;
+ int child_pos = *pos;
+ struct device *dev;
+ int skip = 0;
+ int rc = -1;
+
+ parent = next_port(port);
+ if (!parent)
+ return rc;
+
+ dev = device_find_child(&parent->dev, range,
+ match_switch_decoder_by_range);
+ if (!dev) {
+ dev_err(port->uport_dev,
+ "failed to find decoder mapping %#llx-%#llx\n",
+ range->start, range->end);
+ return rc;
+ }
+ cxlsd = to_cxl_switch_decoder(dev);
+ *ways = cxlsd->cxld.interleave_ways;
+
+ /* Skip over this many siblings in the target list */
+ if (*ways > child_ways)
+ skip = child_pos;
+
+ for (int i = 0; i < *ways; i++) {
+ if (cxlsd->target[i] == port->parent_dport) {
+ if (skip--)
+ continue;
+ *pos = i;
+ rc = 0;
+ break;
+ }
+ }
+ put_device(dev);
+
+ return rc;
+}
+
+static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
+ int region_ways)
+{
+ struct cxl_port *iter, *port = cxled_to_port(cxled);
+ struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+ struct range *range = &cxled->cxld.hpa_range;
+ int parent_ways = 0;
+ int parent_pos = 0;
+ int rc, pos;
+
+ /* Initialize pos to its local position */
+ rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
+ if (rc)
+ return -ENXIO;
+
+ pos = parent_pos;
+
+ if (parent_ways == region_ways)
+ goto out;
+
+ /* Iterate up the ancestral tree refining the position */
+ for (iter = next_port(port); iter; iter = next_port(iter)) {
+ if (is_cxl_root(iter))
+ break;
+
+ rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+ if (rc)
+ return -ENXIO;
+
+ if (parent_ways == region_ways) {
+ pos = parent_pos;
+ break;
+ }
+ pos = pos * parent_ways + parent_pos;
+ }
+out:
+ dev_dbg(&cxlmd->dev,
+ "decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
+ dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
+ dev_name(&port->dev), range->start, range->end, pos);
+
+ return pos;
}
static void find_positions(const struct cxl_switch_decoder *cxlsd,
@@ -1765,6 +1852,21 @@ static int cxl_region_attach(struct cxl_region *cxlr,
.end = p->res->end,
};
+ if (p->nr_targets != p->interleave_ways)
+ return 0;
+
+ /* Exercise position calculator on user-defined regions */
+ for (int i = 0; i < p->nr_targets; i++) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+ int test_pos;
+
+ test_pos = calc_interleave_pos(cxled, p->interleave_ways);
+ dev_dbg(&cxled->cxld.dev,
+ "Interleave calc match %s test_pos:%d cxled->pos:%d\n",
+ (test_pos == cxled->pos) ? "Success" : "Fail",
+ test_pos, cxled->pos);
+ }
+
return 0;
err_decrement:
--
2.37.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions
2023-10-16 6:02 [PATCH v2 0/3] cxl/region: Autodiscovery position repair alison.schofield
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
2023-10-16 6:02 ` [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave alison.schofield
@ 2023-10-16 6:02 ` alison.schofield
2023-10-17 17:40 ` Jim Harris
2023-10-23 21:58 ` Dan Williams
2 siblings, 2 replies; 19+ messages in thread
From: alison.schofield @ 2023-10-16 6:02 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko
From: Alison Schofield <alison.schofield@intel.com>
For auto-discovered regions, the driver must assign each target to
the correct position in the region interleave set.
cxl_region_sort_targets() uses the kernel sort() function to put the
targets in relative order. Once the relative ordering is complete,
positions are assigned based on each targets index in that sorted list.
The sort() compare function does not consider the child offset into a
parent port. The sort put all targets of one port ahead of another
port when an interleave was expected, causing the region assembly to
fail.
Replace the relative sort, with calc_interleave_pos() on each target
in the region target list. That will find the exact position for each
target based on a walk up the ancestral tree from endpoint to root
decoder.
calc_interleave_pos() was introduced in a prior patch, so the work
here is to use in cxl_region_sort_targets().
Cleanup the obsolete helper functions from the prior sort().
Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/core/region.c | 127 +++++---------------------------------
1 file changed, 15 insertions(+), 112 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index b451d215c3c5..fc261c82ea4e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1480,6 +1480,14 @@ static int cxl_region_attach_auto(struct cxl_region *cxlr,
return 0;
}
+static int cmp_interleave_pos(const void *a, const void *b)
+{
+ struct cxl_endpoint_decoder *cxled_a = *(typeof(cxled_a) *)a;
+ struct cxl_endpoint_decoder *cxled_b = *(typeof(cxled_b) *)b;
+
+ return cxled_a->pos - cxled_b->pos;
+}
+
static struct cxl_port *next_port(struct cxl_port *port)
{
if (!port->parent_dport)
@@ -1589,131 +1597,26 @@ static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
return pos;
}
-static void find_positions(const struct cxl_switch_decoder *cxlsd,
- const struct cxl_port *iter_a,
- const struct cxl_port *iter_b, int *a_pos,
- int *b_pos)
-{
- int i;
-
- for (i = 0, *a_pos = -1, *b_pos = -1; i < cxlsd->nr_targets; i++) {
- if (cxlsd->target[i] == iter_a->parent_dport)
- *a_pos = i;
- else if (cxlsd->target[i] == iter_b->parent_dport)
- *b_pos = i;
- if (*a_pos >= 0 && *b_pos >= 0)
- break;
- }
-}
-
-static int cmp_decode_pos(const void *a, const void *b)
-{
- struct cxl_endpoint_decoder *cxled_a = *(typeof(cxled_a) *)a;
- struct cxl_endpoint_decoder *cxled_b = *(typeof(cxled_b) *)b;
- struct cxl_memdev *cxlmd_a = cxled_to_memdev(cxled_a);
- struct cxl_memdev *cxlmd_b = cxled_to_memdev(cxled_b);
- struct cxl_port *port_a = cxled_to_port(cxled_a);
- struct cxl_port *port_b = cxled_to_port(cxled_b);
- struct cxl_port *iter_a, *iter_b, *port = NULL;
- struct cxl_switch_decoder *cxlsd;
- struct device *dev;
- int a_pos, b_pos;
- unsigned int seq;
-
- /* Exit early if any prior sorting failed */
- if (cxled_a->pos < 0 || cxled_b->pos < 0)
- return 0;
-
- /*
- * Walk up the hierarchy to find a shared port, find the decoder that
- * maps the range, compare the relative position of those dport
- * mappings.
- */
- for (iter_a = port_a; iter_a; iter_a = next_port(iter_a)) {
- struct cxl_port *next_a, *next_b;
-
- next_a = next_port(iter_a);
- if (!next_a)
- break;
-
- for (iter_b = port_b; iter_b; iter_b = next_port(iter_b)) {
- next_b = next_port(iter_b);
- if (next_a != next_b)
- continue;
- port = next_a;
- break;
- }
-
- if (port)
- break;
- }
-
- if (!port) {
- dev_err(cxlmd_a->dev.parent,
- "failed to find shared port with %s\n",
- dev_name(cxlmd_b->dev.parent));
- goto err;
- }
-
- dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
- match_switch_decoder_by_range);
- if (!dev) {
- struct range *range = &cxled_a->cxld.hpa_range;
-
- dev_err(port->uport_dev,
- "failed to find decoder that maps %#llx-%#llx\n",
- range->start, range->end);
- goto err;
- }
-
- cxlsd = to_cxl_switch_decoder(dev);
- do {
- seq = read_seqbegin(&cxlsd->target_lock);
- find_positions(cxlsd, iter_a, iter_b, &a_pos, &b_pos);
- } while (read_seqretry(&cxlsd->target_lock, seq));
-
- put_device(dev);
-
- if (a_pos < 0 || b_pos < 0) {
- dev_err(port->uport_dev,
- "failed to find shared decoder for %s and %s\n",
- dev_name(cxlmd_a->dev.parent),
- dev_name(cxlmd_b->dev.parent));
- goto err;
- }
-
- dev_dbg(port->uport_dev, "%s comes %s %s\n",
- dev_name(cxlmd_a->dev.parent),
- a_pos - b_pos < 0 ? "before" : "after",
- dev_name(cxlmd_b->dev.parent));
-
- return a_pos - b_pos;
-err:
- cxled_a->pos = -1;
- return 0;
-}
-
static int cxl_region_sort_targets(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
int i, rc = 0;
- sort(p->targets, p->nr_targets, sizeof(p->targets[0]), cmp_decode_pos,
- NULL);
-
for (i = 0; i < p->nr_targets; i++) {
struct cxl_endpoint_decoder *cxled = p->targets[i];
+ cxled->pos = calc_interleave_pos(cxled, p->interleave_ways);
/*
- * Record that sorting failed, but still continue to restore
- * cxled->pos with its ->targets[] position so that follow-on
- * code paths can reliably do p->targets[cxled->pos] to
- * self-reference their entry.
+ * Record that sorting failed, but still continue to calc
+ * cxled->pos so that follow-on code paths can reliably
+ * do p->targets[cxled->pos] to self-reference their entry.
*/
if (cxled->pos < 0)
rc = -ENXIO;
- cxled->pos = i;
}
+ /* Keep the cxlr target list in interleave position order */
+ sort(p->targets, p->nr_targets, sizeof(p->targets[0]),
+ cmp_interleave_pos, NULL);
dev_dbg(&cxlr->dev, "region sort %s\n", rc ? "failed" : "successful");
return rc;
--
2.37.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
@ 2023-10-17 16:21 ` Jim Harris
2023-10-17 17:24 ` Jim Harris
` (2 more replies)
2023-10-23 20:54 ` Dan Williams
1 sibling, 3 replies; 19+ messages in thread
From: Jim Harris @ 2023-10-17 16:21 UTC (permalink / raw)
To: alison.schofield@intel.com
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
> On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
>
> From: Alison Schofield <alison.schofield@intel.com>
>
> match_decoder_by_range() and decoder_match_range() both determine
> if an HPA range matches a decoder. The first does it for root
> decoders and the second one operates on switch decoders.
>
> Tidy these up with clear naming and make the switch helper more
> like the root decoder helper in style and functionality. Make it
> take the actual range, rather than an endpoint decoder from which
> it extracts the range.
>
> Aside from aesthetics and maintainability, this is in preparation
> for reuse.
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/core/region.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6d63b8798c29..64206fc4d99b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1487,16 +1487,19 @@ static struct cxl_port *next_port(struct cxl_port *port)
> return port->parent_dport->port;
> }
>
> -static int decoder_match_range(struct device *dev, void *data)
> +static int match_switch_decoder_by_range(struct device *dev, void *data)
> {
> - struct cxl_endpoint_decoder *cxled = data;
> + struct range *r1, *r2 = data;
> struct cxl_switch_decoder *cxlsd;
>
> if (!is_switch_decoder(dev))
> return 0;
>
> cxlsd = to_cxl_switch_decoder(dev);
> - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> + r1 = &cxlsd->cxld.hpa_range;
> + return range_contains(r1, r2);
> +}
Hi Alison,
This stray closing brace needs to be removed from this patch.
Rest of the patch looks good, I agree the naming is much better with these
changes.
Reviewed-by: Jim Harris <jim.harris@samsung.com>
> +
> }
>
> static void find_positions(const struct cxl_switch_decoder *cxlsd,
> @@ -1565,7 +1568,8 @@ static int cmp_decode_pos(const void *a, const void *b)
> goto err;
> }
>
> - dev = device_find_child(&port->dev, cxled_a, decoder_match_range);
> + dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
> + match_switch_decoder_by_range);
> if (!dev) {
> struct range *range = &cxled_a->cxld.hpa_range;
>
> @@ -2696,7 +2700,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> return rc;
> }
>
> -static int match_decoder_by_range(struct device *dev, void *data)
> +static int match_root_decoder_by_range(struct device *dev, void *data)
> {
> struct range *r1, *r2 = data;
> struct cxl_root_decoder *cxlrd;
> @@ -2827,7 +2831,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> int rc;
>
> cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
> - match_decoder_by_range);
> + match_root_decoder_by_range);
> if (!cxlrd_dev) {
> dev_err(cxlmd->dev.parent,
> "%s:%s no CXL window for range %#llx:%#llx\n",
> --
> 2.37.3
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-17 16:21 ` Jim Harris
@ 2023-10-17 17:24 ` Jim Harris
2023-10-23 23:22 ` Alison Schofield
2023-10-17 20:43 ` Alison Schofield
2023-10-23 17:51 ` Alison Schofield
2 siblings, 1 reply; 19+ messages in thread
From: Jim Harris @ 2023-10-17 17:24 UTC (permalink / raw)
To: alison.schofield@intel.com
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
> On Oct 17, 2023, at 9:20 AM, Jim Harris <jim.harris@samsung.com> wrote:
>
>
>
>> On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
>>
>> From: Alison Schofield <alison.schofield@intel.com>
>>
>>
>> -static int decoder_match_range(struct device *dev, void *data)
>> +static int match_switch_decoder_by_range(struct device *dev, void *data)
>> {
>> - struct cxl_endpoint_decoder *cxled = data;
>> + struct range *r1, *r2 = data;
>> struct cxl_switch_decoder *cxlsd;
>>
>> if (!is_switch_decoder(dev))
>> return 0;
>>
>> cxlsd = to_cxl_switch_decoder(dev);
>> - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
>> + r1 = &cxlsd->cxld.hpa_range;
>> + return range_contains(r1, r2);
>> +}
>
> Hi Alison,
>
> This stray closing brace needs to be removed from this patch.
>
> Rest of the patch looks good, I agree the naming is much better with these
> changes.
>
> Reviewed-by: Jim Harris <jim.harris@samsung.com>
>
Looking at this again, I think this also needs to check if the switch
decoder is active. Non-active decoders will have range 0 to UINT64_MAX
which would pass the range_contains() check.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave
2023-10-16 6:02 ` [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave alison.schofield
@ 2023-10-17 17:33 ` Jim Harris
2023-10-23 18:10 ` Alison Schofield
2023-10-23 21:47 ` Dan Williams
1 sibling, 1 reply; 19+ messages in thread
From: Jim Harris @ 2023-10-17 17:33 UTC (permalink / raw)
To: alison.schofield@intel.com
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
> On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
>
> From: Alison Schofield <alison.schofield@intel.com>
>
> Introduce a calculation that determines a targets position in a region
> interleave. Perform a selftest of the calculation on user-defined
> regions.
>
> The region driver uses the kernel sort() function to put region
> targets in relative order. Positions are assigned based on each
> targets index in that sorted list. That relative sort doesn't
> consider the offset of a port into its parent port which causes
> some auto-discovered regions to fail creation. In one failure case,
> a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> puts all the targets of one port ahead of another port when they
> were expected to be interleaved.
>
> In preparation for repairing the autodiscovery region assembly,
> introduce a new method for discovering a target position in the
> region interleave.
>
> cxl_interleave_pos() offers a method to determine a targets position
> by ascending from an endpoint to a root decoder. The calculation starts
> with the endpoints local position and its position in its parents port.
> It traverses towards the root decoder and examines both position and
> ways in order to allow the position to be refined all the way to the
> root decoder.
>
> This calculation, applied iteratively, yields the correct position:
>
> position = position * parent_ways + parent_pos;
>
> ...with these rules:
>
> Rule #1 - When (parent_ways == region_ways), Stop!
> position = parent_position;
> This rule is applied in calc_interleave_pos()
>
> Rule #2 - Skip over siblings that come before this memdev in
> the decoder list when searching for the parent position.
> This rule is applied in the helper find_pos_and_ways().
>
> Include a selftest that exercises this new position calculation against
> every successfully configured user-defined region.
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
> drivers/cxl/core/region.c | 102 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 102 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 64206fc4d99b..b451d215c3c5 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1500,6 +1500,93 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> return range_contains(r1, r2);
> }
>
> +/* Find the position of a port in it's parent and the parents ways */
> +static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> + int *pos, int *ways)
> +{
> + struct cxl_switch_decoder *cxlsd;
> + struct cxl_port *parent;
> + int child_ways = *ways;
> + int child_pos = *pos;
> + struct device *dev;
> + int skip = 0;
> + int rc = -1;
> +
> + parent = next_port(port);
> + if (!parent)
> + return rc;
> +
> + dev = device_find_child(&parent->dev, range,
> + match_switch_decoder_by_range);
> + if (!dev) {
> + dev_err(port->uport_dev,
> + "failed to find decoder mapping %#llx-%#llx\n",
> + range->start, range->end);
> + return rc;
> + }
> + cxlsd = to_cxl_switch_decoder(dev);
> + *ways = cxlsd->cxld.interleave_ways;
> +
> + /* Skip over this many siblings in the target list */
> + if (*ways > child_ways)
> + skip = child_pos;
> +
> + for (int i = 0; i < *ways; i++) {
> + if (cxlsd->target[i] == port->parent_dport) {
> + if (skip--)
> + continue;
> + *pos = i;
> + rc = 0;
> + break;
> + }
> + }
> + put_device(dev);
> +
> + return rc;
> +}
> +
> +static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
> + int region_ways)
> +{
> + struct cxl_port *iter, *port = cxled_to_port(cxled);
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct range *range = &cxled->cxld.hpa_range;
> + int parent_ways = 0;
> + int parent_pos = 0;
> + int rc, pos;
> +
> + /* Initialize pos to its local position */
> + rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
> + if (rc)
> + return -ENXIO;
> +
> + pos = parent_pos;
> +
> + if (parent_ways == region_ways)
> + goto out;
> +
> + /* Iterate up the ancestral tree refining the position */
> + for (iter = next_port(port); iter; iter = next_port(iter)) {
I think this can be simplified by initializing with ‘iter = port’
here, and then the earlier lines can be removed, as well as the
‘out’ label.
> + if (is_cxl_root(iter))
> + break;
> +
> + rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> + if (rc)
> + return -ENXIO;
> +
> + if (parent_ways == region_ways) {
> + pos = parent_pos;
> + break;
> + }
> + pos = pos * parent_ways + parent_pos;
> + }
> +out:
> + dev_dbg(&cxlmd->dev,
> + "decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
> + dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
> + dev_name(&port->dev), range->start, range->end, pos);
> +
> + return pos;
> }
>
> static void find_positions(const struct cxl_switch_decoder *cxlsd,
> @@ -1765,6 +1852,21 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> .end = p->res->end,
> };
>
> + if (p->nr_targets != p->interleave_ways)
> + return 0;
There is code just before this that compares nr_targets and interleave_ways
before calling cxl_region_setup_targets().
Decoder fields are set between that comparison and here, but I see no reason
these fields shouldn’t be set before that earlier comparison. It actually
makes the code more consistent, otherwise in the case where
cxl_region_setup_targets() fails, the first n-1 decoders have their fields
set, but the last one would not.
Then this return 0 can just be an else clause on the earlier comparison.
> +
> + /* Exercise position calculator on user-defined regions */
> + for (int i = 0; i < p->nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> + int test_pos;
> +
> + test_pos = calc_interleave_pos(cxled, p->interleave_ways);
> + dev_dbg(&cxled->cxld.dev,
> + "Interleave calc match %s test_pos:%d cxled->pos:%d\n",
> + (test_pos == cxled->pos) ? "Success" : "Fail",
> + test_pos, cxled->pos);
> + }
> +
> return 0;
>
> err_decrement:
> --
> 2.37.3
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions
2023-10-16 6:02 ` [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions alison.schofield
@ 2023-10-17 17:40 ` Jim Harris
2023-10-23 21:58 ` Dan Williams
1 sibling, 0 replies; 19+ messages in thread
From: Jim Harris @ 2023-10-17 17:40 UTC (permalink / raw)
To: alison.schofield@intel.com
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
> On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
>
> From: Alison Schofield <alison.schofield@intel.com>
>
> For auto-discovered regions, the driver must assign each target to
> the correct position in the region interleave set.
>
> cxl_region_sort_targets() uses the kernel sort() function to put the
> targets in relative order. Once the relative ordering is complete,
> positions are assigned based on each targets index in that sorted list.
>
> The sort() compare function does not consider the child offset into a
> parent port. The sort put all targets of one port ahead of another
> port when an interleave was expected, causing the region assembly to
> fail.
>
> Replace the relative sort, with calc_interleave_pos() on each target
> in the region target list. That will find the exact position for each
> target based on a walk up the ancestral tree from endpoint to root
> decoder.
>
> calc_interleave_pos() was introduced in a prior patch, so the work
> here is to use in cxl_region_sort_targets().
>
> Cleanup the obsolete helper functions from the prior sort().
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/region.c | 127 +++++---------------------------------
> 1 file changed, 15 insertions(+), 112 deletions(-)
Nice.
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index b451d215c3c5..fc261c82ea4e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1480,6 +1480,14 @@ static int cxl_region_attach_auto(struct cxl_region *cxlr,
> return 0;
> }
>
> +static int cmp_interleave_pos(const void *a, const void *b)
> +{
> + struct cxl_endpoint_decoder *cxled_a = *(typeof(cxled_a) *)a;
> + struct cxl_endpoint_decoder *cxled_b = *(typeof(cxled_b) *)b;
> +
> + return cxled_a->pos - cxled_b->pos;
> +}
> +
> static struct cxl_port *next_port(struct cxl_port *port)
> {
> if (!port->parent_dport)
> @@ -1589,131 +1597,26 @@ static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
> return pos;
> }
>
> -static void find_positions(const struct cxl_switch_decoder *cxlsd,
> - const struct cxl_port *iter_a,
> - const struct cxl_port *iter_b, int *a_pos,
> - int *b_pos)
> -{
> - int i;
> -
> - for (i = 0, *a_pos = -1, *b_pos = -1; i < cxlsd->nr_targets; i++) {
> - if (cxlsd->target[i] == iter_a->parent_dport)
> - *a_pos = i;
> - else if (cxlsd->target[i] == iter_b->parent_dport)
> - *b_pos = i;
> - if (*a_pos >= 0 && *b_pos >= 0)
> - break;
> - }
> -}
> -
> -static int cmp_decode_pos(const void *a, const void *b)
> -{
> - struct cxl_endpoint_decoder *cxled_a = *(typeof(cxled_a) *)a;
> - struct cxl_endpoint_decoder *cxled_b = *(typeof(cxled_b) *)b;
> - struct cxl_memdev *cxlmd_a = cxled_to_memdev(cxled_a);
> - struct cxl_memdev *cxlmd_b = cxled_to_memdev(cxled_b);
> - struct cxl_port *port_a = cxled_to_port(cxled_a);
> - struct cxl_port *port_b = cxled_to_port(cxled_b);
> - struct cxl_port *iter_a, *iter_b, *port = NULL;
> - struct cxl_switch_decoder *cxlsd;
> - struct device *dev;
> - int a_pos, b_pos;
> - unsigned int seq;
> -
> - /* Exit early if any prior sorting failed */
> - if (cxled_a->pos < 0 || cxled_b->pos < 0)
> - return 0;
> -
> - /*
> - * Walk up the hierarchy to find a shared port, find the decoder that
> - * maps the range, compare the relative position of those dport
> - * mappings.
> - */
> - for (iter_a = port_a; iter_a; iter_a = next_port(iter_a)) {
> - struct cxl_port *next_a, *next_b;
> -
> - next_a = next_port(iter_a);
> - if (!next_a)
> - break;
> -
> - for (iter_b = port_b; iter_b; iter_b = next_port(iter_b)) {
> - next_b = next_port(iter_b);
> - if (next_a != next_b)
> - continue;
> - port = next_a;
> - break;
> - }
> -
> - if (port)
> - break;
> - }
> -
> - if (!port) {
> - dev_err(cxlmd_a->dev.parent,
> - "failed to find shared port with %s\n",
> - dev_name(cxlmd_b->dev.parent));
> - goto err;
> - }
> -
> - dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
> - match_switch_decoder_by_range);
> - if (!dev) {
> - struct range *range = &cxled_a->cxld.hpa_range;
> -
> - dev_err(port->uport_dev,
> - "failed to find decoder that maps %#llx-%#llx\n",
> - range->start, range->end);
> - goto err;
> - }
> -
> - cxlsd = to_cxl_switch_decoder(dev);
> - do {
> - seq = read_seqbegin(&cxlsd->target_lock);
> - find_positions(cxlsd, iter_a, iter_b, &a_pos, &b_pos);
> - } while (read_seqretry(&cxlsd->target_lock, seq));
> -
> - put_device(dev);
> -
> - if (a_pos < 0 || b_pos < 0) {
> - dev_err(port->uport_dev,
> - "failed to find shared decoder for %s and %s\n",
> - dev_name(cxlmd_a->dev.parent),
> - dev_name(cxlmd_b->dev.parent));
> - goto err;
> - }
> -
> - dev_dbg(port->uport_dev, "%s comes %s %s\n",
> - dev_name(cxlmd_a->dev.parent),
> - a_pos - b_pos < 0 ? "before" : "after",
> - dev_name(cxlmd_b->dev.parent));
> -
> - return a_pos - b_pos;
> -err:
> - cxled_a->pos = -1;
> - return 0;
> -}
> -
> static int cxl_region_sort_targets(struct cxl_region *cxlr)
> {
> struct cxl_region_params *p = &cxlr->params;
> int i, rc = 0;
>
> - sort(p->targets, p->nr_targets, sizeof(p->targets[0]), cmp_decode_pos,
> - NULL);
> -
> for (i = 0; i < p->nr_targets; i++) {
> struct cxl_endpoint_decoder *cxled = p->targets[i];
>
> + cxled->pos = calc_interleave_pos(cxled, p->interleave_ways);
> /*
> - * Record that sorting failed, but still continue to restore
> - * cxled->pos with its ->targets[] position so that follow-on
> - * code paths can reliably do p->targets[cxled->pos] to
> - * self-reference their entry.
> + * Record that sorting failed, but still continue to calc
> + * cxled->pos so that follow-on code paths can reliably
> + * do p->targets[cxled->pos] to self-reference their entry.
> */
> if (cxled->pos < 0)
> rc = -ENXIO;
> - cxled->pos = i;
> }
> + /* Keep the cxlr target list in interleave position order */
> + sort(p->targets, p->nr_targets, sizeof(p->targets[0]),
> + cmp_interleave_pos, NULL);
>
> dev_dbg(&cxlr->dev, "region sort %s\n", rc ? "failed" : "successful");
> return rc;
> --
> 2.37.3
>
>
Reviewed-by: Jim Harris <jim.harris@samsung.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-17 16:21 ` Jim Harris
2023-10-17 17:24 ` Jim Harris
@ 2023-10-17 20:43 ` Alison Schofield
2023-10-17 22:59 ` Jim Harris
2023-10-23 17:51 ` Alison Schofield
2 siblings, 1 reply; 19+ messages in thread
From: Alison Schofield @ 2023-10-17 20:43 UTC (permalink / raw)
To: Jim Harris
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Tue, Oct 17, 2023 at 04:21:06PM +0000, Jim Harris wrote:
>
>
> > On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
> >
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > match_decoder_by_range() and decoder_match_range() both determine
> > if an HPA range matches a decoder. The first does it for root
> > decoders and the second one operates on switch decoders.
> >
> > Tidy these up with clear naming and make the switch helper more
> > like the root decoder helper in style and functionality. Make it
> > take the actual range, rather than an endpoint decoder from which
> > it extracts the range.
> >
> > Aside from aesthetics and maintainability, this is in preparation
> > for reuse.
> >
> > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> > Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > drivers/cxl/core/region.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 6d63b8798c29..64206fc4d99b 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -1487,16 +1487,19 @@ static struct cxl_port *next_port(struct cxl_port *port)
> > return port->parent_dport->port;
> > }
> >
> > -static int decoder_match_range(struct device *dev, void *data)
> > +static int match_switch_decoder_by_range(struct device *dev, void *data)
> > {
> > - struct cxl_endpoint_decoder *cxled = data;
> > + struct range *r1, *r2 = data;
> > struct cxl_switch_decoder *cxlsd;
> >
> > if (!is_switch_decoder(dev))
> > return 0;
> >
> > cxlsd = to_cxl_switch_decoder(dev);
> > - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> > + r1 = &cxlsd->cxld.hpa_range;
> > + return range_contains(r1, r2);
> > +}
>
> Hi Alison,
>
> This stray closing brace needs to be removed from this patch.
>
> Rest of the patch looks good, I agree the naming is much better with these
> changes.
>
> Reviewed-by: Jim Harris <jim.harris@samsung.com>
>
Thanks for the review.
Your replies are removing the leading indentation from the original
patch. I see it viewing in my mailer and w browser:
https://lore.kernel.org/linux-cxl/cover.1697433770.git.alison.schofield@intel.com/T/#m1d4fba325a175520b8a43964ae44a28424177448
So, I think it's something on your end.
???
Alison
> > +
> > }
> >
> > static void find_positions(const struct cxl_switch_decoder *cxlsd,
> > @@ -1565,7 +1568,8 @@ static int cmp_decode_pos(const void *a, const void *b)
> > goto err;
> > }
> >
> > - dev = device_find_child(&port->dev, cxled_a, decoder_match_range);
> > + dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
> > + match_switch_decoder_by_range);
> > if (!dev) {
> > struct range *range = &cxled_a->cxld.hpa_range;
> >
> > @@ -2696,7 +2700,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> > return rc;
> > }
> >
> > -static int match_decoder_by_range(struct device *dev, void *data)
> > +static int match_root_decoder_by_range(struct device *dev, void *data)
> > {
> > struct range *r1, *r2 = data;
> > struct cxl_root_decoder *cxlrd;
> > @@ -2827,7 +2831,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> > int rc;
> >
> > cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
> > - match_decoder_by_range);
> > + match_root_decoder_by_range);
> > if (!cxlrd_dev) {
> > dev_err(cxlmd->dev.parent,
> > "%s:%s no CXL window for range %#llx:%#llx\n",
> > --
> > 2.37.3
> >
> >
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-17 20:43 ` Alison Schofield
@ 2023-10-17 22:59 ` Jim Harris
0 siblings, 0 replies; 19+ messages in thread
From: Jim Harris @ 2023-10-17 22:59 UTC (permalink / raw)
To: Alison Schofield
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Tue, Oct 17, 2023 at 01:43:15PM -0700, Alison Schofield wrote:
> On Tue, Oct 17, 2023 at 04:21:06PM +0000, Jim Harris wrote:
> >
> >
> > Hi Alison,
> >
> > This stray closing brace needs to be removed from this patch.
> >
> > Rest of the patch looks good, I agree the naming is much better with these
> > changes.
> >
> > Reviewed-by: Jim Harris <jim.harris@samsung.com>
> >
>
> Thanks for the review.
>
> Your replies are removing the leading indentation from the original
> patch. I see it viewing in my mailer and w browser:
>
> So, I think it's something on your end.
>
> ???
>
> Alison
>
My apologies, I'll stop trying to get macOS Mail to work with the
mailing list.
(Sent with mutt)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-17 16:21 ` Jim Harris
2023-10-17 17:24 ` Jim Harris
2023-10-17 20:43 ` Alison Schofield
@ 2023-10-23 17:51 ` Alison Schofield
2 siblings, 0 replies; 19+ messages in thread
From: Alison Schofield @ 2023-10-23 17:51 UTC (permalink / raw)
To: Jim Harris
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Tue, Oct 17, 2023 at 04:21:06PM +0000, Jim Harris wrote:
>
>
> > On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
> >
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > match_decoder_by_range() and decoder_match_range() both determine
> > if an HPA range matches a decoder. The first does it for root
> > decoders and the second one operates on switch decoders.
> >
> > Tidy these up with clear naming and make the switch helper more
> > like the root decoder helper in style and functionality. Make it
> > take the actual range, rather than an endpoint decoder from which
> > it extracts the range.
> >
> > Aside from aesthetics and maintainability, this is in preparation
> > for reuse.
> >
> > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> > Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > drivers/cxl/core/region.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 6d63b8798c29..64206fc4d99b 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -1487,16 +1487,19 @@ static struct cxl_port *next_port(struct cxl_port *port)
> > return port->parent_dport->port;
> > }
> >
> > -static int decoder_match_range(struct device *dev, void *data)
> > +static int match_switch_decoder_by_range(struct device *dev, void *data)
> > {
> > - struct cxl_endpoint_decoder *cxled = data;
> > + struct range *r1, *r2 = data;
> > struct cxl_switch_decoder *cxlsd;
> >
> > if (!is_switch_decoder(dev))
> > return 0;
> >
> > cxlsd = to_cxl_switch_decoder(dev);
> > - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> > + r1 = &cxlsd->cxld.hpa_range;
> > + return range_contains(r1, r2);
> > +}
>
> Hi Alison,
>
> This stray closing brace needs to be removed from this patch.
Got it, thanks!
>
> Rest of the patch looks good, I agree the naming is much better with these
> changes.
>
> Reviewed-by: Jim Harris <jim.harris@samsung.com>
>
> > +
> > }
> >
> > static void find_positions(const struct cxl_switch_decoder *cxlsd,
> > @@ -1565,7 +1568,8 @@ static int cmp_decode_pos(const void *a, const void *b)
> > goto err;
> > }
> >
> > - dev = device_find_child(&port->dev, cxled_a, decoder_match_range);
> > + dev = device_find_child(&port->dev, &cxled_a->cxld.hpa_range,
> > + match_switch_decoder_by_range);
> > if (!dev) {
> > struct range *range = &cxled_a->cxld.hpa_range;
> >
> > @@ -2696,7 +2700,7 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> > return rc;
> > }
> >
> > -static int match_decoder_by_range(struct device *dev, void *data)
> > +static int match_root_decoder_by_range(struct device *dev, void *data)
> > {
> > struct range *r1, *r2 = data;
> > struct cxl_root_decoder *cxlrd;
> > @@ -2827,7 +2831,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> > int rc;
> >
> > cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
> > - match_decoder_by_range);
> > + match_root_decoder_by_range);
> > if (!cxlrd_dev) {
> > dev_err(cxlmd->dev.parent,
> > "%s:%s no CXL window for range %#llx:%#llx\n",
> > --
> > 2.37.3
> >
> >
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave
2023-10-17 17:33 ` Jim Harris
@ 2023-10-23 18:10 ` Alison Schofield
2023-10-23 18:34 ` Jim Harris
0 siblings, 1 reply; 19+ messages in thread
From: Alison Schofield @ 2023-10-23 18:10 UTC (permalink / raw)
To: Jim Harris
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Tue, Oct 17, 2023 at 05:33:54PM +0000, Jim Harris wrote:
>
>
> > On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
> >
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > Introduce a calculation that determines a targets position in a region
> > interleave. Perform a selftest of the calculation on user-defined
> > regions.
> >
> > The region driver uses the kernel sort() function to put region
> > targets in relative order. Positions are assigned based on each
> > targets index in that sorted list. That relative sort doesn't
> > consider the offset of a port into its parent port which causes
> > some auto-discovered regions to fail creation. In one failure case,
> > a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> > puts all the targets of one port ahead of another port when they
> > were expected to be interleaved.
> >
> > In preparation for repairing the autodiscovery region assembly,
> > introduce a new method for discovering a target position in the
> > region interleave.
> >
> > cxl_interleave_pos() offers a method to determine a targets position
> > by ascending from an endpoint to a root decoder. The calculation starts
> > with the endpoints local position and its position in its parents port.
> > It traverses towards the root decoder and examines both position and
> > ways in order to allow the position to be refined all the way to the
> > root decoder.
> >
> > This calculation, applied iteratively, yields the correct position:
> >
> > position = position * parent_ways + parent_pos;
> >
> > ...with these rules:
> >
> > Rule #1 - When (parent_ways == region_ways), Stop!
> > position = parent_position;
> > This rule is applied in calc_interleave_pos()
> >
> > Rule #2 - Skip over siblings that come before this memdev in
> > the decoder list when searching for the parent position.
> > This rule is applied in the helper find_pos_and_ways().
> >
> > Include a selftest that exercises this new position calculation against
> > every successfully configured user-defined region.
> >
> > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> > Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> > drivers/cxl/core/region.c | 102 ++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 102 insertions(+)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index 64206fc4d99b..b451d215c3c5 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -1500,6 +1500,93 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> > return range_contains(r1, r2);
> > }
> >
> > +/* Find the position of a port in it's parent and the parents ways */
> > +static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> > + int *pos, int *ways)
> > +{
> > + struct cxl_switch_decoder *cxlsd;
> > + struct cxl_port *parent;
> > + int child_ways = *ways;
> > + int child_pos = *pos;
> > + struct device *dev;
> > + int skip = 0;
> > + int rc = -1;
> > +
> > + parent = next_port(port);
> > + if (!parent)
> > + return rc;
> > +
> > + dev = device_find_child(&parent->dev, range,
> > + match_switch_decoder_by_range);
> > + if (!dev) {
> > + dev_err(port->uport_dev,
> > + "failed to find decoder mapping %#llx-%#llx\n",
> > + range->start, range->end);
> > + return rc;
> > + }
> > + cxlsd = to_cxl_switch_decoder(dev);
> > + *ways = cxlsd->cxld.interleave_ways;
> > +
> > + /* Skip over this many siblings in the target list */
> > + if (*ways > child_ways)
> > + skip = child_pos;
> > +
> > + for (int i = 0; i < *ways; i++) {
> > + if (cxlsd->target[i] == port->parent_dport) {
> > + if (skip--)
> > + continue;
> > + *pos = i;
> > + rc = 0;
> > + break;
> > + }
> > + }
> > + put_device(dev);
> > +
> > + return rc;
> > +}
> > +
> > +static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
> > + int region_ways)
> > +{
> > + struct cxl_port *iter, *port = cxled_to_port(cxled);
> > + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > + struct range *range = &cxled->cxld.hpa_range;
> > + int parent_ways = 0;
> > + int parent_pos = 0;
> > + int rc, pos;
> > +
> > + /* Initialize pos to its local position */
> > + rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
> > + if (rc)
> > + return -ENXIO;
> > +
> > + pos = parent_pos;
> > +
> > + if (parent_ways == region_ways)
> > + goto out;
> > +
> > + /* Iterate up the ancestral tree refining the position */
> > + for (iter = next_port(port); iter; iter = next_port(iter)) {
>
> I think this can be simplified by initializing with ‘iter = port’
> here, and then the earlier lines can be removed, as well as the
> ‘out’ label.
Thanks for drawing my attn to this again. I had tried and failed to
do that previously, but in trying again, I must not have init'd pos
'pos' to 0. I've made this change now and it's passing. Much nicer.
Thanks!
>
> > + if (is_cxl_root(iter))
> > + break;
> > +
> > + rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> > + if (rc)
> > + return -ENXIO;
> > +
> > + if (parent_ways == region_ways) {
> > + pos = parent_pos;
> > + break;
> > + }
> > + pos = pos * parent_ways + parent_pos;
> > + }
> > +out:
> > + dev_dbg(&cxlmd->dev,
> > + "decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
> > + dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
> > + dev_name(&port->dev), range->start, range->end, pos);
> > +
> > + return pos;
> > }
> >
> > static void find_positions(const struct cxl_switch_decoder *cxlsd,
> > @@ -1765,6 +1852,21 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> > .end = p->res->end,
> > };
> >
> > + if (p->nr_targets != p->interleave_ways)
> > + return 0;
>
> There is code just before this that compares nr_targets and interleave_ways
> before calling cxl_region_setup_targets().
>
> Decoder fields are set between that comparison and here, but I see no reason
> these fields shouldn’t be set before that earlier comparison. It actually
> makes the code more consistent, otherwise in the case where
> cxl_region_setup_targets() fails, the first n-1 decoders have their fields
> set, but the last one would not.
>
> Then this return 0 can just be an else clause on the earlier comparison.
>
I see what you mean and my first thought was 'sure this could be a little
cleanup before this patch', but then I struggle to justify it. Those first
n-1 decoders have iw,ig,hpa fields set because we had hope that this region
would create. On the last decoder, is there any value in setting up the
decoder when we know it's a failed setup?
Alison
> > +
> > + /* Exercise position calculator on user-defined regions */
> > + for (int i = 0; i < p->nr_targets; i++) {
> > + struct cxl_endpoint_decoder *cxled = p->targets[i];
> > + int test_pos;
> > +
> > + test_pos = calc_interleave_pos(cxled, p->interleave_ways);
> > + dev_dbg(&cxled->cxld.dev,
> > + "Interleave calc match %s test_pos:%d cxled->pos:%d\n",
> > + (test_pos == cxled->pos) ? "Success" : "Fail",
> > + test_pos, cxled->pos);
> > + }
> > +
> > return 0;
> >
> > err_decrement:
> > --
> > 2.37.3
> >
> >
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave
2023-10-23 18:10 ` Alison Schofield
@ 2023-10-23 18:34 ` Jim Harris
0 siblings, 0 replies; 19+ messages in thread
From: Jim Harris @ 2023-10-23 18:34 UTC (permalink / raw)
To: Alison Schofield
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Mon, Oct 23, 2023 at 11:10:29AM -0700, Alison Schofield wrote:
> On Tue, Oct 17, 2023 at 05:33:54PM +0000, Jim Harris wrote:
> >
> > > On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
> > >
> > > + if (p->nr_targets != p->interleave_ways)
> > > + return 0;
> >
> > There is code just before this that compares nr_targets and interleave_ways
> > before calling cxl_region_setup_targets().
> >
> > Decoder fields are set between that comparison and here, but I see no reason
> > these fields shouldn’t be set before that earlier comparison. It actually
> > makes the code more consistent, otherwise in the case where
> > cxl_region_setup_targets() fails, the first n-1 decoders have their fields
> > set, but the last one would not.
> >
> > Then this return 0 can just be an else clause on the earlier comparison.
> >
>
> I see what you mean and my first thought was 'sure this could be a little
> cleanup before this patch', but then I struggle to justify it. Those first
> n-1 decoders have iw,ig,hpa fields set because we had hope that this region
> would create. On the last decoder, is there any value in setting up the
> decoder when we know it's a failed setup?
>
>
I have a patch outstanding in this same area, which may provide some
additional context.
https://lore.kernel.org/linux-cxl/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in/
I think the value is that the all of the decoders get treated the same.
Note that before the nr_targets == interleave_ways check, it sets
cxled->pos, as well as put the decoder in the targets array and
increments the nr_targets.
But ultimately my suggested changes probably belong more in my patch
(or a follow-up to my patch), so I think it's fine to skip my
feedback on this part.
Jim
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
2023-10-17 16:21 ` Jim Harris
@ 2023-10-23 20:54 ` Dan Williams
2023-10-23 23:30 ` Alison Schofield
1 sibling, 1 reply; 19+ messages in thread
From: Dan Williams @ 2023-10-23 20:54 UTC (permalink / raw)
To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko, Jonathan Cameron
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> match_decoder_by_range() and decoder_match_range() both determine
> if an HPA range matches a decoder. The first does it for root
> decoders and the second one operates on switch decoders.
>
> Tidy these up with clear naming and make the switch helper more
> like the root decoder helper in style and functionality. Make it
> take the actual range, rather than an endpoint decoder from which
> it extracts the range.
>
> Aside from aesthetics and maintainability, this is in preparation
> for reuse.
Some minor changelog comments:
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Save "Fixes:" for commits that need to be backported to address an
issue. This will be a dependency for backporting patch2, but on its own
it does not fix anything.
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
Along the same lines this probably moves to the patch that fixes the
report.
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/core/region.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6d63b8798c29..64206fc4d99b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1487,16 +1487,19 @@ static struct cxl_port *next_port(struct cxl_port *port)
> return port->parent_dport->port;
> }
>
> -static int decoder_match_range(struct device *dev, void *data)
> +static int match_switch_decoder_by_range(struct device *dev, void *data)
> {
> - struct cxl_endpoint_decoder *cxled = data;
> + struct range *r1, *r2 = data;
> struct cxl_switch_decoder *cxlsd;
>
> if (!is_switch_decoder(dev))
> return 0;
>
> cxlsd = to_cxl_switch_decoder(dev);
> - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> + r1 = &cxlsd->cxld.hpa_range;
> + return range_contains(r1, r2);
I know this is just maintaining the status quo, but thinking about it a
bit deeper I think range_contains() is only correct for root decoders.
For a given region the root decoder will contain the region, but all the
switch decoders must be a 1:1 mapping. Linux only supports scenarios
where all endpoint decoders map the same HPA range and identify only 1
decoder per-switch port to map that region.
So "range_contains(r1, r2)" becomes "return *r1 == *r2".
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave
2023-10-16 6:02 ` [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave alison.schofield
2023-10-17 17:33 ` Jim Harris
@ 2023-10-23 21:47 ` Dan Williams
1 sibling, 0 replies; 19+ messages in thread
From: Dan Williams @ 2023-10-23 21:47 UTC (permalink / raw)
To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> Introduce a calculation that determines a targets position in a region
> interleave. Perform a selftest of the calculation on user-defined
> regions.
>
> The region driver uses the kernel sort() function to put region
> targets in relative order. Positions are assigned based on each
> targets index in that sorted list. That relative sort doesn't
> consider the offset of a port into its parent port which causes
> some auto-discovered regions to fail creation. In one failure case,
> a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> puts all the targets of one port ahead of another port when they
> were expected to be interleaved.
>
> In preparation for repairing the autodiscovery region assembly,
> introduce a new method for discovering a target position in the
> region interleave.
>
> cxl_interleave_pos() offers a method to determine a targets position
> by ascending from an endpoint to a root decoder. The calculation starts
> with the endpoints local position and its position in its parents port.
> It traverses towards the root decoder and examines both position and
> ways in order to allow the position to be refined all the way to the
> root decoder.
>
> This calculation, applied iteratively, yields the correct position:
>
> position = position * parent_ways + parent_pos;
>
> ...with these rules:
>
> Rule #1 - When (parent_ways == region_ways), Stop!
> position = parent_position;
> This rule is applied in calc_interleave_pos()
>
> Rule #2 - Skip over siblings that come before this memdev in
> the decoder list when searching for the parent position.
> This rule is applied in the helper find_pos_and_ways().
I feel like calc_interleave_pos() is missing some kernel-doc about these
rules. It seems fundamental to maintaining this code going forward. This
short summary here is sufficient for the changelog, but
calc_interleave_pos() wants a few sentences on the theory of operation.
>
> Include a selftest that exercises this new position calculation against
> every successfully configured user-defined region.
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
Again, this change is not a fix, it's a new diagnostic. It is a
dependency for a fix, but that discussion will come out around
backporting patch3.
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
> drivers/cxl/core/region.c | 102 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 102 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 64206fc4d99b..b451d215c3c5 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1500,6 +1500,93 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> return range_contains(r1, r2);
> }
>
> +/* Find the position of a port in it's parent and the parents ways */
> +static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> + int *pos, int *ways)
> +{
> + struct cxl_switch_decoder *cxlsd;
> + struct cxl_port *parent;
> + int child_ways = *ways;
> + int child_pos = *pos;
> + struct device *dev;
> + int skip = 0;
> + int rc = -1;
On the risk of this error code leaking higher, I think it should be
initialized to -ENXIO directly, and not translated by the caller.
> +
> + parent = next_port(port);
> + if (!parent)
> + return rc;
> +
> + dev = device_find_child(&parent->dev, range,
> + match_switch_decoder_by_range);
> + if (!dev) {
> + dev_err(port->uport_dev,
> + "failed to find decoder mapping %#llx-%#llx\n",
> + range->start, range->end);
> + return rc;
> + }
> + cxlsd = to_cxl_switch_decoder(dev);
> + *ways = cxlsd->cxld.interleave_ways;
> +
> + /* Skip over this many siblings in the target list */
> + if (*ways > child_ways)
> + skip = child_pos;
Maybe a clarification that "Since the switch target list is by
definition sorted in region position order, siblings to skip are always
at lower indices."
> +
> + for (int i = 0; i < *ways; i++) {
> + if (cxlsd->target[i] == port->parent_dport) {
> + if (skip--)
> + continue;
> + *pos = i;
> + rc = 0;
> + break;
> + }
> + }
> + put_device(dev);
> +
> + return rc;
> +}
> +
> +static int calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
> + int region_ways)
> +{
> + struct cxl_port *iter, *port = cxled_to_port(cxled);
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + struct range *range = &cxled->cxld.hpa_range;
> + int parent_ways = 0;
> + int parent_pos = 0;
> + int rc, pos;
> +
> + /* Initialize pos to its local position */
> + rc = find_pos_and_ways(port, range, &parent_pos, &parent_ways);
> + if (rc)
> + return -ENXIO;
per above, this can become "return rc;".
I had wondered if this code becomes easier to read to have separate
inputs and outputs versus dual purpose the @pos and @ways paramters, but
I can't come up with anything simpler. I think a bit of kernel-doc would
help with the next casual reader that comes along and wonders about the
theory of operation.
> +
> + pos = parent_pos;
> +
> + if (parent_ways == region_ways)
> + goto out;
> +
> + /* Iterate up the ancestral tree refining the position */
> + for (iter = next_port(port); iter; iter = next_port(iter)) {
> + if (is_cxl_root(iter))
> + break;
> +
> + rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> + if (rc)
> + return -ENXIO;
> +
> + if (parent_ways == region_ways) {
> + pos = parent_pos;
> + break;
> + }
> + pos = pos * parent_ways + parent_pos;
Nice simplification of the current mess!
> + }
> +out:
> + dev_dbg(&cxlmd->dev,
> + "decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
> + dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
> + dev_name(&port->dev), range->start, range->end, pos);
> +
> + return pos;
> }
>
> static void find_positions(const struct cxl_switch_decoder *cxlsd,
> @@ -1765,6 +1852,21 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> .end = p->res->end,
> };
>
> + if (p->nr_targets != p->interleave_ways)
> + return 0;
> +
> + /* Exercise position calculator on user-defined regions */
> + for (int i = 0; i < p->nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> + int test_pos;
> +
> + test_pos = calc_interleave_pos(cxled, p->interleave_ways);
> + dev_dbg(&cxled->cxld.dev,
> + "Interleave calc match %s test_pos:%d cxled->pos:%d\n",
> + (test_pos == cxled->pos) ? "Success" : "Fail",
> + test_pos, cxled->pos);
Part of me wondered if the failure case should be louder here, but in
the case of autodiscovery there is no position to compare against.
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions
2023-10-16 6:02 ` [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions alison.schofield
2023-10-17 17:40 ` Jim Harris
@ 2023-10-23 21:58 ` Dan Williams
2023-10-24 0:42 ` Alison Schofield
1 sibling, 1 reply; 19+ messages in thread
From: Dan Williams @ 2023-10-23 21:58 UTC (permalink / raw)
To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl, Dmytro Adamenko
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> For auto-discovered regions, the driver must assign each target to
> the correct position in the region interleave set.
I would say:
"For auto-discovered regions, the driver must assign each target to
a valid position in the region interleave set, based on the decoder
topology."
...because there are lots of valid configurations that may not be
"correct". I.e. outside of pmem where there are labels to validate the
intended position, label-less autodiscovery can only interpret the
current state of the decode topology.
>
> cxl_region_sort_targets() uses the kernel sort() function to put the
> targets in relative order.
This can more agressively throw the current implementation under the
bus.
"The current implementation fails to parse valid decode topologies, as
it does not consider the child offset into a parent port. The sort puts
all targets of one port ahead of another port when an interleave was
expected, causing the region assembly to fail."
[..]
> Replace the relative sort, with calc_interleave_pos() on each target
> in the region target list. That will find the exact position for each
> target based on a walk up the ancestral tree from endpoint to root
> decoder.
>
> calc_interleave_pos() was introduced in a prior patch, so the work
> here is to use in cxl_region_sort_targets().
>
> Cleanup the obsolete helper functions from the prior sort().
Maybe a comment on the testing this passes, or the plan for keeping this
algorithm regression tested?
>
> Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
[..]
Code looks good.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-17 17:24 ` Jim Harris
@ 2023-10-23 23:22 ` Alison Schofield
0 siblings, 0 replies; 19+ messages in thread
From: Alison Schofield @ 2023-10-23 23:22 UTC (permalink / raw)
To: Jim Harris
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, linux-cxl@vger.kernel.org,
Dmytro Adamenko
On Tue, Oct 17, 2023 at 05:24:45PM +0000, Jim Harris wrote:
>
>
> > On Oct 17, 2023, at 9:20 AM, Jim Harris <jim.harris@samsung.com> wrote:
> >
> >
> >
> >> On Oct 15, 2023, at 11:02 PM, alison.schofield@intel.com wrote:
> >>
> >> From: Alison Schofield <alison.schofield@intel.com>
> >>
> >>
> >> -static int decoder_match_range(struct device *dev, void *data)
> >> +static int match_switch_decoder_by_range(struct device *dev, void *data)
> >> {
> >> - struct cxl_endpoint_decoder *cxled = data;
> >> + struct range *r1, *r2 = data;
> >> struct cxl_switch_decoder *cxlsd;
> >>
> >> if (!is_switch_decoder(dev))
> >> return 0;
> >>
> >> cxlsd = to_cxl_switch_decoder(dev);
> >> - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> >> + r1 = &cxlsd->cxld.hpa_range;
> >> + return range_contains(r1, r2);
> >> +}
> >
> > Hi Alison,
> >
> > This stray closing brace needs to be removed from this patch.
> >
> > Rest of the patch looks good, I agree the naming is much better with these
> > changes.
> >
> > Reviewed-by: Jim Harris <jim.harris@samsung.com>
> >
>
> Looking at this again, I think this also needs to check if the switch
> decoder is active. Non-active decoders will have range 0 to UINT64_MAX
> which would pass the range_contains() check.
Combining this and Dan's input, changing this to check for an exact
match on the switch decoder, so not worry about the uncommitteds.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse
2023-10-23 20:54 ` Dan Williams
@ 2023-10-23 23:30 ` Alison Schofield
0 siblings, 0 replies; 19+ messages in thread
From: Alison Schofield @ 2023-10-23 23:30 UTC (permalink / raw)
To: Dan Williams
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, linux-cxl, Dmytro Adamenko
On Mon, Oct 23, 2023 at 01:54:26PM -0700, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > match_decoder_by_range() and decoder_match_range() both determine
> > if an HPA range matches a decoder. The first does it for root
> > decoders and the second one operates on switch decoders.
> >
> > Tidy these up with clear naming and make the switch helper more
> > like the root decoder helper in style and functionality. Make it
> > take the actual range, rather than an endpoint decoder from which
> > it extracts the range.
> >
> > Aside from aesthetics and maintainability, this is in preparation
> > for reuse.
>
> Some minor changelog comments:
>
> >
> > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
>
> Save "Fixes:" for commits that need to be backported to address an
> issue. This will be a dependency for backporting patch2, but on its own
> it does not fix anything.
>
> > Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
>
> Along the same lines this probably moves to the patch that fixes the
> report.
>
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > drivers/cxl/core/region.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
snip
> >
> > +static int match_switch_decoder_by_range(struct device *dev, void *data)
> > {
> > - struct cxl_endpoint_decoder *cxled = data;
> > + struct range *r1, *r2 = data;
> > struct cxl_switch_decoder *cxlsd;
> >
> > if (!is_switch_decoder(dev))
> > return 0;
> >
> > cxlsd = to_cxl_switch_decoder(dev);
> > - return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range);
> > + r1 = &cxlsd->cxld.hpa_range;
> > + return range_contains(r1, r2);
>
> I know this is just maintaining the status quo, but thinking about it a
> bit deeper I think range_contains() is only correct for root decoders.
> For a given region the root decoder will contain the region, but all the
> switch decoders must be a 1:1 mapping. Linux only supports scenarios
> where all endpoint decoders map the same HPA range and identify only 1
> decoder per-switch port to map that region.
>
> So "range_contains(r1, r2)" becomes "return *r1 == *r2".
Thanks Dan - picked up your changelog suggestions and replace this
range_contains w an 1:1 comparison.
Alison
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions
2023-10-23 21:58 ` Dan Williams
@ 2023-10-24 0:42 ` Alison Schofield
0 siblings, 0 replies; 19+ messages in thread
From: Alison Schofield @ 2023-10-24 0:42 UTC (permalink / raw)
To: Dan Williams
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, linux-cxl, Dmytro Adamenko
On Mon, Oct 23, 2023 at 02:58:08PM -0700, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> >
Dan, Picked up your changelog suggestions
snip...
>
> Maybe a comment on the testing this passes, or the plan for keeping this
> algorithm regression tested?
Will add more info to v3 changelog:
Short answer:
Passed cxl-test for the following configs:
1, 1+1, 1+1+1, 2, 2+2, 2+2+2, 2+2+2+2, 4, 4+4
where each number represents the count of endpoints per host bridges.
ie. 2+2 is 2 HBs with 2 endpoints each.
I'm using a new unit test for this. The posting of which is lagging
this patchset a bit as it depends on kernel cxl/test changes to
support configurable topologies (so we can have all those flavors!)
>
> >
> > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery")
> > Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com>
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>
> [..]
>
> Code looks good.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2023-10-24 0:42 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-16 6:02 [PATCH v2 0/3] cxl/region: Autodiscovery position repair alison.schofield
2023-10-16 6:02 ` [PATCH v2 1/3] cxl/region: Prepare the decoder match range helper for reuse alison.schofield
2023-10-17 16:21 ` Jim Harris
2023-10-17 17:24 ` Jim Harris
2023-10-23 23:22 ` Alison Schofield
2023-10-17 20:43 ` Alison Schofield
2023-10-17 22:59 ` Jim Harris
2023-10-23 17:51 ` Alison Schofield
2023-10-23 20:54 ` Dan Williams
2023-10-23 23:30 ` Alison Schofield
2023-10-16 6:02 ` [PATCH v2 2/3] cxl/region: Calculate a target position in a region interleave alison.schofield
2023-10-17 17:33 ` Jim Harris
2023-10-23 18:10 ` Alison Schofield
2023-10-23 18:34 ` Jim Harris
2023-10-23 21:47 ` Dan Williams
2023-10-16 6:02 ` [PATCH v2 3/3] cxl/region: Use calc_interleave_pos() with autodiscovered regions alison.schofield
2023-10-17 17:40 ` Jim Harris
2023-10-23 21:58 ` Dan Williams
2023-10-24 0:42 ` Alison Schofield
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox