From: Dave Jiang <dave.jiang@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: linux-cxl@vger.kernel.org, dan.j.williams@intel.com,
ira.weiny@intel.com, vishal.l.verma@intel.com,
alison.schofield@intel.com, dave@stgolabs.net
Subject: Re: [PATCH 2/2] cxl: Add checks to access_coordinate calculation to fail missing data
Date: Tue, 5 Mar 2024 15:36:52 -0700 [thread overview]
Message-ID: <b6b8ba4f-0911-4c69-84ab-32027fffbdf4@intel.com> (raw)
In-Reply-To: <20240229174427.00000e9e@Huawei.com>
On 2/29/24 10:44 AM, Jonathan Cameron wrote:
> On Wed, 28 Feb 2024 17:25:42 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
>
>> Jonathan noted that when the coordinates for host bridge and switches
>> can be 0s if no actual data are retrieved and the calculation continues.
>> The resulting number would be inaccurate. Add checks to ensure that the
>> calculation would complete only if the numbers are valid.
>>
>> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>
> Hi Dave,
>
> Whilst I think the fix is right, it is getting hard to read. Maybe
> a rethink is needed on how that iteration works?
>
>> ---
>> drivers/cxl/core/port.c | 31 +++++++++++++++++++++++++++----
>> 1 file changed, 27 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index e1d30a885700..2c82fe24b789 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -2110,6 +2110,20 @@ static void combine_coordinates(struct access_coordinate *c1,
>> c1->read_latency += c2->read_latency;
>> }
>>
>> +static bool coordinates_invalid(struct access_coordinate *c)
>> +{
>> + if (!c->read_bandwidth && !c->write_bandwidth &&
>> + !c->read_latency && !c->write_latency)
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> +static bool parent_port_is_cxl_root(struct cxl_port *port)
>> +{
>> + return is_cxl_root(to_cxl_port(port->dev.parent));
>> +}
>> +
>> /**
>> * cxl_endpoint_get_perf_coordinates - Retrieve performance numbers stored in dports
>> * of CXL path
>> @@ -2142,16 +2156,25 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
>> * port each iteration. If the parent is cxl root then there is
>> * nothing to gather.
>> */
>> - while (!is_cxl_root(to_cxl_port(iter->dev.parent))) {
>> - combine_coordinates(&c, &dport->sw_coord);
>> + while (!parent_port_is_cxl_root(iter)) {
>> + iter = to_cxl_port(iter->dev.parent);
>> +
>> + /* There's no CDAT for the host bridge, so skip if so. */
>
> Comment refers to skipping whereas code is 'doing more' for the other case
> so this is confusing to me.
>
> The inverse of this only occurs on the last iteration I think.
>
> Possibly a do / while instead of a while will do it.
> I'm far from confident though as all the levels of look up have
> me too confused.
So this is somewhat tricky. For example:
devices/pci0000:35/0000:35:01.0/0000:37:00.0/mem5
In this case the endpoint is attached to the host bridge without any switches. The endpoint is 0000:37:00.0 and the host bridge down stream port is 0000:35:01.0. In this instance there is no switch and therefore switch CDAT, but there is a valid dport. So we would skip the dport->sw_coord. However, we do need to pick up the link_latency between endpoint and downstream port. So we spend 1 iteration in the loop and skips the dport->sw_coord.
devices/pci0000:bf/0000:bf:00.0/0000:c0:00.0/0000:c1:00.0/0000:c2:00.0/mem8
Now in this case there's a CXL switch in between. So in first iteration, we pick up the dport->sw_coord. And in second iteration, we skip the dport->sw_coord. However, we would be adding two link_latency. For the link between 0000:c2:00.0 (endpoint) and 0000:c1:00.0 (switch downstream port), and the link between 0000:c0:00.0 (switch upstream port) and 0000:bf:00.0 (host bridge down stream port). So therefore we can't put the sum of link_latency outside of the loop.
Not sure how much better this is:
do {
struct cxl_port *parent_port = to_cxl_port(iter->dev.parent);
dport = iter->parent_dport;
if (!parent_port_is_cxl_root(parent_port)) {
if (coordinates_invalid(&dport->sw_coord))
return -EINVAL;
combine_coordinates(&c, &dport->sw_coord);
}
c.write_latency += dport->link_latency;
c.read_latency += dport->link_latency;
iter = to_cxl_port(iter->dev.parent);
} while (!parent_port_is_cxl_root(iter));
or:
do {
struct cxl_port *parent_port = to_cxl_port(iter->dev.parent);
dport = iter->parent_dport;
c.write_latency += dport->link_latency;
c.read_latency += dport->link_latency;
if (parent_port_is_cxl_root(parent_port))
break;
if (coordinates_invalid(&dport->sw_coord))
return -EINVAL;
combine_coordinates(&c, &dport->sw_coord);
iter = to_cxl_port(iter->dev.parent);
} while (!parent_port_is_cxl_root(iter));
>
>
> do {
> if (coordinates_invalid(&dport->sw_coord))
> return -EINVAL;
>
> combine_coordinates(&c, &dport->sw_coord);
> iter = to_cxl_port(iter->dev.parent);
> dport = iter->parent_dport;
> } while (!parent_port_is_cxl_root(iter));
> /* Do final link updates */
> c.write_latency += dport->link_latency;
> c.read_latency += dport->link_latency;
>
>> + if (!parent_port_is_cxl_root(iter)) {
>> + if (coordinates_invalid(&dport->sw_coord))
>> + return -EINVAL;
>> +
>> + combine_coordinates(&c, &dport->sw_coord);
>> + }
>> +
>> c.write_latency += dport->link_latency;
>> c.read_latency += dport->link_latency;
>> -
>> - iter = to_cxl_port(iter->dev.parent);
>> dport = iter->parent_dport;
>> }
>>
>> /* Augment with the generic port (host bridge) perf data */
>> + if (coordinates_invalid(&dport->hb_coord))
>> + return -EINVAL;
>> combine_coordinates(&c, &dport->hb_coord);
>>
>> /* Get the calculated PCI paths bandwidth */
>
next prev parent reply other threads:[~2024-03-05 22:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-29 0:25 [PATCH 1/2] cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dave Jiang
2024-02-29 0:25 ` [PATCH 2/2] cxl: Add checks to access_coordinate calculation to fail missing data Dave Jiang
2024-02-29 0:35 ` Dan Williams
2024-02-29 0:39 ` Dave Jiang
2024-02-29 0:44 ` Dan Williams
2024-02-29 17:25 ` Jonathan Cameron
2024-02-29 17:44 ` Jonathan Cameron
2024-03-05 22:36 ` Dave Jiang [this message]
2024-03-06 0:18 ` Dave Jiang
2024-02-29 0:32 ` [PATCH 1/2] cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dan Williams
2024-02-29 0:36 ` Dave Jiang
2024-02-29 17:45 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6b8ba4f-0911-4c69-84ab-32027fffbdf4@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@Huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox