From: Dan Williams <dan.j.williams@intel.com>
To: Dave Jiang <dave.jiang@intel.com>, <linux-cxl@vger.kernel.org>
Cc: <dan.j.williams@intel.com>, <ira.weiny@intel.com>,
<vishal.l.verma@intel.com>, <alison.schofield@intel.com>,
<Jonathan.Cameron@huawei.com>, <dave@stgolabs.net>
Subject: Re: [PATCH v5 3/4] cxl: Fix incorrect region perf data calculation
Date: Mon, 1 Apr 2024 12:33:56 -0700 [thread overview]
Message-ID: <660b0c2473df6_15786294da@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <20240325230234.1847525-4-dave.jiang@intel.com>
Dave Jiang wrote:
> Current math in cxl_region_perf_data_calculate divides the latency by 1000
> every time the function gets called. This causes the region latency to be
> divided by 1000 per memory device and the math is incorrect. This is user
> visible as the latency access_coordinate exposed via sysfs will show
> incorrect latency data.
>
> Move the latency adjustment to where dpa_perf is set and this should
> provide the appropriate latency for the region for each endpoint.
>
> Fixes: 3d9f4a197230 ("cxl/region: Calculate performance data for a region")
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/cdat.c | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index 5b75d2d56099..a1b204d451d3 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -214,8 +214,19 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
> static void update_perf_entry(struct device *dev, struct dsmas_entry *dent,
> struct cxl_dpa_perf *dpa_perf)
> {
> - for (int i = 0; i < ACCESS_COORDINATE_MAX; i++)
> + for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
> dpa_perf->coord[i] = dent->coord[i];
> + /*
> + * Convert latency to nanosec from picosec to be consistent
> + * with the resulting latency coordinates computed by the
> + * HMAT_REPORTING code.
> + */
> + dpa_perf->coord[i].read_latency =
> + DIV_ROUND_UP(dpa_perf->coord[i].read_latency, 1000);
> + dpa_perf->coord[i].write_latency =
> + DIV_ROUND_UP(dpa_perf->coord[i].write_latency, 1000);
It feels like a latent bug that dpa_perf is temporarily counted in
picoseconds. I.e. every place that assigns into a 'struct
access_coordinate' should do so in nanoseconds. I see how this change
fixes the "over-division" problem, but dpa_perf->coord should never have
storted picosecond values in the first instance.
Something like the following, to match / reuse hmat_normalize()
(untested!):
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index af5cb818f84d..90a7a90ea811 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -530,17 +530,7 @@ static int get_genport_coordinates(struct device *dev, struct cxl_dport *dport)
if (kstrtou32(acpi_device_uid(hb), 0, &uid))
return -EINVAL;
- rc = acpi_get_genport_coordinates(uid, dport->hb_coord);
- if (rc < 0)
- return rc;
-
- /* Adjust back to picoseconds from nanoseconds */
- for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
- dport->hb_coord[i].read_latency *= 1000;
- dport->hb_coord[i].write_latency *= 1000;
- }
-
- return 0;
+ return acpi_get_genport_coordinates(uid, dport->hb_coord);
}
static int add_host_bridge_dport(struct device *match, void *arg)
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index eddbbe21450c..d5ba4de97c08 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -124,10 +124,8 @@ static int cdat_dslbis_handler(union acpi_subtable_headers *header, void *arg,
le_base = (__force __le64)dslbis->entry_base_unit;
le_val = (__force __le16)dslbis->entry[0];
- rc = check_mul_overflow(le64_to_cpu(le_base),
- le16_to_cpu(le_val), &val);
- if (rc)
- pr_warn("DSLBIS value overflowed.\n");
+ val = hmat_normalize(le16_to_cpu(le_val), le64_to_cpu(le_base),
+ dslbis->data_type);
cxl_access_coordinate_set(&dent->coord, dslbis->data_type, val);
next prev parent reply other threads:[~2024-04-01 19:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-25 23:00 [PATCH v5 0/4] cxl: access_coordinate validity fixes for 6.9 Dave Jiang
2024-03-25 23:00 ` [PATCH v5 1/4] cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dave Jiang
2024-03-29 1:26 ` Dan Williams
2024-03-25 23:00 ` [PATCH v5 2/4] cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord Dave Jiang
2024-03-29 1:28 ` Dan Williams
2024-03-25 23:00 ` [PATCH v5 3/4] cxl: Fix incorrect region perf data calculation Dave Jiang
2024-04-01 19:33 ` Dan Williams [this message]
2024-03-25 23:00 ` [PATCH v5 4/4] cxl: Add checks to access_coordinate calculation to fail missing data Dave Jiang
2024-04-01 19:37 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=660b0c2473df6_15786294da@dwillia2-mobl3.amr.corp.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox