From: Dave Jiang <dave.jiang@intel.com>
To: linux-cxl@vger.kernel.org
Cc: dan.j.williams@intel.com, ira.weiny@intel.com,
vishal.l.verma@intel.com, alison.schofield@intel.com,
Jonathan.Cameron@huawei.com, dave@stgolabs.net
Subject: [PATCH v8 3/3] cxl: Add documentation to explain the shared link bandwidth calculation
Date: Tue, 3 Sep 2024 17:11:52 -0700 [thread overview]
Message-ID: <20240904001316.1688225-4-dave.jiang@intel.com> (raw)
In-Reply-To: <20240904001316.1688225-1-dave.jiang@intel.com>
Create a kernel documentation to describe how the CXL shared upstream
link bandwidth is calculated.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v8:
- spelling fixups (Jonathan)
- remove duplications (Jonathan)
- clarify definition of symmetry (Jonathan)
---
.../driver-api/cxl/access-coordinates.rst | 92 +++++++++++++++++++
Documentation/driver-api/cxl/index.rst | 1 +
2 files changed, 93 insertions(+)
create mode 100644 Documentation/driver-api/cxl/access-coordinates.rst
diff --git a/Documentation/driver-api/cxl/access-coordinates.rst b/Documentation/driver-api/cxl/access-coordinates.rst
new file mode 100644
index 000000000000..319a820f37a0
--- /dev/null
+++ b/Documentation/driver-api/cxl/access-coordinates.rst
@@ -0,0 +1,92 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+==================================
+CXL Access Coordinates Computation
+==================================
+
+Shared Upstream Link Calculation
+================================
+For certain CXL region construction with endpoints behind CXL switches (SW) or
+Root Ports (RP), there is the possibility of the total bandwidth for all
+the endpoints behind a switch being more than the switch upstream link.
+A similar situation can occur within the host, upstream of the root ports.
+The CXL driver performs an additional pass after all the targets have
+arrived for a region in order to recalculate the bandwidths with possible
+upstream link being a limiting factor in mind.
+
+The algorithm assumes the configuration is a symmetric topology as that
+maximizes performance. When asymmetric topology is detected, the calculation
+is aborted. An asymmetric topology is detected during topology walk where the
+number of RPs detected as a grandparent is not equal to the number of devices
+iterated in the same iteration loop. The assumption is made that subtle
+asymmetry in properties does not happen and all paths to EPs are equal.
+
+There can be multiple switches under an RP. There can be multiple RPs under
+a CXL Host Bridge (HB). There can be multiple HBs under a CXL Fixed Memory
+Window Structure (CFMWS).
+
+An example hierarchy:
+
+ CFMWS 0
+ |
+ _________|_________
+ | |
+ ACPI0017-0 ACPI0017-1
+ GP0/HB0/ACPI0016-0 GP1/HB1/ACPI0016-1
+ | | | |
+ RP0 RP1 RP2 RP3
+ | | | |
+ SW 0 SW 1 SW 2 SW 3
+ | | | | | | | |
+ EP0 EP1 EP2 EP3 EP4 EP5 EP6 EP7
+
+Computation for the example hierarchy:
+
+Min (GP0 to CPU BW,
+ Min(SW 0 Upstream Link to RP0 BW,
+ Min(SW0SSLBIS for SW0DSP0 (EP0), EP0 DSLBIS, EP0 Upstream Link) +
+ Min(SW0SSLBIS for SW0DSP1 (EP1), EP1 DSLBIS, EP1 Upstream link)) +
+ Min(SW 1 Upstream Link to RP1 BW,
+ Min(SW1SSLBIS for SW1DSP0 (EP2), EP2 DSLBIS, EP2 Upstream Link) +
+ Min(SW1SSLBIS for SW1DSP1 (EP3), EP3 DSLBIS, EP3 Upstream link))) +
+Min (GP1 to CPU BW,
+ Min(SW 2 Upstream Link to RP2 BW,
+ Min(SW2SSLBIS for SW2DSP0 (EP4), EP4 DSLBIS, EP4 Upstream Link) +
+ Min(SW2SSLBIS for SW2DSP1 (EP5), EP5 DSLBIS, EP5 Upstream link)) +
+ Min(SW 3 Upstream Link to RP3 BW,
+ Min(SW3SSLBIS for SW3DSP0 (EP6), EP6 DSLBIS, EP6 Upstream Link) +
+ Min(SW3SSLBIS for SW3DSP1 (EP7), EP7 DSLBIS, EP7 Upstream link))))
+
+
+The calculation starts at cxl_region_shared_upstream_perf_update(). A xarray
+is created to collect all the endpoint bandwidths via the
+cxl_endpoint_gather_bandwidth() function. The min() of bandwidth from the
+endpoint CDAT and the upstream link bandwidth is calculated. If the endpoint
+has a CXL switch as a parent, then min() of calculated bandwidth and the
+bandwidth from the SSLBIS for the switch downstream port that is associated
+with the endpoint is calculated. The final bandwidth is stored in a
+'struct cxl_perf_ctx' in the xarray indexed by a device pointer. If the
+endpoint is direct attached to a root port (RP), the device pointer would be an
+RP device. If the endpoint is behind a switch, the device pointer would be the
+upstream device of the parent switch.
+
+At the next stage, the code walks through one or more switches if they exist
+in the topology. For endpoints directly attached to RPs, this step is skipped.
+If there is another switch upstream, the code takes the min() of the current
+gathered bandwidth and the upstream link bandwidth. If there's a switch
+upstream, then the SSLBIS of the upstream switch.
+
+Once the topology walk reaches the RP, whether it's direct attached endpoints
+or walking through the switch(es), cxl_rp_gather_bandwidth() is called. At
+this point all the bandwidths are aggregated per each host bridge, which is
+also the index for the resulting xarray.
+
+The next step is to take the min() of the per host bridge bandwidth and the
+bandwidth from the Generic Port (GP). The bandwidths for the GP is retrieved
+via ACPI tables SRAT/HMAT. The min bandwidth are aggregated under the same
+ACPI0017 device to form a new xarray.
+
+Finally, the cxl_region_update_bandwidth() is called and the aggregated
+bandwidth from all the members of the last xarray is updated for the
+access coordinates residing in the cxl region (cxlr) context.
diff --git a/Documentation/driver-api/cxl/index.rst b/Documentation/driver-api/cxl/index.rst
index 12b82725d322..965ba90e8fb7 100644
--- a/Documentation/driver-api/cxl/index.rst
+++ b/Documentation/driver-api/cxl/index.rst
@@ -8,6 +8,7 @@ Compute Express Link
:maxdepth: 1
memory-devices
+ access-coordinates
maturity-map
--
2.46.0
next prev parent reply other threads:[~2024-09-04 0:13 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 0:11 [PATCH v8 0/3] cxl: Region bandwidth calculation for targets with shared upstream link Dave Jiang
2024-09-04 0:11 ` [PATCH v8 1/3] cxl: Preserve the CDAT access_coordinate for an endpoint Dave Jiang
2024-09-04 0:11 ` [PATCH v8 2/3] cxl: Calculate region bandwidth of targets with shared upstream link Dave Jiang
2024-09-12 1:06 ` Alison Schofield
2024-09-04 0:11 ` Dave Jiang [this message]
2024-09-12 1:07 ` [PATCH v8 3/3] cxl: Add documentation to explain the shared link bandwidth calculation Alison Schofield
2024-09-21 10:26 ` [PATCH v8 0/3] cxl: Region bandwidth calculation for targets with shared upstream link Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240904001316.1688225-4-dave.jiang@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox