From: Gregory Price <gourry@gourry.net>
To: Dave Jiang <dave.jiang@intel.com>
Cc: linux-cxl@vger.kernel.org, dave@stgolabs.net,
jonathan.cameron@huawei.com, alison.schofield@intel.com,
vishal.l.verma@intel.com, ira.weiny@intel.com,
dan.j.williams@intel.com
Subject: Re: [PATCH 4/4] cxl: doc/linux/access-coordinates Update access coordinates calculation methods
Date: Wed, 14 May 2025 12:30:05 -0400 [thread overview]
Message-ID: <aCTFDcFNSq38Zn0k@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <20250514003133.584401-5-dave.jiang@intel.com>
On Tue, May 13, 2025 at 05:31:33PM -0700, Dave Jiang wrote:
> Add documentation on how to calculate the access coordinates for a given
> CXL region in detail.
>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
This is awesome, thanks Dave! Minor nits inline
Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
> .../cxl/linux/access-coordinates.rst | 87 +++++++++++++++++++
> 1 file changed, 87 insertions(+)
>
> diff --git a/Documentation/driver-api/cxl/linux/access-coordinates.rst b/Documentation/driver-api/cxl/linux/access-coordinates.rst
> index 71024fa0f561..cf86920f083a 100644
> --- a/Documentation/driver-api/cxl/linux/access-coordinates.rst
> +++ b/Documentation/driver-api/cxl/linux/access-coordinates.rst
> @@ -5,6 +5,84 @@
> CXL Access Coordinates Computation
> ==================================
>
> +Latency and Bandwidth Calculation
> +=================================
> +A memory region performance coordinates (latency and bandwidth) are typically
> +provided via ACPI tables :doc:`SRAT <../platform/acpi/srat>` and
> +:doc:`HMAT <../platform/acpi/hmat>`. However, the platform firmware (BIOS) is
> +not able to annotate those for CXL devices that are hot-plugged since they do
> +not exist during platform firmware initialization. The CXL driver can compute
> +the performance coordinates by retrieving data from several components.
> +
> +The :doc:`SRAT <../platform/acpi/srat>` provides a Generic Port Affinity
> +subtable that ties a proximity domain to a device handle, which in this case
> +would be the CXL hostbridge. Using this association, the performance
> +coordinates for the Generic Port can be retrieved from the
> +:doc:`HMAT <../platform/acpi/hmat>` subtable. This piece represents the
> +performance coordinates between a CPU to the Generic Port (CXL hostbridge).
"from a CPU to the Generic Port", or
"between a CPU and a Generic Port"
either works
> +
> +The :doc:`CDAT <../platform/cdat>` provides the performance coordinates for
> +the CXL device itself. That is the bandwidth and latency to access that device's
> +memory region. The :doc:`DSMAS <../platform/cdat/dsmas>` subtable provides a
> +DSMAD handle that is tied to a Device Physical Address (DPA) range. The
Similar to other doc - need to define DSMAD.
> +:doc:`DSLBIS <../platform/cdat/dslbis>` subtable provides the performance
> +coordinates that's tied to a DSMAD handle and this ties the two table
> +entries together to provide the performance coordinates for each DPA
> +region. For example, if a device exports a DRAM region and a PMEM region,
> +then there would be different performance characteristsics for each of those
> +regions.
> +
> +If there's a CXL switch in the topology, then the performance coordinates for the
> +switch is provided by :doc:`SSLBIS <../platform/cdat/sslbis>` subtable. This
> +provides the bandwidth and latency for traversing the switch between the switch
> +upstream port and the switch downstream port that points to the endpoint device.
> +
> +Simple topology example::
> +
> + GP0/HB0/ACPI0016-0
> + RP0
> + |
> + | L0
> + |
> + SW 0 / USP0
> + SW 0 / DSP0
> + |
> + | L1
> + |
> + EP0
> +
> +In this example, there is a CXL switch between an endpoint and a root port.
> +Latency in this example is calculated as such:
> +L(EP0) - Latency from EP0 CDAT DSMAS+DSLBIS
> +L(L1) - Link latency between EP0 and SW0DSP0
> +L(SW0) - Latency for the switch from SW0 CDAT SSLBIS.
> +L(L0) - Link latency between SW0 and RP0
> +L(RP0) - Latency from root port to CPU via SRAT and HMAT (Generic Port).
> +Total read and write latencies are the sum of all these parts.
> +
> +Bandwidth in this example is calculated as such:
> +B(EP0) - Bandwidth from EP0 CDAT DSMAS+DSLBIS
> +B(L1) - Link bandwidth between EP0 and SW0DSP0
> +B(SW0) - Bandwidth for the switch from SW0 CDAT SSLBIS.
> +B(L0) - Link bandwidth between SW0 and RP0
> +B(RP0) - Bandwidth from root port to CPU via SRAT and HMAT (Generic Port).
> +The total read and write bandwidth is the min() of all these parts.
> +
> +To calculate the link bandwidth:
> +LinkOperatingFrequency (GT/s) is the current negotiated link speed.
> +DataRatePerLink (MB/s) = LinkOperatingFrequency / 8
> +Bandwidth (MB/s) = PCIeCurrentLinkWidth * DataRatePerLink
> +Where PCIeCurrentLinkWidth is the number of lanes in the link.
> +
> +To calculate the link latency:
> +LinkLatency (picoseconds) = FlitSize / LinkBandwidth (MB/s)
> +
> +See `CXL Memory Device SW Guide r1.0 <https://www.intel.com/content/www/us/en/content-details/643805/cxl-memory-device-software-guide.html>`_,
> +section 2.11.3 and 2.11.4 for details.
> +
> +In the end, the access coordinates for a constructed memory region is calculated from one
> +or more memory partitions from each of the CXL device(s).
> +
> Shared Upstream Link Calculation
> ================================
> For certain CXL region construction with endpoints behind CXL switches (SW) or
> @@ -90,3 +168,12 @@ under the same ACPI0017 device to form a new xarray.
> Finally, the cxl_region_update_bandwidth() is called and the aggregated
> bandwidth from all the members of the last xarray is updated for the
> access coordinates residing in the cxl region (cxlr) context.
> +
> +QTG ID
> +======
> +Each :doc:`CEDT <../platform/acpi/cedt>` has a QTG ID field. This field provides
> +the ID that associates with a QoS Throttling Group (QTG) for the CFMWS window.
> +Once the access coordinates are calculated, an ACPI Device Specific Method can
> +be issued to the ACPI0016 device to retrieve the QTG ID depends on the access
> +coordinates provided. The QTG ID for the device can be used as guidance to match
> +to the CFMWS to setup the best Linux root decoder for the device performance.
> --
> 2.49.0
>
prev parent reply other threads:[~2025-05-14 16:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-14 0:31 [PATCH 0/4] cxl: Update CXL documentation for access coordinates calculation Dave Jiang
2025-05-14 0:31 ` [PATCH 1/4] cxl: docs/platform/cdat reference documentation Dave Jiang
2025-05-14 16:12 ` Gregory Price
2025-05-14 19:02 ` Dave Jiang
2025-05-14 21:59 ` Gregory Price
2025-05-14 22:15 ` Dave Jiang
2025-05-14 0:31 ` [PATCH 2/4] cxl: docs/platform/acpi/srat fix memory table misalignment Dave Jiang
2025-05-14 1:40 ` Gregory Price
2025-05-14 15:22 ` Dave Jiang
2025-05-14 0:31 ` [PATCH 3/4] cxl: docs/platform/acpi/srat Add generic target documentation Dave Jiang
2025-05-14 16:15 ` Gregory Price
2025-05-14 0:31 ` [PATCH 4/4] cxl: doc/linux/access-coordinates Update access coordinates calculation methods Dave Jiang
2025-05-14 16:30 ` Gregory Price [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aCTFDcFNSq38Zn0k@gourry-fedora-PF4VCD3F \
--to=gourry@gourry.net \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox