From: Alison Schofield <alison.schofield@intel.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: linux-cxl@vger.kernel.org, dan.j.williams@intel.com,
ira.weiny@intel.com, vishal.l.verma@intel.com,
Jonathan.Cameron@huawei.com, dave@stgolabs.net
Subject: Re: [PATCH v7 2/3] cxl: Calculate region bandwidth of targets with shared upstream link
Date: Wed, 10 Jul 2024 18:39:35 -0700 [thread overview]
Message-ID: <Zo8317DPRkLOsOlH@aschofie-mobl2> (raw)
In-Reply-To: <20240710222716.797267-3-dave.jiang@intel.com>
On Wed, Jul 10, 2024 at 03:24:01PM -0700, Dave Jiang wrote:
> The current bandwidth calculation aggregates all the targets. This simple
> method does not take into account where multiple targets sharing under
> a switch or a root port where the aggregated bandwidth can be greater than
> the upstream link of the switch.
>
> To accurately account for the shared upstream uplink cases, a new update
> function is introduced by walking from the leaves to the root of the
> hierarchy and clamp the bandwidth in the process as needed. This process
> is done when all the targets for a region are present but before the
> final values are send to the HMAT handling code cached access_coordinate
> targets.
>
> The original perf calculation path was kept to calculate the latency
> performance data that does not require the shared link consideration.
> The shared upstream link calculation is done as a second pass when all
> the endpoints have arrived.
>
> Testing is done via qemu with CXL hierachy. run_qemu[1] is modified to
> support several CXL hierachy layouts. The following layouts are tested:
>
> HB: Host Bridge
> RP: Root Port
> SW: Switch
> EP: End Point
>
> 2 HB 2 RP 2 EP: resulting bandwidth: 624
> 1 HB 2 RP 2 EP: resulting bandwidth: 624
> 2 HB 2 RP 2 SW 4 EP: resulting bandwidth: 624
>
> Current testing, perf number from SRAT/HMAT is hacked into the kernel
> code. However with new QEMU support of Generic Target Port that's
> incoming, the perf data injection is no longer needed.
>
> [1]: https://github.com/pmem/run_qemu
>
> Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Link: https://lore.kernel.org/linux-cxl/20240501152503.00002e60@Huawei.com/
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>
> ---
> v7:
> - Add test notes in commit log. (Dan)
> - Move cxl_memdev_get_dpa_perf() to cxled_memdev_get_dpa_perf(). (Dan)
> - Add a DEFINE_FREE(free_perf_xa). (Dan)
> - Address 2hb2rp2ep issue Jonathan reported. (Jonathan)
> - Added more kdoc comment headers. (Dan)
Looks like potential kdocs are not annotated w /**
> - Rename helper functions to be more clear on what they do. (Dan)
> - Move activiation point to after cxl_region_setup_targets(). (Dan)
> ---
snip
> +/*
> + * cxl_region_shared_upstream_perf_update - Recalculate the bandwidth for the region
> + * @cxl_region: the cxl region to recalculate
> + *
> + * The function walks the topology from bottom up and calculates the bandwidth. It
> + * starts at the endpoints, processes at the switches if any, processes at the rootport
> + * level, at the host bridge level, and finally aggregates at the region.
> + */
> +void cxl_region_shared_upstream_bandwidth_update(struct cxl_region *cxlr)
> +{
> + struct xarray *usp_xa, *working_xa;
> + int root_count = 0;
> + bool is_root;
> + int rc;
> +
> + lockdep_assert_held(&cxl_dpa_rwsem);
> +
> + usp_xa = kzalloc(sizeof(*usp_xa), GFP_KERNEL);
> + if (!usp_xa)
> return;
> +
> + xa_init(usp_xa);
> +
> + /* Collect bandwidth data from all the endpoints. */
> + for (int i = 0; i < cxlr->params.nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = cxlr->params.targets[i];
> +
> + is_root = false;
> + rc = cxl_endpoint_gather_bandwidth(cxlr, cxled, usp_xa, &is_root);
> + if (rc) {
> + free_perf_xa(usp_xa);
> + return;
> + }
> + root_count += is_root;
> }
>
> + /* Detect asymmetric hierachy with some direct attached endpoints. */
> + if (root_count && root_count != cxlr->params.nr_targets) {
> + dev_dbg(&cxlr->dev,
> + "Asymmetric hierachy detected, bandwidth not updated\n");
> + return;
> + }
> +
> + /*
> + * Walk up one or more switches to deal with the bandwidth of the
> + * switches if they exist. Endpoints directly attached to RPs skip
> + * over this part.
> + */
> + if (!root_count) {
> + do {
> + working_xa = cxl_switch_gather_bandwidth(cxlr, usp_xa,
> + &is_root);
> + if (IS_ERR(working_xa))
> + goto out;
> + free_perf_xa(usp_xa);
> + usp_xa = working_xa;
> + } while (!is_root);
> + }
> +
> + /* Handle the bandwidth at the root port of the hierachy */
> + working_xa = cxl_rp_gather_bandwidth(usp_xa);
> + if (rc)
> + goto out;
> + free_perf_xa(usp_xa);
I was going to say something about getting rid of the goto's
and just freeing and returning in line like was done ~36 lines
above..but then realize something went astray here in the code
movement. Here and below, rc isn't set.
> + usp_xa = working_xa;
> +
> + /* Handle the bandwidth at the host bridge of the hierachy */
> + working_xa = cxl_hb_gather_bandwidth(usp_xa);
> + if (rc)
> + goto out;
> + free_perf_xa(usp_xa);
> + usp_xa = working_xa;
> +
> + /*
> + * Aggregate all the bandwidth collected per CFMWS (ACPI0017) and
> + * update the region bandwidth with the final calculated values.
> + */
> + cxl_region_update_bandwidth(cxlr, usp_xa);
> +
> +out:
> + free_perf_xa(usp_xa);
> +}
snip
>
>
next prev parent reply other threads:[~2024-07-11 1:39 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-10 22:23 [PATCH v7 0/3] cxl: Region bandwidth calculation for targets with shared upstream link Dave Jiang
2024-07-10 22:24 ` [PATCH v7 1/3] cxl: Preserve the CDAT access_coordinate for an endpoint Dave Jiang
2024-07-10 22:24 ` [PATCH v7 2/3] cxl: Calculate region bandwidth of targets with shared upstream link Dave Jiang
2024-07-11 1:39 ` Alison Schofield [this message]
2024-07-11 16:00 ` Dave Jiang
2024-07-10 22:24 ` [PATCH v7 3/3] cxl: Add documentation to explain the shared link bandwidth calculation Dave Jiang
2024-08-27 16:06 ` Jonathan Cameron
2024-08-27 22:38 ` Dave Jiang
2024-08-28 9:00 ` Jonathan Cameron
2024-08-28 15:35 ` Dave Jiang
2024-08-28 16:12 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zo8317DPRkLOsOlH@aschofie-mobl2 \
--to=alison.schofield@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox