public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: Alison Schofield <alison.schofield@intel.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: linux-cxl@vger.kernel.org, dan.j.williams@intel.com,
	ira.weiny@intel.com, vishal.l.verma@intel.com,
	Jonathan.Cameron@huawei.com, dave@stgolabs.net
Subject: Re: [PATCH v7 2/3] cxl: Calculate region bandwidth of targets with shared upstream link
Date: Wed, 10 Jul 2024 18:39:35 -0700	[thread overview]
Message-ID: <Zo8317DPRkLOsOlH@aschofie-mobl2> (raw)
In-Reply-To: <20240710222716.797267-3-dave.jiang@intel.com>

On Wed, Jul 10, 2024 at 03:24:01PM -0700, Dave Jiang wrote:
> The current bandwidth calculation aggregates all the targets. This simple
> method does not take into account where multiple targets sharing under
> a switch or a root port where the aggregated bandwidth can be greater than
> the upstream link of the switch.
> 
> To accurately account for the shared upstream uplink cases, a new update
> function is introduced by walking from the leaves to the root of the
> hierarchy and clamp the bandwidth in the process as needed. This process
> is done when all the targets for a region are present but before the
> final values are send to the HMAT handling code cached access_coordinate
> targets.
> 
> The original perf calculation path was kept to calculate the latency
> performance data that does not require the shared link consideration.
> The shared upstream link calculation is done as a second pass when all
> the endpoints have arrived.
> 
> Testing is done via qemu with CXL hierachy. run_qemu[1] is modified to
> support several CXL hierachy layouts. The following layouts are tested:
> 
> HB: Host Bridge
> RP: Root Port
> SW: Switch
> EP: End Point
> 
> 2 HB 2 RP 2 EP: resulting bandwidth: 624
> 1 HB 2 RP 2 EP: resulting bandwidth: 624
> 2 HB 2 RP 2 SW 4 EP: resulting bandwidth: 624
> 
> Current testing, perf number from SRAT/HMAT is hacked into the kernel
> code. However with new QEMU support of Generic Target Port that's
> incoming, the perf data injection is no longer needed.
> 
> [1]: https://github.com/pmem/run_qemu
> 
> Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Link: https://lore.kernel.org/linux-cxl/20240501152503.00002e60@Huawei.com/
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v7:
> - Add test notes in commit log. (Dan)
> - Move cxl_memdev_get_dpa_perf() to cxled_memdev_get_dpa_perf(). (Dan)
> - Add a DEFINE_FREE(free_perf_xa). (Dan)
> - Address 2hb2rp2ep issue Jonathan reported. (Jonathan)
> - Added more kdoc comment headers. (Dan)

Looks like potential kdocs are not annotated w /** 


> - Rename helper functions to be more clear on what they do. (Dan)
> - Move activiation point to after cxl_region_setup_targets(). (Dan)
> ---

snip

> +/*
> + * cxl_region_shared_upstream_perf_update - Recalculate the bandwidth for the region
> + * @cxl_region: the cxl region to recalculate
> + *
> + * The function walks the topology from bottom up and calculates the bandwidth. It
> + * starts at the endpoints, processes at the switches if any, processes at the rootport
> + * level, at the host bridge level, and finally aggregates at the region.
> + */
> +void cxl_region_shared_upstream_bandwidth_update(struct cxl_region *cxlr)
> +{
> +	struct xarray *usp_xa, *working_xa;
> +	int root_count = 0;
> +	bool is_root;
> +	int rc;
> +
> +	lockdep_assert_held(&cxl_dpa_rwsem);
> +
> +	usp_xa = kzalloc(sizeof(*usp_xa), GFP_KERNEL);
> +	if (!usp_xa)
>  		return;
> +
> +	xa_init(usp_xa);
> +
> +	/* Collect bandwidth data from all the endpoints. */
> +	for (int i = 0; i < cxlr->params.nr_targets; i++) {
> +		struct cxl_endpoint_decoder *cxled = cxlr->params.targets[i];
> +
> +		is_root = false;
> +		rc = cxl_endpoint_gather_bandwidth(cxlr, cxled, usp_xa, &is_root);
> +		if (rc) {
> +			free_perf_xa(usp_xa);
> +			return;
> +		}
> +		root_count += is_root;
>  	}
>  
> +	/* Detect asymmetric hierachy with some direct attached endpoints. */
> +	if (root_count && root_count != cxlr->params.nr_targets) {
> +		dev_dbg(&cxlr->dev,
> +			"Asymmetric hierachy detected, bandwidth not updated\n");
> +		return;
> +	}
> +
> +	/*
> +	 * Walk up one or more switches to deal with the bandwidth of the
> +	 * switches if they exist. Endpoints directly attached to RPs skip
> +	 * over this part.
> +	 */
> +	if (!root_count) {
> +		do {
> +			working_xa = cxl_switch_gather_bandwidth(cxlr, usp_xa,
> +								 &is_root);
> +			if (IS_ERR(working_xa))
> +				goto out;
> +			free_perf_xa(usp_xa);
> +			usp_xa = working_xa;
> +		} while (!is_root);
> +	}
> +
> +	/* Handle the bandwidth at the root port of the hierachy */
> +	working_xa = cxl_rp_gather_bandwidth(usp_xa);
> +	if (rc)
> +		goto out;
> +	free_perf_xa(usp_xa);

I was going to say something about getting rid of the goto's
and just freeing and returning in line like was done ~36 lines
above..but then realize something went astray here in the code
movement.  Here and below, rc isn't set.


> +	usp_xa = working_xa;
> +
> +	/* Handle the bandwidth at the host bridge of the hierachy */
> +	working_xa = cxl_hb_gather_bandwidth(usp_xa);
> +	if (rc)
> +		goto out;
> +	free_perf_xa(usp_xa);
> +	usp_xa = working_xa;
> +
> +	/*
> +	 * Aggregate all the bandwidth collected per CFMWS (ACPI0017) and
> +	 * update the region bandwidth with the final calculated values.
> +	 */
> +	cxl_region_update_bandwidth(cxlr, usp_xa);
> +
> +out:
> +	free_perf_xa(usp_xa);
> +}

snip

> 
> 

  reply	other threads:[~2024-07-11  1:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-10 22:23 [PATCH v7 0/3] cxl: Region bandwidth calculation for targets with shared upstream link Dave Jiang
2024-07-10 22:24 ` [PATCH v7 1/3] cxl: Preserve the CDAT access_coordinate for an endpoint Dave Jiang
2024-07-10 22:24 ` [PATCH v7 2/3] cxl: Calculate region bandwidth of targets with shared upstream link Dave Jiang
2024-07-11  1:39   ` Alison Schofield [this message]
2024-07-11 16:00     ` Dave Jiang
2024-07-10 22:24 ` [PATCH v7 3/3] cxl: Add documentation to explain the shared link bandwidth calculation Dave Jiang
2024-08-27 16:06   ` Jonathan Cameron
2024-08-27 22:38     ` Dave Jiang
2024-08-28  9:00       ` Jonathan Cameron
2024-08-28 15:35         ` Dave Jiang
2024-08-28 16:12           ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zo8317DPRkLOsOlH@aschofie-mobl2 \
    --to=alison.schofield@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox