Linux CXL
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Dave Jiang <dave.jiang@intel.com>, <linux-cxl@vger.kernel.org>
Cc: <dan.j.williams@intel.com>, <ira.weiny@intel.com>,
	<vishal.l.verma@intel.com>, <alison.schofield@intel.com>,
	<Jonathan.Cameron@huawei.com>, <dave@stgolabs.net>
Subject: Re: [PATCH v2] cxl: Calculate region bandwidth of targets with shared upstream link
Date: Tue, 23 Apr 2024 16:37:50 -0700	[thread overview]
Message-ID: <6628464ee73bb_d567d2943c@iweiny-mobl.notmuch> (raw)
In-Reply-To: <20240411181641.514075-1-dave.jiang@intel.com>

Dave Jiang wrote:

[snip]

> +/*
> + * Calculate the bandwidth for the cxl region based on the number of targets
> + * that share an upstream switch. The function is called while targets are
> + * being attached for a region. If the number of targets is 1, then
> + * the target either does not have a upstream switch or it's the first target
> + * of the shared link. In this case, the bandwidth is the sum of the target
> + * bandwidth and the collected region bandwidth. If the targets from cxl_rr is
> + * greater than 1, then the bandwidth is the minimum of the switch upstream
> + * port bandwidth or the region plus the target bandwidth.
> + */
> +static unsigned int calculate_region_bw(int targets, unsigned int usp_bw,
> +					unsigned int ep_bw,
> +					unsigned int region_bw)
> +{
> +	if (targets == 1)
> +		return region_bw + ep_bw;
> +
> +	return min_t(unsigned int, usp_bw, region_bw + ep_bw);
> +}
> +
>  void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>  				    struct cxl_endpoint_decoder *cxled)
>  {
> @@ -551,7 +581,9 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>  			.start = cxled->dpa_res->start,
>  			.end = cxled->dpa_res->end,
>  	};
> +	struct cxl_port *port = cxlmd->endpoint;
>  	struct cxl_dpa_perf *perf;
> +	int usp_bw, targets;
>  
>  	switch (cxlr->mode) {
>  	case CXL_DECODER_RAM:
> @@ -569,6 +601,12 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>  	if (!range_contains(&perf->dpa_range, &dpa))
>  		return;
>  
> +	usp_bw = cxl_get_switch_uport_bandwidth(port->uport_dev);
> +	if (usp_bw > 0)
> +		targets = cxl_port_shared_region_targets(port, cxlr);

I'm not quite following how this handles a x4 situation with 2 upstream
ports and 2 devices under each of those switches.

     +------+            +------+
     | HB 0 |            | HB 1 |
     +------+            +------+
        | (link 0)          | (link 1)
     +------+            +------+
     | SW 0 |            | SW 1 |
     +------+            +------+
      /    \              /     \
 +------+ +------+   +------+ +------+
 | EP 0 | | EP 1 |   | EP 2 | | EP 3 |
 +------+ +------+   +------+ +------+

In this case the region BW should be:

	min ( sum(link0, link1), sum(EP 0-3) )

How is the sum of EP 0-1 limited to the link0 BW, and EP 2-3 to link1?

> +	else
> +		targets = 1;

Maybe a comment to indicate that targets == 1 is a failure to read the usp
link speed so defaulting back to the sum of the endpoints.

> +
>  	for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
>  		/* Get total bandwidth and the worst latency for the cxl region */
>  		cxlr->coord[i].read_latency = max_t(unsigned int,
> @@ -577,8 +615,14 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>  		cxlr->coord[i].write_latency = max_t(unsigned int,
>  						     cxlr->coord[i].write_latency,
>  						     perf->coord[i].write_latency);
> -		cxlr->coord[i].read_bandwidth += perf->coord[i].read_bandwidth;
> -		cxlr->coord[i].write_bandwidth += perf->coord[i].write_bandwidth;
> +		cxlr->coord[i].read_bandwidth =
> +			calculate_region_bw(targets, usp_bw,
> +					    perf->coord[i].read_bandwidth,
> +					    cxlr->coord[i].read_bandwidth);
> +		cxlr->coord[i].write_bandwidth =
> +			calculate_region_bw(targets, usp_bw,
> +					    perf->coord[i].write_bandwidth,
> +					    cxlr->coord[i].write_bandwidth);
>  	}
>  }
>  

[snip]

> +
> +int cxl_pci_get_switch_usp_bandwidth(struct pci_dev *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct pci_dev *iter = pdev;
> +
> +	do {
> +		if (pci_pcie_type(iter) == PCI_EXP_TYPE_UPSTREAM)
> +			break;

Why is it not ok to do:

	while (pci_pcie_type(iter) != PCI_EXP_TYPE_UPSTREAM) {
		...
	}

Ira

[snip]

  reply	other threads:[~2024-04-23 23:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11 18:16 [PATCH v2] cxl: Calculate region bandwidth of targets with shared upstream link Dave Jiang
2024-04-23 23:37 ` Ira Weiny [this message]
2024-05-01 14:25   ` Jonathan Cameron
2024-05-01 13:46 ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6628464ee73bb_d567d2943c@iweiny-mobl.notmuch \
    --to=ira.weiny@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox