public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v18 14/17] x86/resctrl: Sum monitor data acrss Sub-NUMA (SNC) nodes when needed
Date: Wed, 22 May 2024 14:24:12 -0700	[thread overview]
Message-ID: <ccd5df99-4d7e-4224-a07e-3897e370b53e@intel.com> (raw)
In-Reply-To: <20240515222326.74166-15-tony.luck@intel.com>

Hi Tony,

shortlog: acrss -> across?

On 5/15/2024 3:23 PM, Tony Luck wrote:
> When the sumdomains fields is set in the rmid_read structure, walk
> the list of domains in this resource to find all that share an L3
> cache id (rr->display_id).

"L3 cache id" vs "monitor display scope"?

> 
> Adjust the RMID value based on which SNC domain is being accessed.

Thinking generously this changelog contains two brief descriptions of code
snippets. There is no context or explanation of what the goal of this change
is and why it chooses to implement things the way it does ... implementations
that definitely needs some explanation.

> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 20 +++++++++++---
>  arch/x86/kernel/cpu/resctrl/monitor.c     | 33 ++++++++++++++++++++++-
>  2 files changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 3b9383612c35..7ab788d47ad3 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -575,15 +575,27 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	resid = md.u.rid;
>  	domid = md.u.domid;
>  	evtid = md.u.evtid;
> -
> +	rr.sumdomains = md.u.sum;
>  	r = &rdt_resources_all[resid].r_resctrl;
> -	hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
> -	if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
> +
> +	if (rr.sumdomains) {
> +		rr.display_id = domid;
> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +			if (d->display_id == domid)
> +				goto got_domain;
> +		}
>  		ret = -ENOENT;
>  		goto out;
> +	} else {
> +		hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
> +		if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
> +			ret = -ENOENT;
> +			goto out;
> +		}
> +		d = container_of(hdr, struct rdt_mon_domain, hdr);
>  	}
> -	d = container_of(hdr, struct rdt_mon_domain, hdr);
>  
> +got_domain:
>  	mon_event_read(&rr, r, d, rdtgrp, evtid, false);

This looks more like "wedging things until it works". In the "sumdomains" case
it just picks the first matching domain without any explanation why it would
be ok. The only reason why mon_event_read() needs the domain is so that
it can use it to determine which CPUs it should read the counters from and it looks
like this code is written to take advantage of just that. It is a hack based on
knowledge of internals of mon_event_read() to just get the reader to run on a
CPU that works for SNC where another quirk awaits to override the domain
when the counter is _actually_ read so that it can get the correct architectural
state.

I was expecting here to at least see some documentation/comments
to explain why the code behaves the way it is. Optimistically it can
be documented as an "optimization" for the sumdomains case that is only
used by SNC where it is ok to read counters from any domain in the
"monitor display scope".

Apart from the documentation I do not think this code should wedge itself
in like this. 

To make it obvious what this code does I think the non-SNC case should
set rr.d and mon_event_read() should take a CPU mask as parameter
instead of a struct rdt_domain. This means that in SNC case rr.d will be
NULL and make the code flow more obvious.

>  
>  	if (rr.err == -EIO)
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 0f66825a1ac9..668d2fdf58cd 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -546,7 +546,7 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid,
>  	}
>  }
>  
> -static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
> +static int ___mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  {
>  	struct mbm_state *m;
>  	u64 tval = 0;
> @@ -569,6 +569,37 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  	return 0;
>  }
>  
> +static u32 get_node_rmid(struct rdt_resource *r, struct rdt_mon_domain *d, u32 rmid)
> +{
> +	int cpu = cpumask_any(&d->hdr.cpu_mask);
> +
> +	return rmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid;

Is this missing a "if (snc_nodes_per_l3_cache > 1)" ?

I do not think this belongs in resctrl fs code though. Should this algorithm be forced
on all architectures? It seems more appropriate for the arch specific code.

> +}
> +
> +static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
> +{
> +	struct rdt_mon_domain *d;
> +	u32 node_rmid;
> +	int ret = 0;
> +
> +	if (!rr->sumdomains) {
> +		node_rmid = get_node_rmid(rr->r, rr->d, rmid);
> +		return ___mon_event_count(closid, node_rmid, rr);
> +	}
> +

/*
 * rr->sumdomains is only set by SNC mode where the event
 * counters of a monitoring domain can be read from a CPU belonging
 * to a different monitoring domain that shares the same monitoring
 * display domain. Optimize counter reads when needing to sum the
 * values by reading the counter for several domains from
 * the same CPU instead of sending IPI to one CPU of each monitoring
 * domain.
 */

Please feel free to improve, please do help folks trying to understand
what this code does.

> +	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
> +		if (d->display_id != rr->display_id)
> +			continue;
> +		rr->d = d;
> +		node_rmid = get_node_rmid(rr->r, d, rmid);
> +		ret = ___mon_event_count(closid, node_rmid, rr);
> +		if (ret)
> +			break;
> +	}
> +
> +	return ret;
> +}
> +
>  /*
>   * mbm_bw_count() - Update bw count from values previously read by
>   *		    __mon_event_count().

Reinette

  reply	other threads:[~2024-05-22 21:24 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15 22:23 [PATCH v18 00/17] Add support for Sub-NUMA cluster (SNC) systems Tony Luck
2024-05-15 22:23 ` [PATCH v18 01/17] x86/resctrl: Prepare for new domain scope Tony Luck
2024-05-15 22:23 ` [PATCH v18 02/17] x86/resctrl: Prepare to split rdt_domain structure Tony Luck
2024-05-15 22:23 ` [PATCH v18 03/17] x86/resctrl: Prepare for different scope for control/monitor operations Tony Luck
2024-05-15 22:23 ` [PATCH v18 04/17] x86/resctrl: Split the rdt_domain and rdt_hw_domain structures Tony Luck
2024-05-15 22:23 ` [PATCH v18 05/17] x86/resctrl: Add node-scope to the options for feature scope Tony Luck
2024-05-15 22:23 ` [PATCH v18 06/17] x86/resctrl: Introduce snc_nodes_per_l3_cache Tony Luck
2024-05-22 21:08   ` Reinette Chatre
2024-05-23 19:04     ` Luck, Tony
2024-05-23 20:56       ` Reinette Chatre
2024-05-23 21:25         ` Luck, Tony
2024-05-23 22:31           ` Reinette Chatre
2024-05-23 23:18             ` Luck, Tony
2024-05-23 23:48               ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 07/17] x86/resctrl: Prepare for new Sub-NUMA (SNC) cluster monitor files Tony Luck
2024-05-22 21:12   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 08/17] x86/resctrl: Add and initialize display_id field to struct rdt_mon_domain Tony Luck
2024-05-22 21:13   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 09/17] x86/resctrl: Add new fields to struct rmid_read for summation of domains Tony Luck
2024-05-22 21:14   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 10/17] x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function Tony Luck
2024-05-22 21:15   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 11/17] x86/resctrl: Allocate a new bit in union mon_data_bits Tony Luck
2024-05-22 21:16   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 12/17] x86/resctrl: Create Sub-NUMA (SNC) monitor files Tony Luck
2024-05-22 21:19   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 13/17] x86/resctrl: Handle removing directories in Sub-NUMA (SNC) mode Tony Luck
2024-05-22 21:20   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 14/17] x86/resctrl: Sum monitor data acrss Sub-NUMA (SNC) nodes when needed Tony Luck
2024-05-22 21:24   ` Reinette Chatre [this message]
2024-05-15 22:23 ` [PATCH v18 15/17] x86/resctrl: Fix RMID reading sanity check for Sub-NUMA (SNC) mode Tony Luck
2024-05-16  8:59   ` Maciej Wieczor-Retman
2024-05-22 21:25   ` Reinette Chatre
2024-05-22 23:47     ` Tony Luck
2024-05-23 17:03       ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 16/17] x86/resctrl: Sub NUMA Cluster detection and enable Tony Luck
2024-05-22 21:26   ` Reinette Chatre
2024-05-15 22:23 ` [PATCH v18 17/17] x86/resctrl: Update documentation with Sub-NUMA cluster changes Tony Luck
2024-05-16  8:57   ` Maciej Wieczor-Retman
2024-05-22 21:27   ` Reinette Chatre
2024-05-16  9:28 ` [PATCH v18 00/17] Add support for Sub-NUMA cluster (SNC) systems Maciej Wieczor-Retman
2024-05-16 15:52   ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccd5df99-4d7e-4224-a07e-3897e370b53e@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghua.yu@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox