patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v20 15/18] x86/resctrl: Make __mon_event_count() handle sum domains
Date: Thu, 20 Jun 2024 14:31:51 -0700	[thread overview]
Message-ID: <ae43b9c3-15f9-417d-aad3-ee48987c9b90@intel.com> (raw)
In-Reply-To: <20240610183528.349198-16-tony.luck@intel.com>

Hi Tony,

On 6/10/24 11:35 AM, Tony Luck wrote:
> Legacy resctrl monitor files must provide the sum of event values across
> all Sub-NUMA Cluster (SNC) domains that share an L3 cache instance.
> 
> There are now two cases:
> 1) A specific domain is provided in struct rmid_read
>     This is either a non-SNC system, or the request is to read data
>     from just one SNC node.
> 2) Domain pointer is NULL. In this case the cacheinfo field in struct
>     rmid_read indicates that all SNC nodes that share that L3 cache
>     instance should have the event read and return the sum of all
>     values.
> 
> Update the CPU sanity check. The existing check that an event is read
> from a CPU in the requested domain still applies when reading a single
> domain. But when summing across domains a more relaxed check that the
> current CPU is in the scope of the L3 cache instance is appropriate
> since the MSRs to read events are scoped at L3 cache level.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>   arch/x86/kernel/cpu/resctrl/monitor.c | 40 +++++++++++++++++++++------
>   1 file changed, 32 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index f2fd35d294f2..c4d9a8df8d2d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -324,9 +324,6 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>   
>   	resctrl_arch_rmid_read_context_check();
>   
> -	if (!cpumask_test_cpu(smp_processor_id(), &d->hdr.cpu_mask))
> -		return -EINVAL;
> -
>   	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>   	ret = __rmid_read_phys(prmid, eventid, &msr_val);
>   	if (ret)
> @@ -592,6 +589,8 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid,
>   
>   static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>   {
> +	int cpu = smp_processor_id();
> +	struct rdt_mon_domain *d;
>   	struct mbm_state *m;
>   	u64 tval = 0;
>   
> @@ -603,12 +602,37 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>   		return 0;
>   	}
>   
> -	rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid,
> -					 &tval, rr->arch_mon_ctx);
> -	if (rr->err)
> -		return rr->err;
> +	if (rr->d) {
> +		/* Reading a single domain, must be on a CPU in that domain */

(nit: Please let this sentence as well as the later comment end with period.)

> +		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
> +			return -EINVAL;
> +		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
> +						 rr->evtid, &tval, rr->arch_mon_ctx);
> +		if (rr->err)
> +			return rr->err;
> +
> +		rr->val += tval;
> +
> +		return 0;
> +	}
>   
> -	rr->val += tval;
> +	/* Summing domains that share a cache, must be on a CPU for that cache */
> +	if (!cpumask_test_cpu(cpu, &rr->ci->shared_cpu_map))
> +		return -EINVAL;
> +
> +	/*
> +	 * Legacy files must report the sum of an event across all
> +	 * domains that share the same L3 cache instance.
> +	 */
> +	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
> +		if (d->ci->id != rr->ci->id)
> +			continue;
> +		rr->err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
> +						 rr->evtid, &tval, rr->arch_mon_ctx);
> +		if (rr->err)
> +			return rr->err;

An error here means that the hardware does not have data available to return. Should
the unavailability of data for one domain be considered an error for all? Note how
this is handled in mon_event_count() where the error is discarded if any of the monitor
event reads succeeded ... should this be done here also?

> +		rr->val += tval;
> +	}
>   
>   	return 0;
>   }


Reinette

  reply	other threads:[~2024-06-20 21:32 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-10 18:35 [PATCH v20 00/18] Add support for Sub-NUMA cluster (SNC) systems Tony Luck
2024-06-10 18:35 ` [PATCH v20 01/18] x86/resctrl: Prepare for new domain scope Tony Luck
2024-06-20 21:12   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 02/18] x86/resctrl: Prepare to split rdt_domain structure Tony Luck
2024-06-20 21:13   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 03/18] x86/resctrl: Prepare for different scope for control/monitor operations Tony Luck
2024-06-20 21:13   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 04/18] x86/resctrl: Split the rdt_domain and rdt_hw_domain structures Tony Luck
2024-06-20 21:14   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 05/18] x86/resctrl: Add node-scope to the options for feature scope Tony Luck
2024-06-20 21:15   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 06/18] x86/resctrl: Introduce snc_nodes_per_l3_cache Tony Luck
2024-06-17 22:36   ` Moger, Babu
2024-06-18 22:58     ` Reinette Chatre
2024-06-19 14:43       ` Moger, Babu
2024-06-20 21:19   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 07/18] x86/resctrl: Block use of mba_MBps mount option on Sub-NUMA Cluster (SNC) systems Tony Luck
2024-06-20 21:21   ` Reinette Chatre
2024-06-20 22:07     ` Luck, Tony
2024-06-20 22:12       ` Luck, Tony
2024-06-21  1:56       ` Reinette Chatre
2024-06-21 15:24         ` Tony Luck
2024-06-21 17:10           ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 08/18] x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files Tony Luck
2024-06-20 21:22   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 09/18] x86/resctrl: Add a new field to struct rmid_read for summation of domains Tony Luck
2024-06-20 21:22   ` Reinette Chatre
2024-06-20 22:42     ` Luck, Tony
2024-06-21  1:59       ` Reinette Chatre
2024-06-21 16:07         ` Luck, Tony
2024-06-21 17:10           ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 10/18] x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function Tony Luck
2024-06-20 21:23   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 11/18] x86/resctrl: Allocate a new field in union mon_data_bits Tony Luck
2024-06-20 21:28   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 12/18] x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files Tony Luck
2024-06-20 21:30   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 13/18] x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC) mode Tony Luck
2024-06-20 21:30   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 14/18] x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter Tony Luck
2024-06-20 21:31   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 15/18] x86/resctrl: Make __mon_event_count() handle sum domains Tony Luck
2024-06-20 21:31   ` Reinette Chatre [this message]
2024-06-10 18:35 ` [PATCH v20 16/18] x86/resctrl: Enable RMID shared RMID mode on Sub-NUMA Cluster (SNC) systems Tony Luck
2024-06-20 21:32   ` Reinette Chatre
2024-06-10 18:35 ` [PATCH v20 17/18] x86/resctrl: Sub-NUMA Cluster (SNC) detection Tony Luck
2024-06-20 21:34   ` Reinette Chatre
2024-06-21 17:05   ` Markus Elfring
2024-06-21 17:14     ` Luck, Tony
2024-06-10 18:35 ` [PATCH v20 18/18] x86/resctrl: Update documentation with Sub-NUMA cluster changes Tony Luck
2024-06-20 21:35   ` Reinette Chatre
2024-06-13 19:17 ` [PATCH v20 00/18] Add support for Sub-NUMA cluster (SNC) systems Moger, Babu
2024-06-13 20:32   ` Reinette Chatre
2024-06-13 21:02     ` Luck, Tony
2024-06-14 16:27     ` Moger, Babu
2024-06-14 16:46       ` Reinette Chatre
2024-06-14 21:29         ` Moger, Babu
2024-06-14 21:40           ` Luck, Tony
2024-06-14 22:31             ` Moger, Babu
2024-06-14 23:11           ` Reinette Chatre
2024-06-17 14:06             ` Moger, Babu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae43b9c3-15f9-417d-aad3-ee48987c9b90@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghua.yu@intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).