patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Tony Luck <tony.luck@intel.com>, Fenghua Yu <fenghuay@nvidia.com>,
	"Maciej Wieczor-Retman" <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Anil Keshavamurthy <anil.s.keshavamurthy@intel.com>,
	Chen Yu <yu.c.chen@intel.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v4 22/31] x86/resctrl: Read core telemetry events
Date: Thu, 8 May 2025 08:57:52 -0700	[thread overview]
Message-ID: <5cad8510-cbde-493a-8e73-96da685256fd@intel.com> (raw)
In-Reply-To: <20250429003359.375508-23-tony.luck@intel.com>

Hi Tony,

On 4/28/25 5:33 PM, Tony Luck wrote:
> The resctrl file system passes requests to read event monitor files to
> the architecture resctrl_arch_rmid_read() function to collect values

nit: no need to say "function" when using ().

> from hardware counters.
> 
> Use the resctrl resource to differentiate between calls to read legacy
> L3 events from the new telemetry events (which are attached to
> RDT_RESOURCE_PERF_PKG).
> 
> There may be multiple devices tracking each package, so scan all of them

"devices" seems to be in the mix of similar term as aggregator and
telemetry regions. Having multiple terms for same/similar thing is confusing.

> and add up all counters.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/resctrl/internal.h  |  5 ++++
>  arch/x86/kernel/cpu/resctrl/intel_aet.c | 34 +++++++++++++++++++++++++
>  arch/x86/kernel/cpu/resctrl/monitor.c   |  3 +++
>  3 files changed, 42 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 571db665eca6..dd5fe8a98304 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -170,9 +170,14 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>  #ifdef CONFIG_INTEL_AET_RESCTRL
>  bool intel_aet_get_events(void);
>  void __exit intel_aet_exit(void);
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id evtid, u64 *val);
>  #else
>  static inline bool intel_aet_get_events(void) { return false; }
>  static inline void intel_aet_exit(void) { };
> +static inline int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id evtid, u64 *val)
> +{
> +	return -EINVAL;
> +}
>  #endif
>  
>  #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> index e1cb6bd4788d..0bbf991da981 100644
> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -13,6 +13,7 @@
>  
>  #include <linux/cleanup.h>
>  #include <linux/cpu.h>
> +#include <linux/io.h>
>  #include <linux/resctrl.h>
>  
>  /* Temporary - delete from final version */
> @@ -246,3 +247,36 @@ void __exit intel_aet_exit(void)
>  		free_mmio_info((*peg)->pkginfo);
>  	}
>  }
> +
> +#define VALID_BIT	BIT_ULL(63)
> +#define DATA_BITS	GENMASK_ULL(62, 0)
> +
> +/*
> + * Read counter for an event on a domain (summing all aggregators
> + * on the domain).
> + */
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id evtid, u64 *val)
> +{
> +	struct evtinfo *info = &evtinfo[evtid];
> +	struct mmio_info *mmi;
> +	u64 evtcount;
> +	int idx;
> +
> +	idx = rmid * info->event_group->num_events;
> +	idx += info->idx;
> +	mmi = info->event_group->pkginfo[domid];
> +
> +	if (idx * sizeof(u64) > info->event_group->mmio_size) {

Reading offset "idx * sizeof(u64)" when 
"idx * sizeof(u64) == info->event_group->mmio_size" is overflow, no?
How about (please check):
	if (idx * sizeof(u64) - sizeof(u64) >= info->event_group->mmio_size)
	

> +		pr_warn_once("MMIO index %d out of range\n", idx);
> +		return -EINVAL;

The function's return percolates up to rdtgroup_mondata_show() where
the return code is translated into text: -EINVAL becomes "Unavailable"
and -EIO becomes "Error". Seems like this should be -EIO instead?

> +	}
> +
> +	for (int i = 0; i < mmi->count; i++) {
> +		evtcount = readq(mmi->addrs[i] + idx * sizeof(u64));
> +		if (!(evtcount & VALID_BIT))
> +			return -EINVAL;

What does set of "VALID_BIT" mean? That it is a valid counter or
that the data within is valid?

> +		*val += evtcount & DATA_BITS;
> +	}


> +
> +	return 0;
> +}
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 8d8ec86929fa..04214585824b 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -237,6 +237,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
>  
>  	resctrl_arch_rmid_read_context_check();
>  
> +	if (r->rid == RDT_RESOURCE_PERF_PKG)
> +		return intel_aet_read_event(d->hdr.id, rmid, eventid, val);
> +

Please add comment or check that code that follows is for L3 resource

>  	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>  	ret = __rmid_read_phys(prmid, eventid, &msr_val);
>  	if (ret)

Reinette

  reply	other threads:[~2025-05-08 15:58 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29  0:33 [PATCH v4 00/31] x86/resctrl telemetry monitoring Tony Luck
2025-04-29  0:33 ` [PATCH v4 01/31] x86,fs/resctrl: Drop rdt_mon_features variable Tony Luck
2025-05-08  3:28   ` Reinette Chatre
2025-05-08 18:32     ` Luck, Tony
2025-05-08 23:44       ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 02/31] x86,fs/resctrl: Prepare for more monitor events Tony Luck
2025-05-08  3:30   ` Reinette Chatre
2025-05-09 15:02   ` Peter Newman
2025-04-29  0:33 ` [PATCH v4 03/31] fs/resctrl: Clean up rdtgroup_mba_mbps_event_{show,write}() Tony Luck
2025-05-08  3:31   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 04/31] fs/resctrl: Change how and when events are initialized Tony Luck
2025-05-08  3:31   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 05/31] fs/resctrl: Set up Kconfig options for telemetry events Tony Luck
2025-05-08  3:32   ` Reinette Chatre
2025-05-10  9:58   ` Chen, Yu C
2025-05-12 14:19     ` Luck, Tony
2025-04-29  0:33 ` [PATCH v4 06/31] x86/rectrl: Fake OOBMSM interface Tony Luck
2025-04-30 23:02   ` Luck, Tony
2025-05-08  3:33   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 07/31] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-05-08  3:36   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 08/31] x86/resctrl: Move L3 initialization out of domain_add_cpu_mon() Tony Luck
2025-05-08  3:37   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 09/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-05-08  3:37   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 10/31] x86/resctrl: Change generic monitor functions to use struct rdt_domain_hdr Tony Luck
2025-05-08  3:38   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 11/31] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-05-08  3:39   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 12/31] fs/resctrl: Improve handling for events that can be read from any CPU Tony Luck
2025-05-08  3:54   ` Reinette Chatre
2025-05-13  3:19   ` Chen, Yu C
2025-05-13 16:20     ` Luck, Tony
2025-05-14  9:11       ` Chen, Yu C
2025-04-29  0:33 ` [PATCH v4 13/31] fs/resctrl: Add support for additional monitor event display formats Tony Luck
2025-05-08 15:49   ` Reinette Chatre
2025-05-08 20:28     ` Luck, Tony
2025-05-08 23:45       ` Reinette Chatre
2025-05-09 11:29         ` Dave Martin
2025-05-09 14:46           ` Peter Newman
2025-05-09 16:38             ` Luck, Tony
2025-05-09 16:43             ` Dave Martin
2025-04-29  0:33 ` [PATCH v4 14/31] fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-05-08 15:50   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 15/31] x86/resctrl: Add and initialize rdt_resource for package scope core monitor Tony Luck
2025-05-08 15:50   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 16/31] x86/resctrl: Add first part of telemetry event enumeration Tony Luck
2025-05-08 15:53   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 17/31] x86/resctrl: Add second " Tony Luck
2025-05-08 15:54   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 18/31] x86/resctrl: Add third " Tony Luck
2025-05-08 15:56   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 19/31] x86,fs/resctrl: Fill in details of Clearwater Forest events Tony Luck
2025-05-08 15:54   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 20/31] x86/resctrl: Check for adequate MMIO space Tony Luck
2025-05-08 15:56   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 21/31] x86/resctrl: Add fourth part of telemetry event enumeration Tony Luck
2025-05-08 15:56   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 22/31] x86/resctrl: Read core telemetry events Tony Luck
2025-05-08 15:57   ` Reinette Chatre [this message]
2025-04-29  0:33 ` [PATCH v4 23/31] x86,fs/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-05-08 15:58   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 24/31] fs/resctrl: Add type define for PERF_PKG files Tony Luck
2025-04-29  0:33 ` [PATCH v4 25/31] x86/resctrl: Final steps to enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-04-29  0:33 ` [PATCH v4 26/31] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-05-08 15:58   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 27/31] x86/resctrl: Handle number of RMIDs supported by telemetry resources Tony Luck
2025-05-08 15:59   ` Reinette Chatre
2025-04-29  0:33 ` [PATCH v4 28/31] x86,fs/resctrl: Fix RMID allocation for multiple monitor resources Tony Luck
2025-04-29  0:33 ` [PATCH v4 29/31] fs/resctrl: Add interface for per-resource debug info files Tony Luck
2025-04-29  0:33 ` [PATCH v4 30/31] x86/resctrl: Add info/PERF_PKG_MON/status file Tony Luck
2025-04-29  0:33 ` [PATCH v4 31/31] x86/resctrl: Update Documentation for package events Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cad8510-cbde-493a-8e73-96da685256fd@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).