public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Tony Luck <tony.luck@intel.com>, Fenghua Yu <fenghuay@nvidia.com>,
	"Maciej Wieczor-Retman" <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v12 13/31] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor
Date: Wed, 22 Oct 2025 21:33:59 -0700	[thread overview]
Message-ID: <a10e3334-3616-43ee-84bd-f26469f52051@intel.com> (raw)
In-Reply-To: <20251013223348.103390-14-tony.luck@intel.com>

Hi Tony,

On 10/13/25 3:33 PM, Tony Luck wrote:
> Add a new PERF_PKG resource and introduce package level scope for
> monitoring telemetry events so that CPU hot plug notifiers can build
> domains at the package granularity.
> 
> Use the physical package ID available via topology_physical_package_id()
> to identify the monitoring domains with package level scope. This enables
> user space to use:
>  /sys/devices/system/cpu/cpuX/topology/physical_package_id
> to identify the monitoring domain a CPU is associated with.
> 
> Now there is a new monitoring resource, add a WARN to places that
> implicitly assume RDT_RESOURCE_L3.
> 
> Update some kerneldoc to point out L3 dependencies.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

I do not believe that the changes added since my RB tag fit with this patch.

As mentioned in patch #5 I find that (all but one of) the new changes added
to this patch can be merged with those stray hunks found in patch #5 to form one
logical change. 

Below is a high level of what I have in mind presented as a draft changelog
for that new patch:

	The feature to sum event data across multiple domains supports systems
	with Sub-NUMA Cluster (SNC) mode enabled. The top-level monitoring files
	in each "mon_L3_XX" directory provide the sum of data across all SNC
	nodes sharing an L3 cache instance while the "mon_sub_L3_YY" sub-directories
	provide the event data of the individual nodes.

	SNC is only associated with the L3 resource and domains and as a result
	the flow handling the sum of event data implicitly assumes it is
	working with the L3 resource and domains.

	Reading of telemetry events do not require to sum event	data so this
	feature can remain dedicated to SNC and keep the implicit assumption
	of working with the L3 resource and domains.

	Add a WARN to where the implicit assumption of working with the	L3 resource
	is made and add comments on how the structure controlling the event sum
	feature is used.

> ---
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index f5189b6771a0..39bdaf45fa2a 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -92,8 +92,8 @@ extern struct mon_evt mon_event_all[QOS_NUM_EVENTS];
>   * @list:            Member of the global @mon_data_kn_priv_list list.
>   * @rid:             Resource id associated with the event file.
>   * @evt:             Event structure associated with the event file.
> - * @sum:             Set when event must be summed across multiple
> - *                   domains.
> + * @sum:             Set for RDT_RESOURCE_L3 when event must be summed
> + *                   across multiple domains.
>   * @domid:           When @sum is zero this is the domain to which
>   *                   the event file belongs. When @sum is one this
>   *                   is the id of the L3 cache that all domains to be

above can move to the new patch

...

> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index ae43e09fa5e5..d681b71e6eca 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -712,6 +712,9 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	if (md->sum) {
>  		struct rdt_l3_mon_domain *d;
>  
> +		if (WARN_ON_ONCE(resid != RDT_RESOURCE_L3))
> +			return -EINVAL;
> +
>  		/*
>  		 * This file requires summing across all domains that share
>  		 * the L3 cache id that was provided in the "domid" field of the

above can move to the new patch ... it can form a new hunk with the hunk from
patch #5 like:


  	if (md->sum) {
 + 		struct rdt_l3_mon_domain *d;
 + 
 +		if (WARN_ON_ONCE(resid != RDT_RESOURCE_L3))
 +			FIXME
 +
  		/*
  		 * This file requires summing across all domains that share
  		 * the L3 cache id that was provided in the "domid" field of the

I added the FIXME because this should not do a direct return here while holding
CPU hotplug lock as well as rdtgroup mutex.

> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 255d4bad24cb..4c984dc6784e 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -501,6 +501,9 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>  	 * all domains fail for any reason.
>  	 */
>  	ret = -EINVAL;
> +	if (WARN_ON_ONCE(rr->r->rid != RDT_RESOURCE_L3))
> +		return ret;
> +

This looks redundant to me when considering the WARN added to rdtgroup_mondata_show()
... and it is removed in patch #18 anyway.

>  	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
>  		if (d->ci_id != rr->ci->id)
>  			continue;

...

> @@ -3028,7 +3030,8 @@ static void rmdir_all_sub(void)
>   * @rid:    The resource id for the event file being created.
>   * @domid:  The domain id for the event file being created.
>   * @mevt:   The type of event file being created.
> - * @do_sum: Whether SNC summing monitors are being created.
> + * @do_sum: Whether SNC summing monitors are being created. Only set
> + *	    when @rid == RDT_RESOURCE_L3.
>   */
>  static struct mon_data *mon_get_kn_priv(enum resctrl_res_level rid, int domid,
>  					struct mon_evt *mevt,

above can move to new patch

Reinette

  reply	other threads:[~2025-10-23  4:34 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 22:33 [PATCH v12 00/31] x86,fs/resctrl telemetry monitoring Tony Luck
2025-10-13 22:33 ` [PATCH v12 01/31] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-10-23  4:07   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 02/31] x86/resctrl: Move L3 initialization into new helper function Tony Luck
2025-10-13 22:33 ` [PATCH v12 03/31] x86/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-10-23  4:08   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 04/31] x86/resctrl: Clean up domain_remove_cpu_ctrl() Tony Luck
2025-10-13 22:33 ` [PATCH v12 05/31] x86,fs/resctrl: Refactor domain create/remove using struct rdt_domain_hdr Tony Luck
2025-10-23  4:15   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 06/31] x86,fs/resctrl: Use struct rdt_domain_hdr when reading counters Tony Luck
2025-10-23  4:17   ` Reinette Chatre
2025-10-23 20:27     ` Luck, Tony
2025-10-13 22:33 ` [PATCH v12 07/31] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-10-23  4:18   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 08/31] x86,fs/resctrl: Rename some L3 specific functions Tony Luck
2025-10-23  4:21   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 09/31] fs/resctrl: Make event details accessible to functions when reading events Tony Luck
2025-10-13 22:33 ` [PATCH v12 10/31] x86,fs/resctrl: Handle events that can be read from any CPU Tony Luck
2025-10-23  4:22   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 11/31] x86,fs/resctrl: Support binary fixed point event counters Tony Luck
2025-10-13 22:33 ` [PATCH v12 12/31] x86,fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-10-13 22:33 ` [PATCH v12 13/31] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor Tony Luck
2025-10-23  4:33   ` Reinette Chatre [this message]
2025-10-13 22:33 ` [PATCH v12 14/31] x86/resctrl: Discover hardware telemetry events Tony Luck
2025-10-23  4:28   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 15/31] x86,fs/resctrl: Fill in details of events for guid 0x26696143 and 0x26557651 Tony Luck
2025-10-23  4:28   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 16/31] x86,fs/resctrl: Add architectural event pointer Tony Luck
2025-10-23  4:34   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 17/31] x86/resctrl: Find and enable usable telemetry events Tony Luck
2025-10-23  4:35   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 18/31] fs/resctrl: Split L3 dependent parts out of __mon_event_count() Tony Luck
2025-10-23  4:37   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 19/31] x86/resctrl: Read telemetry events Tony Luck
2025-10-23  4:47   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 20/31] fs/resctrl: Refactor mkdir_mondata_subdir() Tony Luck
2025-10-23 17:45   ` Reinette Chatre
2025-10-27 23:00     ` Luck, Tony
2025-10-28 16:00       ` Reinette Chatre
2025-10-28 17:14         ` Luck, Tony
2025-10-28 17:40           ` Reinette Chatre
2025-10-28 18:40             ` Luck, Tony
2025-10-28 23:55               ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 21/31] fs/resctrl: Refactor rmdir_mondata_subdir_allrdtgrp() Tony Luck
2025-10-23 17:45   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 22/31] x86,fs/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-10-23 17:46   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 23/31] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-10-23 17:45   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 24/31] x86/resctrl: Handle number of RMIDs supported by RDT_RESOURCE_PERF_PKG Tony Luck
2025-10-23 17:48   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 25/31] fs/resctrl: Move allocation/free of closid_num_dirty_rmid[] Tony Luck
2025-10-23 17:49   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 26/31] x86,fs/resctrl: Compute number of RMIDs as minimum across resources Tony Luck
2025-10-13 22:33 ` [PATCH v12 27/31] fs/resctrl: Move RMID initialization to first mount Tony Luck
2025-10-23 17:49   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 28/31] x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-10-23 17:50   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 29/31] fs/resctrl: Provide interface to create architecture specific debugfs area Tony Luck
2025-10-23 17:50   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 30/31] x86/resctrl: Add debugfs files to show telemetry aggregator status Tony Luck
2025-10-23 17:50   ` Reinette Chatre
2025-10-13 22:33 ` [PATCH v12 31/31] x86,fs/resctrl: Update documentation for telemetry events Tony Luck
2025-10-23 17:52   ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a10e3334-3616-43ee-84bd-f26469f52051@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox