Archive-only list for patches
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Tony Luck <tony.luck@intel.com>, Fenghua Yu <fenghuay@nvidia.com>,
	"Maciej Wieczor-Retman" <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	Drew Fustini <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v10 05/28] x86,fs/resctrl: Use struct rdt_domain_hdr instead of struct rdt_mon_domain
Date: Thu, 18 Sep 2025 14:54:54 -0700	[thread overview]
Message-ID: <69ec7c09-2c4b-4289-9f98-143a9618a7cc@intel.com> (raw)
In-Reply-To: <20250912221053.11349-6-tony.luck@intel.com>

Hi Tony,

On 9/12/25 3:10 PM, Tony Luck wrote:
> All monitoring events are associated with the L3 resource and it made

(copy&pasted intro)

> sense to use the L3 specific "struct rdt_mon_domain *" arguments to
> functions manipulating domains.
> 
> Telemetry events will be tied to another resource. This requires the
> functions that manipulate domains to not be L3 specific.
> 
> Change the calling sequence to use the generic struct rdt_domain_hdr for
> domain addition and deletion to preserve as much common code as possible.
> This will simplify enabling of enumeration of domains for events in
> other resources.
> 
> Update kerneldoc for mon_data::sum to note that it is only used for
> RDT_RESOURCE_L3.

Above seems unrelated to the change described in rest of changelog.

> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> Note: To maintain bisectability this patch adds:
> 
> 	d = container_of(hdr, struct rdt_mon_domain, hdr);
> 
> to mon_event_read() so that the chnages to the event reading code
> can be cleanly deferred to the following patch. This doesn't need
> a domain_header_is_valid() because RDT_RESOURCE_L3 is still the only
> resource available at this point in the series. This call to
> container_of() is removed in the next patch with the completion
> of the conversion to using struct rdt_domain_hdr.

Why not just add domain_header_is_valid() to make this clear and remove it
in the next patch? That will take less effort than to justify not doing so. Not to
mention the confusion introduced by this patch.

I believe this is a consequence of this patch mixing three logical changes. As I mentioned
in https://lore.kernel.org/lkml/40461810-95e8-4efa-bff4-540ef01051f4@intel.com/ the
work is very hard to review when changes to support package scope resources are sneaked in
under the guise of L3 refactoring.

The changelog already hints at the confusion, it starts by stating that there is a
requirement: "This requires the functions that manipulate domains to not be L3 specific."
Then the changelog presents the fix as "Change the calling sequence to use the generic
struct rdt_domain_hdr ..."

Just providing "struct rdt_domain_hdr" will not suddenly make the code not be L3 specific,
right? Yet, the patch self makes many more changes than just "passing struct rdt_domain_hdr"
but instead also starts adding support for package scoped resources by pulling out
the L3 specific code not relevant to telemetry events. 

Please let each patch stick to one logical change. Only do the switch to
struct rdt_domain_hdr in this patch. At this point all monitoring code is L3 specific
and it is not necessary to exctract L3 resource code. The switch to
struct rdt_domain_hdr instead makes doing so as a next step easier since it
makes clear what code is L3 resource specific. The code flow changes in support of package
events should also be separate ... but I already mentioned this in response to v9 :(

> ---
>  include/linux/resctrl.h            |   4 +-
>  fs/resctrl/internal.h              |   6 +-
>  arch/x86/kernel/cpu/resctrl/core.c |   4 +-
>  fs/resctrl/ctrlmondata.c           |  11 ++--
>  fs/resctrl/rdtgroup.c              | 102 ++++++++++++++++++++---------
>  5 files changed, 84 insertions(+), 43 deletions(-)
> 
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index dfc91c5e8483..0b55809af5d7 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -504,9 +504,9 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
>  u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
>  			    u32 closid, enum resctrl_conf_type type);
>  int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d);
> -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d);
> +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain_hdr *hdr);
>  void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d);
> -void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d);
> +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain_hdr *hdr);
>  void resctrl_online_cpu(unsigned int cpu);
>  void resctrl_offline_cpu(unsigned int cpu);
>  
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 10b46cd9e394..6c78b1a4ca3a 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -82,8 +82,8 @@ extern struct mon_evt mon_event_all[QOS_NUM_EVENTS];
>   * @list:            Member of the global @mon_data_kn_priv_list list.
>   * @rid:             Resource id associated with the event file.
>   * @evtid:           Event id associated with the event file.
> - * @sum:             Set when event must be summed across multiple
> - *                   domains.
> + * @sum:             Set for RDT_RESOURCE_L3 when event must be summed
> + *                   across multiple domains.

After reading changelog this change is unexpected.

>   * @domid:           When @sum is zero this is the domain to which
>   *                   the event file belongs. When @sum is one this
>   *                   is the id of the L3 cache that all domains to be
> @@ -362,7 +362,7 @@ void mon_event_count(void *info);
>  int rdtgroup_mondata_show(struct seq_file *m, void *arg);
>  
>  void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> -		    struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> +		    struct rdt_domain_hdr *hdr, struct rdtgroup *rdtgrp,
>  		    cpumask_t *cpumask, int evtid, int first);
>  
>  int resctrl_mon_resource_init(void);
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index c6ce72cba543..ffc154189abd 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -533,7 +533,7 @@ static void l3_mon_domain_setup(int cpu, int id, struct rdt_resource *r, struct
>  
>  	list_add_tail_rcu(&d->hdr.list, add_pos);
>  
> -	err = resctrl_online_mon_domain(r, d);
> +	err = resctrl_online_mon_domain(r, &d->hdr);
>  	if (err) {
>  		list_del_rcu(&d->hdr.list);
>  		synchronize_rcu();
> @@ -658,7 +658,7 @@ static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r)
>  
>  		d = container_of(hdr, struct rdt_mon_domain, hdr);
>  		hw_dom = resctrl_to_arch_mon_dom(d);
> -		resctrl_offline_mon_domain(r, d);
> +		resctrl_offline_mon_domain(r, hdr);
>  		list_del_rcu(&hdr->list);
>  		synchronize_rcu();
>  		mon_domain_free(hw_dom);
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index f248eaf50d3c..3bbfb5398e6f 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -547,11 +547,14 @@ struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h, int id,
>  }
>  
>  void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> -		    struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> +		    struct rdt_domain_hdr *hdr, struct rdtgroup *rdtgrp,
>  		    cpumask_t *cpumask, int evtid, int first)
>  {
> +	struct rdt_mon_domain *d;
>  	int cpu;
>  
> +	d = container_of(hdr, struct rdt_mon_domain, hdr);
> +
>  	/* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */
>  	lockdep_assert_cpus_held();
>  
> @@ -598,7 +601,6 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	enum resctrl_event_id evtid;
>  	struct rdt_domain_hdr *hdr;
>  	struct rmid_read rr = {0};
> -	struct rdt_mon_domain *d;
>  	struct rdtgroup *rdtgrp;
>  	int domid, cpu, ret = 0;
>  	struct rdt_resource *r;
> @@ -623,6 +625,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	r = resctrl_arch_get_resource(resid);
>  
>  	if (md->sum) {
> +		struct rdt_mon_domain *d;
> +
>  		/*
>  		 * This file requires summing across all domains that share
>  		 * the L3 cache id that was provided in the "domid" field of the
> @@ -653,8 +657,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  			ret = -ENOENT;
>  			goto out;
>  		}
> -		d = container_of(hdr, struct rdt_mon_domain, hdr);

Ignored yet again. :(

> -		mon_event_read(&rr, r, d, rdtgrp, &d->hdr.cpu_mask, evtid, false);
> +		mon_event_read(&rr, r, hdr, rdtgrp, &hdr->cpu_mask, evtid, false);
>  	}
>  
>  checkresult:
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index ce4e716e6404..8f45763ff515 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -3029,7 +3029,8 @@ static void rmdir_all_sub(void)
>   * @rid:    The resource id for the event file being created.
>   * @domid:  The domain id for the event file being created.
>   * @mevt:   The type of event file being created.
> - * @do_sum: Whether SNC summing monitors are being created.
> + * @do_sum: Whether SNC summing monitors are being created. Only set
> + *          when @rid == RDT_RESOURCE_L3.
>   */
>  static struct mon_data *mon_get_kn_priv(enum resctrl_res_level rid, int domid,
>  					struct mon_evt *mevt,
> @@ -3167,17 +3168,27 @@ static void mon_rmdir_one_subdir(struct kernfs_node *pkn, char *name, char *subn
>   * when last domain being summed is removed.
>   */
>  static void rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r,
> -					   struct rdt_mon_domain *d)
> +					   struct rdt_domain_hdr *hdr)
>  {
>  	struct rdtgroup *prgrp, *crgrp;
> +	int domid = hdr->id;
>  	char subname[32];
> -	bool snc_mode;
>  	char name[32];
>  
> -	snc_mode = r->mon_scope == RESCTRL_L3_NODE;
> -	sprintf(name, "mon_%s_%02d", r->name, snc_mode ? d->ci_id : d->hdr.id);
> -	if (snc_mode)
> -		sprintf(subname, "mon_sub_%s_%02d", r->name, d->hdr.id);
> +	if (r->rid == RDT_RESOURCE_L3) {

This is starting to add package scope enabling ... much more than just providing
struct rdt_domain_hdr as parameter. Providing struct rdt_domain_hdr 
(especially when combined with the struct rdt_mon_domain name change) helps to
highlight which code is L3 specific. With that done a separate change can build
on it to separate L3 specific code from generic code in preparation for package
scope resources.

If you want to keep code like this then the changelog needs to do a better job
of explaining what this patch does.

> +		struct rdt_mon_domain *d;
> +
> +		if (!domain_header_is_valid(hdr, RESCTRL_MON_DOMAIN, RDT_RESOURCE_L3))
> +			return;
> +		d = container_of(hdr, struct rdt_mon_domain, hdr);
> +
> +		/* SNC mode? */
> +		if (r->mon_scope == RESCTRL_L3_NODE) {
> +			domid = d->ci_id;
> +			sprintf(subname, "mon_sub_%s_%02d", r->name, hdr->id);
> +		}
> +	}
> +	sprintf(name, "mon_%s_%02d", r->name, domid);
>  
>  	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
>  		mon_rmdir_one_subdir(prgrp->mon.mon_data_kn, name, subname);
> @@ -3187,19 +3198,18 @@ static void rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r,
>  	}
>  }
>  
> -static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
> +static int mon_add_all_files(struct kernfs_node *kn, struct rdt_domain_hdr *hdr,
>  			     struct rdt_resource *r, struct rdtgroup *prgrp,
> -			     bool do_sum)
> +			     int domid, bool do_sum)
>  {
>  	struct rmid_read rr = {0};
>  	struct mon_data *priv;
>  	struct mon_evt *mevt;
> -	int ret, domid;
> +	int ret;
>  
>  	for_each_mon_event(mevt) {
>  		if (mevt->rid != r->rid || !mevt->enabled)
>  			continue;
> -		domid = do_sum ? d->ci_id : d->hdr.id;
>  		priv = mon_get_kn_priv(r->rid, domid, mevt, do_sum);
>  		if (WARN_ON_ONCE(!priv))
>  			return -EINVAL;
> @@ -3208,26 +3218,38 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
>  		if (ret)
>  			return ret;
>  
> -		if (!do_sum && resctrl_is_mbm_event(mevt->evtid))
> -			mon_event_read(&rr, r, d, prgrp, &d->hdr.cpu_mask, mevt->evtid, true);
> +		if (r->rid == RDT_RESOURCE_L3 && !do_sum && resctrl_is_mbm_event(mevt->evtid))
> +			mon_event_read(&rr, r, hdr, prgrp, &hdr->cpu_mask, mevt->evtid, true);
>  	}
>  
>  	return 0;
>  }
>  
>  static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
> -				struct rdt_mon_domain *d,
> +				struct rdt_domain_hdr *hdr,
>  				struct rdt_resource *r, struct rdtgroup *prgrp)
>  {
>  	struct kernfs_node *kn, *ckn;
> +	bool snc_mode = false;
> +	int domid = hdr->id;
>  	char name[32];
> -	bool snc_mode;
>  	int ret = 0;
>  
>  	lockdep_assert_held(&rdtgroup_mutex);
>  
> -	snc_mode = r->mon_scope == RESCTRL_L3_NODE;
> -	sprintf(name, "mon_%s_%02d", r->name, snc_mode ? d->ci_id : d->hdr.id);
> +	if (r->rid == RDT_RESOURCE_L3) {

Same here.

> +		snc_mode = r->mon_scope == RESCTRL_L3_NODE;
> +		if (snc_mode) {
> +			struct rdt_mon_domain *d;
> +
> +			if (!domain_header_is_valid(hdr, RESCTRL_MON_DOMAIN, RDT_RESOURCE_L3))
> +				return -EINVAL;
> +			d = container_of(hdr, struct rdt_mon_domain, hdr);
> +			domid = d->ci_id;
> +		}
> +	}
> +	sprintf(name, "mon_%s_%02d", r->name, domid);
> +
>  	kn = kernfs_find_and_get(parent_kn, name);
>  	if (kn) {
>  		/*

...

  
> @@ -4290,12 +4318,20 @@ int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d
>  	return err;
>  }
>  
> -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d)
> +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain_hdr *hdr)
>  {
> -	int err;
> +	struct rdt_mon_domain *d;
> +	int err = -EINVAL;
>  
>  	mutex_lock(&rdtgroup_mutex);
>  
> +	if (r->rid != RDT_RESOURCE_L3)
> +		goto mkdir;

Ignored again. This is not working. So far (and it is just at patch #5 :() the rebase error
and changelogs not reflecting patch changes let me think that this was rushed worked. On top of
this there is half addressed feedback to this patch. Again without any discussion. Also, just in
this patch two places where feedback was just simply ignored. You submit this and expect a
thorough review? This is not reasonable. I stop this review here.

Reinette


  reply	other threads:[~2025-09-18 21:55 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-12 22:10 [PATCH v10 00/28] x86,fs/resctrl telemetry monitoring Tony Luck
2025-09-12 22:10 ` [PATCH v10 01/28] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-09-12 22:10 ` [PATCH v10 02/28] x86/resctrl: Move L3 initialization into new helper function Tony Luck
2025-09-18 21:49   ` Reinette Chatre
2025-09-19 18:09     ` Luck, Tony
2025-09-22 23:29       ` Reinette Chatre
2025-09-23 10:09       ` Borislav Petkov
2025-09-12 22:10 ` [PATCH v10 03/28] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-09-18 21:49   ` Reinette Chatre
2025-09-12 22:10 ` [PATCH v10 04/28] x86/resctrl: Clean up domain_remove_cpu_ctrl() Tony Luck
2025-09-12 22:10 ` [PATCH v10 05/28] x86,fs/resctrl: Use struct rdt_domain_hdr instead of struct rdt_mon_domain Tony Luck
2025-09-18 21:54   ` Reinette Chatre [this message]
2025-09-12 22:10 ` [PATCH v10 06/28] x86,fs/resctrl: Use struct rdt_domain_hdr when reading counters Tony Luck
2025-09-12 22:10 ` [PATCH v10 07/28] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-09-12 22:10 ` [PATCH v10 08/28] x86,fs/resctrl: Rename some L3 specific functions Tony Luck
2025-09-12 22:10 ` [PATCH v10 09/28] fs/resctrl: Make event details accessible to functions when reading events Tony Luck
2025-09-12 22:10 ` [PATCH v10 10/28] x86,fs/resctrl: Handle events that can be read from any CPU Tony Luck
2025-09-12 22:10 ` [PATCH v10 11/28] x86,fs/resctrl: Support binary fixed point event counters Tony Luck
2025-09-12 22:10 ` [PATCH v10 12/28] x86,fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-09-12 22:10 ` [PATCH v10 13/28] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor Tony Luck
2025-09-12 22:10 ` [PATCH v10 14/28] x86/resctrl: Discover hardware telemetry events Tony Luck
2025-09-12 22:10 ` [PATCH v10 15/28] x86,fs/resctrl: Fill in details of events for guid 0x26696143 and 0x26557651 Tony Luck
2025-09-12 22:10 ` [PATCH v10 16/28] x86,fs/resctrl: Add architectural event pointer Tony Luck
2025-09-12 22:10 ` [PATCH v10 17/28] x86/resctrl: Find and enable usable telemetry events Tony Luck
2025-09-12 22:10 ` [PATCH v10 18/28] x86/resctrl: Read " Tony Luck
2025-09-12 22:10 ` [PATCH v10 19/28] x86/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-09-12 22:10 ` [PATCH v10 20/28] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-09-12 22:10 ` [PATCH v10 21/28] x86/resctrl: Handle number of RMIDs supported by telemetry resources Tony Luck
2025-09-12 22:10 ` [PATCH v10 22/28] fs/resctrl: Move allocation/free of closid_num_dirty_rmid[] Tony Luck
2025-09-12 22:10 ` [PATCH v10 23/28] fs,x86/resctrl: Compute number of RMIDs as minimum across resources Tony Luck
2025-09-12 22:10 ` [PATCH v10 24/28] fs/resctrl: Move RMID initialization to first mount Tony Luck
2025-09-12 22:10 ` [PATCH v10 25/28] x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-09-12 22:10 ` [PATCH v10 26/28] fs/resctrl: Provide interface to create architecture specific debugfs area Tony Luck
2025-09-12 22:10 ` [PATCH v10 27/28] x86/resctrl: Add debugfs files to show telemetry aggregator status Tony Luck
2025-09-12 22:10 ` [PATCH v10 28/28] x86,fs/resctrl: Update Documentation for package events Tony Luck
2025-09-15 15:21   ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69ec7c09-2c4b-4289-9f98-143a9618a7cc@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox