All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: Fenghua Yu <fenghuay@nvidia.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	"Drew Fustini" <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>,
	<x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<patches@lists.linux.dev>
Subject: Re: [PATCH v14 07/32] x86,fs/resctrl: Use struct rdt_domain_hdr when reading counters
Date: Tue, 2 Dec 2025 12:33:20 -0800	[thread overview]
Message-ID: <aS9NEJAnx0MWoyaT@agluck-desk3> (raw)
In-Reply-To: <895cee86-ac6e-43e7-aece-e283200384ef@intel.com>

On Tue, Dec 02, 2025 at 08:06:47AM -0800, Reinette Chatre wrote:
> Hi Tony,
> > +static int __l3_mon_event_count_sum(struct rdtgroup *rdtgrp, struct rmid_read *rr)
> > +{
> > +	int cpu = smp_processor_id();
> > +	u32 closid = rdtgrp->closid;
> > +	u32 rmid = rdtgrp->mon.rmid;
> > +	struct rdt_mon_domain *d;
> > +	int cntr_id = -ENOENT;
> > +	u64 tval = 0;
> > +	int err, ret;
> >  
> >  	/* Summing domains that share a cache, must be on a CPU for that cache. */
> >  	if (!cpumask_test_cpu(cpu, &rr->ci->shared_cpu_map))
> > @@ -480,7 +494,7 @@ static int __l3_mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
> >  			err = resctrl_arch_cntr_read(rr->r, d, closid, rmid, cntr_id,
> >  						     rr->evtid, &tval);
> 
> This is not safe. The current __mon_event_count() implementation being refactored by this series
> ensures that if rr->is_mbm_cntr is true then cntr_id is valid. This patch places the code doing so
> in __l3_mon_event_count() without an equivalent in the new __l3_mon_event_count_sum(). From what I
> can tell, since __l3_mon_event_count_sum() sets cntr_id to -ENOENT and never initializes it correctly, 
> resctrl_arch_cntr_read() will be called with an invalid cntr_id that it is not able to handle.
> 
> There is no overlap in support for SNC and assignable counters. Do you expect that this is something that
> should be supported? Even if it is, SNC is model specific so it may be reasonable to expect that when/if
> a system supporting both features arrives it would need enabling anyway. I thus propose for simplicity
> that the handling of assignable counters by __l3_mon_event_count_sum() be dropped, albeit with a loud
> complaint if it is ever called with rr->is_mbm_cntr set.
> 

Reinette,

Agreed. I see little liklihood that SNC and assignable counters will
meet on a system.

How does this look for the "loud complaint":


static int __l3_mon_event_count_sum(struct rdtgroup *rdtgrp, struct rmid_read *rr)
{
	int cpu = smp_processor_id();
	u32 closid = rdtgrp->closid;
	u32 rmid = rdtgrp->mon.rmid;
	struct rdt_mon_domain *d;
	u64 tval = 0;
	int err, ret;

	/*
	 * Summing across domains is only done for systems that implement
	 * Sub-NUMA Cluster. There is no overlap with systems that support
	 * assignable counters.
	 */
	if (rr->is_mbm_cntr) {
		pr_warn_once("Assignable counter on SNC system!\n");
		rr->err = -EINVAL;
		return -EINVAL;
	}

	/* Summing domains that share a cache, must be on a CPU for that cache. */
	if (!cpumask_test_cpu(cpu, &rr->ci->shared_cpu_map))
		return -EINVAL;

	/*
	 * Legacy files must report the sum of an event across all
	 * domains that share the same L3 cache instance.
	 * Report success if a read from any domain succeeds, -EINVAL
	 * (translated to "Unavailable" for user space) if reading from
	 * all domains fail for any reason.
	 */
	ret = -EINVAL;
	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
		if (d->ci_id != rr->ci->id)
			continue;
		err = resctrl_arch_rmid_read(rr->r, &d->hdr, closid, rmid,
					     rr->evtid, &tval, rr->arch_mon_ctx);
		if (!err) {
			rr->val += tval;
			ret = 0;
		}
	}

	if (ret)
		rr->err = ret;

	return ret;
}

-Tony

  reply	other threads:[~2025-12-02 20:33 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-24 18:53 [PATCH v14 00/32] x86,fs/resctrl telemetry monitoring Tony Luck
2025-11-24 18:53 ` [PATCH v14 01/32] x86,fs/resctrl: Improve domain type checking Tony Luck
2025-11-24 18:53 ` [PATCH v14 02/32] x86/resctrl: Move L3 initialization into new helper function Tony Luck
2025-11-24 18:53 ` [PATCH v14 03/32] x86/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types Tony Luck
2025-12-02 16:01   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 04/32] x86/resctrl: Clean up domain_remove_cpu_ctrl() Tony Luck
2025-11-24 18:53 ` [PATCH v14 05/32] x86,fs/resctrl: Refactor domain create/remove using struct rdt_domain_hdr Tony Luck
2025-12-02 16:01   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 06/32] fs/resctrl: Split L3 dependent parts out of __mon_event_count() Tony Luck
2025-12-02 16:02   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 07/32] x86,fs/resctrl: Use struct rdt_domain_hdr when reading counters Tony Luck
2025-12-02 16:06   ` Reinette Chatre
2025-12-02 20:33     ` Luck, Tony [this message]
2025-12-02 22:24       ` Reinette Chatre
2025-12-02 23:22         ` Luck, Tony
2025-11-24 18:53 ` [PATCH v14 08/32] x86,fs/resctrl: Rename struct rdt_mon_domain and rdt_hw_mon_domain Tony Luck
2025-12-02 16:07   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 09/32] x86,fs/resctrl: Rename some L3 specific functions Tony Luck
2025-11-24 18:53 ` [PATCH v14 10/32] fs/resctrl: Make event details accessible to functions when reading events Tony Luck
2025-11-24 18:53 ` [PATCH v14 11/32] x86,fs/resctrl: Handle events that can be read from any CPU Tony Luck
2025-12-02 16:08   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 12/32] x86,fs/resctrl: Support binary fixed point event counters Tony Luck
2025-12-02 16:11   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 13/32] x86,fs/resctrl: Add an architectural hook called for each mount Tony Luck
2025-11-24 18:53 ` [PATCH v14 14/32] x86,fs/resctrl: Add and initialize rdt_resource for package scope monitor Tony Luck
2025-12-02 16:11   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 15/32] fs/resctrl: Emphasize that L3 monitoring resource is required for summing domains Tony Luck
2025-11-24 18:53 ` [PATCH v14 16/32] x86/resctrl: Discover hardware telemetry events Tony Luck
2025-12-02 16:18   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 17/32] x86,fs/resctrl: Fill in details of events for guid 0x26696143 and 0x26557651 Tony Luck
2025-12-02 16:19   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 18/32] x86,fs/resctrl: Add architectural event pointer Tony Luck
2025-11-24 18:53 ` [PATCH v14 19/32] x86/resctrl: Find and enable usable telemetry events Tony Luck
2025-12-02 16:21   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 20/32] x86/resctrl: Read " Tony Luck
2025-12-02 16:21   ` Reinette Chatre
2025-11-24 18:53 ` [PATCH v14 21/32] fs/resctrl: Refactor mkdir_mondata_subdir() Tony Luck
2025-11-24 18:53 ` [PATCH v14 22/32] fs/resctrl: Refactor rmdir_mondata_subdir_allrdtgrp() Tony Luck
2025-11-24 18:54 ` [PATCH v14 23/32] x86,fs/resctrl: Handle domain creation/deletion for RDT_RESOURCE_PERF_PKG Tony Luck
2025-11-24 18:54 ` [PATCH v14 24/32] x86/resctrl: Add energy/perf choices to rdt boot option Tony Luck
2025-12-02 16:28   ` Reinette Chatre
2025-12-03 18:04     ` Luck, Tony
2025-12-03 21:21       ` Reinette Chatre
2025-12-03 22:27         ` Luck, Tony
2025-12-03 23:25           ` Reinette Chatre
2025-11-24 18:54 ` [PATCH v14 25/32] x86/resctrl: Handle number of RMIDs supported by RDT_RESOURCE_PERF_PKG Tony Luck
2025-12-02 16:31   ` Reinette Chatre
2025-11-24 18:54 ` [PATCH v14 26/32] fs/resctrl: Move allocation/free of closid_num_dirty_rmid[] Tony Luck
2025-11-24 18:54 ` [PATCH v14 27/32] x86,fs/resctrl: Compute number of RMIDs as minimum across resources Tony Luck
2025-11-24 18:54 ` [PATCH v14 28/32] fs/resctrl: Move RMID initialization to first mount Tony Luck
2025-11-24 18:54 ` [PATCH v14 29/32] x86/resctrl: Enable RDT_RESOURCE_PERF_PKG Tony Luck
2025-11-24 18:54 ` [PATCH v14 30/32] fs/resctrl: Provide interface to create architecture specific debugfs area Tony Luck
2025-11-24 18:54 ` [PATCH v14 31/32] x86/resctrl: Add debugfs files to show telemetry aggregator status Tony Luck
2025-11-24 18:54 ` [PATCH v14 32/32] x86,fs/resctrl: Update documentation for telemetry events Tony Luck
2025-12-02 16:34   ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS9NEJAnx0MWoyaT@agluck-desk3 \
    --to=tony.luck@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.