Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount

public inbox for patches@lists.linux.dev
 help / color / mirror / Atom feed

From: Reinette Chatre <reinette.chatre@intel.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Fenghua Yu <fenghuay@nvidia.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	"Drew Fustini" <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>,
	David E Box <david.e.box@intel.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>, <patches@lists.linux.dev>
Subject: Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
Date: Wed, 22 Apr 2026 14:28:18 -0700	[thread overview]
Message-ID: <1633a669-ac1f-4ef4-b733-e12bb1c6a5eb@intel.com> (raw)
In-Reply-To: <aefdJG4y8JS_aTse@agluck-desk3>

Hi Tony,

On 4/21/26 1:25 PM, Luck, Tony wrote:
> On Fri, Apr 10, 2026 at 02:21:35PM -0700, Reinette Chatre wrote:

>> Here is how I understand the relationship:
>>
>> - arch enumerates a new feature at any time and calls resctrl_capable_mon_event()
>>   to mark any event discovered during enumeration as "capable".
> 
> This is possible.
> 
>> - arch is responsible for domain creation and is trusted to do so *after*
>>   marking any events as "capable", any needed per-domain state is created for
>>   the "capable" events.
> 
> This creates a new problem. Domains for monitor resources are created
> for resources where rdt_resource::mon_capable is true. But this is

Right, and both (specifically, setting of rdt_resource::mon_capable and
domain creation) are done by architecture.

> visible by file system code. Maybe the same "capable" -> "enabled"

Right, this is required to be visible to file system because the filesystem
needs access to the domains as well as rdt_resource::mon_capable.

> trick can be used to separate architecture creation of domains from
> file system use?

Architecture creation of domains is already separated from file system use of domains
via cpus_read_lock(). This is already required for synchronization between architecture
and resctrl fs and used by AET implementation that creates domains with
cpus_read_lock() held. The domain list is an RCU list and can thus also be accessed
from RCU read-side critical section but no such usage exists for x86 at this time.
All accesses of domain list from file system is done with cpus_read_lock() held.

arch thus already controls when the domains it creates are visible to resctrl fs.

>> - when resctrl fs is mounted, before creating any files, resctrl fs automatically
>>   sets all "capable" events to "enabled". resctrl fs will create needed files only for
>>   "enabled" events. 
> 
> This has a new race. Instead of a binary problem of AET either being
> fully enabled or completely missing. User may see some subset of events
> if architecture was in the middle of looping over events marking them
> capable when file system is mounted.

AET currently marks events as enabled without any locks held. It does so before
setting rdt_resource::mon_capable though and will only set rdt_resource::mon_capable
with cpus_read_lock() held, which rdt_get_tree requires to do anything.

One scenario to consider is if it takes a few attempts to enumerate the event groups
which means that rdt_resource::mon_capable may already be set when a new event is
enabled/marked as capable. This may be a problem, but would moving the
call to mark event as capable to be with all the other changes to state used
by resctrl fs address this? Specifically, by moving the call to mark event as
capable to be with the snippet that sets rdt_resource::mon_capable and creates
the domains (again).

With this the behavior of "fully enabled" or "completely missing" can be maintained?
Although, there seems to be some other state of "only some event group(s)" enabled
as opposed to "fully enabled", but I believe there will be no partial enabling of
an event group.

> 
>> - if an arch discovers new events after resctrl is mounted then it can still
>>   enumerate the events and mark them as "capable" - resctrl fs will pick that up
>>   on remount.
>>
>> - arch can only disable an event as part of the unmount handler, this will clear
>>   "capable" as well as "enabled". This can be enforced with a check in the
>>   callback where only rdtgroup_mutex should be needed to access resctrl_mounted.
> 
> Seems OK. But to make sure that events are accessible, architecture will
> now have to "hold" the pmt_telemetry module regardless of whether
> resctrl file system is mounted.

Could you please elaborate why this is required? if I understand correctly the
"hold" on the pmt_telemetry module will be done by itself between the
intel_pmt_get_regions_by_feature() and intel_pmt_put_feature_group() calls.

> 
> Since architecture doesn't have a reliable indication of when the file
> system is mounted (or if it ever will be mounted) the hold prevents
> pmt_telemtry from ever being unloaded.

Why does architecture need an indication of when the file system is mounted?

> 
>>>
>>> This also runs into problems if INTEL_PMT_TELEMETRY is unloaded between
>>> telling the file system that some set of events are "capable" and the
>>> file system asking to enable them.
>>
>> This is existing problem and sounds like you are solving this with the module_get()
>> and module_put(). 
> 
> My solution only works if architecture is called in initial part of file
> system mount, and trailing part of file system unmount. As detailed
> before these calls must be without rdtgroup_mutex held because of the
> need to hold cpus_read_lock() and domain_list_lock around domain
> creation and removal.

Understood.

... 
> "Double/split locking" is used at various places in the Linux kernel for data
> structures that need to be accessed read-only in separate contexts. Both
> locks must be held when updating the data structure. AI (Gemini) listed
> these high profile instances:
> 
> struct task_struct: protected by task->pi_lock and the rq->lock
> 
> VFS / dcache (Rename Operations): must hold i_rwsem for source and
> target of the rename
> 
> Netfilter / Connection Tracking: Modifying connection state requires
> a specific bucket lock and the status-change lock.

AI did not need to look outside resctrl where domain list update is
protected by both cpus_read_lock() and domain_list_lock but reads just by
cpus_read_lock() or RCU. I do not see why adding such additional complexity
is needed here.

Reinette

next prev parent reply	other threads:[~2026-04-22 21:28 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
2026-04-04  0:00   ` Reinette Chatre
2026-04-06 18:07     ` David Box
2026-04-08  5:07   ` Christoph Hellwig
2026-04-08 17:01     ` Luck, Tony
2026-04-09  5:41       ` Christoph Hellwig
2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
2026-04-04  0:01   ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
2026-04-04  0:03   ` Reinette Chatre
2026-04-06 18:35     ` Luck, Tony
2026-04-06 21:13       ` Reinette Chatre
2026-04-07 18:40         ` Luck, Tony
2026-04-07 23:10           ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
2026-04-04  0:52   ` Reinette Chatre
2026-04-06 20:35     ` Luck, Tony
2026-04-06 21:16       ` Reinette Chatre
2026-04-09 20:35         ` Luck, Tony
2026-04-10 15:16           ` Reinette Chatre
2026-04-10 18:59             ` Luck, Tony
2026-04-10 21:21               ` Reinette Chatre
2026-04-10 23:03                 ` Luck, Tony
2026-04-21 20:25                 ` Luck, Tony
2026-04-22 21:28                   ` Reinette Chatre [this message]
2026-04-22 21:59                     ` Luck, Tony
2026-04-22 22:10                       ` Reinette Chatre
2026-04-22 22:44                         ` Luck, Tony
2026-04-22 23:17                           ` Reinette Chatre
2026-04-23 22:29                         ` Luck, Tony
2026-04-23 23:54                           ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
2026-04-04  0:56   ` Reinette Chatre
2026-04-07 18:13     ` Luck, Tony
2026-04-07 18:40       ` Reinette Chatre
2026-04-07 20:33         ` Luck, Tony
2026-03-30 21:43 ` [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit() Tony Luck
2026-03-30 21:43 ` [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1633a669-ac1f-4ef4-b733-e12bb1c6a5eb@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=david.e.box@intel.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox