Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount

public inbox for patches@lists.linux.dev
 help / color / mirror / Atom feed

From: Reinette Chatre <reinette.chatre@intel.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Fenghua Yu <fenghuay@nvidia.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	Peter Newman <peternewman@google.com>,
	James Morse <james.morse@arm.com>,
	Babu Moger <babu.moger@amd.com>,
	"Drew Fustini" <dfustini@baylibre.com>,
	Dave Martin <Dave.Martin@arm.com>, Chen Yu <yu.c.chen@intel.com>,
	David E Box <david.e.box@intel.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>, <patches@lists.linux.dev>
Subject: Re: [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount
Date: Fri, 10 Apr 2026 08:16:50 -0700	[thread overview]
Message-ID: <c948c358-1dcb-4896-9c0d-239e99d89e72@intel.com> (raw)
In-Reply-To: <adgNfJSeHVyBXumt@agluck-desk3>

Hi Tony,

On 4/9/26 1:35 PM, Luck, Tony wrote:
> On Mon, Apr 06, 2026 at 02:16:46PM -0700, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 4/6/26 1:35 PM, Luck, Tony wrote:
>>> On Fri, Apr 03, 2026 at 05:52:30PM -0700, Reinette Chatre wrote:
>>>> On 3/30/26 2:43 PM, Tony Luck wrote:

>>>>> @@ -2900,6 +2893,30 @@ static int rdt_get_tree(struct fs_context *fc)
>>>>>  	return ret;
>>>>>  }
>>>>>  
>>>>> +static int rdt_get_tree_wrapper(struct fs_context *fc)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	mutex_lock(&resctrl_mount_lock);
>>>>> +
>>>>> +	/*
>>>>> +	 * resctrl file system can only be mounted once.
>>>>> +	 */
>>>>> +	if (resctrl_mounted) {
>>>>> +		mutex_unlock(&resctrl_mount_lock);
>>>>> +		return -EBUSY;
>>>>> +	}
>>>>> +
>>>>
>>>> This does not look right. Here too is resctrl_mounted accessed without rdtgroup_mutex
>>>> held. This change implies that resctrl_mounted is now protected by resctrl_mount_lock
>>>> but resctrl is not changed to respect this throughout resulting in unsafe access of
>>>> resctrl_mounted.
>>>>
>>>> Does this new resctrl_mount_lock need to be in resctrl fs? It really seems as though the
>>>> needed synchronization belongs in the architecture. Could this instead be accomplished
>>>> with a private mutex within the AET code?
>>>
>>> If you dig in lore for the v3 of this patch, you'll see I had the mutex in the
>>> AET code. But there were some complications.
>>>
>>> 1) Need to acquire in intel_aet_pre_mount() and release in intel_aet_mount_result()
>>> which is legal, but makes code more complex when call chains need to be compared to
>>> check that the mutex is being released correctly.
>>
>> Why was it needed to hold mutex for so long? I cannot find explanation here or in changelog
>> of v3. I did not remember correctly and considered the AET code to be doing the domain
>> addition. Even so, I do think a mutex internal to the arch code can be used to manage
>> the synchronization. Could you please elaborate why this cannot be done?
> 
> I tried to move the locks into architecture code. But main problem is still
> handling when a user tries to mount an already mounted resctrl file system
> and gets -EBUSY.
> 
> In that case file system calls resctrl_arch_pre_mount() with the file system
> mounted. You suggested that the AET code could detect and ignore a repeat
> enumeration by noting that the event_group "(*peg)->pfg" is non-NULL, set by

It would be great if resctrl only needs to enumerate once but I do not see how that
is possible since there is no clear indication from PMT driver whether (all) its 
data is ready or not.

> the original enumeration. But that fails in this scenario:
> 
> 	# rmmod pmt_telemetry
> 	# mount -t resctrl resctrl /sys/fs/resctrl
> 		... succeeds, but without AET present
> 	# modprobe pmt_telemetry
> 	# mount -t resctrl resctrl /sys/fs/resctrl
> 		... enumeration success, but now calls resctrl_enable_mon_event()
> 		... with the file system mounted

Thank you for catching this.

> 
> I think the bast solution for this is to change definition of resctrl_arch_pre_mount()
> from "called on every mount attempt" to "called only when resctrl is NOT mounted".
> This is because architecture code cannot tell whether the file system is mounted.

Does this mean the addition of the extra locking in resctrl fs? I am not comfortable with
that asymmetrical locking.

I think a problem here is that resctrl_enable_mon_event() gives the architecture code the
capability of manipulating internal resctrl fs state directly. This is a consequence
of the original design before the arch/fs split. I see the additional resctrl fs locking
as an attempt to make it easier for the arch to manipulate fs state but I do not think we
should go in that direction, instead I think it is better to improve the arch/fs boundaries
here.

To that end, what if resctrl uses its familiar "capable" vs "enable" separation here?
That is, the architecture informs resctrl fs whether it is *capable* of a feature and
resctrl fs, based on interactions with user space, determines whether the feature is
*enabled*?

For a simpler addition resctrl could obtain a new helper, for example,
resctrl_capable_mon_event() that only marks an event as "capable". resctrl_enable_mon_event()
could remain as the "early" call that marks events as capable
and enabled but may be deprecated. Both of these *have* to be called before any domains
are created since per-domain state would now depend on whether the system is "capable"
of an event or not. resctrl fs can assume that all "capable" events needs to be enabled
and it can mark them so at the beginning of each mount, resctrl will only expose
"enabled" events.

There are likely some simplifications on top of current implementation to not make
this too invasive.

What do you think?

>>> 2) The "only mounted once" case meant extra state (AET_PRESENT, which you note
>>> in next patch may be redundant) because intel_aet_pre_mount() is called, but
>>> needs to do nothing.
>>
>> Right, I do not see need for extra state. In fact, since it is not clear to me that
>> PMT enumeration will be complete when intel_pmt_get_regions_by_feature() is called it
>> seemed worthwhile to only rely on event_group::pfg - if PMT enumeration was not complete
>> during mount N it may be complete on mount N+1? This creates a poor user interface
>> though since user would need an alternate way to know if AET is supported and then
>> a "remount until it works" approach.
> 
> The race remains, and is lost if resctrl is auto-mounted at boot from /etc/fstab.
> 
> The user can tell if AET is supported with:
> 
> 	$ grep ^ /sys/class/intel_pmt/*/guid
> 
> and checking if any of the RMID based event guids are present on the system.
> 
> Delta T for the race is small enough that delaying the mount to some other
> startup script should be sufficient. Users are likely to have such a script
> to create the CTRL_MON directories and configure schemata for their workload.
> So annoying, but easily solved.

I do not have a clear understanding of how the new implementation with registration
function will look to understand the races involved.

>>> Adding resctrl_mount_lock to the file system code made things simpler. The
>>
>> Adding complications to resctrl fs to make things simpler for x86?
> 
> I believe it is necessary, since architecture cannot tell if the file system
> is mounted.
> 
>>> pre-mount code can't be called with rdtgroup_mutex held because it needs to
>>> build the domains. That needs cpus_read_lock() + mutex_lock(&domain_list_lock);
>>
>> ack. Can an arch-specific mutex be used instead?
> 
> See above.
> 
>>> I need to add more comments on locking. resctrl_mounted is only modified when both
>>> resctrl_mount_lock AND rdtgroup_mutex are held. I believe that makes it safe to
>>> read the value of resctrl_mounted with just rdtgroup_mutex held.
>>
>> ...but not to read it with only resctrl_mount_lock held as in snippet above.
> 
> Holding either of resctrl_mount_lock or rdtgroup_mutex makes it safe to
> read the value of resctrl_mounted as it can only be modified when both
> mutexes are held.

I am concerned about this approach due to it not being symmetrical and how resctrl fs
now adds additional locking to accommodate drivers.

Reinette

next prev parent reply	other threads:[~2026-04-10 15:16 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 21:43 [PATCH v4 0/7] Allow AET to use PMT/TPMI as loadable modules Tony Luck
2026-03-30 21:43 ` [PATCH v4 1/7] platform/x86/intel/pmt: Export PMT enumeration functions as GPL Tony Luck
2026-04-04  0:00   ` Reinette Chatre
2026-04-06 18:07     ` David Box
2026-04-08  5:07   ` Christoph Hellwig
2026-04-08 17:01     ` Luck, Tony
2026-04-09  5:41       ` Christoph Hellwig
2026-03-30 21:43 ` [PATCH v4 2/7] x86/resctrl: Drop setting of event_group::force_off when insufficient RMIDs Tony Luck
2026-04-04  0:01   ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 3/7] fs/resctrl: Add interface to disable a monitor event Tony Luck
2026-04-04  0:03   ` Reinette Chatre
2026-04-06 18:35     ` Luck, Tony
2026-04-06 21:13       ` Reinette Chatre
2026-04-07 18:40         ` Luck, Tony
2026-04-07 23:10           ` Reinette Chatre
2026-03-30 21:43 ` [PATCH v4 4/7] fs,x86/resctrl: Add architecture hooks for every mount/unmount Tony Luck
2026-04-04  0:52   ` Reinette Chatre
2026-04-06 20:35     ` Luck, Tony
2026-04-06 21:16       ` Reinette Chatre
2026-04-09 20:35         ` Luck, Tony
2026-04-10 15:16           ` Reinette Chatre [this message]
2026-04-10 18:59             ` Luck, Tony
2026-04-10 21:21               ` Reinette Chatre
2026-04-10 23:03                 ` Luck, Tony
2026-03-30 21:43 ` [PATCH v4 5/7] x86/resctrl: Resolve PMT and TPMI symbols at runtime Tony Luck
2026-04-04  0:56   ` Reinette Chatre
2026-04-07 18:13     ` Luck, Tony
2026-04-07 18:40       ` Reinette Chatre
2026-04-07 20:33         ` Luck, Tony
2026-03-30 21:43 ` [PATCH v4 6/7] x86/resctrl: Delete intel_aet_exit() Tony Luck
2026-03-30 21:43 ` [PATCH v4 7/7] x86/resctrl: Downgrade dependency of AET on INTEL_PMT Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c948c358-1dcb-4896-9c0d-239e99d89e72@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=david.e.box@intel.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox