From: James Morse <james.morse@arm.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS <scott@os.amperecomputing.com>,
carl@os.amperecomputing.com, lcherian@marvell.com,
bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
baolin.wang@linux.alibaba.com,
Jamie Iles <quic_jiles@quicinc.com>,
Xin Hao <xhao@linux.alibaba.com>,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand <david@redhat.com>,
Rex Nie <rex.nie@jaguarmicro.com>,
Dave Martin <dave.martin@arm.com>, Koba Ko <kobak@nvidia.com>,
Shanker Donthineni <sdonthineni@nvidia.com>,
fenghuay@nvidia.com, Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>,
Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH v8 04/21] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point
Date: Thu, 24 Apr 2025 10:15:17 +0100 [thread overview]
Message-ID: <dfb42daf-74ac-461f-bc56-f1d9ec805e9d@arm.com> (raw)
In-Reply-To: <81a54d21-40af-438e-8139-322597a7506e@intel.com>
Hi Reinette,
On 16/04/2025 01:25, Reinette Chatre wrote:
> On 4/11/25 9:42 AM, James Morse wrote:
>> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
>> resctrl can't be built as a module, and the kernfs helpers are not exported
>> so this is unlikely to change. MPAM has an error interrupt which indicates
>> the MPAM driver has gone haywire. Should this occur tasks could run with
>> the wrong control values, leading to bad performance for important tasks.
>> In this scenario the MPAM driver will reset the hardware, but it needs
>> a way to tell resctrl that no further configuration should be attempted.
>>
>> In particular, moving tasks between control or monitor groups does not
>> interact with the architecture code, so there is no opportunity for the
>> arch code to indicate that the hardware is no-longer functioning.
>>
>> Using resctrl_exit() for this leaves the system in a funny state as
>> resctrl is still mounted, but cannot be un-mounted because the sysfs
>> directory that is typically used has been removed. Dave Martin suggests
>> this may cause systemd trouble in the future as not all filesystems
>> can be unmounted.
>>
>> Add calls to remove all the files and directories in resctrl, and
>> remove the sysfs_remove_mount_point() call that leaves the system
>> in a funny state. When triggered, this causes all the resctrl files
>> to disappear. resctrl can be unmounted, but not mounted again.
> The caveat here is that resctrl pretends to be mounted (resctrl_mounted == true)
> but there is nothing there. The undocumented part of this is that for this
> to work resctrl fs depends (a lot) on the architecture's callbacks to know
> if they are being called after a resctrl_exit() call so that they return data
> that will direct resctrl fs behavior to safest exit for those
> resctrl fs flows that are still possible after a resctrl_exit(). Not ideal
> layering.
It was the arch code that called resctrl_exit() - there is no other path into it.
I don't think its a problem for the arch code to also know to return an error.
I haven't found anything where which error is returned actually matter - so there
is no 'direction', only errors.
I agree the documentation can be improved.
> I understand from a previous comment [1] that one of the Arm "tricks" is to
> offline all domains. This seems to be a good "catch all" to ensure that at least
> current flows of concern are not running anymore.
Yup, that is necessary to stop the limbo and overflow workers for trying to read the
counters - which is a waste of time.
> Considering this,
> what if there is a new resctrl_error_exit() that does something like below?
>
> void resctrl_error_exit(void)
> {
> mutex_lock(&rdtgroup_mutex);
> WARN_ON_ONCE(resctrl_new_function_returns_true_if_any_resource_has_a_control_or_monitor_domain());
> resctrl_fs_teardown();
> mutex_unlock(&rdtgroup_mutex);
> resctrl_exit();
> }
Makes sense - the alternative would be to dig around to cancel the limbo/overflow
work, and a subsequent CPU-online might start them again.
> I do not see this as requiring anything new from architecture but instead
> making what Arm already does a requirement and keeping existing behavior?
I agree.
> This leaves proc_resctrl_show() that relies on resctrl_mounted but as I see
> the resctrl_fs_cleanup() will remove all resource groups that should result
> in the output being as it will be if resctrl is not mounted. No dependence
> on architecture callbacks returning resctrl_exit() aware data here.
Great - I'd missed that one,
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index fdf2616c7ca0..3f9c37637d7e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -4416,11 +4429,26 @@ int __init resctrl_init(void)
>> return ret;
>> }
>>
>> +/**
>> + * resctrl_exit() - Remove the resctrl filesystem and free resources.
>> + *
>> + * Called by the architecture code in response to a fatal error.
>> + * Resctrl files and structures are removed from kernfs to prevent further
>> + * configuration.
>
> Please write with imperative tone. For example, "Remove resctrl files and structures ..."
>
>> + */
>> void __exit resctrl_exit(void)
>> {
>> + mutex_lock(&rdtgroup_mutex);
>> + resctrl_fs_teardown();
>> + mutex_unlock(&rdtgroup_mutex);
>> +
>> debugfs_remove_recursive(debugfs_resctrl);
>
> Is it possible for the fatal error handling to trigger multiple calls here?
> To protect against multiple calls causing issues debugfs_resctrl can be set to NULL here.
It's not, the driver keeps track of whether resctrl_init() had been called, and only calls
resctrl_exit() once. But I agree it would be better to make it robust to this.
>> unregister_filesystem(&rdt_fs_type);
>
> unregister_filesystem() seems to handle an already-unregistered filesystem.
>
>> - sysfs_remove_mount_point(fs_kobj, "resctrl");
>> +
>> + /*
>> + * The sysfs mount point added by resctrl_init() is not removed so that
>> + * it can be used to umount resctrl.
>> + */
>
> (needs imperative)
>
>>
>> resctrl_mon_resource_exit();
>> }
Thanks,
James
next prev parent reply other threads:[~2025-04-24 9:15 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-11 16:42 [PATCH v8 00/21] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl James Morse
2025-04-11 16:42 ` [PATCH v8 01/21] x86/resctrl: Fix rdtgroup_mkdir()'s unlocked use of kernfs_node::name James Morse
2025-04-12 0:10 ` Fenghua Yu
2025-04-11 16:42 ` [PATCH v8 02/21] x86/resctrl: Remove the limit on the number of CLOSID James Morse
2025-04-15 21:06 ` Reinette Chatre
2025-04-24 9:12 ` James Morse
2025-04-24 15:17 ` Reinette Chatre
2025-04-25 2:56 ` Shaopeng Tan (Fujitsu)
2025-04-25 15:56 ` James Morse
2025-04-11 16:42 ` [PATCH v8 03/21] x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" James Morse
2025-04-15 21:11 ` Reinette Chatre
2025-04-24 9:12 ` James Morse
2025-04-11 16:42 ` [PATCH v8 04/21] x86/resctrl: resctrl_exit() teardown resctrl but leave the mount point James Morse
2025-04-16 0:25 ` Reinette Chatre
2025-04-24 9:15 ` James Morse [this message]
2025-04-11 16:42 ` [PATCH v8 05/21] x86/resctrl: Drop __init/__exit on assorted symbols James Morse
2025-04-11 16:42 ` [PATCH v8 06/21] x86/resctrl: Move is_mba_sc() out of core.c James Morse
2025-04-11 16:42 ` [PATCH v8 07/21] x86/resctrl: Add end-marker to the resctrl_event_id enum James Morse
2025-04-15 18:56 ` Luck, Tony
2025-04-24 9:15 ` James Morse
2025-04-11 16:42 ` [PATCH v8 08/21] x86/resctrl: Expand the width of dom_id by replacing mon_data_bits James Morse
2025-04-16 0:34 ` Reinette Chatre
2025-04-24 11:15 ` James Morse
2025-04-22 17:06 ` Moger, Babu
2025-04-22 17:14 ` Luck, Tony
2025-04-22 17:59 ` Moger, Babu
2025-04-22 18:10 ` Luck, Tony
2025-04-11 16:42 ` [PATCH v8 09/21] x86/resctrl: Remove a newline to avoid confusing the code move script James Morse
2025-04-25 2:32 ` Shaopeng Tan (Fujitsu)
2025-04-25 15:59 ` James Morse
2025-04-11 16:42 ` [PATCH v8 10/21] x86/resctrl: Split trace.h James Morse
2025-04-11 16:42 ` [PATCH v8 11/21] fs/resctrl: Add boiler plate for external resctrl code James Morse
2025-04-11 16:42 ` [PATCH v8 12/21] x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl James Morse
2025-04-17 22:46 ` Reinette Chatre
2025-04-24 9:25 ` James Morse
2025-04-11 16:42 ` [PATCH v8 13/21] x86/resctrl: Squelch whitespace anomalies in resctrl core code James Morse
2025-04-11 16:42 ` [PATCH v8 14/21] x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context() James Morse
2025-04-11 16:42 ` [PATCH v8 15/21] x86/resctrl: Relax some asm #includes James Morse
2025-04-16 2:08 ` Reinette Chatre
2025-04-11 16:42 ` [PATCH v8 16/21] x86/resctrl: Always initialise rid field in rdt_resources_all[] James Morse
2025-04-15 19:08 ` Luck, Tony
2025-04-24 17:08 ` James Morse
2025-04-16 2:14 ` Reinette Chatre
2025-04-24 17:08 ` James Morse
2025-04-11 16:42 ` [PATCH v8 17/21] x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl James Morse
2025-04-12 0:18 ` Fenghua Yu
2025-04-14 16:04 ` Reinette Chatre
2025-04-14 23:22 ` Fenghua Yu
2025-04-14 23:29 ` Reinette Chatre
2025-04-14 23:21 ` Fenghua Yu
2025-04-24 17:08 ` James Morse
2025-04-15 0:27 ` Fenghua Yu
2025-04-24 17:11 ` James Morse
2025-04-11 16:42 ` [PATCH v8 18/21] x86,fs/resctrl: Remove duplicated trace header files James Morse
2025-04-16 2:18 ` Reinette Chatre
2025-04-24 17:11 ` James Morse
2025-04-22 14:23 ` Fenghua Yu
2025-04-24 17:11 ` James Morse
2025-04-11 16:42 ` [PATCH v8 19/21] fs/resctrl: Remove unnecessary includes James Morse
2025-04-11 16:42 ` [PATCH v8 20/21] fs/resctrl: Change internal.h's header guard macros James Morse
2025-04-11 16:42 ` [PATCH v8 21/21] x86,fs/resctrl: Move resctrl.rst to live under Documentation/filesystems James Morse
2025-04-16 2:31 ` Reinette Chatre
2025-04-24 17:12 ` James Morse
2025-04-24 17:22 ` Reinette Chatre
2025-04-15 18:48 ` [PATCH v8 00/21] x86/resctrl: Move the resctrl filesystem code to /fs/resctrl Luck, Tony
2025-04-24 17:12 ` James Morse
2025-04-17 12:18 ` Shaopeng Tan (Fujitsu)
2025-04-17 14:47 ` Reinette Chatre
2025-04-18 0:08 ` Moger, Babu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dfb42daf-74ac-461f-bc56-f1d9ec805e9d@arm.com \
--to=james.morse@arm.com \
--cc=Babu.Moger@amd.com \
--cc=amitsinght@marvell.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=bobo.shaobowang@huawei.com \
--cc=bp@alien8.de \
--cc=carl@os.amperecomputing.com \
--cc=dave.martin@arm.com \
--cc=david@redhat.com \
--cc=dfustini@baylibre.com \
--cc=fenghuay@nvidia.com \
--cc=hpa@zytor.com \
--cc=kobak@nvidia.com \
--cc=lcherian@marvell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peternewman@google.com \
--cc=quic_jiles@quicinc.com \
--cc=reinette.chatre@intel.com \
--cc=rex.nie@jaguarmicro.com \
--cc=scott@os.amperecomputing.com \
--cc=sdonthineni@nvidia.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=tan.shaopeng@fujitsu.com \
--cc=tan.shaopeng@jp.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=xhao@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox