From: Reinette Chatre <reinette.chatre@intel.com>
To: James Morse <james.morse@arm.com>, <x86@kernel.org>,
<linux-kernel@vger.kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
<shameerali.kolothum.thodi@huawei.com>,
D Scott Phillips OS <scott@os.amperecomputing.com>,
<carl@os.amperecomputing.com>, <lcherian@marvell.com>,
<bobo.shaobowang@huawei.com>, <tan.shaopeng@fujitsu.com>,
<baolin.wang@linux.alibaba.com>,
Jamie Iles <quic_jiles@quicinc.com>,
Xin Hao <xhao@linux.alibaba.com>, <peternewman@google.com>,
<dfustini@baylibre.com>, <amitsinght@marvell.com>
Subject: Re: [PATCH v7 02/24] x86/resctrl: kfree() rmid_ptrs from rdtgroup_exit()
Date: Thu, 14 Dec 2023 11:06:04 -0800 [thread overview]
Message-ID: <c53872e2-1d2e-44b3-80f1-e39fb0a7330d@intel.com> (raw)
In-Reply-To: <a5ef6b40-a9b3-5338-a12a-6a4540cda861@arm.com>
Hi James,
On 12/14/2023 10:28 AM, James Morse wrote:
> Hi Reinette,
>
> On 13/12/2023 23:27, Reinette Chatre wrote:
>> Hi James,
>>
>> On 12/13/2023 10:03 AM, James Morse wrote:
>>> On 09/11/2023 17:39, Reinette Chatre wrote:
>>>> On 10/25/2023 11:03 AM, James Morse wrote:
>>
>> ...
>>
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>>>> index 19e0681f0435..0056c9962a44 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>>>> @@ -992,7 +992,13 @@ late_initcall(resctrl_late_init);
>>>>>
>>>>> static void __exit resctrl_exit(void)
>>>>> {
>>>>> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>>> +
>>>>> cpuhp_remove_state(rdt_online);
>>>>> +
>>>>> + if (r->mon_capable)
>>>>> + rdt_put_mon_l3_config(r);
>>>>> +
>>>>> rdtgroup_exit();
>>>>> }
>>>>
>>>> I expect cleanup to do the inverse of init. I do not know what was the
>>>> motivation for the rdtgroup_exit() to follow cpuhp_remove_state()
>>>
>>> This will invoke the hotplug callbacks, making it look to resctrl like all CPUs are
>>> offline. This means it is then impossible for rdtgroup_exit() to race with the hotplug
>>> notifiers. (if you could run this code...)
>
>> hmmm ... if there is a risk of such a race would the init code not also be
>> vulnerable to that with the notifiers up before rdtgroup_init()?
>
> Nope, because this array is allocated behind rdt_get_mon_l3_config(), which ultimately
> comes from get_rdt_resources() in resctrl_late_init() - which calls cpuhp_setup_state()
> after all this init work has been done.
>
> (cpu hp always gives me a headache1)
Right. My comment was actually and specifically about rdtgroup_init() and attempting to
understand your view of its races with the hotplug notifiers in response to your comment about
its (the hotplug notifiers) races with rdtgroup_exit().
The current order of state initialization you mention and hotplug notifiers needing the
state is sane and implies to expect an inverse order of teardown.
>> The races you mention
>> are not obvious to me. I see the filesystem and hotplug code protected against races via
>> the mutex and static keys. Could you please elaborate on the flows of concern?
>
> Functions like __check_limbo() (calling __rmid_entry()) are called under the
> rdtgroup_mutex, but they don't consider that rmid_ptrs[] may be NULL.
>
> But this could only happen if the limbo work ran after cpuhp_remove_state() - this can't
> happen because the hotplug callbacks cancel the limbo work, and won't reschedule it if the
> domain is going offline.
>
>
> The only other path is via free_rmid(), I've not thought too much about this as
> resctrl_exit() can't actually be invoked - this code is discarded by the linker.
>
> It could be run on MPAM, but only in response to an 'error interrupt' (which is optional)
> - and all the MPAM error interrupts indicate a software bug.
This still just considers the resctrl state and hotplug notifiers.
I clearly am missing something. It is still not clear to me how this connects to your earlier
comment about races with the rdtgroup_exit() code ... how the hotplug notifiers races with the
filesystem register/unregister code.
>
> I've only invoked this path once, and rdtgroup_exit()s unregister_filesystem() didn't
> remove all the files. I anticipate digging into this teardown code more once the bulk of
> the MPAM driver is upstream.
>
>
>> I am not advocating for cpuhp_remove_state() to be called later. I understand that
>> it simplifies the flows to consider.
>>
>>>> but I
>>>> was expecting this new cleanup to be done after rdtgroup_exit() to be inverse
>>>> of init. This cleanup is inserted in middle of two existing cleanup - could
>>>> you please elaborate how this location was chosen?
>>>
>>> rdtgroup_exit() does nothing with the resctrl structures, it removes sysfs and debugfs
>>> entries, and unregisters the filesystem.
>>>
>>> Hypothetically, you can't observe any effect of the rmid_ptrs array being freed as
>>> all the CPUs are offline and the overflow/limbo threads should have been cancelled.
>>> Once cpuhp_remove_state() has been called, this really doesn't matter.
>
>> Sounds like nothing prevents this code from following the custom of cleanup to be
>> inverse of init (yet keep cpuhp_remove_state() first).
>
> I'll put the the rdt_put_mon_l3_config() call after rdtgroup_exit()...
thank you
Reinette
next prev parent reply other threads:[~2023-12-14 19:06 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-25 18:03 [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2023-10-25 18:03 ` [PATCH v7 01/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef James Morse
2023-10-25 18:03 ` [PATCH v7 02/24] x86/resctrl: kfree() rmid_ptrs from rdtgroup_exit() James Morse
2023-11-09 17:39 ` Reinette Chatre
2023-12-13 18:03 ` James Morse
2023-12-13 23:27 ` Reinette Chatre
2023-12-14 18:28 ` James Morse
2023-12-14 19:06 ` Reinette Chatre [this message]
2023-12-15 17:40 ` James Morse
2023-11-09 20:28 ` Moger, Babu
2023-12-13 18:03 ` James Morse
2023-10-25 18:03 ` [PATCH v7 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2023-11-09 17:40 ` Reinette Chatre
2023-11-09 20:28 ` Moger, Babu
2023-12-13 18:03 ` James Morse
2023-10-25 18:03 ` [PATCH v7 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2023-11-09 20:29 ` Moger, Babu
2023-12-13 18:03 ` James Morse
2023-10-25 18:03 ` [PATCH v7 05/24] x86/resctrl: Track the closid with the rmid James Morse
2023-11-09 17:41 ` Reinette Chatre
2023-12-13 18:03 ` James Morse
2023-11-09 20:31 ` Moger, Babu
2023-12-13 18:04 ` James Morse
2023-10-25 18:03 ` [PATCH v7 06/24] x86/resctrl: Access per-rmid structures by index James Morse
2023-10-31 7:43 ` [EXT] " Amit Singh Tomar
2023-12-11 14:33 ` James Morse
2024-01-21 10:27 ` Amit Singh Tomar
2024-01-22 18:07 ` James Morse
2023-11-09 17:42 ` Reinette Chatre
2023-12-13 18:04 ` James Morse
2023-11-09 20:32 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 07/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2023-11-09 17:42 ` Reinette Chatre
2023-11-09 20:32 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 08/24] x86/resctrl: Track the number of dirty RMID a CLOSID has James Morse
2023-11-09 17:43 ` Reinette Chatre
2023-12-13 18:04 ` James Morse
2023-11-09 20:38 ` Moger, Babu
2023-12-13 18:04 ` James Morse
2023-10-25 18:03 ` [PATCH v7 09/24] x86/resctrl: Use __set_bit()/__clear_bit() instead of open coding James Morse
2023-11-09 17:44 ` Reinette Chatre
2023-12-13 18:05 ` James Morse
2023-11-09 20:38 ` Moger, Babu
2023-12-13 18:05 ` James Morse
2023-10-25 18:03 ` [PATCH v7 10/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid James Morse
2023-11-09 17:46 ` Reinette Chatre
2023-12-14 11:36 ` James Morse
2023-11-09 20:39 ` Moger, Babu
2023-12-14 11:37 ` James Morse
2023-10-25 18:03 ` [PATCH v7 11/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2023-11-09 20:39 ` Moger, Babu
2023-12-14 11:37 ` James Morse
2023-11-09 20:39 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 12/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow James Morse
2023-11-09 17:46 ` Reinette Chatre
2023-11-09 20:40 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI James Morse
2023-11-09 17:46 ` Reinette Chatre
2023-11-09 20:40 ` Moger, Babu
2023-12-14 11:37 ` James Morse
2023-10-25 18:03 ` [PATCH v7 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2023-11-09 17:47 ` Reinette Chatre
2023-12-14 11:37 ` James Morse
2023-12-14 18:52 ` Reinette Chatre
2023-11-09 20:42 ` Moger, Babu
2023-12-14 11:37 ` James Morse
2023-10-25 18:03 ` [PATCH v7 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2023-11-09 20:47 ` Moger, Babu
2023-12-14 11:38 ` James Morse
2023-10-25 18:03 ` [PATCH v7 16/24] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2023-11-09 20:47 ` Moger, Babu
2023-12-14 11:38 ` James Morse
2023-10-25 18:03 ` [PATCH v7 17/24] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2023-11-09 20:48 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2023-11-09 20:48 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2023-11-09 20:51 ` Moger, Babu
2023-12-14 11:38 ` James Morse
2023-10-25 18:03 ` [PATCH v7 20/24] x86/resctrl: Add CPU online callback for resctrl work James Morse
2023-11-09 20:51 ` Moger, Babu
2023-12-14 11:38 ` James Morse
2023-10-25 18:03 ` [PATCH v7 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2023-11-09 17:48 ` Reinette Chatre
2023-12-14 11:38 ` James Morse
2023-12-14 18:53 ` Reinette Chatre
2023-12-15 17:41 ` James Morse
2023-11-09 20:51 ` Moger, Babu
2023-12-14 11:38 ` James Morse
2023-10-25 18:03 ` [PATCH v7 22/24] x86/resctrl: Add CPU offline callback for resctrl work James Morse
2023-11-09 20:52 ` Moger, Babu
2023-12-14 11:39 ` James Morse
2023-10-25 18:03 ` [PATCH v7 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() James Morse
2023-11-09 20:52 ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 24/24] x86/resctrl: Separate arch and fs resctrl locks James Morse
2023-11-09 17:48 ` Reinette Chatre
2023-12-14 11:39 ` James Morse
2023-11-09 20:52 ` Moger, Babu
2023-12-14 11:39 ` James Morse
2023-11-09 14:05 ` [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Moger, Babu
2023-12-14 11:39 ` James Morse
2023-11-13 1:54 ` Shaopeng Tan (Fujitsu)
2023-12-14 18:28 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c53872e2-1d2e-44b3-80f1-e39fb0a7330d@intel.com \
--to=reinette.chatre@intel.com \
--cc=Babu.Moger@amd.com \
--cc=amitsinght@marvell.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=bobo.shaobowang@huawei.com \
--cc=bp@alien8.de \
--cc=carl@os.amperecomputing.com \
--cc=dfustini@baylibre.com \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=lcherian@marvell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peternewman@google.com \
--cc=quic_jiles@quicinc.com \
--cc=scott@os.amperecomputing.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=tan.shaopeng@fujitsu.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=xhao@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox