From: Dave Martin <Dave.Martin@arm.com>
To: Ben Horgan <ben.horgan@arm.com>
Cc: James Morse <james.morse@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
Rob Herring <robh@kernel.org>,
Rohit Mathew <rohit.mathew@arm.com>,
Shanker Donthineni <sdonthineni@nvidia.com>,
Zeng Heng <zengheng4@huawei.com>,
Lecopzer Chen <lecopzerc@nvidia.com>,
Carl Worth <carl@os.amperecomputing.com>,
shameerali.kolothum.thodi@huawei.com,
D Scott Phillips OS <scott@os.amperecomputing.com>,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
tan.shaopeng@fujitsu.com, baolin.wang@linux.alibaba.com,
Jamie Iles <quic_jiles@quicinc.com>,
Xin Hao <xhao@linux.alibaba.com>,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand <david@redhat.com>,
Rex Nie <rex.nie@jaguarmicro.com>, Koba Ko <kobak@nvidia.com>
Subject: Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
Date: Mon, 28 Jul 2025 16:34:27 +0100 [thread overview]
Message-ID: <aIeYgxJf9EASA5Zs@e133380.arm.com> (raw)
In-Reply-To: <7ab40c09-3922-4e0c-85dd-00ff05be4ce6@arm.com>
Hi,
On Mon, Jul 28, 2025 at 12:59:12PM +0100, Ben Horgan wrote:
> Hi James,
>
> On 7/11/25 19:36, James Morse wrote:
> > When CPUs come online the original configuration should be restored.
> > Once the maximum partid is known, allocate an configuration array for
> > each component, and reprogram each RIS configuration from this.
> >
> > The MPAM spec describes how multiple controls can interact. To prevent
> > this happening by accident, always reset controls that don't have a
> > valid configuration. This allows the same helper to be used for
> > configuration and reset.
> >
> > CC: Dave Martin <Dave.Martin@arm.com>
> > Signed-off-by: James Morse <james.morse@arm.com>
> > ---
> > drivers/platform/arm64/mpam/mpam_devices.c | 236 ++++++++++++++++++--
> > drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
> > 2 files changed, 234 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> > index bb3695eb84e9..f3ecfda265d2 100644
> > --- a/drivers/platform/arm64/mpam/mpam_devices.c
> > +++ b/drivers/platform/arm64/mpam/mpam_devices.c
[...]
> > @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
[...]
> > +static void mpam_reprogram_msc(struct mpam_msc *msc)
> > +{
> > + int idx;
> > + u16 partid;
> > + bool reset;
> > + struct mpam_config *cfg;
> > + struct mpam_msc_ris *ris;
> > +
> > + idx = srcu_read_lock(&mpam_srcu);
> > + list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
> > + if (!mpam_is_enabled() && !ris->in_reset_state) {
> > + mpam_touch_msc(msc, &mpam_reset_ris, ris);
> > + ris->in_reset_state = true;
> > + continue;
> > + }
> > +
> > + reset = true;
> > + for (partid = 0; partid <= mpam_partid_max; partid++) {
> Do we need to consider 'partid_max_lock' here?
Just throwing in my 2¢, since I'd dug into this a bit previously:
Here, we are resetting an MSC or re-onlining a CPU. Either way, I
think that this only happens after the initial probing phase is
complete.
mpam_enable_once() is ordered with respect to the task that did the
final unlock of partid_max_lock during probing, by means of the
schedule_work() call. (See <linux/workqueue.h>.)
Taking the hotplug lock and installing mpam_cpu_online() for CPU
hotplug probably brings a sufficient guarantee also (though I've not
dug into it).
This function doesn't seem to be called during the probing phase (via
mpam_discovery_cpu_online()), so there shouldn't be any racing updates
to the global variables here.
> > + cfg = &ris->vmsc->comp->cfg[partid];
> > + if (cfg->features)
> > + reset = false;
> > +
> > + mpam_reprogram_ris_partid(ris, partid, cfg);
> > + }
> > + ris->in_reset_state = reset;
> > + }
> > + srcu_read_unlock(&mpam_srcu, idx);
> > +}
[...]
> > @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
[...]
> > +static int __allocate_component_cfg(struct mpam_component *comp)
> > +{
> > + if (comp->cfg)
> > + return 0;
> > +
> > + comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);
> And here?
Similarly, this runs only in the mpam_enable_once() call.
[...]
> > @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
> > might_sleep();
> > lockdep_assert_cpus_held();
> > + memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
> And here?
Similarly to mpam_reset_msc(), I think this probably only runs from
mpam_enable_once() or mpam_cpu_online().
I think most or all of the existing reads of the affected globals from
within mpam_resctrl.c are also callbacks from resctrl_init(), which
again exceutes during mpam_enable_once() (though I won't promise I
haven't missed one or two).
Once resctrl has fired up, I believe that the MPAM driver basically
trusts the IDs coming in from resctrl, and doesn't need to range-check
them against the global parameters again.
[...]
> Thanks,
>
> Ben
I consciously haven't done all the homework on this.
Although it may look like the globals are read all over the place after
probing, I think this actually only happens during resctrl initialision
(which is basically single-threaded).
The only place where they are read after probing and without mediation
via resctrl is on the CPU hotplug path.
Adding locking would ensure that an unstable value is never read, but
this is not sufficient by itself to sure that the _final_ value of a
variable is read (for some definition of "final"). And, if there is a
well-defined notion of final value and there is sufficient
synchronisation to ensure that this is the value read by a particular
read, then by construction an unstable value cannot be read.
I think that this kind of pattern is not that uncommon in the kernel,
though it is a bit painful to reason about.
Cheers
---Dave
next prev parent reply other threads:[~2025-07-28 16:06 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
2025-07-11 18:36 ` [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data James Morse
2025-07-11 18:36 ` [RFC PATCH 02/36] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
2025-07-11 18:36 ` [RFC PATCH 03/36] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
2025-07-11 18:36 ` [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node James Morse
2025-07-14 11:40 ` Ben Horgan
2025-07-25 17:08 ` James Morse
2025-07-28 8:37 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
2025-07-25 17:06 ` James Morse
2025-07-22 14:28 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
2025-07-23 14:42 ` Ben Horgan
2025-07-25 17:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
2025-07-16 15:51 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
2025-07-14 11:42 ` Ben Horgan
2025-08-05 17:06 ` James Morse
2025-07-16 16:21 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE Jonathan Cameron
2025-08-05 17:06 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
2025-07-11 18:36 ` [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
2025-07-16 16:24 ` Jonathan Cameron
2025-08-05 17:06 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM James Morse
2025-07-16 16:26 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
2025-07-16 17:07 ` Jonathan Cameron
2025-07-23 16:39 ` Ben Horgan
2025-08-05 17:07 ` James Morse
2025-08-15 9:33 ` Ben Horgan
2025-07-28 10:08 ` Jonathan Cameron
2025-08-05 17:08 ` James Morse
2025-08-05 17:07 ` James Morse
2025-07-24 10:50 ` Ben Horgan
2025-08-05 17:08 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding James Morse
2025-07-11 21:43 ` Rob Herring
2025-08-05 17:08 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
2025-07-21 16:32 ` Jonathan Cameron
2025-08-06 18:03 ` James Morse
2025-07-24 10:56 ` Ben Horgan
2025-08-06 18:03 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
2025-07-24 11:02 ` Ben Horgan
2025-08-06 18:03 ` James Morse
2025-07-24 12:09 ` Catalin Marinas
2025-08-06 18:04 ` James Morse
2025-08-07 17:50 ` Drew Fustini
2025-07-11 18:36 ` [RFC PATCH 14/36] arm_mpam: Add support for memory controller MSC on DT platforms James Morse
2025-07-11 18:36 ` [RFC PATCH 15/36] arm_mpam: Add the class and component structures for ris firmware described James Morse
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
2025-08-06 18:04 ` James Morse
2025-07-24 14:02 ` Ben Horgan
2025-08-06 18:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
2025-07-24 14:13 ` Ben Horgan
2025-08-06 18:07 ` James Morse
2025-07-29 6:11 ` Baisheng Gao
2025-08-06 18:07 ` James Morse
2025-08-05 8:46 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 18/36] arm_mpam: Probe MSCs to find the supported partid/pmg values James Morse
2025-07-11 18:36 ` [RFC PATCH 19/36] arm_mpam: Add helpers for managing the locking around the mon_sel registers James Morse
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
2025-07-24 15:08 ` Ben Horgan
2025-07-28 16:16 ` Jonathan Cameron
2025-08-07 18:26 ` James Morse
2025-07-28 8:56 ` Ben Horgan
2025-08-08 7:20 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
2025-07-28 9:15 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
2025-07-28 9:49 ` Ben Horgan
2025-08-08 7:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 23/36] arm_mpam: Add a helper to touch an MSC from any CPU James Morse
2025-07-11 18:36 ` [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
2025-07-28 10:22 ` Ben Horgan
2025-08-08 7:07 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:08 ` James Morse
2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:07 ` James Morse
2025-07-22 15:06 ` Jonathan Cameron
2025-08-08 7:11 ` James Morse
2025-07-28 10:49 ` Ben Horgan
2025-08-08 7:11 ` James Morse
2025-08-04 16:53 ` Fenghua Yu
2025-08-08 7:12 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 26/36] arm_mpam: Use a static key to indicate when mpam is enabled James Morse
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:13 ` James Morse
2025-07-28 11:59 ` Ben Horgan
2025-07-28 15:34 ` Dave Martin [this message]
2025-08-08 7:16 ` James Morse
2025-08-08 7:14 ` James Morse
2025-08-04 16:39 ` Fenghua Yu
2025-08-08 7:17 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 28/36] arm_mpam: Probe and reset the rest of the features James Morse
2025-07-11 18:36 ` [RFC PATCH 29/36] arm_mpam: Add helpers to allocate monitors James Morse
2025-07-11 18:36 ` [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
2025-07-28 13:02 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 31/36] arm_mpam: Track bandwidth counter state for overflow and power management James Morse
2025-07-11 18:36 ` [RFC PATCH 32/36] arm_mpam: Probe for long/lwd mbwu counters James Morse
2025-07-11 18:36 ` [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported James Morse
2025-07-28 13:46 ` Ben Horgan
2025-08-08 7:19 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 34/36] arm_mpam: Add helper to reset saved mbwu state James Morse
2025-07-11 18:36 ` [RFC PATCH 35/36] arm_mpam: Add kunit test for bitmap reset James Morse
2025-07-11 18:36 ` [RFC PATCH 36/36] arm_mpam: Add kunit tests for props_mismatch() James Morse
2025-08-01 16:09 ` [RFC PATCH 00/36] arm_mpam: Add basic mpam driver Jonathan Cameron
2025-08-08 7:23 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aIeYgxJf9EASA5Zs@e133380.arm.com \
--to=dave.martin@arm.com \
--cc=amitsinght@marvell.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=ben.horgan@arm.com \
--cc=bobo.shaobowang@huawei.com \
--cc=carl@os.amperecomputing.com \
--cc=david@redhat.com \
--cc=dfustini@baylibre.com \
--cc=james.morse@arm.com \
--cc=kobak@nvidia.com \
--cc=lcherian@marvell.com \
--cc=lecopzerc@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peternewman@google.com \
--cc=quic_jiles@quicinc.com \
--cc=rex.nie@jaguarmicro.com \
--cc=robh@kernel.org \
--cc=rohit.mathew@arm.com \
--cc=scott@os.amperecomputing.com \
--cc=sdonthineni@nvidia.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=tan.shaopeng@fujitsu.com \
--cc=xhao@linux.alibaba.com \
--cc=zengheng4@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).