linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Ben Horgan <ben.horgan@arm.com>
To: Fenghua Yu <fenghuay@nvidia.com>,
	James Morse <james.morse@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Cc: D Scott Phillips OS <scott@os.amperecomputing.com>,
	carl@os.amperecomputing.com, lcherian@marvell.com,
	bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
	baolin.wang@linux.alibaba.com,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>,
	peternewman@google.com, dfustini@baylibre.com,
	amitsinght@marvell.com, David Hildenbrand <david@kernel.org>,
	Dave Martin <dave.martin@arm.com>, Koba Ko <kobak@nvidia.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	baisheng.gao@unisoc.com,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	Gavin Shan <gshan@redhat.com>,
	rohit.mathew@arm.com, reinette.chatre@intel.com,
	Punit Agrawal <punit.agrawal@oss.qualcomm.com>
Subject: Re: [RFC PATCH 31/38] arm_mpam: resctrl: Update the rmid reallocation limit
Date: Tue, 9 Dec 2025 16:36:13 +0000	[thread overview]
Message-ID: <dc5355c6-e7a6-466e-9183-9753281961a8@arm.com> (raw)
In-Reply-To: <1cc804d5-9d14-4d71-9da1-a660a2569e4e@nvidia.com>

Hi Fenghua,

On 12/6/25 00:06, Fenghua Yu wrote:
> Hi, James,
> 
> On 12/5/25 13:58, James Morse wrote:
>> resctrl's limbo code needs to be told when the data left in a cache is
>> small enough for the partid+pmg value to be re-allocated.
>>
>> x86 uses the cache size divided by the number of rmid users the cache
>> may have. Do the same, but for the smallest cache, and with the
>> number of partid-and-pmg users.
>>
>> Querying the cache size can't happen until after cacheinfo_sysfs_init()
>> has run, so mpam_resctrl_setup() must wait until then.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> ---
>>   drivers/resctrl/mpam_resctrl.c | 54 ++++++++++++++++++++++++++++++++++
>>   1 file changed, 54 insertions(+)
>>
>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/
>> mpam_resctrl.c
>> index 506063bd3348..ccdf8db742c9 100644
>> --- a/drivers/resctrl/mpam_resctrl.c
>> +++ b/drivers/resctrl/mpam_resctrl.c
>> @@ -16,6 +16,7 @@
>>   #include <linux/resctrl.h>
>>   #include <linux/slab.h>
>>   #include <linux/types.h>
>> +#include <linux/wait.h>
>>     #include <asm/mpam.h>
>>   @@ -58,6 +59,13 @@ static bool cdp_enabled;
>>    */
>>   static bool resctrl_enabled;
>>   +/*
>> + * mpam_resctrl_pick_caches() needs to know the size of the caches.
>> cacheinfo
>> + * populates this from a device_initcall(). mpam_resctrl_setup() must
>> wait.
>> + */
>> +static bool cacheinfo_ready;
>> +static DECLARE_WAIT_QUEUE_HEAD(wait_cacheinfo_ready);
>> +
>>   /*
>>    * L3 local/total may come from different classes - what is the
>> number of MBWU
>>    * 'on L3'?
>> @@ -584,6 +592,38 @@ void resctrl_arch_reset_cntr(struct rdt_resource
>> *r, struct rdt_mon_domain *d,
>>       reset_mon_cdp_safe(mon, mon_comp, USE_PRE_ALLOCATED, closid, rmid);
>>   }
>>   +/*
>> + * The rmid realloc threshold should be for the smallest cache
>> exposed to
>> + * resctrl.
>> + */
>> +static int update_rmid_limits(struct mpam_class *class)
>> +{
>> +    u32 num_unique_pmg = resctrl_arch_system_num_rmid_idx();
>> +    struct mpam_props *cprops = &class->props;
>> +    struct cacheinfo *ci;
>> +
>> +    lockdep_assert_cpus_held();
>> +
>> +    /* Assume cache levels are the same size for all CPUs... */
>> +    ci = get_cpu_cacheinfo_level(smp_processor_id(), class->level);
>> +    if (!ci || ci->size == 0) {
>> +        pr_debug("Could not read cache size for class %u\n",
>> +             class->level);
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (!mpam_has_feature(mpam_feat_msmon_csu, cprops))
>> +        return 0;
>> +
>> +    if (!resctrl_rmid_realloc_limit ||
>> +        ci->size < resctrl_rmid_realloc_limit) {
>> +        resctrl_rmid_realloc_limit = ci->size;
>> +        resctrl_rmid_realloc_threshold = ci->size / num_unique_pmg;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static bool cache_has_usable_cpor(struct mpam_class *class)
>>   {
>>       struct mpam_props *cprops = &class->props;
>> @@ -1025,6 +1065,9 @@ static void mpam_resctrl_pick_counters(void)
>>               /* CSU counters only make sense on a cache. */
>>               switch (class->type) {
>>               case MPAM_CLASS_CACHE:
>> +                if (update_rmid_limits(class))
>> +                    continue;
>> +
>>                   counter_update_class(QOS_L3_OCCUP_EVENT_ID, class);
>>                   return;
>>               default:
>> @@ -1731,6 +1774,8 @@ int mpam_resctrl_setup(void)
>>       struct mpam_resctrl_res *res;
>>       struct mpam_resctrl_mon *mon;
>>   +    wait_event(wait_cacheinfo_ready, cacheinfo_ready);
> 
> This may cause system hang for any hw/fw issue that causes cacheinfo
> failure. Instead of hang, is it better to have a timeout wait here? Like
>     errowait_event_timeout(wait_cache_info_read, cacheinfo_ready, 5 * HZ);
> and report failure when cacheinfo is not ready.

This is just waiting on everything in device_initcall(). I think we've
got bigger problems if that doesn't finish.

> 
> Thanks.
> 
> -Fenghua


Thanks,

Ben



  reply	other threads:[~2025-12-09 16:36 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-05 21:58 [RFC PATCH 00/38] arm_mpam: Add KVM/arm64 and resctrl glue code James Morse
2025-12-05 21:58 ` [RFC PATCH 01/38] arm64: mpam: Context switch the MPAM registers James Morse
2025-12-05 23:53   ` Fenghua Yu
2025-12-09 15:08     ` Ben Horgan
2025-12-09 14:49   ` Ben Horgan
2025-12-12 12:30   ` Ben Horgan
2025-12-18 10:35   ` Jonathan Cameron
2025-12-18 14:52     ` Ben Horgan
2025-12-18 14:55       ` Ben Horgan
2025-12-18 15:38         ` Jonathan Cameron
2025-12-18 15:54           ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 02/38] arm64: mpam: Re-initialise MPAM regs when CPU comes online James Morse
2025-12-09 15:13   ` Ben Horgan
2025-12-11 11:23     ` Ben Horgan
2025-12-11 11:32       ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 03/38] arm64: mpam: Advertise the CPUs MPAM limits to the driver James Morse
2025-12-18 10:38   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 04/38] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs James Morse
2025-12-11 13:41   ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 05/38] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values James Morse
2025-12-18 10:44   ` Jonathan Cameron
2025-12-19 11:56     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 06/38] KVM: arm64: Force guest EL1 to use user-space's partid configuration James Morse
2025-12-09 15:32   ` Ben Horgan
2025-12-12 11:31   ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 07/38] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation James Morse
2025-12-09 15:43   ` Ben Horgan
2025-12-18 11:30   ` Jonathan Cameron
2025-12-19 12:02     ` Ben Horgan
2025-12-22 11:48       ` Jonathan Cameron
2026-01-02 11:07         ` Ben Horgan
2025-12-19 12:17     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 08/38] arm_mpam: resctrl: Pick the caches we will use as resctrl resources James Morse
2025-12-09 15:57   ` Ben Horgan
2025-12-16 10:14     ` Ben Horgan
2025-12-18 11:38   ` Jonathan Cameron
2025-12-19 12:04     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 09/38] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls() James Morse
2025-12-05 21:58 ` [RFC PATCH 10/38] arm_mpam: resctrl: Add resctrl_arch_get_config() James Morse
2025-12-05 21:58 ` [RFC PATCH 11/38] arm_mpam: resctrl: Implement helpers to update configuration James Morse
2025-12-18 11:47   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 12/38] arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks James Morse
2025-12-05 21:58 ` [RFC PATCH 13/38] arm_mpam: resctrl: Add CDP emulation James Morse
2025-12-16 13:49   ` Ben Horgan
2025-12-16 14:24   ` Ben Horgan
2025-12-18 11:58   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 14/38] arm_mpam: resctrl: Add rmid index helpers James Morse
2025-12-05 21:58 ` [RFC PATCH 15/38] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats James Morse
2025-12-05 21:58 ` [RFC PATCH 16/38] arm_mpam: resctrl: Add support for 'MB' resource James Morse
2025-12-12  4:27   ` Gavin Shan
2025-12-16 15:56     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 17/38] arm_mpam: resctrl: Add kunit test for control format conversions James Morse
2025-12-05 21:58 ` [RFC PATCH 18/38] arm_mpam: resctrl: Add support for csu counters James Morse
2025-12-16 13:55   ` Ben Horgan
2025-12-18 13:20   ` Jonathan Cameron
2025-12-19 12:06     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 19/38] arm_mpam: resctrl: pick classes for use as mbm counters James Morse
2025-12-18 13:36   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 20/38] arm_mpam: resctrl: Pre-allocate free running monitors James Morse
2025-12-05 21:58 ` [RFC PATCH 21/38] arm_mpam: resctrl: Pre-allocate assignable monitors James Morse
2025-12-18 13:42   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 22/38] arm_mpam: resctrl: Add kunit test for ABMC/CDP interactions James Morse
2025-12-05 21:58 ` [RFC PATCH 23/38] arm_mpam: resctrl: Add resctrl_arch_config_cntr() for ABMC use James Morse
2025-12-05 21:58 ` [RFC PATCH 24/38] arm_mpam: resctrl: Allow resctrl to allocate monitors James Morse
2025-12-16 16:58   ` Ben Horgan
2025-12-18 13:49   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 25/38] arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() James Morse
2025-12-18 13:53   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 26/38] arm_mpam: resctrl: Add resctrl_arch_cntr_read() & resctrl_arch_reset_cntr() James Morse
2025-12-05 21:58 ` [RFC PATCH 27/38] arm_mpam: resctrl: Add empty definitions for assorted resctrl functions James Morse
2025-12-09 16:31   ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 28/38] arm64: mpam: Select ARCH_HAS_CPU_RESCTRL James Morse
2025-12-09 16:33   ` Ben Horgan
2025-12-18 13:55   ` Jonathan Cameron
2025-12-05 21:58 ` [RFC PATCH 29/38] arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl James Morse
2025-12-05 21:58 ` [RFC PATCH 30/38] arm_mpam: resctrl: Call resctrl_exit() in the event of errors James Morse
2025-12-05 21:58 ` [RFC PATCH 31/38] arm_mpam: resctrl: Update the rmid reallocation limit James Morse
2025-12-06  0:06   ` Fenghua Yu
2025-12-09 16:36     ` Ben Horgan [this message]
2025-12-05 21:58 ` [RFC PATCH 32/38] arm_mpam: resctrl: Sort the order of the domain lists James Morse
2025-12-05 21:58 ` [RFC PATCH 33/38] arm_mpam: Generate a configuration for min controls James Morse
2025-12-09 16:45   ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 34/38] arm_mpam: Add quirk framework James Morse
2025-12-18 14:04   ` Jonathan Cameron
2025-12-19 12:19     ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 35/38] arm_mpam: Add workaround for T241-MPAM-1 James Morse
2025-12-10 12:20   ` Ben Horgan
2025-12-05 21:58 ` [RFC PATCH 36/38] arm_mpam: Add workaround for T241-MPAM-4 James Morse
2025-12-09 16:58   ` Ben Horgan
2025-12-05 21:59 ` [RFC PATCH 37/38] arm_mpam: Add workaround for T241-MPAM-6 James Morse
2025-12-09 17:06   ` Ben Horgan
2025-12-05 21:59 ` [RFC PATCH 38/38] arm_mpam: Quirk CMN-650's CSU NRDY behaviour James Morse
2025-12-09 14:40 ` [RFC PATCH 00/38] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
2025-12-09 15:53   ` Peter Newman
2025-12-09 16:14     ` Ben Horgan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc5355c6-e7a6-466e-9183-9753281961a8@arm.com \
    --to=ben.horgan@arm.com \
    --cc=amitsinght@marvell.com \
    --cc=baisheng.gao@unisoc.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=carl@os.amperecomputing.com \
    --cc=dave.martin@arm.com \
    --cc=david@kernel.org \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kobak@nvidia.com \
    --cc=lcherian@marvell.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peternewman@google.com \
    --cc=punit.agrawal@oss.qualcomm.com \
    --cc=quic_jiles@quicinc.com \
    --cc=reinette.chatre@intel.com \
    --cc=rohit.mathew@arm.com \
    --cc=scott@os.amperecomputing.com \
    --cc=sdonthineni@nvidia.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).