public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>
Cc: D Scott Phillips OS <scott@os.amperecomputing.com>,
	"carl@os.amperecomputing.com" <carl@os.amperecomputing.com>,
	"lcherian@marvell.com" <lcherian@marvell.com>,
	"bobo.shaobowang@huawei.com" <bobo.shaobowang@huawei.com>,
	"baolin.wang@linux.alibaba.com" <baolin.wang@linux.alibaba.com>,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>,
	"peternewman@google.com" <peternewman@google.com>,
	"dfustini@baylibre.com" <dfustini@baylibre.com>,
	"amitsinght@marvell.com" <amitsinght@marvell.com>,
	David Hildenbrand <david@redhat.com>,
	Dave Martin <dave.martin@arm.com>, Koba Ko <kobak@nvidia.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	"fenghuay@nvidia.com" <fenghuay@nvidia.com>,
	"baisheng.gao@unisoc.com" <baisheng.gao@unisoc.com>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	Rob Herring <robh@kernel.org>,
	Rohit Mathew <rohit.mathew@arm.com>,
	Rafael Wysocki <rafael@kernel.org>, Len Brown <lenb@kernel.org>,
	Lorenzo Pieralisi <lpieralisi@kernel.org>,
	Hanjun Guo <guohanjun@huawei.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH v2 27/29] arm_mpam: Add helper to reset saved mbwu state
Date: Fri, 10 Oct 2025 17:53:25 +0100	[thread overview]
Message-ID: <960a8bf9-63ad-4748-a669-622b2570700a@arm.com> (raw)
In-Reply-To: <OSZPR01MB87982A954238E8122FF3ABD58B16A@OSZPR01MB8798.jpnprd01.prod.outlook.com>

Hi Shaopeng,

On 18/09/2025 03:35, Shaopeng Tan (Fujitsu) wrote:
>> resctrl expects to reset the bandwidth counters when the filesystem is
>> mounted.
>>
>> To allow this, add a helper that clears the saved mbwu state. Instead of cross
>> calling to each CPU that can access the component MSC to write to the counter,
>> set a flag that causes it to be zero'd on the the next read. This is easily done by
>> forcing a configuration update.

>> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
>> index 3080a81f0845..8254d6190ca2 100644
>> --- a/drivers/resctrl/mpam_devices.c
>> +++ b/drivers/resctrl/mpam_devices.c
>> @@ -1112,7 +1122,10 @@ static void __ris_msmon_read(void *arg)
>>  	read_msmon_ctl_flt_vals(m, &cur_ctl, &cur_flt);
>>  	clean_msmon_ctl_val(&cur_ctl);
>>  	gen_msmon_ctl_flt_vals(m, &ctl_val, &flt_val);
>> -	if (cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN))
>> +	config_mismatch = cur_flt != flt_val ||
>> +			  cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN);
>> +
>> +	if (config_mismatch || reset_on_next_read)
>>  		write_msmon_ctl_flt_vals(m, ctl_val, flt_val);

I don't have a platform that implements any of the bandwidth counters, so may need a hand
to debug this ...


> mbm_handle_overflow() calls __ris_msmon_read() every second. 
> If there are multiple monitor groups, the config_mismatch will "true" every second. 

It shouldn't be - I think you've forced it into a pathalogical case that the resctrl glue
code tries very hard to avoid.

The pattern of allocating a montior, detecing a mismatch and reconfiguring it is needed
for CSU. That stuff is re-usable for MBWU, but you never want it to happen outside
control/monitor group creation because it means you're losing data.

For those reading along at home:
resctrl expects there to be as many hardware monitors as PARTID*PMG - because every
control and monitor group has 'mbm_total_bytes' or equivalent files. User-space can read
these at any time, and the deal is they start at 0 from boot, and reset when the control
or monitor group is created.

This means the MPAM driver needs to have enough, and it needs to pre-configure them on
startup.

The resctrl glue code calls this 'free running'. It means when you call
resctrl_arch_mon_ctx_alloc() for a bandwidth monitor - it doesn't allocate a context, but
returns a magic out of range value 'USE_RMID_IDX' so that subsequent calls use the
pre-allocated monitor.


If you don't have PARTID*PMG's worth of monitors - you can't have resctrl's
mbm_total_bytes interface. People regularly complain about this - but the alternative is
counters that randomly reset, meaning you could never trust the value.
I have no intention of supporting that mode, (its already available in /dev/urandom!)


If you're seeing this mismatch happen from the overflow thread - I think you've forced
the mbwu counters on when you don't have enough monitors.
Even if the resctrl overflow 'thread' used the same mon_ctx - USE_RMID_IDX means it will
access a different hardware monitor each time.
Another option is clean_msmon_ctl_val() is missing a bit that is set by hardware, causing
the values to mismatch when they shouldn't.


Could you check mon_ctx is USE_RMID_IDX, and check which bits are mismatching?


> Then "mbwu_state->prev_val = 0;" in function write_msmon_ctl_flt_vals() will be always run.
> This means that for multiple monitoring groups, the MemoryBandwidth monitoring value is cleared every second.

Yes - this should never happen because the overflow thread should never cause a mismatch,
and the montiros should only be reconfigured when control/monitor groups are allocated.



Thanks,

James

  reply	other threads:[~2025-10-10 16:53 UTC|newest]

Thread overview: 200+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-10 20:42 [PATCH v2 00/29] arm_mpam: Add basic mpam driver James Morse
2025-09-10 20:42 ` [PATCH v2 01/29] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
2025-09-11 10:43   ` Jonathan Cameron
2025-09-11 10:48     ` Jonathan Cameron
2025-09-19 16:10     ` James Morse
2025-09-25  9:32   ` Stanimir Varbanov
2025-10-10 16:54     ` James Morse
2025-10-02  3:35   ` Fenghua Yu
2025-10-10 16:54     ` James Morse
2025-10-03  0:15   ` Gavin Shan
2025-10-10 16:55     ` James Morse
2025-09-10 20:42 ` [PATCH v2 02/29] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
2025-09-11 10:46   ` Jonathan Cameron
2025-09-19 16:10     ` James Morse
2025-09-11 14:08   ` Ben Horgan
2025-09-19 16:10     ` James Morse
2025-10-02  3:55   ` Fenghua Yu
2025-10-10 16:55     ` James Morse
2025-10-03  0:17   ` Gavin Shan
2025-09-10 20:42 ` [PATCH v2 03/29] ACPI / PPTT: Find cache level by cache-id James Morse
2025-09-11 10:59   ` Jonathan Cameron
2025-09-19 16:10     ` James Morse
2025-09-11 15:27   ` Lorenzo Pieralisi
2025-09-19 16:10     ` James Morse
2025-10-02  4:30   ` Fenghua Yu
2025-10-10 16:55     ` James Morse
2025-10-03  0:23   ` Gavin Shan
2025-09-10 20:42 ` [PATCH v2 04/29] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
2025-09-11 11:06   ` Jonathan Cameron
2025-09-19 16:10     ` James Morse
2025-10-02  5:03   ` Fenghua Yu
2025-10-10 16:55     ` James Morse
2025-09-10 20:42 ` [PATCH v2 05/29] arm64: kconfig: Add Kconfig entry for MPAM James Morse
2025-09-12 10:14   ` Ben Horgan
2025-10-02  5:06   ` Fenghua Yu
2025-10-10 16:55     ` James Morse
2025-10-03  0:32   ` Gavin Shan
2025-10-10 16:55     ` James Morse
2025-09-10 20:42 ` [PATCH v2 06/29] ACPI / MPAM: Parse the MPAM table James Morse
2025-09-11 13:17   ` Jonathan Cameron
2025-09-19 16:11     ` James Morse
2025-09-26 14:48       ` Jonathan Cameron
2025-10-17 18:50         ` James Morse
2025-09-11 14:56   ` Lorenzo Pieralisi
2025-09-19 16:11     ` James Morse
2025-09-16 13:17   ` [PATCH] arm_mpam: Try reading again if MPAM instance returns not ready Zeng Heng
2025-09-19 16:11     ` James Morse
2025-09-20 10:14       ` Zeng Heng
2025-10-02  3:21   ` [PATCH v2 06/29] ACPI / MPAM: Parse the MPAM table Fenghua Yu
2025-10-17 18:50     ` James Morse
2025-10-03  0:58   ` Gavin Shan
2025-10-17 18:51     ` James Morse
2025-09-10 20:42 ` [PATCH v2 07/29] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
2025-09-11 13:35   ` Jonathan Cameron
2025-09-23 16:41     ` James Morse
2025-09-26 14:55       ` Jonathan Cameron
2025-10-17 18:51         ` James Morse
2025-09-17 11:03   ` Ben Horgan
2025-09-29 17:44     ` James Morse
2025-10-03  3:53   ` Gavin Shan
2025-10-17 18:51     ` James Morse
2025-09-10 20:42 ` [PATCH v2 08/29] arm_mpam: Add the class and component structures for firmware described ris James Morse
2025-09-11 14:22   ` Jonathan Cameron
2025-09-26 17:52     ` James Morse
2025-09-11 16:30   ` Markus Elfring
2025-09-26 17:52     ` James Morse
2025-09-26 18:15       ` Markus Elfring
2025-10-17 18:51         ` James Morse
2025-10-03 16:54   ` Fenghua Yu
2025-10-17 18:51     ` James Morse
2025-10-06 23:13   ` Gavin Shan
2025-10-17 18:51     ` James Morse
2025-09-10 20:42 ` [PATCH v2 09/29] arm_mpam: Add MPAM MSC register layout definitions James Morse
2025-09-11 15:00   ` Jonathan Cameron
2025-10-17 18:53     ` James Morse
2025-09-12  7:33   ` Markus Elfring
2025-10-06 23:25   ` Gavin Shan
2025-09-10 20:42 ` [PATCH v2 10/29] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
2025-09-11 15:07   ` Jonathan Cameron
2025-09-29 17:44     ` James Morse
2025-09-12 10:42   ` Ben Horgan
2025-09-29 17:44     ` James Morse
2025-10-03 17:56   ` Fenghua Yu
2025-10-06 23:42   ` Gavin Shan
2025-09-10 20:42 ` [PATCH v2 11/29] arm_mpam: Probe hardware to find the supported partid/pmg values James Morse
2025-09-11 15:18   ` Jonathan Cameron
2025-09-29 17:44     ` James Morse
2025-09-12 11:11   ` Ben Horgan
2025-09-29 17:44     ` James Morse
2025-10-03 18:58   ` Fenghua Yu
2025-09-10 20:42 ` [PATCH v2 12/29] arm_mpam: Add helpers for managing the locking around the mon_sel registers James Morse
2025-09-11 15:24   ` Jonathan Cameron
2025-09-29 17:44     ` James Morse
2025-09-11 15:31   ` Ben Horgan
2025-09-29 17:44     ` James Morse
2025-10-05  0:09   ` Fenghua Yu
2025-09-10 20:42 ` [PATCH v2 13/29] arm_mpam: Probe the hardware features resctrl supports James Morse
2025-09-11 15:29   ` Jonathan Cameron
2025-09-29 17:45     ` James Morse
2025-09-11 15:37   ` Ben Horgan
2025-09-29 17:45     ` James Morse
2025-09-30 13:32       ` Ben Horgan
2025-10-05  0:53   ` Fenghua Yu
2025-09-10 20:42 ` [PATCH v2 14/29] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
2025-09-12 11:49   ` Jonathan Cameron
2025-09-29 17:45     ` James Morse
2025-10-05  1:28   ` Fenghua Yu
2025-09-10 20:42 ` [PATCH v2 15/29] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
2025-09-12 11:25   ` Ben Horgan
2025-09-12 14:52     ` Ben Horgan
2025-09-30 17:06       ` James Morse
2025-09-30 17:06     ` James Morse
2025-09-12 11:55   ` Jonathan Cameron
2025-09-30 17:06     ` James Morse
2025-09-30  2:51   ` Shaopeng Tan (Fujitsu)
2025-10-01  9:51     ` James Morse
     [not found]   ` <1f084a23-7211-4291-99b6-7f5192fb9096@nvidia.com>
2025-10-17 18:50     ` James Morse
2025-09-10 20:42 ` [PATCH v2 16/29] arm_mpam: Add a helper to touch an MSC from any CPU James Morse
2025-09-12 11:57   ` Jonathan Cameron
2025-10-01  9:50     ` James Morse
2025-10-05 21:08   ` Fenghua Yu
2025-09-10 20:42 ` [PATCH v2 17/29] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
2025-09-12 11:42   ` Ben Horgan
2025-10-02 18:02     ` James Morse
2025-09-12 12:02   ` Jonathan Cameron
2025-09-30 17:06     ` James Morse
2025-09-25  7:16   ` Fenghua Yu
2025-10-02 18:02     ` James Morse
2025-09-10 20:42 ` [PATCH v2 18/29] arm_mpam: Register and enable IRQs James Morse
2025-09-12 12:12   ` Jonathan Cameron
2025-10-02 18:02     ` James Morse
2025-09-12 14:40   ` Ben Horgan
2025-10-02 18:03     ` James Morse
2025-09-12 15:22   ` Dave Martin
2025-10-03 18:02     ` James Morse
2025-09-25  6:33   ` Fenghua Yu
2025-10-03 18:03     ` James Morse
2025-09-10 20:42 ` [PATCH v2 19/29] arm_mpam: Use a static key to indicate when mpam is enabled James Morse
2025-09-12 12:13   ` Jonathan Cameron
2025-10-03 18:03     ` James Morse
2025-09-12 14:42   ` Ben Horgan
2025-10-03 18:03     ` James Morse
2025-09-26  2:31   ` Fenghua Yu
2025-10-03 18:04     ` James Morse
2025-09-10 20:43 ` [PATCH v2 20/29] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
2025-09-12 12:22   ` Jonathan Cameron
2025-10-07 11:11     ` James Morse
2025-09-12 15:00   ` Ben Horgan
2025-09-25  6:53   ` Fenghua Yu
2025-10-03 18:04     ` James Morse
2025-09-10 20:43 ` [PATCH v2 21/29] arm_mpam: Probe and reset the rest of the features James Morse
2025-09-12 13:07   ` Jonathan Cameron
2025-10-03 18:05     ` James Morse
2025-09-10 20:43 ` [PATCH v2 22/29] arm_mpam: Add helpers to allocate monitors James Morse
2025-09-12 13:11   ` Jonathan Cameron
2025-10-06 14:57     ` James Morse
2025-10-06 15:56     ` James Morse
2025-09-10 20:43 ` [PATCH v2 23/29] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
2025-09-11 15:46   ` Ben Horgan
2025-09-12 15:08     ` Ben Horgan
2025-10-06 16:00       ` James Morse
2025-10-06 15:59     ` James Morse
2025-09-12 13:21   ` Jonathan Cameron
2025-10-09 17:48     ` James Morse
2025-09-25  2:30   ` Fenghua Yu
2025-10-09 17:48     ` James Morse
2025-09-10 20:43 ` [PATCH v2 24/29] arm_mpam: Track bandwidth counter state for overflow and power management James Morse
2025-09-12 13:24   ` Jonathan Cameron
2025-10-09 17:48     ` James Morse
2025-09-12 15:55   ` Ben Horgan
2025-10-13 16:29     ` James Morse
2025-09-10 20:43 ` [PATCH v2 25/29] arm_mpam: Probe for long/lwd mbwu counters James Morse
2025-09-12 13:27   ` Jonathan Cameron
2025-10-09 17:48     ` James Morse
2025-09-10 20:43 ` [PATCH v2 26/29] arm_mpam: Use long MBWU counters if supported James Morse
2025-09-12 13:29   ` Jonathan Cameron
2025-10-10 16:53     ` James Morse
2025-09-26  4:51   ` Fenghua Yu
2025-09-10 20:43 ` [PATCH v2 27/29] arm_mpam: Add helper to reset saved mbwu state James Morse
2025-09-12 13:33   ` Jonathan Cameron
2025-10-10 16:53     ` James Morse
2025-09-18  2:35   ` Shaopeng Tan (Fujitsu)
2025-10-10 16:53     ` James Morse [this message]
2025-09-26  4:11   ` Fenghua Yu
2025-10-10 16:53     ` James Morse
2025-09-10 20:43 ` [PATCH v2 28/29] arm_mpam: Add kunit test for bitmap reset James Morse
2025-09-12 13:37   ` Jonathan Cameron
2025-10-10 16:53     ` James Morse
2025-09-12 16:06   ` Ben Horgan
2025-10-10 16:53     ` James Morse
2025-09-26  2:35   ` Fenghua Yu
2025-10-10 16:53     ` James Morse
2025-09-10 20:43 ` [PATCH v2 29/29] arm_mpam: Add kunit tests for props_mismatch() James Morse
2025-09-12 13:41   ` Jonathan Cameron
2025-10-10 16:54     ` James Morse
2025-09-12 16:01   ` Ben Horgan
2025-10-10 16:54     ` James Morse
2025-09-26  2:36   ` Fenghua Yu
2025-10-10 16:54     ` James Morse
2025-09-25  7:18 ` [PATCH v2 00/29] arm_mpam: Add basic mpam driver Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=960a8bf9-63ad-4748-a669-622b2570700a@arm.com \
    --to=james.morse@arm.com \
    --cc=amitsinght@marvell.com \
    --cc=baisheng.gao@unisoc.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=carl@os.amperecomputing.com \
    --cc=catalin.marinas@arm.com \
    --cc=dakr@kernel.org \
    --cc=dave.martin@arm.com \
    --cc=david@redhat.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghuay@nvidia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=guohanjun@huawei.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kobak@nvidia.com \
    --cc=lcherian@marvell.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=robh@kernel.org \
    --cc=rohit.mathew@arm.com \
    --cc=scott@os.amperecomputing.com \
    --cc=sdonthineni@nvidia.com \
    --cc=sudeep.holla@arm.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=will@kernel.org \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox