public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Babu Moger <bmoger@amd.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
	babu.moger@amd.com, tony.luck@intel.com, Dave.Martin@arm.com,
	james.morse@arm.com, dave.hansen@linux.intel.com, bp@alien8.de
Cc: kas@kernel.org, rick.p.edgecombe@intel.com,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode
Date: Tue, 7 Oct 2025 12:36:18 -0500	[thread overview]
Message-ID: <0cd2c8ac-8dee-4280-b726-af0119baa4a1@amd.com> (raw)
In-Reply-To: <2cdc5b52-a00c-4772-8221-8d98b787722a@intel.com>

Hi Reinette,

On 10/6/25 20:23, Reinette Chatre wrote:
> Hi Babu,
> 
> On 10/6/25 1:38 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 10/6/25 12:56, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 9/30/25 1:26 PM, Babu Moger wrote:
>>>> resctrl features can be enabled or disabled using boot-time kernel
>>>> parameters. To turn off the memory bandwidth events (mbmtotal and
>>>> mbmlocal), users need to pass the following parameter to the kernel:
>>>> "rdt=!mbmtotal,!mbmlocal".
>>>
>>> ah, indeed ... although, the intention behind the mbmtotal and mbmlocal kernel
>>> parameters was to connect them to the actual hardware features identified
>>> by X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL respectively.
>>>
>>>
>>>> Found that memory bandwidth events (mbmtotal and mbmlocal) cannot be
>>>> disabled when mbm_event mode is enabled. resctrl_mon_resource_init()
>>>> unconditionally enables these events without checking if the underlying
>>>> hardware supports them.
>>>
>>> Technically this is correct since if hardware supports ABMC then the
>>> hardware is no longer required to support X86_FEATURE_CQM_MBM_TOTAL and
>>> X86_FEATURE_CQM_MBM_LOCAL in order to provide mbm_total_bytes
>>> and mbm_local_bytes.
>>>
>>> I can see how this may be confusing to user space though ...
>>>
>>>>
>>>> Remove the unconditional enablement of MBM features in
>>>> resctrl_mon_resource_init() to fix the problem. The hardware support
>>>> verification is already done in get_rdt_mon_resources().
>>>
>>> I believe by "hardware support" you mean hardware support for
>>> X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL. Wouldn't a fix like
>>> this then require any system that supports ABMC to also support
>>> X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL to be able to
>>> support mbm_total_bytes and mbm_local_bytes?
>>
>> Yes. That is correct. Right now, ABMC and X86_FEATURE_CQM_MBM_TOTAL/
>> X86_FEATURE_CQM_MBM_LOCAL are kind of tightly coupled. We have not clearly
>> separated the that.
> 
> Are you speaking from resctrl side since from what I understand these are
> independent features from the hardware side?

It is independent from hardware side. I meant we still use legacy events 
from "default" mode.

> 
>>> This problem seems to be similar to the one solved by [1] since
>>> by supporting ABMC there is no "hardware does not support mbmtotal/mbmlocal"
>>> but instead there only needs to be a check if the feature has been disabled
>>> by command line. That is, add a rdt_is_feature_enabled() check to the
>>> existing "!resctrl_is_mon_event_enabled()" check?
>>
>> Enable or disable needs to be done at get_rdt_mon_resources(). It needs to
>> be done early in  the initialization before calling domain_add_cpu() where
>> event data structures (mbm_states aarch_mbm_states) are allocated.
> 
> Good point. My mistake to suggest the event should be enabled by
> resctrl fs.


How about adding another check in get_rdt_mon_resources()?

if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)
     || rdt_is_feature_enabled(mbmtotal)) {
                 resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
                 ret = true;
         }

I need to take Tony's patch for this.

> 
>>
>>>
>>> But wait ... I think there may be a bigger problem when considering systems
>>> that support ABMC but not X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL.
>>> Shouldn't resctrl prevent such a system from switching to "default"
>>> mbm_assign_mode? Otherwise resctrl will happily let such a system switch
>>> to default mode and when user attempts to read an event file resctrl will
>>> attempt to read it via MSRs that are not supported.
>>> Looks like ABMC may need something similar to CONFIG_RESCTRL_ASSIGN_FIXED
>>> to handle this case in show() while preventing user space from switching to
>>> "default" mode on write()?
>>
>> This may not be an issue right now. When X86_FEATURE_CQM_MBM_TOTAL and
>> X86_FEATURE_CQM_MBM_LOCAL are not supported then mon_data files of these
>> events are not created.
> 
> By "right now" I assume you mean the current implementation? I think your statement
> assumes that no CPUs come or go after resctrl_mon_resource_init() enables the MBM events?
> Current implementation will enable MBM events if ABMC is supported. When the
> first CPU of a domain comes online after that then resctrl will create the mon_data
> files. These files will remain if a user then switches to default mode and if
> the user then attempts to read one of these counters then I expect problems.

Yes. It will be a problem in the that case.

I am not clear on using config option you mentioned above.

What about using the check resctrl_is_mon_event_enabled() in

resctrl_mbm_assign_mode_show() and resctrl_mbm_assign_mode_write() ?

Thanks
Babu


  reply	other threads:[~2025-10-07 17:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30 20:26 [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode Babu Moger
2025-10-06 17:56 ` Reinette Chatre
2025-10-06 20:38   ` Moger, Babu
2025-10-07  1:23     ` Reinette Chatre
2025-10-07 17:36       ` Babu Moger [this message]
2025-10-08  2:38         ` Reinette Chatre
2025-10-14 16:24           ` Reinette Chatre
2025-10-14 17:38             ` Babu Moger
2025-10-14 17:43               ` Babu Moger
2025-10-14 20:57                 ` Reinette Chatre
2025-10-14 22:45                   ` Moger, Babu
2025-10-14 23:09                     ` Reinette Chatre
2025-10-15 14:55                       ` Moger, Babu
2025-10-15 19:56                         ` Reinette Chatre
2025-10-15 20:37                           ` Moger, Babu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0cd2c8ac-8dee-4280-b726-af0119baa4a1@amd.com \
    --to=bmoger@amd.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox