linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
@ 2025-07-25 18:29 Babu Moger
  2025-07-25 18:29 ` [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions Babu Moger
                   ` (34 more replies)
  0 siblings, 35 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian


This series adds the support for Assignable Bandwidth Monitoring Counters
(ABMC). It is also called QoS RMID Pinning feature.

Series is written such that it is easier to support other assignable
features supported from different vendors.

The feature details are documented in the  APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC). The documentation is available at
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

The patches are based on top of commit (6.16.0-rc7)
commit 34481698fd9c ("Merge branch into tip/master: 'x86/sev'")

# Introduction

Users can create as many monitor groups as RMIDs supported by the hardware.
However, the bandwidth monitoring feature on AMD systems only guarantees
that RMIDs currently assigned to a processor will be tracked by hardware.
The counters of any other RMIDs which are no longer being tracked will be
reset to zero. The MBM event counters return "Unavailable" for the RMIDs
that are not tracked by hardware. So, there can be only limited number of
groups that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups
are being tracked during a particular time. Users do not have the option
to monitor a group or set of groups for a certain period of time without
worrying about counters being reset in between.
    
The ABMC feature allows users to assign a hardware counter ID to an RMID,
event pair and monitor bandwidth usage as long as it is assigned. The
hardware continues to track the assigned counter until it is explicitly
unassigned by the user. Additionally, the user can specify the type of
memory transactions (e.g., reads, writes) to be tracked by the counter
for the assigned RMID.

Without ABMC enabled, monitoring will work in current 'default' mode without
assignment option.

# History

Earlier implementation of ABMC had dependancy on BMEC (Bandwidth Monitoring
Event Configuration). Peter had concerns with that implementation because
it may be not be compatible with ARM's MPAM.

Here are the threads discussing the concerns and new interface to address the concerns.
https://lore.kernel.org/lkml/CALPaoCg97cLVVAcacnarp+880xjsedEWGJPXhYpy4P7=ky4MZw@mail.gmail.com/
https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/

Here are the finalized requirements based on the discussion:

*   BMEC and ABMC are incompatible with each other. They need to be mutually exclusive.

*   Eliminate global assignment listing. The interface
    /sys/fs/resctrl/info/L3_MON/mbm_assign_control is no longer required.

*   Create the configuration directories at /sys/fs/resctrl/info/L3_MON/counter_configs/.
    The configuration file names should be free-form, allowing users to create them as needed.

*   Perform assignment listing at the group level by introducing mbm_L3_assignments
    in each monitoring group level. The listing should provide the following details:

    Event Configuration: Specifies the event configuration applied. This will be crucial
    when "mkdir" on event configuration is added in the future, leading to the creation
    of mon_data/mon_l3_*/<event configuration>.

    Domains: Identifies the domains where the configuration is applied, supporting multi-domain setups.

    Assignment Type: Indicates whether the assignment is Exclusive (e or d), Shared (s), or Unassigned (_).

    Exclusive assignment: Assign the counter ID the RMID, event pair exclusively.
    
    Shared assignment: A shared assignment applies to both soft-ABMC and ABMC. A user can designate a
                       "counter" (could be hardware counter or "active" RMID) as shared and that means
                       the counter within that domain is shared between different monitor groups and actual
                       assignment is scheduled by resctrl.  

    Unassigned: No longer assigned.

*   Provide option to enable or disable auto assignment when new group is created.

*   Keep the flexibility to support future assign options like Soft-ABMC etc.
    https://lore.kernel.org/lkml/7f10fa69-d1fe-4748-b10c-fa0c9b60bd66@intel.com/
    

This series addresses the requirements listed above and keeping the options open for future
enhancements.

# Implementation details

Create a generic interface to support user space assignment of scarce
counters used for monitoring. First usage of interface is by ABMC with option
to expand usage to "soft-ABMC" and MPAM counters in future.

Feature adds following interface files:

/sys/fs/resctrl/info/L3_MON/mbm_assign_mode: Reports the list of assignable
monitoring features supported. The enclosed brackets indicate which
feature is enabled.

/sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: The maximum number of monitoring counters
(total of available and assigned counters) in each domain when the system supports
mbm_event mode.

/sys/fs/resctrl/info/L3_MON/available_mbm_cntrs: The number of monitoring counters
available for assignment in each domain when mbm_event mode is enabled on the system.

/sys/fs/resctrl/info/L3_MON/event_configs: Contains sub-directory for each MBM event
					   that can be assigned to a counter.

/sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter: The type of
			memory transactions tracked by the event mbm_total_bytes.

/sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter: The type of
			memory transactions tracked by the event mbm_local_bytes.

/sys/fs/resctrl/mbm_L3_assignments: Per monitor group interface to list or modify
				    counters assigned to the group.

# Examples

a. Check if MBM assign support is available
	#mount -t resctrl resctrl /sys/fs/resctrl/

	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	[mbm_event]
	default

	mbm_event feature is detected and it is enabled.

b. Check how many assignable counters are supported. 

	# cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs 
	0=32;1=32

c. Check how many assignable counters are available for assignment in each domain.

	# cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs 
	0=30;1=30

d. Check the default event configuration.

	# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter 
	local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
        local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all

	# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter 
	local_reads,local_non_temporal_writes,local_reads_slow_memory

e. Series adds a new interface file "mbm_L3_assignments" in each monitoring group
   to list and modify that group's monitoring states.

	The list is displayed in the following format:

	<Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state>

        Event: A valid MBM event listed in the
        /sys/fs/resctrl/info/L3_MON/event_configs directory.

        Domain ID: A valid domain ID.

        Assignment types:

        _ : No counter assigned.

        e : Counter assigned exclusively.

	To list the default group states:
	# cat /sys/fs/resctrl/mbm_L3_assignments
	mbm_total_bytes:0=e;1=e
	mbm_local_bytes:0=e;1=e

	To unassign the counter associated with the mbm_total_bytes event on domain 0:
	# echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
	# cat /sys/fs/resctrl/mbm_L3_assignments
	mbm_total_bytes:0=_;1=e
	mbm_local_bytes:0=e;1=e

	To unassign the counter associated with the mbm_total_bytes event on all domains:
    	# echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
	# cat /sys/fs/resctrl/mbm_L3_assignment
	mbm_total_bytes:0=_;1=_
	mbm_local_bytes:0=e;1=e

	To assign a counter associated with the mbm_total_bytes event on all domains in exclusive mode:
    	# echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
	# cat /sys/fs/resctrl/mbm_L3_assignments
	mbm_total_bytes:0=e;1=e
	mbm_local_bytes:0=e;1=e

g. Read the events mbm_total_bytes and mbm_local_bytes of the default group.
   There is no change in reading the events with the assignment.  If the event is unassigned
   when reading, then the read will come back as "Unassigned".
	
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	779247936
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 
	765207488
	
h. Check the event configurations.

	# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
	local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
	local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all

	# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
	local_reads,local_non_temporal_writes,local_reads_slow_memory

i. Change the event configuration for mbm_local_bytes.

	# echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
	/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter

	# cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
	local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
	
	This will update all (across all domains of all monitor groups) counter assignments 
        associated with the mbm_local_bytes event.

j. Now read the local event again. The first read may come back with "Unavailable"
   status. The subsequent read of mbm_local_bytes will display the current value.
	
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
	Unavailable
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
	314101

k. Users have the option to go back to 'default' mbm_assign_mode if required.
   This can be done using the following command. Note that switching the
   mbm_assign_mode will reset all the MBM counters (and thus all MBM events) of all
   the resctrl groups.

	# echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	mbm_event
	[default]
	
l. Unmount the resctrl filesystem.
	 
	# umount /sys/fs/resctrl/
---
v16:
    Picked up first four patches from (Tony):
    https://lore.kernel.org/lkml/20250711235341.113933-1-tony.luck@intel.com/
    These patches have already been reviewed.

    Updated Reviewed-by: tag for few patches.

    Fixed the conflicts with latest cpufeatures.h and scattered.c files.

    Added a new check in get_rdt_mon_resources().
    Added check in resctrl_is_mon_event_enabled() before enabling.

    Resetting the architectural state in resctrl_arch_config_cntr() in both
    assign and unassign cases now.

    Function renames:
    resctrl_config_cntr() -> rdtgroup_assign_cntr()
    rdtgroup_alloc_config_cntr() -> rdtgroup_alloc_assign_cntr()

    Passed struct mevt to rdtgroup_alloc_assign_cntr so it can print event name on failure.

    Function rename:
      rdtgroup_free_config_cntr() -> rdtgroup_free_unassign_cntr().

    Updated rdtgroup_free_unassign_cntr() to pass struct mon_evt to match
    rdtgroup_alloc_assign_cntr() prototype.

    Removed lots of copied and unnecessary text from resctrl.h.
    Also removed references to LLC occupancy.
    Removed arch_mon_ctx from resctrl_arch_cntr_read().

    Renamed get_corrected_val() -> get_corrected_val().
     
    Removed the call resctrl_arch_rmid_read_context_check();
    Added the text about RMID_VAL_UNAVAIL error.

    Squashed two patches into one.
     https://lore.kernel.org/lkml/df215f02db88cad714755cd5275f20cf0ee4ae26.1752013061.git.babu.moger@amd.com/
     https://lore.kernel.org/lkml/296c435e9bf63fc5031114cced00fbb4837ad327.1752013061.git.babu.moger@amd.com/

    Changed is_cntr field in struct rmid_read to is_mbm_cntr.
    Fixed the memory leak with arch_mon_ctx.
    Updated the resctrl.rst user doc.

    Report Unassigned only if none of the events in CTRL_MON and MON are assigned.
      
    Moved event_filter_show() to fs/resctrl/monitor.c

    Added rdtgroup_mutex in event_filter_show().
    Removed extern for mbm_transactions. Not required.
          
    Moved resctrl_process_configs() and event_filter_write() to fs/resctrl/monitor.c.

    Renamed resctrl_process_configs() -> resctrl_parse_mem_transactions().

    Fixed the return in resctrl_mbm_assign_on_mkdir_write().

    Moved r->mon.mbm_assign_on_mkdir initialization to resctrl_mon_resource_init().

    Updated resctrl.rst few corrections and consistancy.

    Fixed few references of counter_configs to -> event_configs.
    Renamed resctrl_process_assign() to resctrl_parse_mbm_assignment().
    Moved resctrl_parse_mbm_assignment() and rdtgroup_modify_assign_state() to monitor.c.

    Added new comment in resctrl_bmec_files_show() about kernfs_find_and_get failure.
    Added the parameter to resctrl_bmec_files_show() to pass the kernfs_node.
    Updated resctrl_bmec_files_show() to pass NULL for kn_fs_node.

    Added a patch to add me as a reviewer on Reinette's suggestion.

v15:
  1-4  Picked up Tony's tree. This will be base for both the series.
  rdt-aet-v5.5 branch of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
  After Reinette's comment on previous version.
  https://lore.kernel.org/lkml/e9eb906f-d463-4c1e-9e15-5ed795fe5366@intel.com/
  https://lore.kernel.org/lkml/b761e6ec-a874-4d06-8437-a3a717a91abb@intel.com/

  Improved changelog in most of the patches. Thanks to Reinette.
  Improved the code comment in few places.

  Fixed the enumeration code by adding check in resctrl_cpu_detect() during the init.
  Moved the fs related enumeration to resctrl_mon_resource_init().

  Removed evt_cfg from struct mbm_cntr_cfg based on the discussion.
  https://lore.kernel.org/lkml/887bad33-7f4a-4b6d-95a7-fdfe0451f42b@intel.com/

  Removed resctrl_set_mon_evt_cfg().
  Moved the event initialization to resctrl_mon_resource_init().

  Changed few goto labels for consistency.

  Added extra check !r->mon.mbm_cntr_assignable in mbm_cntr_get() to return error.

  Added two new arch calls resctrl_arch_cntr_read() and resctrl_arch_reset_cntr() implement
  mbm_event mode. This is kind of major change in this series.
  https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/

  Added is_cntr in rmid_read to implement resctrl_arch_cntr_read() and resctrl_arch_reset_cntr().

  Removed the error setting in rdtgroup_mondata_show(). It is already done in mon_event_read()
  based on the discussion.
  https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/

  Changed the function name resctrl_mkdir_counter_configs() to resctrl_mkdir_event_configs().
  Called resctrl_mkdir_event_configs from rdtgroup_mkdir_info_resdir().
  It avoids the call kernfs_find_and_get() to get the node for info directory.
  Used for_each_mon_event() where applicable.

  Fixed the partial initialization of val in resctrl_process_configs().
  Passed mon_evt where applicable. The struct rdt_resource can be obtained from mon_evt::rid.

  Fixed the static checker warning in resctrl_mbm_assign_on_mkdir_write() reported in
  https://lore.kernel.org/lkml/dd4a1021-b996-438e-941c-69dfcea5f22a@intel.com/

  Moved resctrl_bmec_files_show() inside rdtgroup_mkdir_info_resdir().

v14:
   Patch #1 is already been reviewed. Not need to review.

   Patches # 2-5:
   This is Tony's work. This is part of Tony's telemetry series.
   https://lore.kernel.org/lkml/20250521225049.132551-1-tony.luck@intel.com/

   Tony made special update for me to include in this series.
   https://lore.kernel.org/lkml/20250609162139.91651-1-tony.luck@intel.com/.
   We both are going to carry thesse mutliple events support patches.

   Patches #6-31 are changes related to mbm_assign_mode. 

   Took time to check all the text comments. Taken care most of comments.
   Anything missing is not intentional. ):

   Removed the dependancy on X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL
   as discussed in https://lore.kernel.org/lkml/5f8b21c6-5166-46a6-be14-0c7c9bfb7cde@intel.com/
   Reworked on ABMC enumeration during the init.

   Updated the code comment in resctrl.h on all the prototypes.

   Added lockdep_assert_cpus_held() in _resctrl_abmc_enable().
   Removed inline for resctrl_arch_mbm_cntr_assign_enabled().
   Added prototype descriptions for resctrl_arch_mbm_cntr_assign_enabled()
   and resctrl_arch_mbm_cntr_assign_set() in include/linux/resctrl.h.
   
   Changed the name of the monitor mode to mbm_event_assign based on the discussion.
   https://lore.kernel.org/lkml/7628cec8-5914-4895-8289-027e7821777e@amd.com/
   Updated resctrl.rst for mbm_event mode.
   Changed subject line to fs/resctrl in few patches.

   Removed BMEC reference internal.h. 

   Removed mbm_mode in mon_evt data structure as it is not required anymore.
   Added resctrl_get_mon_evt_cfg() and resctrl_set_mon_evt_cfg().

   Removed evt_cfg parameter in resctrl_arch_config_cntr(). Get evt_cfg only
   when assign is required.

   Updated the code documentation for mbm_cntr_alloc() and  mbm_cntr_get().
   Passed struct mon_evt to resctrl_assign_cntr_event() that way to avoid
   back and forth calls to get event details.

   Passing the struct mon_evt to resctrl_free_config_cntr() and removed
   the need for mbm_get_mon_event() call.
   Corrected the code documentation for mbm_cntr_free().

   Added WARN_ON_ONCE() when cntr_id < 0.
   Improved code documentation in include/linux/resctrl.h.
   Added the check in mbm_update() to skip overflow handler when counter is unassigned.

   Changed the term memory events to memory transactions to be consistant.

   Changed the name of directory to event_configs from counter_config.
   Updated user doc about the memory transactions supported by assignment.

   Renamed few functions resctrl_group_assign() -> rdtgroup_assign_cntr()
   resctrl_update_assign() -> resctrl_assign_cntr_allrdtgrp()

   Added rdtgroup_mutex in resctrl_mbm_assign_on_mkdir_show().

   Fixed the problem reported by Peter.
   https://lore.kernel.org/lkml/CALPaoCjvUSKLKOXzF85j8mHT=eZYM-7R0=gJ3PRgOk4yuF5ZhQ@mail.gmail.com/
   Updated the changelog.
   
   Added check in rdt_mon_features_show to hide bmec related feature.

   Added the call resctrl_bmec_files_show() to enable/disable files
   related to BMEC monitor mode is changed.

   Added resctrl_set_mon_evt_cfg() to reset event configuration values
   when mode is changes.

   Changed the name of the mbm_assign_mode's supported to mbm_event or default.
   https://lore.kernel.org/lkml/9b08ab86-22d2-40c1-be20-fcc73ee98b3d@amd.com/

   Added example section in user doc (resctrl.rst) on how to use mbm_assign_modes.

v13:
   Removed BMEC related 2 patches which were in the previous series.
   It was related to optimization which can be doen later.

   Patches are created on top of FS/ARCH restructure. So, major changes
   are due to FS/ARCH restructure. The files are split between
   arch/x86/kernel/cpu/resctrl/ and fs/resctrl/. So, functions
   are moved between these files accordingly.

   Added fflag RFTYPE_RES_CACHE for mbm_assign_mode, num_mbm_cntrs, available_mbm_cntrs.

   Removed the references to "mbm_assign_control".
  
   Moved resctrl_arch_config_cntr() prototype to include/linux/resctrl.h.
   Changed resctrl_arch_config_cntr() to retun void from int to simplify few call
   sequences.

   Added the event configuration details inside the evt_list in monitor domains.
   The avoids the need for new structure mbm_assign_config. 

   Passed evtid to functions resctrl_alloc_config_cntr() and resctrl_assign_cntr_event().
   Event configuration value can be easily obtained from mon_evt list.

   Added new patch to pass the entire struct rdtgroup to __mon_event_count(),
   mbm_update(), and related functions. We can easily get RMID,CLOSID etc from rdtgroup.

   Added new function __cntr_id_read_phys() to handle ABMC event reading.

   Added a new patch to hide BMEC related files when mbm_cntr_assign mode is enabled..
  
   Added the call resctrl_init_evt_configuration() to setup the event configuration during init.

   And few other commit message updates and user doc updates.

   Removed Reviewed-by from few patches as patches have changed due to FS/ARCH restructure.

   Let me know if I missed something.

v12:
   This version is kind of RFC series with a new interface.
   
   Removed Reviewed-by tag on few patches when the patch has changed.

   Moved BMEC related patches (1 and 2) to beginning of the series.
   Removed the dependancy on BMEC to ABMC feature.

   Removed the un-necessary initialization of mon_config_info structure.
   Changed wrmsrl instead of wrmsr to address the below comment.
   https://lore.kernel.org/lkml/0fc8dbd4-07d8-40bd-8eec-402b48762807@zytor.com/

   Fixed the conflicts due to recent changes in rdt_resource data structure.
   Added new mbm_cfg_mask field to resctrl_mon.
   
   Added the code to reset arch state inside _resctrl_abmc_enable().

   Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
   This will be defined only in arm and not in x86.

   Changed the code to display the max supported monitoring counters in each domain.
   
   Fixed the struct mbm_cntr_cfg code documentation.
   Moved the struct mbm_cntr_cfg definition to resctrl/internal.h as suggested by James.

   Replaced seq_puts(s, ";") with seq_putc(s, ';');
   Added missing rdt_last_cmd_clear() in resctrl_available_mbm_cntrs_show().

   Added the check to reset the architecture-specific state only when assign is requested.

   Added evt_cfg as the parameter to resctrl_arch_config_cntr() as the user will
   be passing the event configuration from /info/L3_MON/event_configs/.

   Changed the check in resctrl_alloc_config_cntr() to reduce the indentation.
   Fixed the handling error on first failure while assigning.
   Added new parameter event configuration (evt_cfg) to get the event configuration from user space.

   Added tte support for reading ABMC counters. This is bit involved change and affects lots of code.

   New patch to support event configurations via new counter_configs method.

   Removed mbm_cntr_reset() as it is not required while removing the group.

   Added new patch to handle auto assign on group creation ("mbm_assign_on_mkdir")

   Added couple of patches add interface for "mbm_L3_assignments" on each mon group.

   Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear counters and
   non-architectural states when monitor mode is changed.
   https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/

   Moved the resctrl_arch_mbm_cntr_assign_set_one to domain_add_cpu_mon().

   Patches 17, 18, 19, 20, 21, 23, 24 are completely new to address the new interface requirement.

v11:
   The commit 2937f9c361f7a ("x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflags")
   is already merged. Removed from the series.
   
   Resolved minor conflicts due to code displacement in latest code.
 
   Moved the monitoring related calls to monitor.c file when possible.
   Moved some of the changes from include/linux/resctrl.h to arch/x86/kernel/cpu/resctrl/internal.h
   as requested by Reinette. This changes will be moved back when arch and non code is separated.
   
   Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
   Renamed rdtgroup_num_mbm_cntrs_show() to resctrl_num_mbm_cntrs_show().

   Moved the mon_config_info structure definition to internal.h.
   Moved resctrl_arch_mon_event_config_get() and resctrl_arch_mon_event_config_set()
   to monitor.c file.

   Moved resctrl_arch_assign_cntr() and resctrl_abmc_config_one_amd() to monitor.c.
   Added the code to reset the arch state in resctrl_arch_assign_cntr().
   Also removed resctrl_arch_reset_rmid() inside IPI as the counters are reset from the callers.

   Renamed rdtgroup_assign_cntr_event() to resctrl_assign_cntr_event().
   Refactored the resctrl_assign_cntr_event().
   Added functionality to exit on the first error during assignment.
   Simplified mbm_cntr_free().
   Removed the function mbm_cntr_assigned(). Will be using mbm_cntr_get() to
   figure out if the counter is assigned or not.
   
   Renamed rdtgroup_unassign_cntr_event() to resctrl_unassign_cntr_event().
   Refactored the resctrl_unassign_cntr_event().

   Moved mbm_cntr_reset() to monitor.c.
   Added code reset non-architectural state in mbm_cntr_reset().
   Added missing rdtgroup_unassign_cntrs() calls on failure path.

   Domain can be NULL with SNC support so moved the unassign check in rdtgroup_mondata_show().

   Renamed rdtgroup_mbm_assign_mode_write() to resctrl_mbm_assign_mode_write().
   Added more details in resctrl.rst about mbm_cntr_assign mode.
   Re-arranged the text in resctrl.rst file in section mbm_cntr_assign.

   Moved resctrl_arch_mbm_cntr_assign_set_one() to monitor.c

   Added non-arch RMID reset in mbm_config_write_domain().
   Removed resctrl_arch_reset_rmid() call in resctrl_abmc_config_one_amd(). Not required
   as reset of arch and non-arch rmid counters done from the callers. It simplies the IPI code.

   Fixed printing the separator after each domain while listing the group assignments.
   Renamed rdtgroup_mbm_assign_control_show to resctrl_mbm_assign_control_show().

   Fixed the static check warning with initializing dom_id in resctrl_process_flags()

   Added change log in each patch for specific changes.

v10:
   Major change is related to domain specific assignment.
   Added struct mbm_cntr_cfg inside mon domains. This will handle
   the domain specific assignments as discussed in below.
   https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
   I did not see the need to add cntr_id in mbm_state structure. Not used in the code.
   Following patches take care of these changes.
   Patch 12, 13, 15, 16, 17, 18.
   
   Added __init attribute to cache_alloc_hsw_probe(). Followed function
   prototype rules (preferred order is storage class before return type).
   
   Moved the mon_config_info structure definition to resctrl.h
   
   Added call resctrl_arch_reset_rmid() to reset the RMID in the domain inside IPI call
   resctrl_abmc_config_one_amd.
   
   SMP and non-SMP call support is not required in resctrl_arch_config_cntr with new
   domain specific assign approach/data structure.
   
   Assigned the counter before exposing the event files.
   Moved the call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
   This is called both CNTR_MON and MON group creation.
   
   Call mbm_cntr_reset() when unmounted to clear all the assignments.
   
   Fixed the issue with finding the domain in multiple iterations in rdtgroup_process_flags().
   
   Printed full error message with domain information when assign fails.
   
   Taken care of other text comments in all the patches. Patch specific changes are in each patch.
   
   If I missed something please point me and it is not intentional.

v9:
   Patch 14 is a new addition. 
   Major change in patch 24.
   Moved the fix patch to address __init attribute to begining of the series.
   Fixed all the call sequences. Added additional Fixed tags.

   Added Reviewed-by where applicable.

   Took care of couple of minor merge conflicts with latest code.
   Re-ordered the MSR in couple of instances.
   Added available_mbm_cntrs (patch 14) to print the number of counter in a domain.

   Used MBM_EVENT_ARRAY_INDEX macro to get the event index.
   Introduced rdtgroup_cntr_id_init() to initialize the cntr_id

   Introduced new function resctrl_config_cntr to assign the counter, update
   the bitmap and reset the architectural state.
   Taken care of error handling(freeing the counter) when assignment fails.
  
   Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
   Updated couple of rdtgroup_unassign_cntrs() calls properly.

   Fixed problem changing the mode to mbm_cntr_assign mode when it is
   not supported. Added extra checks to detect if systems supports it.
   
   https://lore.kernel.org/lkml/03b278b5-6c15-4d09-9ab7-3317e84a409e@intel.com/
   As discussed in the above comment, introduced resctrl_mon_event_config_set to
   handle IPI. But sending another IPI inside IPI causes problem. Kernel
   reports SMP warning. So, introduced resctrl_arch_update_cntr() to send the
   command directly.

   Fixed handling special case '//0=' and '//".
   Removed extra strstr() call in rdtgroup_mbm_assign_control_write().
   Added generic failure text when assignment operation fails.
   Corrected user documentation format texts.

v8:
  Patches are getting into final stages. 
  Couple of changes Patch 8, Patch 19 and Patch 23.
  Most of the other changes are related to rename and text message updates.

  Details are in each patch. Here is the summary.

  Added __init attribute to dom_data_init() in patch 8/25.
  Moved the mbm_cntrs_init() and mbm_cntrs_exit() functionality inside
  dom_data_init() and dom_data_exit() respectively.

  Renamed resctrl_mbm_evt_config_init() to arch_mbm_evt_config_init()
  Renamed resctrl_arch_event_config_get() to resctrl_arch_mon_event_config_get().
          resctrl_arch_event_config_set() to resctrl_arch_mon_event_config_set().

  Rename resctrl_arch_assign_cntr to resctrl_arch_config_cntr.
  Renamed rdtgroup_assign_cntr() to rdtgroup_assign_cntr_event().
  Added the code to return the error if rdtgroup_assign_cntr_event fails.
  Moved definition of MBM_EVENT_ARRAY_INDEX to resctrl/internal.h.
  Renamed rdtgroup_mbm_cntr_is_assigned to mbm_cntr_assigned_to_domain
  Added return error handling in resctrl_arch_config_cntr().
  Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
  Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
  Fixed the problem with unassigning the child MON groups of CTRL_MON group.
  Reset the internal counters after mbm_cntr_assign mode is changed.
  Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset()
  Renamed resctrl_arch_mbm_cntr_assign_configure to
            resctrl_arch_mbm_cntr_assign_set_one.

  Used the same IPI as event update to modify the assignment.
  Could not do the way we discussed in the thread.
  https://lore.kernel.org/lkml/f77737ac-d3f6-3e4b-3565-564f79c86ca8@amd.com/
  Needed to figure out event type to update the configuration.

  Moved unassign first and assign during the assign modification.
  Assign none "_" takes priority. Cannot be mixed with other flags.
  Updated the documentation and .rst file format. htmldoc looks ok.

v7:
   Major changes are related to FS and arch codes separation.
   Changed few interface names based on feedback.
   Here are the summary and each patch contains changes specific the patch.

   Removed WARN_ON for num_mbm_cntrs. Decided to dynamically allocate the bitmap.
   WARN_ON is not required anymore.
 
   Renamed the function resctrl_arch_get_abmc_enabled() to resctrl_arch_mbm_cntr_assign_enabled().

   Merged resctrl_arch_mbm_cntr_assign_disable, resctrl_arch_mbm_cntr_assign_disable
   and renamed to resctrl_arch_mbm_cntr_assign_set(). Passed the struct rdt_resource
   to these functions.

   Removed resctrl_arch_reset_rmid_all() from arch code. This will be done from FS the caller.

   Updated the descriptions/commit log in resctrl.rst to generic text. Removed ABMC references.
   Renamed mbm_mode to mbm_assign_mode.
   Renamed mbm_control to  mbm_assign_control.
   Introduced mutex lock in rdtgroup_mbm_mode_show().
 
   The 'legacy' mode is called 'default' mode. 

   Removed the static allocation and now allocating bitmap mbm_cntr_free_map dynamically.

   Merged rdtgroup_assign_cntr(), rdtgroup_alloc_cntr() into one.
   Merged rdtgroup_unassign_cntr(), rdtgroup_free_cntr() into one.
   
  Added struct rdt_resource to the interface functions resctrl_arch_assign_cntr ()
  and resctrl_arch_unassign_cntr().
  Rename rdtgroup_abmc_cfg() to resctrl_abmc_config_one_amd().
   
  Added a new patch to fix counter assignment on event config changes.

  Removed the references of ABMC from user interfaces.

  Simplified the parsing (strsep(&token, "//") in rdtgroup_mbm_assign_control_write().
  Added mutex lock in rdtgroup_mbm_assign_control_write() while processing.

  Thomas Gleixner asked us to update  https://gitlab.com/x86-cpuid.org/x86-cpuid-db. 
  It needs internal approval. We are working on it.

v6:
  We still need to finalize few interface details on mbm_assign_mode and mbm_assign_control
  in case of ABMC and Soft-ABMC. We can continue the discussion with this series.

  Added support for domain-id '*' to update all the domains at once.
  Fixed assign interface to allocate the counter if counter is
  not assigned.   
  Fixed unassign interface to free the counter if the counter is not
  assigned in any of the domains.

  Renamed abmc_capable to mbm_cntr_assignable.

  Renamed abmc_enabled to mbm_cntr_assign_enabled.
  Used msr_set_bit and msr_clear_bit for msr updates.
  Renamed resctrl_arch_abmc_enable() to resctrl_arch_mbm_cntr_assign_enable().
  Renamed resctrl_arch_abmc_disable() to resctrl_arch_mbm_cntr_assign_disable().

  Changed the display name from num_cntrs to num_mbm_cntrs.

  Removed the variable mbm_cntrs_free_map_len. This is not required.
  Removed the call mbm_cntrs_init() in arch code. This needs to be done at higher level.
  Used DECLARE_BITMAP to initialize mbm_cntrs_free_map.
  Removed unused config value definitions.

  Introduced mbm_cntr_map to track counters at domain level. With this
  we dont need to send MSR read to read the counter configuration.

  Separated all the counter id management to upper level in FS code.

  Added checks to detect "Unassigned" before reading the RMID.

  More details in each patch.

v5:
  Rebase changes (because of SNC support)

  Interface changes.
   /sys/fs/resctrl/mbm_assign to /sys/fs/resctrl/mbm_assign_mode.
   /sys/fs/resctrl/mbm_assign_control to /sys/fs/resctrl/mbm_assign_control.

  Added few arch specific routines.
  resctrl_arch_get_abmc_enabled.
  resctrl_arch_abmc_enable.
  resctrl_arch_abmc_disable.

  Few renames
   num_cntrs_free_map -> mbm_cntrs_free_map
   num_cntrs_init -> mbm_cntrs_init
   arch_domain_mbm_evt_config -> resctrl_arch_mbm_evt_config

  Introduced resctrl_arch_event_config_get and
    resctrl_arch_event_config_set() to update event configuration.

  Removed mon_state field mongroup. Added MON_CNTR_UNSET to initialize counters.

  Renamed ctr_id to cntr_id for the hardware counter.
 
  Report "Unassigned" in case the user attempts to read the events without assigning the counter.
  
  ABMC is enabled during the boot up. Can be enabled or disabled later.

  Fixed opcode and flags combination.
    '=_" is valid.
    "-_" amd "+_" is not valid.

 Added all the comments as far as I know. If I missed something, it is not intentional.

v4: 
  Main change is domain specific event assignment.
  Kept the ABMC feature as a default.
  Dynamcic switching between ABMC and mbm_legacy is still allowed.
  We are still not clear about mount option.
  Moved the monitoring related data in resctrl_mon structure from rdt_resource.
  Fixed the display of legacy and ABMC mode.
  Used bimap APIs when possible.
  Removed event configuration read from MSRs. We can use the
  internal saved data.(patch 12)
  Added more comments about L3_QOS_ABMC_CFG MSR.
  Added IPIs to read the assignment status for each domain (patch 18 and 19)
  More details in each patch.

v3:
   This series adds the support for global assignment mode discussed in
   the thread. https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
   Removed the individual assignment mode and included the global assignment interface.
   Added following interface files.
   a. /sys/fs/resctrl/info/L3_MON/mbm_assign
      Used for displaying the current assignment mode and switch between
      ABMC and legacy mode.
   b. /sys/fs/resctrl/info/L3_MON/mbm_assign_control
      Used for lising the groups assignment mode and modify the assignment states.
   c. Most of the changes are related to the new interface.
   d. Addressed the comments from Reinette, James and Peter.
   e. Hope I have addressed most of the major feedbacks discussed. If I missed
      something then it is not intentional. Please feel free to comment.
   f. Sending this as an RFC as per Reinette's comment. So, this is still open
      for discussion.

v2:
   a. Major change is the way ABMC is enabled. Earlier, user needed to remount
      with -o abmc to enable ABMC feature. Removed that option now.
      Now users can enable ABMC by "$echo 1 to /sys/fs/resctrl/info/L3_MON/mbm_assign_enable".
     
   b. Added new word 21 to x86/cpufeatures.h.

   c. Display unsupported if user attempts to read the events when ABMC is enabled
      and event is not assigned.

   d. Display monitor_state as "Unsupported" when ABMC is disabled.
  
   e. Text updates and rebase to latest tip tree (as of Jan 18).
 
   f. This series is still work in progress. I am yet to hear from ARM developers. 

--------------------------------------------------------------------------------------

Previous revisions:
v15: https://lore.kernel.org/lkml/cover.1752013061.git.babu.moger@amd.com/
v14: https://lore.kernel.org/lkml/cover.1749848714.git.babu.moger@amd.com/
v13: https://lore.kernel.org/lkml/cover.1747349530.git.babu.moger@amd.com/
v12: https://lore.kernel.org/lkml/cover.1743725907.git.babu.moger@amd.com/
v11: https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
v10: https://lore.kernel.org/lkml/cover.1734034524.git.babu.moger@amd.com/
v9: https://lore.kernel.org/lkml/cover.1730244116.git.babu.moger@amd.com/
v8: https://lore.kernel.org/lkml/cover.1728495588.git.babu.moger@amd.com/
v7: https://lore.kernel.org/lkml/cover.1725488488.git.babu.moger@amd.com/
v6: https://lore.kernel.org/lkml/cover.1722981659.git.babu.moger@amd.com/
v5: https://lore.kernel.org/lkml/cover.1720043311.git.babu.moger@amd.com/
v4: https://lore.kernel.org/lkml/cover.1716552602.git.babu.moger@amd.com/
v3: https://lore.kernel.org/lkml/cover.1711674410.git.babu.moger@amd.com/  
v2: https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
v1: https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
----------------------------------------------------------------------------


Babu Moger (30):
  x86/cpufeatures: Add support for Assignable Bandwidth Monitoring
    Counters (ABMC)
  x86/resctrl: Add ABMC feature in the command line options
  x86,fs/resctrl: Consolidate monitoring related data from rdt_resource
  x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details
  x86/resctrl: Add support to enable/disable AMD ABMC feature
  fs/resctrl: Introduce the interface to display monitoring modes
  fs/resctrl: Add resctrl file to display number of assignable counters
  fs/resctrl: Introduce mbm_cntr_cfg to track assignable counters per
    domain
  fs/resctrl: Introduce interface to display number of free MBM counters
  x86/resctrl: Add data structures and definitions for ABMC assignment
  fs/resctrl: Introduce event configuration field in struct mon_evt
  x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a
    counter with ABMC
  fs/resctrl: Add the functionality to assign MBM events
  fs/resctrl: Add the functionality to unassign MBM events
  fs/resctrl: Pass struct rdtgroup instead of individual members
  fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode
  x86/resctrl: Refactor resctrl_arch_rmid_read()
  x86/resctrl: Implement resctrl_arch_reset_cntr() and
    resctrl_arch_cntr_read()
  fs/resctrl: Support counter read/reset with mbm_event assignment mode
  fs/resctrl: Add definitions for MBM event configuration
  fs/resctrl: Add event configuration directory under info/L3_MON/
  fs/resctrl: Provide interface to update the event configurations
  fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on
    mkdir
  fs/resctrl: Auto assign counters on mkdir and clean up on group
    removal
  fs/resctrl: Introduce mbm_L3_assignments to list assignments in a
    group
  fs/resctrl: Introduce the interface to modify assignments in a group
  fs/resctrl: Disable BMEC event configuration when mbm_event mode is
    enabled
  fs/resctrl: Introduce the interface to switch between monitor modes
  x86/resctrl: Configure mbm_event mode if supported
  MAINTAINERS: resctrl: add myself as reviewer

Tony Luck (4):
  x86,fs/resctrl: Consolidate monitor event descriptions
  x86,fs/resctrl: Replace architecture event enabled checks
  x86/resctrl: Remove 'rdt_mon_features' global variable
  x86,fs/resctrl: Prepare for more monitor events

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/filesystems/resctrl.rst         | 312 +++++++++
 MAINTAINERS                                   |   1 +
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/include/asm/resctrl.h                |  16 -
 arch/x86/kernel/cpu/resctrl/core.c            |  79 ++-
 arch/x86/kernel/cpu/resctrl/internal.h        |  56 +-
 arch/x86/kernel/cpu/resctrl/monitor.c         | 247 +++++--
 arch/x86/kernel/cpu/scattered.c               |   1 +
 fs/resctrl/ctrlmondata.c                      |  26 +-
 fs/resctrl/internal.h                         |  54 +-
 fs/resctrl/monitor.c                          | 637 ++++++++++++++++--
 fs/resctrl/rdtgroup.c                         | 588 ++++++++++++++--
 include/linux/resctrl.h                       | 148 +++-
 include/linux/resctrl_types.h                 |  18 +-
 16 files changed, 1962 insertions(+), 226 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:47   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 02/34] x86,fs/resctrl: Replace architecture event enabled checks Babu Moger
                   ` (33 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

From: Tony Luck <tony.luck@intel.com>

There are currently only three monitor events, all associated with
the RDT_RESOURCE_L3 resource. Growing support for additional events
will be easier with some restructuring to have a single point in
file system code where all attributes of all events are defined.

Place all event descriptions into an array mon_event_all[]. Doing
this has the beneficial side effect of removing the need for
rdt_resource::evt_list.

Add resctrl_event_id::QOS_FIRST_EVENT for a lower bound on range
checks for event ids and as the starting index to scan mon_event_all[].

Drop the code that builds evt_list and change the two places where
the list is scanned to scan mon_event_all[] instead using a new
helper macro for_each_mon_event().

Architecture code now informs file system code which events are
available with resctrl_enable_mon_event().

Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Picked up first four patches from:
https://lore.kernel.org/lkml/20250711235341.113933-1-tony.luck@intel.com/
These patches have already been reviewed.
---
 arch/x86/kernel/cpu/resctrl/core.c | 12 ++++--
 fs/resctrl/internal.h              | 13 ++++--
 fs/resctrl/monitor.c               | 63 +++++++++++++++---------------
 fs/resctrl/rdtgroup.c              | 11 +++---
 include/linux/resctrl.h            |  4 +-
 include/linux/resctrl_types.h      | 12 ++++--
 6 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 187d527ef73b..7fcae25874fe 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -864,12 +864,18 @@ static __init bool get_rdt_mon_resources(void)
 {
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
 
-	if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC))
+	if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) {
+		resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID);
 		rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID);
-	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL))
+	}
+	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
+		resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
 		rdt_mon_features |= (1 << QOS_L3_MBM_TOTAL_EVENT_ID);
-	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL))
+	}
+	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
+		resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
 		rdt_mon_features |= (1 << QOS_L3_MBM_LOCAL_EVENT_ID);
+	}
 
 	if (!rdt_mon_features)
 		return false;
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 0a1eedba2b03..4f315b7e9ec0 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -52,19 +52,26 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
 }
 
 /**
- * struct mon_evt - Entry in the event list of a resource
+ * struct mon_evt - Properties of a monitor event
  * @evtid:		event id
+ * @rid:		resource id for this event
  * @name:		name of the event
  * @configurable:	true if the event is configurable
- * @list:		entry in &rdt_resource->evt_list
+ * @enabled:		true if the event is enabled
  */
 struct mon_evt {
 	enum resctrl_event_id	evtid;
+	enum resctrl_res_level	rid;
 	char			*name;
 	bool			configurable;
-	struct list_head	list;
+	bool			enabled;
 };
 
+extern struct mon_evt mon_event_all[QOS_NUM_EVENTS];
+
+#define for_each_mon_event(mevt) for (mevt = &mon_event_all[QOS_FIRST_EVENT];	\
+				      mevt < &mon_event_all[QOS_NUM_EVENTS]; mevt++)
+
 /**
  * struct mon_data - Monitoring details for each event file.
  * @list:            Member of the global @mon_data_kn_priv_list list.
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index f5637855c3ac..2313e48de55f 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -844,38 +844,39 @@ static void dom_data_exit(struct rdt_resource *r)
 	mutex_unlock(&rdtgroup_mutex);
 }
 
-static struct mon_evt llc_occupancy_event = {
-	.name		= "llc_occupancy",
-	.evtid		= QOS_L3_OCCUP_EVENT_ID,
-};
-
-static struct mon_evt mbm_total_event = {
-	.name		= "mbm_total_bytes",
-	.evtid		= QOS_L3_MBM_TOTAL_EVENT_ID,
-};
-
-static struct mon_evt mbm_local_event = {
-	.name		= "mbm_local_bytes",
-	.evtid		= QOS_L3_MBM_LOCAL_EVENT_ID,
-};
-
 /*
- * Initialize the event list for the resource.
- *
- * Note that MBM events are also part of RDT_RESOURCE_L3 resource
- * because as per the SDM the total and local memory bandwidth
- * are enumerated as part of L3 monitoring.
+ * All available events. Architecture code marks the ones that
+ * are supported by a system using resctrl_enable_mon_event()
+ * to set .enabled.
  */
-static void l3_mon_evt_init(struct rdt_resource *r)
+struct mon_evt mon_event_all[QOS_NUM_EVENTS] = {
+	[QOS_L3_OCCUP_EVENT_ID] = {
+		.name	= "llc_occupancy",
+		.evtid	= QOS_L3_OCCUP_EVENT_ID,
+		.rid	= RDT_RESOURCE_L3,
+	},
+	[QOS_L3_MBM_TOTAL_EVENT_ID] = {
+		.name	= "mbm_total_bytes",
+		.evtid	= QOS_L3_MBM_TOTAL_EVENT_ID,
+		.rid	= RDT_RESOURCE_L3,
+	},
+	[QOS_L3_MBM_LOCAL_EVENT_ID] = {
+		.name	= "mbm_local_bytes",
+		.evtid	= QOS_L3_MBM_LOCAL_EVENT_ID,
+		.rid	= RDT_RESOURCE_L3,
+	},
+};
+
+void resctrl_enable_mon_event(enum resctrl_event_id eventid)
 {
-	INIT_LIST_HEAD(&r->evt_list);
+	if (WARN_ON_ONCE(eventid < QOS_FIRST_EVENT || eventid >= QOS_NUM_EVENTS))
+		return;
+	if (mon_event_all[eventid].enabled) {
+		pr_warn("Duplicate enable for event %d\n", eventid);
+		return;
+	}
 
-	if (resctrl_arch_is_llc_occupancy_enabled())
-		list_add_tail(&llc_occupancy_event.list, &r->evt_list);
-	if (resctrl_arch_is_mbm_total_enabled())
-		list_add_tail(&mbm_total_event.list, &r->evt_list);
-	if (resctrl_arch_is_mbm_local_enabled())
-		list_add_tail(&mbm_local_event.list, &r->evt_list);
+	mon_event_all[eventid].enabled = true;
 }
 
 /**
@@ -902,15 +903,13 @@ int resctrl_mon_resource_init(void)
 	if (ret)
 		return ret;
 
-	l3_mon_evt_init(r);
-
 	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
-		mbm_total_event.configurable = true;
+		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].configurable = true;
 		resctrl_file_fflags_init("mbm_total_bytes_config",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
 	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_LOCAL_EVENT_ID)) {
-		mbm_local_event.configurable = true;
+		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].configurable = true;
 		resctrl_file_fflags_init("mbm_local_bytes_config",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 77d08229d855..b95501d4b5de 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1152,7 +1152,9 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
 	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
 	struct mon_evt *mevt;
 
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	for_each_mon_event(mevt) {
+		if (mevt->rid != r->rid || !mevt->enabled)
+			continue;
 		seq_printf(seq, "%s\n", mevt->name);
 		if (mevt->configurable)
 			seq_printf(seq, "%s_config\n", mevt->name);
@@ -3057,10 +3059,9 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
 	struct mon_evt *mevt;
 	int ret, domid;
 
-	if (WARN_ON(list_empty(&r->evt_list)))
-		return -EPERM;
-
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	for_each_mon_event(mevt) {
+		if (mevt->rid != r->rid || !mevt->enabled)
+			continue;
 		domid = do_sum ? d->ci_id : d->hdr.id;
 		priv = mon_get_kn_priv(r->rid, domid, mevt, do_sum);
 		if (WARN_ON_ONCE(!priv))
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 6fb4894b8cfd..2944042bd84c 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -269,7 +269,6 @@ enum resctrl_schema_fmt {
  * @mon_domains:	RCU list of all monitor domains for this resource
  * @name:		Name to use in "schemata" file.
  * @schema_fmt:		Which format string and parser is used for this schema.
- * @evt_list:		List of monitoring events
  * @mbm_cfg_mask:	Bandwidth sources that can be tracked when bandwidth
  *			monitoring events can be configured.
  * @cdp_capable:	Is the CDP feature available on this resource
@@ -287,7 +286,6 @@ struct rdt_resource {
 	struct list_head	mon_domains;
 	char			*name;
 	enum resctrl_schema_fmt	schema_fmt;
-	struct list_head	evt_list;
 	unsigned int		mbm_cfg_mask;
 	bool			cdp_capable;
 };
@@ -372,6 +370,8 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
 u32 resctrl_arch_system_num_rmid_idx(void);
 int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
+void resctrl_enable_mon_event(enum resctrl_event_id eventid);
+
 bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
 
 /**
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index a25fb9c4070d..2dadbc54e4b3 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -34,11 +34,15 @@
 /* Max event bits supported */
 #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
 
-/*
- * Event IDs, the values match those used to program IA32_QM_EVTSEL before
- * reading IA32_QM_CTR on RDT systems.
- */
+/* Event IDs */
 enum resctrl_event_id {
+	/* Must match value of first event below */
+	QOS_FIRST_EVENT			= 0x01,
+
+	/*
+	 * These values match those used to program IA32_QM_EVTSEL before
+	 * reading IA32_QM_CTR on RDT systems.
+	 */
 	QOS_L3_OCCUP_EVENT_ID		= 0x01,
 	QOS_L3_MBM_TOTAL_EVENT_ID	= 0x02,
 	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 02/34] x86,fs/resctrl: Replace architecture event enabled checks
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2025-07-25 18:29 ` [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 03/34] x86/resctrl: Remove 'rdt_mon_features' global variable Babu Moger
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

From: Tony Luck <tony.luck@intel.com>

The resctrl file system now has complete knowledge of the status
of every event. So there is no need for per-event function calls
to check.

Replace each of the resctrl_arch_is_{event}enabled() calls with
resctrl_is_mon_event_enabled(QOS_{EVENT}).

No functional change.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Picked up first four patches from:
https://lore.kernel.org/lkml/20250711235341.113933-1-tony.luck@intel.com/
These patches have already been reviewed.
---
 arch/x86/include/asm/resctrl.h        | 15 ---------------
 arch/x86/kernel/cpu/resctrl/core.c    |  4 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c |  4 ++--
 fs/resctrl/ctrlmondata.c              |  4 ++--
 fs/resctrl/monitor.c                  | 16 +++++++++++-----
 fs/resctrl/rdtgroup.c                 | 18 +++++++++---------
 include/linux/resctrl.h               |  2 ++
 7 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index feb93b50e990..b1dd5d6b87db 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -84,21 +84,6 @@ static inline void resctrl_arch_disable_mon(void)
 	static_branch_dec_cpuslocked(&rdt_enable_key);
 }
 
-static inline bool resctrl_arch_is_llc_occupancy_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
-}
-
-static inline bool resctrl_arch_is_mbm_total_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID));
-}
-
-static inline bool resctrl_arch_is_mbm_local_enabled(void)
-{
-	return (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID));
-}
-
 /*
  * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
  *
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7fcae25874fe..1a319ce9328c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -402,13 +402,13 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *hw_dom)
 {
 	size_t tsize;
 
-	if (resctrl_arch_is_mbm_total_enabled()) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) {
 		tsize = sizeof(*hw_dom->arch_mbm_total);
 		hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL);
 		if (!hw_dom->arch_mbm_total)
 			return -ENOMEM;
 	}
-	if (resctrl_arch_is_mbm_local_enabled()) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) {
 		tsize = sizeof(*hw_dom->arch_mbm_local);
 		hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL);
 		if (!hw_dom->arch_mbm_local) {
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c261558276cd..61d38517e2bf 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -207,11 +207,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
 {
 	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
 
-	if (resctrl_arch_is_mbm_total_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
 		memset(hw_dom->arch_mbm_total, 0,
 		       sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
 
-	if (resctrl_arch_is_mbm_local_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 		memset(hw_dom->arch_mbm_local, 0,
 		       sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
 }
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index d98e0d2de09f..ad7ffc6acf13 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -473,12 +473,12 @@ ssize_t rdtgroup_mba_mbps_event_write(struct kernfs_open_file *of,
 	rdt_last_cmd_clear();
 
 	if (!strcmp(buf, "mbm_local_bytes")) {
-		if (resctrl_arch_is_mbm_local_enabled())
+		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 			rdtgrp->mba_mbps_event = QOS_L3_MBM_LOCAL_EVENT_ID;
 		else
 			ret = -EINVAL;
 	} else if (!strcmp(buf, "mbm_total_bytes")) {
-		if (resctrl_arch_is_mbm_total_enabled())
+		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
 			rdtgrp->mba_mbps_event = QOS_L3_MBM_TOTAL_EVENT_ID;
 		else
 			ret = -EINVAL;
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 2313e48de55f..9e988b2c1a22 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -336,7 +336,7 @@ void free_rmid(u32 closid, u32 rmid)
 
 	entry = __rmid_entry(idx);
 
-	if (resctrl_arch_is_llc_occupancy_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
 		add_rmid_to_limbo(entry);
 	else
 		list_add_tail(&entry->list, &rmid_free_lru);
@@ -637,10 +637,10 @@ static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
 	 * This is protected from concurrent reads from user as both
 	 * the user and overflow handler hold the global mutex.
 	 */
-	if (resctrl_arch_is_mbm_total_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
 		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
 
-	if (resctrl_arch_is_mbm_local_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
 }
 
@@ -879,6 +879,12 @@ void resctrl_enable_mon_event(enum resctrl_event_id eventid)
 	mon_event_all[eventid].enabled = true;
 }
 
+bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid)
+{
+	return eventid >= QOS_FIRST_EVENT && eventid < QOS_NUM_EVENTS &&
+	       mon_event_all[eventid].enabled;
+}
+
 /**
  * resctrl_mon_resource_init() - Initialise global monitoring structures.
  *
@@ -914,9 +920,9 @@ int resctrl_mon_resource_init(void)
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
 
-	if (resctrl_arch_is_mbm_local_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 		mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
-	else if (resctrl_arch_is_mbm_total_enabled())
+	else if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
 		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index b95501d4b5de..a7eeb33501da 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -123,8 +123,8 @@ void rdt_staged_configs_clear(void)
 
 static bool resctrl_is_mbm_enabled(void)
 {
-	return (resctrl_arch_is_mbm_total_enabled() ||
-		resctrl_arch_is_mbm_local_enabled());
+	return (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID) ||
+		resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID));
 }
 
 static bool resctrl_is_mbm_event(int e)
@@ -196,7 +196,7 @@ static int closid_alloc(void)
 	lockdep_assert_held(&rdtgroup_mutex);
 
 	if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID) &&
-	    resctrl_arch_is_llc_occupancy_enabled()) {
+	    resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) {
 		cleanest_closid = resctrl_find_cleanest_closid();
 		if (cleanest_closid < 0)
 			return cleanest_closid;
@@ -4051,7 +4051,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
 
 	if (resctrl_is_mbm_enabled())
 		cancel_delayed_work(&d->mbm_over);
-	if (resctrl_arch_is_llc_occupancy_enabled() && has_busy_rmid(d)) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID) && has_busy_rmid(d)) {
 		/*
 		 * When a package is going down, forcefully
 		 * decrement rmid->ebusy. There is no way to know
@@ -4087,12 +4087,12 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
 	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
 	size_t tsize;
 
-	if (resctrl_arch_is_llc_occupancy_enabled()) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) {
 		d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL);
 		if (!d->rmid_busy_llc)
 			return -ENOMEM;
 	}
-	if (resctrl_arch_is_mbm_total_enabled()) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) {
 		tsize = sizeof(*d->mbm_total);
 		d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL);
 		if (!d->mbm_total) {
@@ -4100,7 +4100,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
 			return -ENOMEM;
 		}
 	}
-	if (resctrl_arch_is_mbm_local_enabled()) {
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) {
 		tsize = sizeof(*d->mbm_local);
 		d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL);
 		if (!d->mbm_local) {
@@ -4145,7 +4145,7 @@ int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d)
 					   RESCTRL_PICK_ANY_CPU);
 	}
 
-	if (resctrl_arch_is_llc_occupancy_enabled())
+	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
 		INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo);
 
 	/*
@@ -4220,7 +4220,7 @@ void resctrl_offline_cpu(unsigned int cpu)
 			cancel_delayed_work(&d->mbm_over);
 			mbm_setup_overflow_handler(d, 0, cpu);
 		}
-		if (resctrl_arch_is_llc_occupancy_enabled() &&
+		if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID) &&
 		    cpu == d->cqm_work_cpu && has_busy_rmid(d)) {
 			cancel_delayed_work(&d->cqm_limbo);
 			cqm_setup_limbo_handler(d, 0, cpu);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 2944042bd84c..40aba6b5d4f0 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -372,6 +372,8 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 
 void resctrl_enable_mon_event(enum resctrl_event_id eventid);
 
+bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid);
+
 bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 03/34] x86/resctrl: Remove 'rdt_mon_features' global variable
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2025-07-25 18:29 ` [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions Babu Moger
  2025-07-25 18:29 ` [PATCH v16 02/34] x86,fs/resctrl: Replace architecture event enabled checks Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 04/34] x86,fs/resctrl: Prepare for more monitor events Babu Moger
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

From: Tony Luck <tony.luck@intel.com>

rdt_mon_features is used as a bitmask of enabled monitor events. A monitor
event's status is now maintained in mon_evt::enabled with all monitor
events' mon_evt structures found in the filesystem's mon_event_all[] array.

Remove the remaining uses of rdt_mon_features.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Picked up first four patches from:
https://lore.kernel.org/lkml/20250711235341.113933-1-tony.luck@intel.com/
These patches have already been reviewed.
---
 arch/x86/include/asm/resctrl.h        | 1 -
 arch/x86/kernel/cpu/resctrl/core.c    | 9 +++++----
 arch/x86/kernel/cpu/resctrl/monitor.c | 5 -----
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index b1dd5d6b87db..575f8408a9e7 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -44,7 +44,6 @@ DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
 
 extern bool rdt_alloc_capable;
 extern bool rdt_mon_capable;
-extern unsigned int rdt_mon_features;
 
 DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
 DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 1a319ce9328c..5d14f9a14eda 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -863,21 +863,22 @@ static __init bool get_rdt_alloc_resources(void)
 static __init bool get_rdt_mon_resources(void)
 {
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	bool ret = false;
 
 	if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) {
 		resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID);
-		rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID);
+		ret = true;
 	}
 	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
 		resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
-		rdt_mon_features |= (1 << QOS_L3_MBM_TOTAL_EVENT_ID);
+		ret = true;
 	}
 	if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
 		resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
-		rdt_mon_features |= (1 << QOS_L3_MBM_LOCAL_EVENT_ID);
+		ret = true;
 	}
 
-	if (!rdt_mon_features)
+	if (!ret)
 		return false;
 
 	return !rdt_get_mon_l3_config(r);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 61d38517e2bf..07f8ab097cbe 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -31,11 +31,6 @@
  */
 bool rdt_mon_capable;
 
-/*
- * Global to indicate which monitoring events are enabled.
- */
-unsigned int rdt_mon_features;
-
 #define CF(cf)	((unsigned long)(1048576 * (cf) + 0.5))
 
 static int snc_nodes_per_l3_cache = 1;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 04/34] x86,fs/resctrl: Prepare for more monitor events
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (2 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 03/34] x86/resctrl: Remove 'rdt_mon_features' global variable Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

From: Tony Luck <tony.luck@intel.com>

There's a rule in computer programming that objects appear zero,
once, or many times. So code accordingly.

There are two MBM events and resctrl is coded with a lot of

        if (local)
                do one thing
        if (total)
                do a different thing

Change the rdt_mon_domain and rdt_hw_mon_domain structures to hold arrays
of pointers to per event data instead of explicit fields for total and
local bandwidth.

Simplify by coding for many events using loops on which are enabled.

Move resctrl_is_mbm_event() to <linux/resctrl.h> so it can be used more
widely. Also provide a for_each_mbm_event_id() helper macro.

Cleanup variable names in functions touched to consistently use
"eventid" for those with type enum resctrl_event_id.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Picked up first four patches from:
https://lore.kernel.org/lkml/20250711235341.113933-1-tony.luck@intel.com/
These patches have already been reviewed.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 40 +++++++++++----------
 arch/x86/kernel/cpu/resctrl/internal.h |  8 ++---
 arch/x86/kernel/cpu/resctrl/monitor.c  | 36 +++++++++----------
 fs/resctrl/monitor.c                   | 13 ++++---
 fs/resctrl/rdtgroup.c                  | 50 +++++++++++++-------------
 include/linux/resctrl.h                | 23 +++++++++---
 include/linux/resctrl_types.h          |  3 ++
 7 files changed, 96 insertions(+), 77 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 5d14f9a14eda..fbf019c1ff11 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -365,8 +365,10 @@ static void ctrl_domain_free(struct rdt_hw_ctrl_domain *hw_dom)
 
 static void mon_domain_free(struct rdt_hw_mon_domain *hw_dom)
 {
-	kfree(hw_dom->arch_mbm_total);
-	kfree(hw_dom->arch_mbm_local);
+	int idx;
+
+	for_each_mbm_idx(idx)
+		kfree(hw_dom->arch_mbm_states[idx]);
 	kfree(hw_dom);
 }
 
@@ -400,25 +402,27 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_ctrl_domain *
  */
 static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *hw_dom)
 {
-	size_t tsize;
-
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) {
-		tsize = sizeof(*hw_dom->arch_mbm_total);
-		hw_dom->arch_mbm_total = kcalloc(num_rmid, tsize, GFP_KERNEL);
-		if (!hw_dom->arch_mbm_total)
-			return -ENOMEM;
-	}
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) {
-		tsize = sizeof(*hw_dom->arch_mbm_local);
-		hw_dom->arch_mbm_local = kcalloc(num_rmid, tsize, GFP_KERNEL);
-		if (!hw_dom->arch_mbm_local) {
-			kfree(hw_dom->arch_mbm_total);
-			hw_dom->arch_mbm_total = NULL;
-			return -ENOMEM;
-		}
+	size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]);
+	enum resctrl_event_id eventid;
+	int idx;
+
+	for_each_mbm_event_id(eventid) {
+		if (!resctrl_is_mon_event_enabled(eventid))
+			continue;
+		idx = MBM_STATE_IDX(eventid);
+		hw_dom->arch_mbm_states[idx] = kcalloc(num_rmid, tsize, GFP_KERNEL);
+		if (!hw_dom->arch_mbm_states[idx])
+			goto cleanup;
 	}
 
 	return 0;
+cleanup:
+	for_each_mbm_idx(idx) {
+		kfree(hw_dom->arch_mbm_states[idx]);
+		hw_dom->arch_mbm_states[idx] = NULL;
+	}
+
+	return -ENOMEM;
 }
 
 static int get_domain_id_from_scope(int cpu, enum resctrl_scope scope)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5e3c41b36437..58dca892a5df 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -54,15 +54,15 @@ struct rdt_hw_ctrl_domain {
  * struct rdt_hw_mon_domain - Arch private attributes of a set of CPUs that share
  *			      a resource for a monitor function
  * @d_resctrl:	Properties exposed to the resctrl file system
- * @arch_mbm_total:	arch private state for MBM total bandwidth
- * @arch_mbm_local:	arch private state for MBM local bandwidth
+ * @arch_mbm_states:	Per-event pointer to the MBM event's saved state.
+ *			An MBM event's state is an array of struct arch_mbm_state
+ *			indexed by RMID on x86.
  *
  * Members of this structure are accessed via helpers that provide abstraction.
  */
 struct rdt_hw_mon_domain {
 	struct rdt_mon_domain		d_resctrl;
-	struct arch_mbm_state		*arch_mbm_total;
-	struct arch_mbm_state		*arch_mbm_local;
+	struct arch_mbm_state		*arch_mbm_states[QOS_NUM_L3_MBM_EVENTS];
 };
 
 static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct rdt_ctrl_domain *r)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 07f8ab097cbe..f01db2034d08 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -161,18 +161,14 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mon_domain *hw_do
 						 u32 rmid,
 						 enum resctrl_event_id eventid)
 {
-	switch (eventid) {
-	case QOS_L3_OCCUP_EVENT_ID:
-		return NULL;
-	case QOS_L3_MBM_TOTAL_EVENT_ID:
-		return &hw_dom->arch_mbm_total[rmid];
-	case QOS_L3_MBM_LOCAL_EVENT_ID:
-		return &hw_dom->arch_mbm_local[rmid];
-	default:
-		/* Never expect to get here */
-		WARN_ON_ONCE(1);
+	struct arch_mbm_state *state;
+
+	if (!resctrl_is_mbm_event(eventid))
 		return NULL;
-	}
+
+	state = hw_dom->arch_mbm_states[MBM_STATE_IDX(eventid)];
+
+	return state ? &state[rmid] : NULL;
 }
 
 void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
@@ -201,14 +197,16 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
 void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d)
 {
 	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
-
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
-		memset(hw_dom->arch_mbm_total, 0,
-		       sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
-
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
-		memset(hw_dom->arch_mbm_local, 0,
-		       sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
+	enum resctrl_event_id eventid;
+	int idx;
+
+	for_each_mbm_event_id(eventid) {
+		if (!resctrl_is_mon_event_enabled(eventid))
+			continue;
+		idx = MBM_STATE_IDX(eventid);
+		memset(hw_dom->arch_mbm_states[idx], 0,
+		       sizeof(*hw_dom->arch_mbm_states[0]) * r->num_rmid);
+	}
 }
 
 static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 9e988b2c1a22..dcc6c00eb362 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -346,15 +346,14 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid,
 				       u32 rmid, enum resctrl_event_id evtid)
 {
 	u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+	struct mbm_state *state;
 
-	switch (evtid) {
-	case QOS_L3_MBM_TOTAL_EVENT_ID:
-		return &d->mbm_total[idx];
-	case QOS_L3_MBM_LOCAL_EVENT_ID:
-		return &d->mbm_local[idx];
-	default:
+	if (!resctrl_is_mbm_event(evtid))
 		return NULL;
-	}
+
+	state = d->mbm_states[MBM_STATE_IDX(evtid)];
+
+	return state ? &state[idx] : NULL;
 }
 
 static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index a7eeb33501da..77336d5e4915 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -127,12 +127,6 @@ static bool resctrl_is_mbm_enabled(void)
 		resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID));
 }
 
-static bool resctrl_is_mbm_event(int e)
-{
-	return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
-		e <= QOS_L3_MBM_LOCAL_EVENT_ID);
-}
-
 /*
  * Trivial allocator for CLOSIDs. Use BITMAP APIs to manipulate a bitmap
  * of free CLOSIDs.
@@ -4023,9 +4017,13 @@ static void rdtgroup_setup_default(void)
 
 static void domain_destroy_mon_state(struct rdt_mon_domain *d)
 {
+	int idx;
+
 	bitmap_free(d->rmid_busy_llc);
-	kfree(d->mbm_total);
-	kfree(d->mbm_local);
+	for_each_mbm_idx(idx) {
+		kfree(d->mbm_states[idx]);
+		d->mbm_states[idx] = NULL;
+	}
 }
 
 void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d)
@@ -4085,32 +4083,34 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
 static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain *d)
 {
 	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
-	size_t tsize;
+	size_t tsize = sizeof(*d->mbm_states[0]);
+	enum resctrl_event_id eventid;
+	int idx;
 
 	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID)) {
 		d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL);
 		if (!d->rmid_busy_llc)
 			return -ENOMEM;
 	}
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) {
-		tsize = sizeof(*d->mbm_total);
-		d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL);
-		if (!d->mbm_total) {
-			bitmap_free(d->rmid_busy_llc);
-			return -ENOMEM;
-		}
-	}
-	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) {
-		tsize = sizeof(*d->mbm_local);
-		d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL);
-		if (!d->mbm_local) {
-			bitmap_free(d->rmid_busy_llc);
-			kfree(d->mbm_total);
-			return -ENOMEM;
-		}
+
+	for_each_mbm_event_id(eventid) {
+		if (!resctrl_is_mon_event_enabled(eventid))
+			continue;
+		idx = MBM_STATE_IDX(eventid);
+		d->mbm_states[idx] = kcalloc(idx_limit, tsize, GFP_KERNEL);
+		if (!d->mbm_states[idx])
+			goto cleanup;
 	}
 
 	return 0;
+cleanup:
+	bitmap_free(d->rmid_busy_llc);
+	for_each_mbm_idx(idx) {
+		kfree(d->mbm_states[idx]);
+		d->mbm_states[idx] = NULL;
+	}
+
+	return -ENOMEM;
 }
 
 int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_domain *d)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 40aba6b5d4f0..478d7a935ca3 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -161,8 +161,9 @@ struct rdt_ctrl_domain {
  * @hdr:		common header for different domain types
  * @ci_id:		cache info id for this domain
  * @rmid_busy_llc:	bitmap of which limbo RMIDs are above threshold
- * @mbm_total:		saved state for MBM total bandwidth
- * @mbm_local:		saved state for MBM local bandwidth
+ * @mbm_states:		Per-event pointer to the MBM event's saved state.
+ *			An MBM event's state is an array of struct mbm_state
+ *			indexed by RMID on x86 or combined CLOSID, RMID on Arm.
  * @mbm_over:		worker to periodically read MBM h/w counters
  * @cqm_limbo:		worker to periodically read CQM h/w counters
  * @mbm_work_cpu:	worker CPU for MBM h/w counters
@@ -172,8 +173,7 @@ struct rdt_mon_domain {
 	struct rdt_domain_hdr		hdr;
 	unsigned int			ci_id;
 	unsigned long			*rmid_busy_llc;
-	struct mbm_state		*mbm_total;
-	struct mbm_state		*mbm_local;
+	struct mbm_state		*mbm_states[QOS_NUM_L3_MBM_EVENTS];
 	struct delayed_work		mbm_over;
 	struct delayed_work		cqm_limbo;
 	int				mbm_work_cpu;
@@ -376,6 +376,21 @@ bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid);
 
 bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
 
+static inline bool resctrl_is_mbm_event(enum resctrl_event_id eventid)
+{
+	return (eventid >= QOS_L3_MBM_TOTAL_EVENT_ID &&
+		eventid <= QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
+/* Iterate over all memory bandwidth events */
+#define for_each_mbm_event_id(eventid)				\
+	for (eventid = QOS_L3_MBM_TOTAL_EVENT_ID;		\
+	     eventid <= QOS_L3_MBM_LOCAL_EVENT_ID; eventid++)
+
+/* Iterate over memory bandwidth arrays in domain structures */
+#define for_each_mbm_idx(idx)					\
+	for (idx = 0; idx < QOS_NUM_L3_MBM_EVENTS; idx++)
+
 /**
  * resctrl_arch_mon_event_config_write() - Write the config for an event.
  * @config_info: struct resctrl_mon_config_info describing the resource, domain
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index 2dadbc54e4b3..d98351663c2c 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -51,4 +51,7 @@ enum resctrl_event_id {
 	QOS_NUM_EVENTS,
 };
 
+#define QOS_NUM_L3_MBM_EVENTS	(QOS_L3_MBM_LOCAL_EVENT_ID - QOS_L3_MBM_TOTAL_EVENT_ID + 1)
+#define MBM_STATE_IDX(evt)	((evt) - QOS_L3_MBM_TOTAL_EVENT_ID)
+
 #endif /* __LINUX_RESCTRL_TYPES_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (3 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 04/34] x86,fs/resctrl: Prepare for more monitor events Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:47   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 06/34] x86/resctrl: Add ABMC feature in the command line options Babu Moger
                   ` (29 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Users can create as many monitor groups as RMIDs supported by the hardware.
However, bandwidth monitoring feature on AMD system only guarantees that
RMIDs currently assigned to a processor will be tracked by hardware. The
counters of any other RMIDs which are no longer being tracked will be reset
to zero. The MBM event counters return "Unavailable" for the RMIDs that are
not tracked by hardware. So, there can be only limited number of groups
that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups are
being tracked during a particular time. Users do not have the option to
monitor a group or set of groups for a certain period of time without
worrying about RMID being reset in between.

The ABMC feature allows users to assign a hardware counter to an RMID,
event pair and monitor bandwidth usage as long as it is assigned. The
hardware continues to track the assigned counter until it is explicitly
unassigned by the user. There is no need to worry about counters being
reset during this period. Additionally, the user can specify the type of
memory transactions (e.g., reads, writes) for the counter to track.

Without ABMC enabled, monitoring will work in current mode without
assignment option.

The Linux resctrl subsystem provides an interface that allows monitoring of
up to two memory bandwidth events per group, selected from a combination of
available total and local events. When ABMC is enabled, two events will be
assigned to each group by default, in line with the current interface
design. Users will also have the option to configure which types of memory
transactions are counted by these events.

Due to the limited number of available counters (32), users may quickly
exhaust the available counters. If the system runs out of assignable ABMC
counters, the kernel will report an error. In such cases, users will need
to unassign one or more active counters to free up counters for new
assignments. resctrl will provide options to assign or unassign events
through the group-specific interface file.

The feature is detected via CPUID_Fn80000020_EBX_x00 bit 5.
Bits Description
5    ABMC (Assignable Bandwidth Monitoring Counters)

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
Note: Checkpatch checks/warnings are ignored to maintain coding style.

v16: Fixed the conflicts with latest cpufeatures.h and scattered.c files.

v15: Minor changelog update.

v14: Removed the dependancy on X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL.
     as discussed in https://lore.kernel.org/lkml/5f8b21c6-5166-46a6-be14-0c7c9bfb7cde@intel.com/
     Need to re-work on ABMC enumeration during the init.
     Updated changelog with few text update.

v13: Updated the commit log with Linux interface details.

v12: Removed the dependancy on X86_FEATURE_BMEC.
     Removed the Reviewed-by tag as patch has changed.

v11: No changes.

v10: No changes.

v9: Took care of couple of minor merge conflicts. No other changes.

v8: No changes.

v7: Removed "" from feature flags. Not required anymore.
    https://lore.kernel.org/lkml/20240817145058.GCZsC40neU4wkPXeVR@fat_crate.local/

v6: Added Reinette's Reviewed-by. Moved the Checkpatch note below ---.

v5: Minor rebase change and subject line update.

v4: Changes because of rebase. Feature word 21 has few more additions now.
    Changed the text to "tracked by hardware" instead of active.

v3: Change because of rebase. Actual patch did not change.

v2: Added dependency on X86_FEATURE_BMEC.
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/scattered.c    | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 602957dd2609..43cba78a50e5 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -494,6 +494,7 @@
 #define X86_FEATURE_TSA_SQ_NO		(21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
 #define X86_FEATURE_TSA_L1_NO		(21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
 #define X86_FEATURE_CLEAR_CPU_BUF_VM	(21*32+13) /* Clear CPU buffers using VERW before VMRUN */
+#define X86_FEATURE_ABMC		(21*32+14) /* Assignable Bandwidth Monitoring Counters */
 
 /*
  * BUG word(s)
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index b4a1f6732a3a..cdd3c3e2d4c6 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -50,6 +50,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_MBA,			CPUID_EBX,  6, 0x80000008, 0 },
 	{ X86_FEATURE_SMBA,			CPUID_EBX,  2, 0x80000020, 0 },
 	{ X86_FEATURE_BMEC,			CPUID_EBX,  3, 0x80000020, 0 },
+	{ X86_FEATURE_ABMC,			CPUID_EBX,  5, 0x80000020, 0 },
 	{ X86_FEATURE_TSA_SQ_NO,		CPUID_ECX,  1, 0x80000021, 0 },
 	{ X86_FEATURE_TSA_L1_NO,		CPUID_ECX,  2, 0x80000021, 0 },
 	{ X86_FEATURE_AMD_WORKLOAD_CLASS,	CPUID_EAX, 22, 0x80000021, 0 },
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 06/34] x86/resctrl: Add ABMC feature in the command line options
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (4 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 07/34] x86,fs/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Add a kernel command-line parameter to enable or disable the exposure of
the ABMC (Assignable Bandwidth Monitoring Counters) hardware feature to
resctrl.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: No changes.

v14: Slight changelog modification.

v13: Removed the Reviewed-by as the file resctrl.rst is moved to
     Documentation/filesystems/resctrl.rst. In that sense patch has changed.

v12: No changes.

v11: No changes.

v10: No changes.

v9: No code changes. Added Reviewed-by.

v8: Commit message update.

v7: No changes

v6: No changes

v5: No changes

v4: No changes

v3: No changes

v2: No changes
---
 Documentation/admin-guide/kernel-parameters.txt | 2 +-
 Documentation/filesystems/resctrl.rst           | 1 +
 arch/x86/kernel/cpu/resctrl/core.c              | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index b29c7a8dc7e6..6a74a32c9416 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6077,7 +6077,7 @@
 	rdt=		[HW,X86,RDT]
 			Turn on/off individual RDT features. List is:
 			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
-			mba, smba, bmec.
+			mba, smba, bmec, abmc.
 			E.g. to turn on cmt and turn off mba use:
 				rdt=cmt,!mba
 
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index c7949dd44f2f..c97fd77a107d 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -26,6 +26,7 @@ MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total", "cqm_mbm_local"
 MBA (Memory Bandwidth Allocation)		"mba"
 SMBA (Slow Memory Bandwidth Allocation)         ""
 BMEC (Bandwidth Monitoring Event Configuration) ""
+ABMC (Assignable Bandwidth Monitoring Counters) ""
 ===============================================	================================
 
 Historically, new features were made visible by default in /proc/cpuinfo. This
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index fbf019c1ff11..b07b12a05886 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -711,6 +711,7 @@ enum {
 	RDT_FLAG_MBA,
 	RDT_FLAG_SMBA,
 	RDT_FLAG_BMEC,
+	RDT_FLAG_ABMC,
 };
 
 #define RDT_OPT(idx, n, f)	\
@@ -736,6 +737,7 @@ static struct rdt_options rdt_options[]  __ro_after_init = {
 	RDT_OPT(RDT_FLAG_MBA,	    "mba",	X86_FEATURE_MBA),
 	RDT_OPT(RDT_FLAG_SMBA,	    "smba",	X86_FEATURE_SMBA),
 	RDT_OPT(RDT_FLAG_BMEC,	    "bmec",	X86_FEATURE_BMEC),
+	RDT_OPT(RDT_FLAG_ABMC,	    "abmc",	X86_FEATURE_ABMC),
 };
 #define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 07/34] x86,fs/resctrl: Consolidate monitoring related data from rdt_resource
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (5 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 06/34] x86/resctrl: Add ABMC feature in the command line options Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The cache allocation and memory bandwidth allocation feature properties
are consolidated into struct resctrl_cache and struct resctrl_membw
respectively.

In preparation for more monitoring properties that will clobber the
existing resource struct more, re-organize the monitoring specific
properties to also be in a separate structure.

Also switch "bandwidth sources" term to "memory transactions" to use
consistent term within resctrl for related monitoring features.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added the Reviewed-by tag.

v15: Updated changelog.
     Minor update in code comment in resctrl.h.

v14: Updated the code comment in resctrl.h.

v13: Changes due to FS/ARCH restructure.

v12: Fixed the conflicts due to recent changes in rdt_resource data structure.
     Added new mbm_cfg_mask field to resctrl_mon.
     Removed Reviewed-by tag as patch has changed.

v11: No changes.

v10: No changes.

v9: No changes.

v8: Added Reviewed-by from Reinette. No other changes.

v7: Added kernel doc for data structure. Minor text update.

v6: Update commit message and update kernel doc for rdt_resource.

v5: Commit message update.
    Also changes related to data structure updates does to SNC support.

v4: New patch.
---
 arch/x86/kernel/cpu/resctrl/core.c    |  4 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c | 10 +++++-----
 fs/resctrl/rdtgroup.c                 |  6 +++---
 include/linux/resctrl.h               | 18 +++++++++++++-----
 4 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index b07b12a05886..267e9206a999 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -107,7 +107,7 @@ u32 resctrl_arch_system_num_rmid_idx(void)
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
 
 	/* RMID are independent numbers for x86. num_rmid_idx == num_rmid */
-	return r->num_rmid;
+	return r->mon.num_rmid;
 }
 
 struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
@@ -541,7 +541,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 
 	arch_mon_domain_online(r, d);
 
-	if (arch_domain_mbm_alloc(r->num_rmid, hw_dom)) {
+	if (arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) {
 		mon_domain_free(hw_dom);
 		return;
 	}
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f01db2034d08..2558b1bdef8b 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -130,7 +130,7 @@ static int logical_rmid_to_physical_rmid(int cpu, int lrmid)
 	if (snc_nodes_per_l3_cache == 1)
 		return lrmid;
 
-	return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid;
+	return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->mon.num_rmid;
 }
 
 static int __rmid_read_phys(u32 prmid, enum resctrl_event_id eventid, u64 *val)
@@ -205,7 +205,7 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
 			continue;
 		idx = MBM_STATE_IDX(eventid);
 		memset(hw_dom->arch_mbm_states[idx], 0,
-		       sizeof(*hw_dom->arch_mbm_states[0]) * r->num_rmid);
+		       sizeof(*hw_dom->arch_mbm_states[0]) * r->mon.num_rmid);
 	}
 }
 
@@ -344,7 +344,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 	resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024;
 	hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l3_cache;
-	r->num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache;
+	r->mon.num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache;
 	hw_res->mbm_width = MBM_CNTR_WIDTH_BASE;
 
 	if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
@@ -359,7 +359,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	 *
 	 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
 	 */
-	threshold = resctrl_rmid_realloc_limit / r->num_rmid;
+	threshold = resctrl_rmid_realloc_limit / r->mon.num_rmid;
 
 	/*
 	 * Because num_rmid may not be a power of two, round the value
@@ -373,7 +373,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
-		r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
 	}
 
 	r->mon_capable = true;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 77336d5e4915..ca0475b75390 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1135,7 +1135,7 @@ static int rdt_num_rmids_show(struct kernfs_open_file *of,
 {
 	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
 
-	seq_printf(seq, "%d\n", r->num_rmid);
+	seq_printf(seq, "%d\n", r->mon.num_rmid);
 
 	return 0;
 }
@@ -1731,9 +1731,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
 	}
 
 	/* Value from user cannot be more than the supported set of events */
-	if ((val & r->mbm_cfg_mask) != val) {
+	if ((val & r->mon.mbm_cfg_mask) != val) {
 		rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n",
-				    r->mbm_cfg_mask);
+				    r->mon.mbm_cfg_mask);
 		return -EINVAL;
 	}
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 478d7a935ca3..fe2af6cb96d4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -255,38 +255,46 @@ enum resctrl_schema_fmt {
 	RESCTRL_SCHEMA_RANGE,
 };
 
+/**
+ * struct resctrl_mon - Monitoring related data of a resctrl resource.
+ * @num_rmid:		Number of RMIDs available.
+ * @mbm_cfg_mask:	Memory transactions that can be tracked when bandwidth
+ *			monitoring events can be configured.
+ */
+struct resctrl_mon {
+	int			num_rmid;
+	unsigned int		mbm_cfg_mask;
+};
+
 /**
  * struct rdt_resource - attributes of a resctrl resource
  * @rid:		The index of the resource
  * @alloc_capable:	Is allocation available on this machine
  * @mon_capable:	Is monitor feature available on this machine
- * @num_rmid:		Number of RMIDs available
  * @ctrl_scope:		Scope of this resource for control functions
  * @mon_scope:		Scope of this resource for monitor functions
  * @cache:		Cache allocation related data
  * @membw:		If the component has bandwidth controls, their properties.
+ * @mon:		Monitoring related data.
  * @ctrl_domains:	RCU list of all control domains for this resource
  * @mon_domains:	RCU list of all monitor domains for this resource
  * @name:		Name to use in "schemata" file.
  * @schema_fmt:		Which format string and parser is used for this schema.
- * @mbm_cfg_mask:	Bandwidth sources that can be tracked when bandwidth
- *			monitoring events can be configured.
  * @cdp_capable:	Is the CDP feature available on this resource
  */
 struct rdt_resource {
 	int			rid;
 	bool			alloc_capable;
 	bool			mon_capable;
-	int			num_rmid;
 	enum resctrl_scope	ctrl_scope;
 	enum resctrl_scope	mon_scope;
 	struct resctrl_cache	cache;
 	struct resctrl_membw	membw;
+	struct resctrl_mon	mon;
 	struct list_head	ctrl_domains;
 	struct list_head	mon_domains;
 	char			*name;
 	enum resctrl_schema_fmt	schema_fmt;
-	unsigned int		mbm_cfg_mask;
 	bool			cdp_capable;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (6 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 07/34] x86,fs/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:49   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 09/34] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
                   ` (26 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
Bits Description
15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
     Monitoring Counter ID + 1

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Detect the feature and number of assignable counters supported. For
backward compatibility, upon detecting the assignable counter feature,
enable the mbm_total_bytes and mbm_local_bytes events that users are
familiar with as part of original L3 MBM support.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Added a new check in get_rdt_mon_resources().
     Added a check resctrl_is_mon_event_enabled() before enabling.

v15: Minor update to changelog.
     Added check in resctrl_cpu_detect().
     Moved the resctrl_enable_mon_event() to resctrl_mon_resource_init().

v14: Updated enumeration to support ABMC regardless of MBM total and local support.
     Updated the changelog accordingly.

v13: No changes.

v12: Resolved conflicts because of latest merge.
     Removed Reviewed-by as the patch has changed.

v11: No changes.

v10: No changes.

v9: Added Reviewed-by tag. No code changes

v8: Used GENMASK for the mask.

v7: Removed WARN_ON for num_mbm_cntrs. Decided to dynamically allocate the
    bitmap. WARN_ON is not required anymore.
    Removed redundant comments.

v6: Commit message update.
    Renamed abmc_capable to mbm_cntr_assignable.

v5: Name change num_cntrs to num_mbm_cntrs.
    Moved abmc_capable to resctrl_mon.

v4: Removed resctrl_arch_has_abmc(). Added all the code inline. We dont
    need to separate this as arch code.

v3: Removed changes related to mon_features.
    Moved rdt_cpu_has to core.c and added new function resctrl_arch_has_abmc.
    Also moved the fields mbm_assign_capable and mbm_assign_cntrs to
    rdt_resource. (James)

v2: Changed the field name to mbm_assign_capable from abmc_capable.
---
 arch/x86/kernel/cpu/resctrl/core.c    |  5 ++++-
 arch/x86/kernel/cpu/resctrl/monitor.c | 11 ++++++++---
 fs/resctrl/monitor.c                  |  7 +++++++
 include/linux/resctrl.h               |  4 ++++
 4 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 267e9206a999..09cb5a70b1cb 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -883,6 +883,8 @@ static __init bool get_rdt_mon_resources(void)
 		resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
 		ret = true;
 	}
+	if (rdt_cpu_has(X86_FEATURE_ABMC))
+		ret = true;
 
 	if (!ret)
 		return false;
@@ -990,7 +992,8 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
 
 	if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
 	    cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
-	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
+	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL) ||
+	    cpu_has(c, X86_FEATURE_ABMC)) {
 		u32 eax, ebx, ecx, edx;
 
 		/* QoS sub-leaf, EAX=0Fh, ECX=1 */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 2558b1bdef8b..0a695ce68f46 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -339,6 +339,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	unsigned int threshold;
+	u32 eax, ebx, ecx, edx;
 
 	snc_nodes_per_l3_cache = snc_get_config();
 
@@ -368,14 +369,18 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	 */
 	resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
 
-	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
-		u32 eax, ebx, ecx, edx;
-
+	if (rdt_cpu_has(X86_FEATURE_BMEC) || rdt_cpu_has(X86_FEATURE_ABMC)) {
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
 		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
 	}
 
+	if (rdt_cpu_has(X86_FEATURE_ABMC)) {
+		r->mon.mbm_cntr_assignable = true;
+		cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
+		r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
+	}
+
 	r->mon_capable = true;
 
 	return 0;
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index dcc6c00eb362..66c8c635f4b3 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -924,6 +924,13 @@ int resctrl_mon_resource_init(void)
 	else if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
 		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
 
+	if (r->mon.mbm_cntr_assignable) {
+		if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
+			resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
+		if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
+			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
+	}
+
 	return 0;
 }
 
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index fe2af6cb96d4..eb80cc233be4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -260,10 +260,14 @@ enum resctrl_schema_fmt {
  * @num_rmid:		Number of RMIDs available.
  * @mbm_cfg_mask:	Memory transactions that can be tracked when bandwidth
  *			monitoring events can be configured.
+ * @num_mbm_cntrs:	Number of assignable counters.
+ * @mbm_cntr_assignable:Is system capable of supporting counter assignment?
  */
 struct resctrl_mon {
 	int			num_rmid;
 	unsigned int		mbm_cfg_mask;
+	int			num_mbm_cntrs;
+	bool			mbm_cntr_assignable;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 09/34] x86/resctrl: Add support to enable/disable AMD ABMC feature
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (7 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes Babu Moger
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Add the functionality to enable/disable AMD ABMC feature.

AMD ABMC feature is enabled by setting enabled bit(0) in MSR
L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
to be updated on all the logical processors in the QOS Domain.

Hardware counters will reset when ABMC state is changed.

The ABMC feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Minor comment change in resctrl.h.

v14: Added lockdep_assert_cpus_held() in _resctrl_abmc_enable().
     Removed inline for resctrl_arch_mbm_cntr_assign_enabled().
     Added prototype descriptions for resctrl_arch_mbm_cntr_assign_enabled()
     and resctrl_arch_mbm_cntr_assign_set() in include/linux/resctrl.h.

v13: Resolved minor conflicts with recent FS/ARCH restructure.

v12: Clarified the comment on _resctrl_abmc_enable().
     Added the code to reset arch state in _resctrl_abmc_enable().
     Resolved the conflicts with latest merge.

v11: Moved the monitoring related calls to monitor.c file.
     Moved the changes from include/linux/resctrl.h to
     arch/x86/kernel/cpu/resctrl/internal.h.
     Removed the Reviewed-by tag as patch changed.
     Actual code did not change.

v10: No changes.

v9: Re-ordered the MSR and added Reviewed-by tag.

v8: Commit message update and moved around the comments about L3_QOS_EXT_CFG
    to _resctrl_abmc_enable.

v7: Renamed the function
    resctrl_arch_get_abmc_enabled() to resctrl_arch_mbm_cntr_assign_enabled().

    Merged resctrl_arch_mbm_cntr_assign_disable, resctrl_arch_mbm_cntr_assign_disable
    and renamed to resctrl_arch_mbm_cntr_assign_set().

    Moved the function definition to linux/resctrl.h.

    Passed the struct rdt_resource to these functions.
    Removed resctrl_arch_reset_rmid_all() from arch code. This will be done
    from the caller.

v6: Renamed abmc_enabled to mbm_cntr_assign_enabled.
    Used msr_set_bit and msr_clear_bit for msr updates.
    Renamed resctrl_arch_abmc_enable() to resctrl_arch_mbm_cntr_assign_enable().
    Renamed resctrl_arch_abmc_disable() to resctrl_arch_mbm_cntr_assign_disable().
    Made _resctrl_abmc_enable to return void.

v5: Renamed resctrl_abmc_enable to resctrl_arch_abmc_enable.
    Renamed resctrl_abmc_disable to resctrl_arch_abmc_disable.
    Introduced resctrl_arch_get_abmc_enabled to get abmc state from
    non-arch code.
    Renamed resctrl_abmc_set_all to _resctrl_abmc_enable().
    Modified commit log to make it clear about AMD ABMC feature.

v3: No changes.

v2: Few text changes in commit message.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h |  5 +++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 45 ++++++++++++++++++++++++++
 include/linux/resctrl.h                | 20 ++++++++++++
 4 files changed, 71 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 7490bb5c0776..c998cf0e1375 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1222,6 +1222,7 @@
 /* - AMD: */
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
+#define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 
 /* AMD-V MSRs */
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 58dca892a5df..a79a487e639c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -37,6 +37,9 @@ struct arch_mbm_state {
 	u64	prev_msr;
 };
 
+/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
+#define ABMC_ENABLE_BIT			0
+
 /**
  * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
  *			       a resource for a control function
@@ -102,6 +105,7 @@ struct msr_param {
  * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
  * @mbm_width:		Monitor width, to detect and correct for overflow.
  * @cdp_enabled:	CDP state of this resource
+ * @mbm_cntr_assign_enabled:	ABMC feature is enabled
  *
  * Members of this structure are either private to the architecture
  * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -115,6 +119,7 @@ struct rdt_hw_resource {
 	unsigned int		mon_scale;
 	unsigned int		mbm_width;
 	bool			cdp_enabled;
+	bool			mbm_cntr_assign_enabled;
 };
 
 static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 0a695ce68f46..cce35a0ad455 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -399,3 +399,48 @@ void __init intel_rdt_mbm_apply_quirk(void)
 	mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold;
 	mbm_cf = mbm_cf_table[cf_index].cf;
 }
+
+static void resctrl_abmc_set_one_amd(void *arg)
+{
+	bool *enable = arg;
+
+	if (*enable)
+		msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+	else
+		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+}
+
+/*
+ * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs
+ * associated with all monitor domains.
+ */
+static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
+{
+	struct rdt_mon_domain *d;
+
+	lockdep_assert_cpus_held();
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_abmc_set_one_amd,
+				 &enable, 1);
+		resctrl_arch_reset_rmid_all(r, d);
+	}
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
+{
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	if (r->mon.mbm_cntr_assignable &&
+	    hw_res->mbm_cntr_assign_enabled != enable) {
+		_resctrl_abmc_enable(r, enable);
+		hw_res->mbm_cntr_assign_enabled = enable;
+	}
+
+	return 0;
+}
+
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
+{
+	return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
+}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index eb80cc233be4..919806122c50 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -445,6 +445,26 @@ static inline u32 resctrl_get_config_index(u32 closid,
 bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l);
 int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
 
+/**
+ * resctrl_arch_mbm_cntr_assign_enabled() - Check if MBM counter assignment
+ *					    mode is enabled.
+ * @r:		Pointer to the resource structure.
+ *
+ * Return:
+ * true if the assignment mode is enabled, false otherwise.
+ */
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
+
+/**
+ * resctrl_arch_mbm_cntr_assign_set() - Configure the MBM counter assignment mode.
+ * @r:		Pointer to the resource structure.
+ * @enable:	Set to true to enable, false to disable the assignment mode.
+ *
+ * Return:
+ * 0 on success, < 0 on error.
+ */
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
+
 /*
  * Update the ctrl_val and apply this config right now.
  * Must be called on one of the domain's CPUs.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (8 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 09/34] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-08-06 21:02   ` Moger, Babu
  2025-07-25 18:29 ` [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters Babu Moger
                   ` (24 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Introduce the resctrl file "mbm_assign_mode" to list the supported counter
assignment modes.

The "mbm_event" counter assignment mode allows users to assign a hardware
counter to an RMID, event pair and monitor bandwidth usage as long as it is
assigned. The hardware continues to track the assigned counter until it is
explicitly unassigned by the user. Each event within a resctrl group can be
assigned independently in this mode.

On AMD systems "mbm_event" mode is backed by the ABMC (Assignable
Bandwidth Monitoring Counters) hardware feature and is enabled by default.

The "default" mode is the existing mode that works without the explicit
counter assignment, instead relying on dynamic counter assignment by
hardware that may result in hardware not dedicating a counter resulting
in monitoring data reads returning "Unavailable".

Provide an interface to display the monitor modes on the system.

$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_event]
default

Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64.

On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be
defined when the "mbm_event" mode is supported.

Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the user
interface remains compatible with upcoming Arm64 support. IS_ENABLED()
safely evaluates to 0 when the configuration is not defined.

As a result, for MPAM, the display would be either:
[default]
or
[mbm_event]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Update with Reviewed-by tag.

v15: Minor text changes in changelog and resctrl.rst.

v14: Changed the name of the monitor mode to mbm_cntr_evt_assign based on the discussion.
     https://lore.kernel.org/lkml/7628cec8-5914-4895-8289-027e7821777e@amd.com/
     Changed the name of the mbm_assign_mode's.
     Updated resctrl.rst for mbm_event mode.
     Changed subject line to fs/resctrl.

v13: Updated the commit log with motivation for adding CONFIG_RESCTRL_ASSIGN_FIXED.
     Added fflag RFTYPE_RES_CACHE for mbm_assign_mode file.
     Updated user doc. Removed the references to "mbm_assign_control".
     Resolved the conflicts with latest FS/ARCH code restructure.

v12: Minor text update in change log and user documentation.
     Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
     This will be defined only in arm and not in x86.

v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
     Removed few texts in resctrl.rst about AMD specific information.
     Updated few texts.

v10: Added few more text to user documentation clarify on the default mode.

v9: Updated user documentation based on comments.

v8: Commit message update.

v7: Updated the descriptions/commit log in resctrl.rst to generic text.
    Thanks to James and Reinette.
    Rename mbm_mode to mbm_assign_mode.
    Introduced mutex lock in rdtgroup_mbm_mode_show().

v6: Added documentation for mbm_cntr_assign and legacy mode.
    Moved mbm_mode fflags initialization to static initialization.

v5: Changed interface name to mbm_mode.
    It will be always available even if ABMC feature is not supported.
    Added description in resctrl.rst about ABMC mode.
    Fixed display abmc and legacy consistantly.

v4: Fixed the checks for legacy and abmc mode. Default it ABMC.

v3: New patch to display ABMC capability.
---
 Documentation/filesystems/resctrl.rst | 31 ++++++++++++++++++++++
 fs/resctrl/rdtgroup.c                 | 37 +++++++++++++++++++++++++++
 2 files changed, 68 insertions(+)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index c97fd77a107d..b692829fec5f 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -257,6 +257,37 @@ with the following files:
 	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
 	    0=0x30;1=0x30;3=0x15;4=0x15
 
+"mbm_assign_mode":
+	The supported counter assignment modes. The enclosed brackets indicate which mode
+	is enabled.
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+	  [mbm_event]
+	  default
+
+	"mbm_event":
+
+	mbm_event mode allows users to assign a hardware counter to an RMID, event
+	pair and monitor the bandwidth usage as long as it is assigned. The hardware
+	continues to track the assigned counter until it is explicitly unassigned by
+	the user. Each event within a resctrl group can be assigned independently.
+
+	In this mode, a monitoring event can only accumulate data while it is backed
+	by a hardware counter. Use "mbm_L3_assignments" found in each CTRL_MON and MON
+	group to specify which of the events should have a counter assigned. The number
+	of counters available is described in the "num_mbm_cntrs" file. Changing the
+	mode may cause all counters on the resource to reset.
+
+	"default":
+
+	In default mode, resctrl assumes there is a hardware counter for each
+	event within every CTRL_MON and MON group. On AMD platforms, it is
+	recommended to use the mbm_event mode, if supported, to prevent reset of MBM
+	events between reads resulting from hardware re-allocating counters. This can
+	result in misleading values or display "Unavailable" if no counter is assigned
+	to the event.
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index ca0475b75390..c7ca9113a12a 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1799,6 +1799,36 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
+					struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+	bool enabled;
+
+	mutex_lock(&rdtgroup_mutex);
+	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
+
+	if (r->mon.mbm_cntr_assignable) {
+		if (enabled)
+			seq_puts(s, "[mbm_event]\n");
+		else
+			seq_puts(s, "[default]\n");
+
+		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
+			if (enabled)
+				seq_puts(s, "default\n");
+			else
+				seq_puts(s, "mbm_event\n");
+		}
+	} else {
+		seq_puts(s, "[default]\n");
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -1911,6 +1941,13 @@ static struct rftype res_common_files[] = {
 		.seq_show	= mbm_local_bytes_config_show,
 		.write		= mbm_local_bytes_config_write,
 	},
+	{
+		.name		= "mbm_assign_mode",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_mbm_assign_mode_show,
+		.fflags		= RFTYPE_MON_INFO | RFTYPE_RES_CACHE,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (9 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-08-06 21:12   ` Moger, Babu
  2025-07-25 18:29 ` [PATCH v16 12/34] fs/resctrl: Introduce mbm_cntr_cfg to track assignable counters per domain Babu Moger
                   ` (23 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The "mbm_event" counter assignment mode allows users to assign a hardware
counter to an RMID, event pair and monitor bandwidth usage as long as it is
assigned.  The hardware continues to track the assigned counter until it is
explicitly unassigned by the user.

Create 'num_mbm_cntrs' resctrl file that displays the number of counters
supported in each domain. 'num_mbm_cntrs' is only visible to user space
when the system supports "mbm_event" mode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Changed "assign a hardware counter ID" to "assign a hardware counter"
     in couple of places.

v14: Minor update to changelog and user doc (resctrl.rst).
     Changed subject line to fs/resctrl.

v13: Updated the changelog.
     Added fflags RFTYPE_RES_CACHE to the file num_mbm_cntrs.
     Replaced seq_puts from seq_putc where applicable.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The files monitor.c/rdtgroup.c have been split between FS and ARCH directories.

v12: Changed the code to display the max supported monitoring counters in
     each domain. Also updated the documentation.
     Resolved the conflict with the latest code.

v11: Renamed rdtgroup_num_mbm_cntrs_show() to resctrl_num_mbm_cntrs_show().
     Few monor text updates.

v10: No changes.

v9: Updated user document based on the comments.
    Will add a new file available_mbm_cntrs later in the series.

v8: Commit message update and documentation update.

v7: Minor commit log text changes.

v6: No changes.

v5: Changed the display name from num_cntrs to num_mbm_cntrs.
    Updated the commit message.
    Moved the patch after mbm_mode is introduced.

v4: Changed the counter name to num_cntrs. And few text changes.

v3: Changed the field name to mbm_assign_cntrs.

v2: Changed the field name to mbm_assignable_counters from abmc_counter.
---
 Documentation/filesystems/resctrl.rst | 11 ++++++++++
 fs/resctrl/monitor.c                  |  2 ++
 fs/resctrl/rdtgroup.c                 | 30 +++++++++++++++++++++++++++
 3 files changed, 43 insertions(+)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index b692829fec5f..4eb27530be6f 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -288,6 +288,17 @@ with the following files:
 	result in misleading values or display "Unavailable" if no counter is assigned
 	to the event.
 
+"num_mbm_cntrs":
+	The maximum number of counters (total of available and assigned counters) in
+	each domain when the system supports mbm_event mode.
+
+	For example, on a system with maximum of 32 memory bandwidth monitoring
+	counters in each of its L3 domains:
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
+	  0=32;1=32
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 66c8c635f4b3..4539b08db7b9 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -929,6 +929,8 @@ int resctrl_mon_resource_init(void)
 			resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
 		if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
+		resctrl_file_fflags_init("num_mbm_cntrs",
+					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index c7ca9113a12a..acbda73a9b9d 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1829,6 +1829,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
+				      struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+	struct rdt_mon_domain *dom;
+	bool sep = false;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
+		if (sep)
+			seq_putc(s, ';');
+
+		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
+		sep = true;
+	}
+	seq_putc(s, '\n');
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+	return 0;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -1866,6 +1890,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= rdt_default_ctrl_show,
 		.fflags		= RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
 	},
+	{
+		.name		= "num_mbm_cntrs",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_num_mbm_cntrs_show,
+	},
 	{
 		.name		= "min_cbm_bits",
 		.mode		= 0444,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 12/34] fs/resctrl: Introduce mbm_cntr_cfg to track assignable counters per domain
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (10 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters Babu Moger
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The "mbm_event" counter assignment mode allows users to assign a hardware
counter to an RMID, event pair and monitor bandwidth usage as long as it is
assigned.  The hardware continues to track the assigned counter until it is
explicitly unassigned by the user. Counters are assigned/unassigned at
monitoring domain level.

Manage a monitoring domain's hardware counters using a per monitoring
domain array of struct mbm_cntr_cfg that is indexed by the hardware
counter ID. A hardware counter's configuration contains the MBM event
ID and points to the monitoring group that it is assigned to, with a NULL
pointer meaning that the hardware counter is available for assignment.

There is no direct way to determine which hardware counters are assigned
to a particular monitoring group. Check every entry of every hardware
counter configuration array in every monitoring domain to query which
MBM events of a monitoring group is tracked by hardware. Such queries are
acceptable because of a very small number of assignable counters (32
to 64).

Suggested-by: Peter Newman <peternewman@google.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Minor changelog update.
     Removed evt_cfg from struct mbm_cntr_cfg based on the discussion.
     https://lore.kernel.org/lkml/887bad33-7f4a-4b6d-95a7-fdfe0451f42b@intel.com/

v14: Updated code documentation and changelog.
     Fixed up the indentation in resctrl.h.
     Changed subject line to fs/resctrl.

v13: Resolved conflicts caused by the recent FS/ARCH code restructure.
     The files monitor.c/rdtgroup.c have been split between FS and ARCH directories.

v12: Fixed the struct mbm_cntr_cfg code documentation.
     Removed few strange charactors in changelog.
     Added the counter range for better understanding.
     Moved the struct mbm_cntr_cfg definition to resctrl/internal.h as
     suggested by James.

v11: Refined the change log based on Reinette's feedback.
     Fixed few style issues.

v10: Patch changed completely to handle the counters at domain level.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     Removed Reviewed-by tag.
     Did not see the need to add cntr_id in mbm_state structure. Not used in the code.

v9: Added Reviewed-by tag. No other changes.

v8: Minor commit message changes.

v7: Added check mbm_cntr_assignable for allocating bitmap mbm_cntr_map

v6: New patch to add domain level assignment.
---
 fs/resctrl/rdtgroup.c   |  8 ++++++++
 include/linux/resctrl.h | 15 +++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index acbda73a9b9d..a09566720d4f 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -4086,6 +4086,7 @@ static void domain_destroy_mon_state(struct rdt_mon_domain *d)
 {
 	int idx;
 
+	kfree(d->cntr_cfg);
 	bitmap_free(d->rmid_busy_llc);
 	for_each_mbm_idx(idx) {
 		kfree(d->mbm_states[idx]);
@@ -4169,6 +4170,13 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
 			goto cleanup;
 	}
 
+	if (resctrl_is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
+		tsize = sizeof(*d->cntr_cfg);
+		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
+		if (!d->cntr_cfg)
+			goto cleanup;
+	}
+
 	return 0;
 cleanup:
 	bitmap_free(d->rmid_busy_llc);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 919806122c50..e013caba6641 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -156,6 +156,18 @@ struct rdt_ctrl_domain {
 	u32				*mbps_val;
 };
 
+/**
+ * struct mbm_cntr_cfg - Assignable counter configuration.
+ * @evtid:		MBM event to which the counter is assigned. Only valid
+ *			if @rdtgroup is not NULL.
+ * @rdtgrp:		resctrl group assigned to the counter. NULL if the
+ *			counter is free.
+ */
+struct mbm_cntr_cfg {
+	enum resctrl_event_id	evtid;
+	struct rdtgroup		*rdtgrp;
+};
+
 /**
  * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor resource
  * @hdr:		common header for different domain types
@@ -168,6 +180,8 @@ struct rdt_ctrl_domain {
  * @cqm_limbo:		worker to periodically read CQM h/w counters
  * @mbm_work_cpu:	worker CPU for MBM h/w counters
  * @cqm_work_cpu:	worker CPU for CQM h/w counters
+ * @cntr_cfg:		array of assignable counters' configuration (indexed
+ *			by counter ID)
  */
 struct rdt_mon_domain {
 	struct rdt_domain_hdr		hdr;
@@ -178,6 +192,7 @@ struct rdt_mon_domain {
 	struct delayed_work		cqm_limbo;
 	int				mbm_work_cpu;
 	int				cqm_work_cpu;
+	struct mbm_cntr_cfg		*cntr_cfg;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (11 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 12/34] fs/resctrl: Introduce mbm_cntr_cfg to track assignable counters per domain Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-08-06 21:19   ` Moger, Babu
  2025-07-25 18:29 ` [PATCH v16 14/34] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Introduce the "available_mbm_cntrs" resctrl file to display the number of
counters available for assignment in each domain when "mbm_event" mode is
enabled.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Minor changelog text update.
     Minor resctrl.rst text update and corrected the error text in
     resctrl_available_mbm_cntrs_show().
     Changed the goto label to out_unlock for consistency.

v14: Minor changelog update.
     Changed subject line to fs/resctrl.

v13: Resolved conflicts caused by the recent FS/ARCH code restructure.
     The files monitor.c and rdtgroup.c file has now been split between
     the FS and ARCH directories.

v12: Minor change to change log.
     Updated the documentation text with an example.
     Replaced seq_puts(s, ";") with seq_putc(s, ';');
     Added missing rdt_last_cmd_clear() in resctrl_available_mbm_cntrs_show().

v11: Rename rdtgroup_available_mbm_cntrs_show() to resctrl_available_mbm_cntrs_show().
     Few minor text changes.

v10: Patch changed to handle the counters at domain level.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     So, display logic also changed now.

v9: New patch
---
 Documentation/filesystems/resctrl.rst | 11 ++++++
 fs/resctrl/monitor.c                  |  2 ++
 fs/resctrl/rdtgroup.c                 | 48 +++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 4eb27530be6f..446736dbd97f 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -299,6 +299,17 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
 	  0=32;1=32
 
+"available_mbm_cntrs":
+	The number of counters available for assignment in each domain when mbm_event
+	mode is enabled on the system.
+
+	For example, on a system with 30 available [hardware] assignable counters
+	in each of its L3 domains:
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
+	  0=30;1=30
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 4539b08db7b9..a0b0ea45c7b4 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -931,6 +931,8 @@ int resctrl_mon_resource_init(void)
 			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
 		resctrl_file_fflags_init("num_mbm_cntrs",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
+		resctrl_file_fflags_init("available_mbm_cntrs",
+					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 	}
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index a09566720d4f..15d10c346307 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1853,6 +1853,48 @@ static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
+					    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+	struct rdt_mon_domain *dom;
+	bool sep = false;
+	u32 cntrs, i;
+	int ret = 0;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
+		if (sep)
+			seq_putc(s, ';');
+
+		cntrs = 0;
+		for (i = 0; i < r->mon.num_mbm_cntrs; i++) {
+			if (!dom->cntr_cfg[i].rdtgrp)
+				cntrs++;
+		}
+
+		seq_printf(s, "%d=%u", dom->hdr.id, cntrs);
+		sep = true;
+	}
+	seq_putc(s, '\n');
+
+out_unlock:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -1876,6 +1918,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= rdt_mon_features_show,
 		.fflags		= RFTYPE_MON_INFO,
 	},
+	{
+		.name		= "available_mbm_cntrs",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_available_mbm_cntrs_show,
+	},
 	{
 		.name		= "num_rmids",
 		.mode		= 0444,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 14/34] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (12 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 15/34] fs/resctrl: Introduce event configuration field in struct mon_evt Babu Moger
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The ABMC feature allows users to assign a hardware counter to an RMID,
event pair and monitor bandwidth usage as long as it is assigned. The
hardware continues to track the assigned counter until it is explicitly
unassigned by the user.

The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
ABMC counter assignment is done by setting the counter id, bandwidth
source (RMID) and bandwidth configuration.

Attempts to read or write the MSR when ABMC is not enabled will result
in a #GP(0) exception.

Introduce the data structures and definitions for MSR L3_QOS_ABMC_CFG
(0xC000_03FDh):
=========================================================================
Bits 	Mnemonic	Description			Access Reset
							Type   Value
=========================================================================
63 	CfgEn 		Configuration Enable 		R/W 	0

62 	CtrEn 		Enable/disable counting		R/W 	0

61:53 	– 		Reserved 			MBZ 	0

52:48 	CtrID 		Counter Identifier		R/W	0

47 	IsCOS		BwSrc field is a CLOSID		R/W	0
			(not an RMID)

46:44 	–		Reserved			MBZ	0

43:32	BwSrc		Bandwidth Source		R/W	0
			(RMID or CLOSID)

31:0	BwType		Bandwidth configuration		R/W	0
			tracked by the CtrID
==========================================================================

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Minor changelog update.

v14: Removed BMEC reference internal.h.
     Updated the changelog and code documentation.

v13: Removed the Reviewed-by tag as there is commit log change to remove
     BMEC reference.

v12: No changes.

v11: No changes.

v10: No changes.

v9: Removed the references of L3_QOS_ABMC_DSC.
    Text changes about configuration in kernel doc.

v8: Update the configuration notes in kernel_doc.
    Few commit message update.

v7: Removed the reference of L3_QOS_ABMC_DSC as it is not used anymore.
    Moved the configuration notes to kernel_doc.
    Adjusted the tabs for l3_qos_abmc_cfg and checkpatch seems happy.

v6: Removed all the fs related changes.
    Added note on CfgEn,CtrEn.
    Removed the definitions which are not used.
    Removed cntr_id initialization.

v5: Moved assignment flags here (path 10/19 of v4).
    Added MON_CNTR_UNSET definition to initialize cntr_id's.
    More details in commit log.
    Renamed few fields in l3_qos_abmc_cfg for readability.

v4: Added more descriptions.
    Changed the name abmc_ctr_id to ctr_id.
    Added L3_QOS_ABMC_DSC. Used for reading the configuration.

v3: No changes.

v2: No changes.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h | 36 ++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index c998cf0e1375..3ad7f37b8ad8 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1222,6 +1222,7 @@
 /* - AMD: */
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
+#define MSR_IA32_L3_QOS_ABMC_CFG	0xc00003fd
 #define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index a79a487e639c..6bf6042f11b6 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -164,6 +164,42 @@ union cpuid_0x10_x_edx {
 	unsigned int full;
 };
 
+/*
+ * ABMC counters are configured by writing to L3_QOS_ABMC_CFG.
+ *
+ * @bw_type		: Event configuration that represent the memory
+ *			  transactions being tracked by the @cntr_id.
+ * @bw_src		: Bandwidth source (RMID or CLOSID).
+ * @reserved1		: Reserved.
+ * @is_clos		: @bw_src field is a CLOSID (not an RMID).
+ * @cntr_id		: Counter identifier.
+ * @reserved		: Reserved.
+ * @cntr_en		: Counting enable bit.
+ * @cfg_en		: Configuration enable bit.
+ *
+ * Configuration and counting:
+ * Counter can be configured across multiple writes to MSR. Configuration
+ * is applied only when @cfg_en = 1. Counter @cntr_id is reset when the
+ * configuration is applied.
+ * @cfg_en = 1, @cntr_en = 0 : Apply @cntr_id configuration but do not
+ *                             count events.
+ * @cfg_en = 1, @cntr_en = 1 : Apply @cntr_id configuration and start
+ *                             counting events.
+ */
+union l3_qos_abmc_cfg {
+	struct {
+		unsigned long bw_type  :32,
+			      bw_src   :12,
+			      reserved1: 3,
+			      is_clos  : 1,
+			      cntr_id  : 5,
+			      reserved : 9,
+			      cntr_en  : 1,
+			      cfg_en   : 1;
+	} split;
+	unsigned long full;
+};
+
 void rdt_ctrl_update(void *arg);
 
 int rdt_get_mon_l3_config(struct rdt_resource *r);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 15/34] fs/resctrl: Introduce event configuration field in struct mon_evt
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (13 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 14/34] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-25 18:29 ` [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When supported, mbm_event counter assignment mode allows the user to
configure events to track specific types of memory transactions.

Introduce the evt_cfg field in struct mon_evt to define the type of memory
transactions tracked by a monitoring event. Also add a helper function to
get the evt_cfg value.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v16: Added Reviewed-by tag.

v15: Updated the changelog.
     Removed resctrl_set_mon_evt_cfg().
     Moved the event initialization to resctrl_mon_resource_init().

v14: This is updated patch from previous patch.
     https://lore.kernel.org/lkml/95b7f4e9d72773e8fda327fc80b429646efc3a8a.1747349530.git.babu.moger@amd.com/
     Removed mbm_mode as it is not required anymore.
     Added resctrl_get_mon_evt_cfg() and resctrl_set_mon_evt_cfg().

v13: New patch to handle different event configuration types with
     mbm_cntr_assign mode.
---
 fs/resctrl/internal.h   | 5 +++++
 fs/resctrl/monitor.c    | 9 +++++++++
 include/linux/resctrl.h | 2 ++
 3 files changed, 16 insertions(+)

diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 4f315b7e9ec0..db3a0f12ad77 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -56,6 +56,10 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
  * @evtid:		event id
  * @rid:		resource id for this event
  * @name:		name of the event
+ * @evt_cfg:		Event configuration value that represents the
+ *			memory transactions (e.g., READS_TO_LOCAL_MEM,
+ *			READS_TO_REMOTE_MEM) being tracked by @evtid.
+ *			Only valid if @evtid is an MBM event.
  * @configurable:	true if the event is configurable
  * @enabled:		true if the event is enabled
  */
@@ -63,6 +67,7 @@ struct mon_evt {
 	enum resctrl_event_id	evtid;
 	enum resctrl_res_level	rid;
 	char			*name;
+	u32			evt_cfg;
 	bool			configurable;
 	bool			enabled;
 };
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index a0b0ea45c7b4..a089867262fa 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -884,6 +884,11 @@ bool resctrl_is_mon_event_enabled(enum resctrl_event_id eventid)
 	       mon_event_all[eventid].enabled;
 }
 
+u32 resctrl_get_mon_evt_cfg(enum resctrl_event_id evtid)
+{
+	return mon_event_all[evtid].evt_cfg;
+}
+
 /**
  * resctrl_mon_resource_init() - Initialise global monitoring structures.
  *
@@ -929,6 +934,10 @@ int resctrl_mon_resource_init(void)
 			resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
 		if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
 			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
+		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
+		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
+								   READS_TO_LOCAL_S_MEM |
+								   NON_TEMP_WRITE_TO_LOCAL_MEM;
 		resctrl_file_fflags_init("num_mbm_cntrs",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		resctrl_file_fflags_init("available_mbm_cntrs",
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index e013caba6641..87daa4ca312d 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -409,6 +409,8 @@ static inline bool resctrl_is_mbm_event(enum resctrl_event_id eventid)
 		eventid <= QOS_L3_MBM_LOCAL_EVENT_ID);
 }
 
+u32 resctrl_get_mon_evt_cfg(enum resctrl_event_id eventid);
+
 /* Iterate over all memory bandwidth events */
 #define for_each_mbm_event_id(eventid)				\
 	for (eventid = QOS_L3_MBM_TOTAL_EVENT_ID;		\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (14 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 15/34] fs/resctrl: Introduce event configuration field in struct mon_evt Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:50   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events Babu Moger
                   ` (18 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The ABMC feature allows users to assign a hardware counter to an RMID,
event pair and monitor bandwidth usage as long as it is assigned. The
hardware continues to track the assigned counter until it is explicitly
unassigned by the user.

Implement an x86 architecture-specific handler to configure a counter. This
architecture specific handler is called by resctrl fs when a counter is
assigned or unassigned as well as when an already assigned counter's
configuration should be updated. Configure counters by writing to the
L3_QOS_ABMC_CFG MSR, specifying the counter ID, bandwidth source (RMID),
and event configuration.

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
    Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Updated the changelog.
     Reset the architectural state in resctrl_arch_config_cntr() in both
     assign and unassign cases.

v15: Minor changelog update.
     Added few code comments in include/linux/resctrl.h.

v14: Removed evt_cfg parameter in resctrl_arch_config_cntr(). Get evt_cfg
     only when assign is required.
     Minor update to changelog.

v13: Moved resctrl_arch_config_cntr() prototype to include/linux/resctrl.h.
     Changed resctrl_arch_config_cntr() to retun void from int.
     Updated the kernal doc for the prototype.
     Updated the code comment.

12: Added the check to reset the architecture-specific state only when
     assign is requested.
     Added evt_cfg as the parameter as the user will be passing the event
     configuration from /info/L3_MON/event_configs/.

v11: Moved resctrl_arch_assign_cntr() and resctrl_abmc_config_one_amd() to
     monitor.c.
     Added the code to reset the arch state in resctrl_arch_assign_cntr().
     Also removed resctrl_arch_reset_rmid() inside IPI as the counters are
     reset from the callers.
     Re-wrote commit message.

v10: Added call resctrl_arch_reset_rmid() to reset the RMID in the domain
     inside IPI call.
     SMP and non-SMP call support is not required in resctrl_arch_config_cntr
     with new domain specific assign approach/data structure.
     Commit message update.

v9: Removed the code to reset the architectural state. It will done
    in another patch.

v8: Rename resctrl_arch_assign_cntr to resctrl_arch_config_cntr.

v7: Separated arch and fs functions. This patch only has arch implementation.
    Added struct rdt_resource to the interface resctrl_arch_assign_cntr.
    Rename rdtgroup_abmc_cfg() to resctrl_abmc_config_one_amd().

v6: Removed mbm_cntr_alloc() from this patch to keep fs and arch code
    separate.
    Added code to update the counter assignment at domain level.

v5: Few name changes to match cntr_id.
    Changed the function names to
      rdtgroup_assign_cntr
      resctr_arch_assign_cntr
      More comments on commit log.
      Added function summary.

v4: Commit message update.
      User bitmap APIs where applicable.
      Changed the interfaces considering MPAM(arm).
      Added domain specific assignment.

v3: Removed the static from the prototype of rdtgroup_assign_abmc.
      The function is not called directly from user anymore. These
      changes are related to global assignment interface.

v2: Minor text changes in commit message.
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 36 +++++++++++++++++++++++++++
 include/linux/resctrl.h               | 19 ++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index cce35a0ad455..ed295a6c5e66 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -444,3 +444,39 @@ bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
 {
 	return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
 }
+
+static void resctrl_abmc_config_one_amd(void *info)
+{
+	union l3_qos_abmc_cfg *abmc_cfg = info;
+
+	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
+}
+
+/*
+ * Send an IPI to the domain to assign the counter to RMID, event pair.
+ */
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			      enum resctrl_event_id evtid, u32 rmid, u32 closid,
+			      u32 cntr_id, bool assign)
+{
+	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+	union l3_qos_abmc_cfg abmc_cfg = { 0 };
+	struct arch_mbm_state *am;
+
+	abmc_cfg.split.cfg_en = 1;
+	abmc_cfg.split.cntr_en = assign ? 1 : 0;
+	abmc_cfg.split.cntr_id = cntr_id;
+	abmc_cfg.split.bw_src = rmid;
+	if (assign)
+		abmc_cfg.split.bw_type = resctrl_get_mon_evt_cfg(evtid);
+
+	smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
+
+	/*
+	 * The hardware counter is reset (because cfg_en == 1) so there is no
+	 * need to record initial non-zero counts.
+	 */
+	am = get_arch_mbm_state(hw_dom, rmid, evtid);
+	if (am)
+		memset(am, 0, sizeof(*am));
+}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 87daa4ca312d..50e38445183a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -594,6 +594,25 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
  */
 void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
 
+/**
+ * resctrl_arch_config_cntr() - Configure the counter with its new RMID
+ *				and event details.
+ * @r:			Resource structure.
+ * @d:			The domain in which counter with ID @cntr_id should be configured.
+ * @evtid:		Monitoring event type (e.g., QOS_L3_MBM_TOTAL_EVENT_ID
+ *			or QOS_L3_MBM_LOCAL_EVENT_ID).
+ * @rmid:		RMID.
+ * @closid:		CLOSID.
+ * @cntr_id:		Counter ID to configure.
+ * @assign:		True to assign the counter or update an existing assignment,
+ *			false to unassign the counter.
+ *
+ * This can be called from any CPU.
+ */
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			      enum resctrl_event_id evtid, u32 rmid, u32 closid,
+			      u32 cntr_id, bool assign);
+
 extern unsigned int resctrl_rmid_realloc_threshold;
 extern unsigned int resctrl_rmid_realloc_limit;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (15 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:52   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign " Babu Moger
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When supported, "mbm_event" counter assignment mode offers "num_mbm_cntrs"
number of counters that can be assigned to RMID, event pairs and monitor
bandwidth usage as long as it is assigned.

Add the functionality to allocate and assign a counter to an RMID, event
pair in the domain.

If all the counters are in use, kernel will log the error message
"Failed to allocate counter for <event> in domain <id>" in
/sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.
Exit on the first failure when assigning counters across all the domains.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Function renames:
     resctrl_config_cntr() -> rdtgroup_assign_cntr()
     rdtgroup_alloc_config_cntr() -> rdtgroup_alloc_assign_cntr()
     Passed struct mevt to rdtgroup_alloc_assign_cntr so it can print event name on failure.
     Minor code comment update.

v15: Updated the changelog.
     Added the check !r->mon.mbm_cntr_assignable in mbm_cntr_get() to return error.
     Removed the check to verify evt_cfg in the domain as it is not required anymore.
     https://lore.kernel.org/lkml/887bad33-7f4a-4b6d-95a7-fdfe0451f42b@intel.com/
     Return success if the counter is already assigned.
     Rename resctrl_assign_cntr_event() -> rdtgroup_assign_cntr_event().
     Removed the parameter struct rdt_resource. It can be obtained from mevt->rid.

v14: Updated the changelog little bit.
     Updated the code documentation for mbm_cntr_alloc() and  mbm_cntr_get().
     Passed struct mon_evt to resctrl_assign_cntr_event() that way to avoid
     back and forth calls to get event details.
     Updated the code documentation about the failure when counters are exhasted.
     Changed subject line to fs/resctrl.

v13: Updated changelog.
     Changed resctrl_arch_config_cntr() to return void instead of int.
     Just passing evtid is to resctrl_alloc_config_cntr() and
     resctrl_assign_cntr_event(). Event configuration value can be easily
     obtained from mon_evt list.
     Introduced new function mbm_get_mon_event() to get event configuration value.
     Added prototype descriptions to mbm_cntr_get() and mbm_cntr_alloc().
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The files monitor.c/rdtgroup.c have been split between FS and ARCH directories.

v12: Fixed typo in the subjest line.
     Replaced several counters with "num_mbm_cntrs" counters.
     Changed the check in resctrl_alloc_config_cntr() to reduce the indentation.
     Fixed the handling error on first failure.
     Added domain id and event id on failure.
     Fixed the return error override.
     Added new parameter event configuration (evt_cfg) to get the event configuration
     from user space.

v11: Patch changed again quite a bit.
     Moved the functions to monitor.c.
     Renamed rdtgroup_assign_cntr_event() to resctrl_assign_cntr_event().
     Refactored the resctrl_assign_cntr_event().
     Added functionality to exit on the first error during assignment.
     Simplified mbm_cntr_free().
     Removed the function mbm_cntr_assigned(). Will be using mbm_cntr_get() to
     figure out if the counter is assigned or not.
     Updated commit message and code comments.

v10: Patch changed completely.
     Counters are managed at the domain based on the discussion.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     Reset non-architectural MBM state.
     Commit message update.

v9: Introduced new function resctrl_config_cntr to assign the counter, update
    the bitmap and reset the architectural state.
    Taken care of error handling(freeing the counter) when assignment fails.
    Moved mbm_cntr_assigned_to_domain here as it used in this patch.
    Minor text changes.

v8: Renamed rdtgroup_assign_cntr() to rdtgroup_assign_cntr_event().
    Added the code to return the error if rdtgroup_assign_cntr_event fails.
    Moved definition of MBM_EVENT_ARRAY_INDEX to resctrl/internal.h.
    Updated typo in the comments.

v7: New patch. Moved all the FS code here.
    Merged rdtgroup_assign_cntr and rdtgroup_alloc_cntr.
    Adde new #define MBM_EVENT_ARRAY_INDEX.
---
 fs/resctrl/internal.h |   3 +
 fs/resctrl/monitor.c  | 130 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 133 insertions(+)

diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index db3a0f12ad77..419423bdabdc 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -387,6 +387,9 @@ bool closid_allocated(unsigned int closid);
 
 int resctrl_find_cleanest_closid(void);
 
+int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
+			       struct mon_evt *mevt);
+
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index a089867262fa..8b0aa2469643 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -953,3 +953,133 @@ void resctrl_mon_resource_exit(void)
 
 	dom_data_exit(r);
 }
+
+/*
+ * rdtgroup_assign_cntr() - Assign/unassign the counter ID for the event, RMID
+ * pair in the domain.
+ *
+ * Assign the counter if @assign is true else unassign the counter. Reset the
+ * associated non-architectural state.
+ */
+static void rdtgroup_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+				 enum resctrl_event_id evtid, u32 rmid, u32 closid,
+				 u32 cntr_id, bool assign)
+{
+	struct mbm_state *m;
+
+	resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign);
+
+	m = get_mbm_state(d, closid, rmid, evtid);
+	if (m)
+		memset(m, 0, sizeof(*m));
+}
+
+/*
+ * mbm_cntr_get() - Return the counter ID for the matching @evtid and @rdtgrp.
+ *
+ * Return:
+ * Valid counter ID on success, or -ENOENT on failure.
+ */
+static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+{
+	int cntr_id;
+
+	if (!r->mon.mbm_cntr_assignable)
+		return -ENOENT;
+
+	if (!resctrl_is_mbm_event(evtid))
+		return -ENOENT;
+
+	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
+		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
+		    d->cntr_cfg[cntr_id].evtid == evtid)
+			return cntr_id;
+	}
+
+	return -ENOENT;
+}
+
+/*
+ * mbm_cntr_alloc() - Initialize and return a new counter ID in the domain @d.
+ * Caller must ensure that the specified event is not assigned already.
+ *
+ * Return:
+ * Valid counter ID on success, or -ENOSPC on failure.
+ */
+static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
+			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+{
+	int cntr_id;
+
+	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
+		if (!d->cntr_cfg[cntr_id].rdtgrp) {
+			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
+			d->cntr_cfg[cntr_id].evtid = evtid;
+			return cntr_id;
+		}
+	}
+
+	return -ENOSPC;
+}
+
+/*
+ * rdtgroup_alloc_assign_cntr() - Allocate a counter ID and assign it to the event
+ * pointed to by @mevt and the resctrl group @rdtgrp within the domain @d.
+ *
+ * Return:
+ * 0 on success, < 0 on failure.
+ */
+static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+				      struct rdtgroup *rdtgrp, struct mon_evt *mevt)
+{
+	int cntr_id;
+
+	/* No action required if the counter is assigned already. */
+	cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid);
+	if (cntr_id >= 0)
+		return 0;
+
+	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, mevt->evtid);
+	if (cntr_id <  0) {
+		rdt_last_cmd_printf("Failed to allocate counter for %s in domain %d\n",
+				    mevt->name, d->hdr.id);
+		return cntr_id;
+	}
+
+	rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, true);
+
+	return 0;
+}
+
+/*
+ * rdtgroup_assign_cntr_event() - Assign a hardware counter for the event in
+ * @mevt to the resctrl group @rdtgrp. Assign counters to all domains if @d is
+ * NULL; otherwise, assign the counter to the specified domain @d.
+ *
+ * If all counters in a domain are already in use, rdtgroup_alloc_assign_cntr()
+ * will fail. The assignment process will abort at the first failure encountered
+ * during domain traversal, which may result in the event being only partially
+ * assigned.
+ *
+ * Return:
+ * 0 on success, < 0 on failure.
+ */
+int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
+			       struct mon_evt *mevt)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid);
+	int ret = 0;
+
+	if (!d) {
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = rdtgroup_alloc_assign_cntr(r, d, rdtgrp, mevt);
+	}
+
+	return ret;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign MBM events
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (16 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:53   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members Babu Moger
                   ` (16 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The "mbm_event" counter assignment mode offers "num_mbm_cntrs" number of
counters that can be assigned to RMID, event pairs and monitor bandwidth
usage as long as it is assigned. If all the counters are in use, the
kernel logs the error message "Unable to allocate counter in domain" in
/sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.

To make space for a new assignment, users must unassign an already
assigned counter and retry the assignment again.

Add the functionality to unassign and free the counters in the domain.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Function rename rdtgroup_free_config_cntr() -> rdtgroup_free_unassign_cntr().
     Updated rdtgroup_free_unassign_cntr() to pass struct mon_evt to match
     rdtgroup_alloc_assign_cntr() prototype.

v15: Updated the changelog.
     Changed code in mbm_cntr_free to use the sizeof(*d->cntr_cfg)).
     Removed unnecessary return in resctrl_free_config_cntr().
     Rename resctrl_unassign_cntr_event() -> rdtgroup_unassign_cntr_event().
     Removed the parameter struct rdt_resource. It can be obtained from mevt->rid.

v14: Passing the struct mon_evt to resctrl_free_config_cntr() and removed
     the need for mbm_get_mon_event() call.
     Corrected the code documentation for mbm_cntr_free().
     Changed resctrl_free_config_cntr() and resctrl_unassign_cntr_event()
     to return void.
     Changed subject line to fs/resctrl.
     Updated the changelog.

v13: Moved mbm_cntr_free() to this patch as it is used in here first.
     Not required to pass evt_cfg to resctrl_unassign_cntr_event(). It is
     available via mbm_get_mon_event().
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The monitor.c file has now been split between the FS and ARCH directories.

v12: Updated the commit text to make bit more clear.
     Replaced several counters with "num_mbm_cntrs" counters.
     Fixed typo in the subjest line.
     Fixed the handling error on first failure.
     Added domain id and event id on failure.
     Added new parameter event configuration (evt_cfg) to provide the event from
     user space.

v11: Moved the functions to monitor.c.
     Renamed rdtgroup_unassign_cntr_event() to resctrl_unassign_cntr_event().
     Refactored the resctrl_unassign_cntr_event().
     Updated commit message and code comments.


v10: Patch changed again.
     Counters are managed at the domain based on the discussion.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     commit message update.

v9: Changes related to addition of new function resctrl_config_cntr().
    The removed rdtgroup_mbm_cntr_is_assigned() as it was introduced
    already.
    Text changes to take care comments.

v8: Renamed rdtgroup_mbm_cntr_is_assigned to mbm_cntr_assigned_to_domain
    Added return error handling in resctrl_arch_config_cntr().

v7: Merged rdtgroup_unassign_cntr and rdtgroup_free_cntr functions.
    Renamed rdtgroup_mbm_cntr_test() to rdtgroup_mbm_cntr_is_assigned().
    Reworded the commit log little bit.

v6: Removed mbm_cntr_free from this patch.
    Added counter test in all the domains and free if it is not assigned to
    any domains.

v5: Few name changes to match cntr_id.
    Changed the function names to rdtgroup_unassign_cntr
    More comments on commit log.

v4: Added domain specific unassign feature.
    Few name changes.

v3: Removed the static from the prototype of rdtgroup_unassign_abmc.
    The function is not called directly from user anymore. These
    changes are related to global assignment interface.

v2: No changes.
---
 fs/resctrl/internal.h |  2 ++
 fs/resctrl/monitor.c  | 46 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 419423bdabdc..216588842444 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -389,6 +389,8 @@ int resctrl_find_cleanest_closid(void);
 
 int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
 			       struct mon_evt *mevt);
+void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
+				  struct mon_evt *mevt);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 8b0aa2469643..049a82729c0b 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1023,6 +1023,14 @@ static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
 	return -ENOSPC;
 }
 
+/*
+ * mbm_cntr_free() - Clear the counter ID configuration details in the domain @d.
+ */
+static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
+{
+	memset(&d->cntr_cfg[cntr_id], 0, sizeof(*d->cntr_cfg));
+}
+
 /*
  * rdtgroup_alloc_assign_cntr() - Allocate a counter ID and assign it to the event
  * pointed to by @mevt and the resctrl group @rdtgrp within the domain @d.
@@ -1083,3 +1091,41 @@ int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp
 
 	return ret;
 }
+
+/*
+ * rdtgroup_free_unassign_cntr() - Unassign and reset the counter ID configuration
+ * for the event pointed to by @mevt within the domain @d and resctrl group @rdtgrp.
+ */
+static void rdtgroup_free_unassign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+					struct rdtgroup *rdtgrp, struct mon_evt *mevt)
+{
+	int cntr_id;
+
+	cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid);
+
+	/* If there is no cntr_id assigned, nothing to do */
+	if (cntr_id < 0)
+		return;
+
+	rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, false);
+
+	mbm_cntr_free(d, cntr_id);
+}
+
+/*
+ * rdtgroup_unassign_cntr_event() - Unassign a hardware counter associated with
+ * the event structure @mevt from the domain @d and the group @rdtgrp. Unassign
+ * the counters from all the domains if @d is NULL else unassign from @d.
+ */
+void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
+				  struct mon_evt *mevt)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid);
+
+	if (!d) {
+		list_for_each_entry(d, &r->mon_domains, hdr.list)
+			rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt);
+	} else {
+		rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt);
+	}
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (17 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign " Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:54   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode Babu Moger
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Reading monitoring data for a monitoring group requires both the RMID and
CLOSID. The RMID and CLOSID are members of struct rdtgroup but passed
separately to several functions involved in retrieving event data.

When "mbm_event" counter assignment mode is enabled, a counter ID is
required to read event data. The counter ID is obtained through
mbm_cntr_get(), which expects a struct rdtgroup pointer.

Provide a pointer to the struct rdtgroup as parameter to functions involved
in retrieving event data to simplify access to RMID, CLOSID, and counter
ID.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Minor code comment update.

v15: Rephrased the changelog. Thanks to Reinette.

v14: Few text update to commit log.

v13: New patch to pass the entire struct rdtgroup to __mon_event_count(),
     mbm_update(), and related functions.
---
 fs/resctrl/monitor.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 049a82729c0b..070965d45770 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -356,9 +356,11 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid,
 	return state ? &state[idx] : NULL;
 }
 
-static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
+static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
 {
 	int cpu = smp_processor_id();
+	u32 closid = rdtgrp->closid;
+	u32 rmid = rdtgrp->mon.rmid;
 	struct rdt_mon_domain *d;
 	struct cacheinfo *ci;
 	struct mbm_state *m;
@@ -420,8 +422,8 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 /*
  * mbm_bw_count() - Update bw count from values previously read by
  *		    __mon_event_count().
- * @closid:	The closid used to identify the cached mbm_state.
- * @rmid:	The rmid used to identify the cached mbm_state.
+ * @rdtgrp:	resctrl group associated with the CLOSID and RMID to identify
+ *		the cached mbm_state.
  * @rr:		The struct rmid_read populated by __mon_event_count().
  *
  * Supporting function to calculate the memory bandwidth
@@ -429,9 +431,11 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
  * __mon_event_count() is compared with the chunks value from the previous
  * invocation. This must be called once per second to maintain values in MBps.
  */
-static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr)
+static void mbm_bw_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
 {
 	u64 cur_bw, bytes, cur_bytes;
+	u32 closid = rdtgrp->closid;
+	u32 rmid = rdtgrp->mon.rmid;
 	struct mbm_state *m;
 
 	m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
@@ -460,7 +464,7 @@ void mon_event_count(void *info)
 
 	rdtgrp = rr->rgrp;
 
-	ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr);
+	ret = __mon_event_count(rdtgrp, rr);
 
 	/*
 	 * For Ctrl groups read data from child monitor groups and
@@ -471,8 +475,7 @@ void mon_event_count(void *info)
 
 	if (rdtgrp->type == RDTCTRL_GROUP) {
 		list_for_each_entry(entry, head, mon.crdtgrp_list) {
-			if (__mon_event_count(entry->closid, entry->mon.rmid,
-					      rr) == 0)
+			if (__mon_event_count(entry, rr) == 0)
 				ret = 0;
 		}
 	}
@@ -603,7 +606,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
 }
 
 static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
-				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
+				 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
 {
 	struct rmid_read rr = {0};
 
@@ -617,30 +620,30 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
 		return;
 	}
 
-	__mon_event_count(closid, rmid, &rr);
+	__mon_event_count(rdtgrp, &rr);
 
 	/*
 	 * If the software controller is enabled, compute the
 	 * bandwidth for this event id.
 	 */
 	if (is_mba_sc(NULL))
-		mbm_bw_count(closid, rmid, &rr);
+		mbm_bw_count(rdtgrp, &rr);
 
 	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
 }
 
 static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
-		       u32 closid, u32 rmid)
+		       struct rdtgroup *rdtgrp)
 {
 	/*
 	 * This is protected from concurrent reads from user as both
 	 * the user and overflow handler hold the global mutex.
 	 */
 	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
-		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
+		mbm_update_one_event(r, d, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
 
 	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
-		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
+		mbm_update_one_event(r, d, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
 }
 
 /*
@@ -713,11 +716,11 @@ void mbm_handle_overflow(struct work_struct *work)
 	d = container_of(work, struct rdt_mon_domain, mbm_over.work);
 
 	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
-		mbm_update(r, d, prgrp->closid, prgrp->mon.rmid);
+		mbm_update(r, d, prgrp);
 
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list)
-			mbm_update(r, d, crgrp->closid, crgrp->mon.rmid);
+			mbm_update(r, d, crgrp);
 
 		if (is_mba_sc(NULL))
 			update_mba_bw(prgrp, d);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (18 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:59   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read() Babu Moger
                   ` (14 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When supported, "mbm_event" counter assignment mode allows users to assign
a hardware counter to an RMID, event pair and monitor the bandwidth usage
as long as it is assigned. The hardware continues to track the assigned
counter until it is explicitly unassigned by the user.

Introduce the architecture calls resctrl_arch_cntr_read() and
resctrl_arch_reset_cntr() to read and reset event counters when "mbm_event"
mode is supported. Function names are chosen to match existing
resctrl_arch_rmid_read() and resctrl_arch_reset_rmid().

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Updated the changelog.
     Removed lots of copied and unnecessary text from resctrl.h.
     Also removed references to LLC occupancy.
     Removed arch_mon_ctx from resctrl_arch_cntr_read().

v15: New patch to add arch calls resctrl_arch_cntr_read() and resctrl_arch_reset_cntr()
     with mbm_event mode.
     https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/
---
 include/linux/resctrl.h | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 50e38445183a..4d37827121a6 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -613,6 +613,44 @@ void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
 			      enum resctrl_event_id evtid, u32 rmid, u32 closid,
 			      u32 cntr_id, bool assign);
 
+/**
+ * resctrl_arch_cntr_read() - Read the event data corresponding to the counter ID
+ *			      assigned to the RMID, event pair for this resource
+ *			      and domain.
+ * @r:			Resource that the counter should be read from.
+ * @d:			Domain that the counter should be read from.
+ * @closid:		CLOSID that matches the RMID.
+ * @rmid:		RMID used for counter ID assignment.
+ * @cntr_id:		The counter ID whose event data should be read. Valid when
+ *			"mbm_event" mode is enabled and @eventid is MBM event.
+ * @eventid:		eventid used for counter ID assignment, such as
+ *			QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.
+ * @val:		Result of the counter read in bytes.
+ *
+ * Return:
+ * 0 on success, or -EIO, -EINVAL etc on error.
+ */
+int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+			   u32 closid, u32 rmid, int cntr_id,
+			   enum resctrl_event_id eventid, u64 *val);
+
+/**
+ * resctrl_arch_reset_cntr() - Reset any private state associated with counter ID.
+ * @r:		The domain's resource.
+ * @d:		The counter ID's domain.
+ * @closid:	CLOSID that matches the RMID.
+ * @rmid:	RMID used for counter ID assignment.
+ * @cntr_id:	The counter ID whose event data should be reset. Valid when
+ *		"mbm_event" mode is enabled and @eventid is MBM event.
+ * @eventid:	eventid used for counter ID assignment, such as
+ *		QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.
+ *
+ * This can be called from any CPU.
+ */
+void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			     u32 closid, u32 rmid, int cntr_id,
+			     enum resctrl_event_id eventid);
+
 extern unsigned int resctrl_rmid_realloc_threshold;
 extern unsigned int resctrl_rmid_realloc_limit;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read()
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (19 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 19:59   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() Babu Moger
                   ` (13 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

resctrl_arch_rmid_read() adjusts the value obtained from MSR_IA32_QM_CTR to
account for the overflow for MBM events and apply counter scaling for all
the events. This logic is common to both reading an RMID and reading a
hardware counter directly.

Refactor the hardware value adjustment logic into get_corrected_val() to
prepare for support of reading a hardware counter.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Rephrased the changelog.
     Fixed allignment.
     Renamed mbm_corrected_val() -> get_corrected_val().

v15: New patch to add arch calls resctrl_arch_cntr_read() and resctrl_arch_reset_cntr()
     with mbm_event mode.
     https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 38 ++++++++++++++++-----------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ed295a6c5e66..1f77fd58e707 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -217,24 +217,13 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
 	return chunks >> shift;
 }
 
-int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
-			   u32 unused, u32 rmid, enum resctrl_event_id eventid,
-			   u64 *val, void *ignored)
+static u64 get_corrected_val(struct rdt_resource *r, struct rdt_mon_domain *d,
+			     u32 rmid, enum resctrl_event_id eventid, u64 msr_val)
 {
 	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
-	int cpu = cpumask_any(&d->hdr.cpu_mask);
 	struct arch_mbm_state *am;
-	u64 msr_val, chunks;
-	u32 prmid;
-	int ret;
-
-	resctrl_arch_rmid_read_context_check();
-
-	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
-	ret = __rmid_read_phys(prmid, eventid, &msr_val);
-	if (ret)
-		return ret;
+	u64 chunks;
 
 	am = get_arch_mbm_state(hw_dom, rmid, eventid);
 	if (am) {
@@ -246,7 +235,26 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
 		chunks = msr_val;
 	}
 
-	*val = chunks * hw_res->mon_scale;
+	return chunks * hw_res->mon_scale;
+}
+
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+			   u32 unused, u32 rmid, enum resctrl_event_id eventid,
+			   u64 *val, void *ignored)
+{
+	int cpu = cpumask_any(&d->hdr.cpu_mask);
+	u64 msr_val;
+	u32 prmid;
+	int ret;
+
+	resctrl_arch_rmid_read_context_check();
+
+	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
+	ret = __rmid_read_phys(prmid, eventid, &msr_val);
+	if (ret)
+		return ret;
+
+	*val = get_corrected_val(r, d, rmid, eventid, msr_val);
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read()
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (20 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read() Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:01   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode Babu Moger
                   ` (12 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

System software can read resctrl event data for a particular resource by
writing the RMID and Event Identifier (EvtID) to the QM_EVTSEL register and
then reading the event data from the QM_CTR register.

In ABMC mode, the event data of a specific counter ID can be read by
setting the following fields: QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID
= L3CacheABMC (=1) and setting [RMID] to the desired counter ID. Reading
QM_CTR will then return the contents of the specified counter ID. The
RMID_VAL_ERROR bit will be set if the counter configuration was invalid, or
if an invalid counter ID was set in the QM_EVTSEL[RMID] field. If the
counter data is currently unavailable, the RMID_VAL_UNAVAIL bit will be
set.

Introduce resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() to reset
and read event data for a specific counter.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Updated the changelog.
     Removed the call resctrl_arch_rmid_read_context_check();
     Added the text about RMID_VAL_UNAVAIL error.

v15: Updated patch to add arch calls resctrl_arch_cntr_read() and resctrl_arch_reset_cntr()
     with mbm_event mode.
     https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/

v14: Updated the context in changelog. Added text in imperative tone.
     Added WARN_ON_ONCE() when cntr_id < 0.
     Improved code documentation in include/linux/resctrl.h.
     Added the check in mbm_update() to skip overflow handler when counter is unassigned.

v13: Split the patch into 2. First one to handle the passing of rdtgroup structure to few
     functions( __mon_event_count and mbm_update(). Second one to handle ABMC counter reading.
     Added new function __cntr_id_read_phys() to handle ABMC event reading.
     Updated kernel doc for resctrl_arch_reset_rmid() and resctrl_arch_rmid_read().
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The monitor.c file has now been split between the FS and ARCH directories.

v12: New patch to support extended event mode when ABMC is enabled.
---
 arch/x86/kernel/cpu/resctrl/internal.h |  6 +++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 68 ++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6bf6042f11b6..ae4003d44df4 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -40,6 +40,12 @@ struct arch_mbm_state {
 /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
 #define ABMC_ENABLE_BIT			0
 
+/*
+ * Qos Event Identifiers.
+ */
+#define ABMC_EXTENDED_EVT_ID		BIT(31)
+#define ABMC_EVT_ID			BIT(0)
+
 /**
  * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
  *			       a resource for a control function
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 1f77fd58e707..57c8409a8247 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -259,6 +259,74 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
 	return 0;
 }
 
+static int __cntr_id_read(u32 cntr_id, u64 *val)
+{
+	u64 msr_val;
+
+	/*
+	 * QM_EVTSEL Register definition:
+	 * =======================================================
+	 * Bits    Mnemonic        Description
+	 * =======================================================
+	 * 63:44   --              Reserved
+	 * 43:32   RMID            Resource Monitoring Identifier
+	 * 31      ExtEvtID        Extended Event Identifier
+	 * 30:8    --              Reserved
+	 * 7:0     EvtID           Event Identifier
+	 * =======================================================
+	 * The contents of a specific counter can be read by setting the
+	 * following fields in QM_EVTSEL.ExtendedEvtID(=1) and
+	 * QM_EVTSEL.EvtID = L3CacheABMC (=1) and setting [RMID] to the
+	 * desired counter ID. Reading QM_CTR will then return the
+	 * contents of the specified counter. The RMID_VAL_ERROR bit will
+	 * be set if the counter configuration was invalid, or if an invalid
+	 * counter ID was set in the QM_EVTSEL[RMID] field. If the counter
+	 * data is currently unavailable, the RMID_VAL_UNAVAIL bit will be set.
+	 */
+	wrmsr(MSR_IA32_QM_EVTSEL, ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID, cntr_id);
+	rdmsrl(MSR_IA32_QM_CTR, msr_val);
+
+	if (msr_val & RMID_VAL_ERROR)
+		return -EIO;
+	if (msr_val & RMID_VAL_UNAVAIL)
+		return -EINVAL;
+
+	*val = msr_val;
+	return 0;
+}
+
+void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			     u32 unused, u32 rmid, int cntr_id,
+			     enum resctrl_event_id eventid)
+{
+	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+	struct arch_mbm_state *am;
+
+	am = get_arch_mbm_state(hw_dom, rmid, eventid);
+	if (am) {
+		memset(am, 0, sizeof(*am));
+
+		/* Record any initial, non-zero count value. */
+		__cntr_id_read(cntr_id, &am->prev_msr);
+	}
+}
+
+int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
+			   u32 unused, u32 rmid, int cntr_id,
+			   enum resctrl_event_id eventid, u64 *val)
+{
+	u64 msr_val;
+	int ret;
+
+	ret = __cntr_id_read(cntr_id, &msr_val);
+	if (ret)
+		return ret;
+
+	*val = get_corrected_val(r, d, rmid, eventid, msr_val);
+
+	return 0;
+}
+
 /*
  * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1
  * which indicates that RMIDs are configured in legacy mode.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (21 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:03   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration Babu Moger
                   ` (11 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When "mbm_event" counter assignment mode is enabled, the architecture
requires a counter ID to read the event data.

Introduce an is_mbm_cntr field in struct rmid_read to indicate whether
counter assignment mode is in use.

Update the logic to call resctrl_arch_cntr_read() and
resctrl_arch_reset_cntr() when the assignment mode is active. Report
'Unassigned' in case the user attempts to read the event without assigning
a hardware counter.

Declare mbm_cntr_get() in fs/resctrl/internal.h to make it accessible to
other functions within fs/resctrl.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Squashed two patches here.
     https://lore.kernel.org/lkml/df215f02db88cad714755cd5275f20cf0ee4ae26.1752013061.git.babu.moger@amd.com/
     https://lore.kernel.org/lkml/296c435e9bf63fc5031114cced00fbb4837ad327.1752013061.git.babu.moger@amd.com/
     Changed is_cntr field in struct rmid_read to is_mbm_cntr.
     Fixed the memory leak with arch_mon_ctx.
     Updated the resctrl.rst user doc.
     Updated the changelog.
     Report Unassigned only if none of the events in CTRL_MON and MON are assigned.

v15: New patch to add is_cntr in rmid_read as discussed in
     https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/
---
 Documentation/filesystems/resctrl.rst |  6 ++++
 fs/resctrl/ctrlmondata.c              | 22 +++++++++---
 fs/resctrl/internal.h                 |  5 +++
 fs/resctrl/monitor.c                  | 52 ++++++++++++++++++++-------
 4 files changed, 67 insertions(+), 18 deletions(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 446736dbd97f..4c24c5f3f4c1 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -434,6 +434,12 @@ When monitoring is enabled all MON groups will also contain:
 	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
 	where "YY" is the node number.
 
+	When the 'mbm_event' counter assignment mode is enabled, reading
+	an MBM event of a MON group returns 'Unassigned' if no hardware
+	counter is assigned to it. For CTRL_MON groups, 'Unassigned' is
+	returned if the MBM event does not have an assigned counter in the
+	CTRL_MON group nor in any of its associated MON groups.
+
 "mon_hw_id":
 	Available only with debug option. The identifier used by hardware
 	for the monitor group. On x86 this is the RMID.
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index ad7ffc6acf13..31787ce6ec91 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -563,10 +563,15 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 	rr->r = r;
 	rr->d = d;
 	rr->first = first;
-	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
-	if (IS_ERR(rr->arch_mon_ctx)) {
-		rr->err = -EINVAL;
-		return;
+	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
+	    resctrl_is_mbm_event(evtid)) {
+		rr->is_mbm_cntr = true;
+	} else {
+		rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
+		if (IS_ERR(rr->arch_mon_ctx)) {
+			rr->err = -EINVAL;
+			return;
+		}
 	}
 
 	cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
@@ -582,7 +587,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 	else
 		smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
 
-	resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
+	if (rr->arch_mon_ctx)
+		resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
 }
 
 int rdtgroup_mondata_show(struct seq_file *m, void *arg)
@@ -653,10 +659,16 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 
 checkresult:
 
+	/*
+	 * -ENOENT is a special case, set only when "mbm_event" counter assignment
+	 * mode is enabled and no counter has been assigned.
+	 */
 	if (rr.err == -EIO)
 		seq_puts(m, "Error\n");
 	else if (rr.err == -EINVAL)
 		seq_puts(m, "Unavailable\n");
+	else if (rr.err == -ENOENT)
+		seq_puts(m, "Unassigned\n");
 	else
 		seq_printf(m, "%llu\n", rr.val);
 
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 216588842444..eeee83a5067a 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -110,6 +110,8 @@ struct mon_data {
  *	   domains in @r sharing L3 @ci.id
  * @evtid: Which monitor event to read.
  * @first: Initialize MBM counter when true.
+ * @is_mbm_cntr: Is the counter valid? true if "mbm_event" counter assignment mode is
+ *	   enabled and it is an MBM event.
  * @ci_id: Cacheinfo id for L3. Only set when @d is NULL. Used when summing domains.
  * @err:   Error encountered when reading counter.
  * @val:   Returned value of event counter. If @rgrp is a parent resource group,
@@ -124,6 +126,7 @@ struct rmid_read {
 	struct rdt_mon_domain	*d;
 	enum resctrl_event_id	evtid;
 	bool			first;
+	bool			is_mbm_cntr;
 	unsigned int		ci_id;
 	int			err;
 	u64			val;
@@ -391,6 +394,8 @@ int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp
 			       struct mon_evt *mevt);
 void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
 				  struct mon_evt *mevt);
+int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 070965d45770..a8b53b0ad0b7 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -362,13 +362,25 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
 	u32 closid = rdtgrp->closid;
 	u32 rmid = rdtgrp->mon.rmid;
 	struct rdt_mon_domain *d;
+	int cntr_id = -ENOENT;
 	struct cacheinfo *ci;
 	struct mbm_state *m;
 	int err, ret;
 	u64 tval = 0;
 
+	if (rr->is_mbm_cntr) {
+		cntr_id = mbm_cntr_get(rr->r, rr->d, rdtgrp, rr->evtid);
+		if (cntr_id < 0) {
+			rr->err = -ENOENT;
+			return -EINVAL;
+		}
+	}
+
 	if (rr->first) {
-		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
+		if (rr->is_mbm_cntr)
+			resctrl_arch_reset_cntr(rr->r, rr->d, closid, rmid, cntr_id, rr->evtid);
+		else
+			resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
 		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
 		if (m)
 			memset(m, 0, sizeof(struct mbm_state));
@@ -379,8 +391,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
 		/* Reading a single domain, must be on a CPU in that domain. */
 		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
 			return -EINVAL;
-		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
-						 rr->evtid, &tval, rr->arch_mon_ctx);
+		if (rr->is_mbm_cntr)
+			rr->err = resctrl_arch_cntr_read(rr->r, rr->d, closid, rmid, cntr_id,
+							 rr->evtid, &tval);
+		else
+			rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
+							 rr->evtid, &tval, rr->arch_mon_ctx);
 		if (rr->err)
 			return rr->err;
 
@@ -405,8 +421,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
 	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
 		if (d->ci_id != rr->ci_id)
 			continue;
-		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
-					     rr->evtid, &tval, rr->arch_mon_ctx);
+		if (rr->is_mbm_cntr)
+			err = resctrl_arch_cntr_read(rr->r, d, closid, rmid, cntr_id,
+						     rr->evtid, &tval);
+		else
+			err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
+						     rr->evtid, &tval, rr->arch_mon_ctx);
 		if (!err) {
 			rr->val += tval;
 			ret = 0;
@@ -613,11 +633,16 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
 	rr.r = r;
 	rr.d = d;
 	rr.evtid = evtid;
-	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
-	if (IS_ERR(rr.arch_mon_ctx)) {
-		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
-				    PTR_ERR(rr.arch_mon_ctx));
-		return;
+	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
+	    resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rr.is_mbm_cntr = true;
+	} else {
+		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
+		if (IS_ERR(rr.arch_mon_ctx)) {
+			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
+					    PTR_ERR(rr.arch_mon_ctx));
+			return;
+		}
 	}
 
 	__mon_event_count(rdtgrp, &rr);
@@ -629,7 +654,8 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
 	if (is_mba_sc(NULL))
 		mbm_bw_count(rdtgrp, &rr);
 
-	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
+	if (rr.arch_mon_ctx)
+		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
 }
 
 static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
@@ -983,8 +1009,8 @@ static void rdtgroup_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *
  * Return:
  * Valid counter ID on success, or -ENOENT on failure.
  */
-static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
-			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
 {
 	int cntr_id;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (22 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:03   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
                   ` (10 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The "mbm_event" counter assignment mode allows the user to assign a
hardware counter to an RMID, event pair and monitor the bandwidth as long
as it is assigned. The user can specify the memory transaction(s) for the
counter to track.

Add the definitions for supported memory transactions (e.g., read, write,
etc.) the counter can be configured with.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Minor code comment update.

v15: Updated the changelog.
     Moved NUM_MBM_TRANSACTIONS to include/linux/resctrl_types.h
     Changed struct mbm_config_value to  struct mbm_transaction.

v14: Changed the term memory events to memory transactions to be consistant.
     Changed the name of the structure to mbm_config_value(from mbm_evt_value).
     Changed name to memory trasactions where applicable.
     Changes subject line to fs/resctrl.

v13: Updated the changelog.
     Removed the definitions from resctrl_types.h and moved to internal.h.
     Removed mbm_assign_config definition. Configurations will be part of
     mon_evt list.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The rdtgroup.c file has now been split between the FS and ARCH directories.

v12: New patch to support event configurations via new counter_configs
     method.
---
 fs/resctrl/internal.h         | 11 +++++++++++
 fs/resctrl/monitor.c          | 11 +++++++++++
 include/linux/resctrl_types.h |  3 +++
 3 files changed, 25 insertions(+)

diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index eeee83a5067a..693268bcbad2 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -216,6 +216,17 @@ struct rdtgroup {
 	struct pseudo_lock_region	*plr;
 };
 
+/**
+ * struct mbm_transaction - Memory transaction an MBM event can be configured with.
+ * @name:	Name of memory transaction (read, write ...).
+ * @val:	The bit (eg. READS_TO_LOCAL_MEM or READS_TO_REMOTE_MEM) used to
+ *		represent the memory transaction within an event's configuration.
+ */
+struct mbm_transaction {
+	char	name[32];
+	u32	val;
+};
+
 /* rdtgroup.flags */
 #define	RDT_DELETED		1
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index a8b53b0ad0b7..16bcfeeb89e6 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -918,6 +918,17 @@ u32 resctrl_get_mon_evt_cfg(enum resctrl_event_id evtid)
 	return mon_event_all[evtid].evt_cfg;
 }
 
+/* Decoded values for each type of memory transaction. */
+struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
+	{"local_reads", READS_TO_LOCAL_MEM},
+	{"remote_reads", READS_TO_REMOTE_MEM},
+	{"local_non_temporal_writes", NON_TEMP_WRITE_TO_LOCAL_MEM},
+	{"remote_non_temporal_writes", NON_TEMP_WRITE_TO_REMOTE_MEM},
+	{"local_reads_slow_memory", READS_TO_LOCAL_S_MEM},
+	{"remote_reads_slow_memory", READS_TO_REMOTE_S_MEM},
+	{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
+};
+
 /**
  * resctrl_mon_resource_init() - Initialise global monitoring structures.
  *
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index d98351663c2c..acfe07860b34 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -34,6 +34,9 @@
 /* Max event bits supported */
 #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
 
+/* Number of memory transactions that an MBM event can be configured with */
+#define NUM_MBM_TRANSACTIONS		7
+
 /* Event IDs */
 enum resctrl_event_id {
 	/* Must match value of first event below */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (23 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:04   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations Babu Moger
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When "mbm_event" counter assignment mode is supported the
/sys/fs/resctrl/info/L3_MON/event_configs directory contains a
sub-directory for each MBM event that can be assigned to a counter.
The MBM event sub-directory contains a file named "event_filter" that
is used to view and modify which memory transactions the MBM event is
configured with.

Create /sys/fs/resctrl/info/L3_MON/event_configs directory on resctrl
mount and pre-populate it with directories for the two existing MBM events:
mbm_total_bytes and mbm_local_bytes. Create the "event_filter" file within
each MBM event directory with the needed *show() that displays the memory
transactions with which the MBM event is configured.

Example:
$ mount -t resctrl resctrl /sys/fs/resctrl
$ cd /sys/fs/resctrl/
$ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter
  local_reads,remote_reads,local_non_temporal_writes,
  remote_non_temporal_writes,local_reads_slow_memory,
  remote_reads_slow_memory,dirty_victim_writes_all

$ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter
  local_reads,local_non_temporal_writes,local_reads_slow_memory

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Moved event_filter_show() to fs/resctrl/monitor.c
     Changed the goto label out_config to out.
     Added rdtgroup_mutex in event_filter_show().
     Removed extern for mbm_transactions. Not required.
     0025-fs-resctrl-Add-event-configuration-directory-under
     0025-fs-resctrl-Add-event-configuration-directory-under
     0025-fs-resctrl-Add-event-configuration-directory-under
     Added prototype rdt_kn_parent_priv() in so it can be called from monitor.c

v15: Fixed the event_filter display with proper spacing.
     Updated the changelog.
     Changed the function name resctrl_mkdir_counter_configs() to
     resctrl_mkdir_event_configs().
     Called resctrl_mkdir_event_configs from rdtgroup_mkdir_info_resdir().
     It avoids the call kernfs_find_and_get() to get the node for info directory.
     Used for_each_mon_event() where applicable.

v14: Updated the changelog with context. Thanks to Reinette.
     Changed the name of directory to event_configs from counter_config.
     Updated user doc about the memory transactions supported by assignment.
     Removed mbm_mode from struct mon_evt. Not required anymore.

v13: Updated user doc (resctrl.rst).
     Changed the name of the function resctrl_mkdir_info_configs to
     resctrl_mkdir_counter_configs().
     Replaced seq_puts() with seq_putc() where applicable.
     Removed RFTYPE_MON_CONFIG definition. Not required.
     Changed the name of the flag RFTYPE_CONFIG to RFTYPE_ASSIGN_CONFIG.
     Reinette suggested RFTYPE_MBM_EVENT_CONFIG but RFTYPE_ASSIGN_CONFIG
     seemed shorter and pricise.
     The configuration is created using evt_list.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The monitor.c/rdtgroup.c files have been split between the FS and ARCH directories.

v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
---
 Documentation/filesystems/resctrl.rst | 32 +++++++++++++++
 fs/resctrl/internal.h                 |  6 +++
 fs/resctrl/monitor.c                  | 24 +++++++++++
 fs/resctrl/rdtgroup.c                 | 58 ++++++++++++++++++++++++++-
 4 files changed, 119 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 4c24c5f3f4c1..3dfc177f9792 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -310,6 +310,38 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
 	  0=30;1=30
 
+"event_configs":
+	Directory that exists when "mbm_event" counter assignment mode is supported.
+	Contains sub-directory for each MBM event that can be assigned to a counter.
+
+	Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
+	Each MBM event's sub-directory contains a file named "event_filter" that is
+	used to view and modify which memory transactions the MBM event is configured
+	with.
+
+	List of memory transaction types supported:
+
+	==========================  ========================================================
+	Name			    Description
+	==========================  ========================================================
+	dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
+	remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
+	local_reads_slow_memory     Reads to slow memory in the local NUMA domain
+	remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
+	local_non_temporal_writes   Non-temporal writes to local NUMA domain
+	remote_reads                Reads to memory in the non-local NUMA domain
+	local_reads                 Reads to memory in the local NUMA domain
+	==========================  ========================================================
+
+	For example::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
+	  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
+	  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
+
+	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
+	  local_reads,local_non_temporal_writes,local_reads_slow_memory
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 693268bcbad2..e082d8718199 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -252,6 +252,8 @@ struct mbm_transaction {
 
 #define RFTYPE_DEBUG			BIT(10)
 
+#define RFTYPE_ASSIGN_CONFIG		BIT(11)
+
 #define RFTYPE_CTRL_INFO		(RFTYPE_INFO | RFTYPE_CTRL)
 
 #define RFTYPE_MON_INFO			(RFTYPE_INFO | RFTYPE_MON)
@@ -408,6 +410,10 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
 int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
 		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
 
+void *rdt_kn_parent_priv(struct kernfs_node *kn);
+
+int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
+
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 16bcfeeb89e6..fa5f63126682 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
 	{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
 };
 
+int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
+{
+	struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
+	bool sep = false;
+	int i;
+
+	mutex_lock(&rdtgroup_mutex);
+
+	for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
+		if (mevt->evt_cfg & mbm_transactions[i].val) {
+			if (sep)
+				seq_putc(seq, ',');
+			seq_printf(seq, "%s", mbm_transactions[i].name);
+			sep = true;
+		}
+	}
+	seq_putc(seq, '\n');
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}
+
 /**
  * resctrl_mon_resource_init() - Initialise global monitoring structures.
  *
@@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void)
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		resctrl_file_fflags_init("available_mbm_cntrs",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
+		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
 	}
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 15d10c346307..11fc8e362ead 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -975,7 +975,7 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of,
 	return 0;
 }
 
-static void *rdt_kn_parent_priv(struct kernfs_node *kn)
+void *rdt_kn_parent_priv(struct kernfs_node *kn)
 {
 	/*
 	 * The parent pointer is only valid within RCU section since it can be
@@ -2019,6 +2019,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= mbm_local_bytes_config_show,
 		.write		= mbm_local_bytes_config_write,
 	},
+	{
+		.name		= "event_filter",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= event_filter_show,
+	},
 	{
 		.name		= "mbm_assign_mode",
 		.mode		= 0444,
@@ -2279,10 +2285,48 @@ int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
 	return ret;
 }
 
+static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_node *l3_mon_kn)
+{
+	struct kernfs_node *kn_subdir, *kn_subdir2;
+	struct mon_evt *mevt;
+	int ret;
+
+	kn_subdir = kernfs_create_dir(l3_mon_kn, "event_configs", l3_mon_kn->mode, NULL);
+	if (IS_ERR(kn_subdir))
+		return PTR_ERR(kn_subdir);
+
+	ret = rdtgroup_kn_set_ugid(kn_subdir);
+	if (ret)
+		return ret;
+
+	for_each_mon_event(mevt) {
+		if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid))
+			continue;
+
+		kn_subdir2 = kernfs_create_dir(kn_subdir, mevt->name, kn_subdir->mode, mevt);
+		if (IS_ERR(kn_subdir2)) {
+			ret = PTR_ERR(kn_subdir2);
+			goto out;
+		}
+
+		ret = rdtgroup_kn_set_ugid(kn_subdir2);
+		if (ret)
+			goto out;
+
+		ret = rdtgroup_add_files(kn_subdir2, RFTYPE_ASSIGN_CONFIG);
+		if (ret)
+			break;
+	}
+
+out:
+	return ret;
+}
+
 static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
 				      unsigned long fflags)
 {
 	struct kernfs_node *kn_subdir;
+	struct rdt_resource *r;
 	int ret;
 
 	kn_subdir = kernfs_create_dir(kn_info, name,
@@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
 		return ret;
 
 	ret = rdtgroup_add_files(kn_subdir, fflags);
+	if (ret)
+		return ret;
+
+	if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) {
+		r = priv;
+		if (r->mon.mbm_cntr_assignable) {
+			ret = resctrl_mkdir_event_configs(r, kn_subdir);
+			if (ret)
+				return ret;
+		}
+	}
+
 	if (!ret)
 		kernfs_activate(kn_subdir);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (24 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:05   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir Babu Moger
                   ` (8 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

When "mbm_event" counter assignment mode is supported, users can modify
the event configuration by writing to the 'event_filter' resctrl file.
The event configurations for mbm_event mode are located in
/sys/fs/resctrl/info/L3_MON/event_configs/.

Update the assignments of all CTRL_MON and MON resource groups when the
event configuration is modified.

Example:
$ mount -t resctrl resctrl /sys/fs/resctrl

$ cd /sys/fs/resctrl/

$ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter
  local_reads,local_non_temporal_writes,local_reads_slow_memory

$ echo "local_reads,local_non_temporal_writes" >
  info/L3_MON/event_configs/mbm_total_bytes/event_filter

$ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter
  local_reads,local_non_temporal_writes

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Moved resctrl_process_configs() and event_filter_write()
     to fs/resctrl/monitor.c.
     Renamed resctrl_process_configs() -> resctrl_parse_mem_transactions().
     Few minor code commnet update.

v15: Updated changelog.
     Updated spacing in resctrl.rst.
     Corrected the name counter_configs -> event_configs.
     Changed the name rdtgroup_assign_cntr() > rdtgroup_update_cntr_event().
     Removed the code to check d->cntr_cfg[cntr_id].evt_cfg.
     Fixed the partial initialization of val in resctrl_process_configs().
     Passed mon_evt where applicable. The struct rdt_resource can be obtained from mon_evt::rid.

v14: Passed struct mon_evt where applicable instead of just the event type.
     Fixed few text corrections about memory trasaction type.
     Renamed few functions resctrl_group_assign() -> rdtgroup_assign_cntr()
     resctrl_update_assign() -> resctrl_assign_cntr_allrdtgrp()
     Removed few extra bases.

v13: Updated changelog for imperative mode.
     Added function description in the prototype.
     Updated the user doc resctrl.rst to address few feedback.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The rdtgroup.c/monitor.c file has now been split between the FS and ARCH directories.

v12: New patch to modify event configurations.
---
 Documentation/filesystems/resctrl.rst |  12 +++
 fs/resctrl/internal.h                 |   4 +
 fs/resctrl/monitor.c                  | 114 ++++++++++++++++++++++++++
 fs/resctrl/rdtgroup.c                 |   3 +-
 4 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 3dfc177f9792..37dbad4d50f7 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -342,6 +342,18 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
 	  local_reads,local_non_temporal_writes,local_reads_slow_memory
 
+	Modify the event configuration by writing to the "event_filter" file within
+	the "event_configs" directory. The read/write "event_filter" file contains the
+	configuration of the event that reflects which memory transactions are counted by it.
+
+	For example::
+
+	  # echo "local_reads, local_non_temporal_writes" >
+	    /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
+
+	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
+	   local_reads,local_non_temporal_writes
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index e082d8718199..e2e3fc0c5fab 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -409,11 +409,15 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
 				  struct mon_evt *mevt);
 int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
 		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
+void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt);
 
 void *rdt_kn_parent_priv(struct kernfs_node *kn);
 
 int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
 
+ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
+			   loff_t off);
+
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index fa5f63126682..8efbeb910f77 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -952,6 +952,77 @@ int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v
 	return 0;
 }
 
+static int resctrl_parse_mem_transactions(char *tok, u32 *val)
+{
+	u32 temp_val = 0;
+	char *evt_str;
+	bool found;
+	int i;
+
+next_config:
+	if (!tok || tok[0] == '\0') {
+		*val = temp_val;
+		return 0;
+	}
+
+	/* Start processing the strings for each memory transaction type */
+	evt_str = strim(strsep(&tok, ","));
+	found = false;
+	for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
+		if (!strcmp(mbm_transactions[i].name, evt_str)) {
+			temp_val |= mbm_transactions[i].val;
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		rdt_last_cmd_printf("Invalid memory transaction type %s\n", evt_str);
+		return -EINVAL;
+	}
+
+	goto next_config;
+}
+
+ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
+			   loff_t off)
+{
+	struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
+	struct rdt_resource *r;
+	u32 evt_cfg = 0;
+	int ret = 0;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	r = resctrl_arch_get_resource(mevt->rid);
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_event counter assignment mode is not enabled\n");
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	ret = resctrl_parse_mem_transactions(buf, &evt_cfg);
+	if (!ret && mevt->evt_cfg != evt_cfg) {
+		mevt->evt_cfg = evt_cfg;
+		resctrl_update_cntr_allrdtgrp(mevt);
+	}
+
+out_unlock:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 /**
  * resctrl_mon_resource_init() - Initialise global monitoring structures.
  *
@@ -1193,3 +1264,46 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
 		rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt);
 	}
 }
+
+/*
+ * rdtgroup_update_cntr_event - Update the counter assignments for the event
+ *				in a group.
+ * @r:		Resource to which update needs to be done.
+ * @rdtgrp:	Resctrl group.
+ * @evtid:	MBM monitor event.
+ */
+static void rdtgroup_update_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+				       enum resctrl_event_id evtid)
+{
+	struct rdt_mon_domain *d;
+	int cntr_id;
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
+		if (cntr_id >= 0)
+			resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
+						 rdtgrp->closid, cntr_id, true);
+	}
+}
+
+/*
+ * resctrl_update_cntr_allrdtgrp - Update the counter assignments for the event
+ *				   for all the groups.
+ * @mevt	MBM Monitor event.
+ */
+void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(mevt->rid);
+	struct rdtgroup *prgrp, *crgrp;
+
+	/*
+	 * Find all the groups where the event is assigned and update the
+	 * configuration of existing assignments.
+	 */
+	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
+		rdtgroup_update_cntr_event(r, prgrp, mevt->evtid);
+
+		list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list, mon.crdtgrp_list)
+			rdtgroup_update_cntr_event(r, crgrp, mevt->evtid);
+	}
+}
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 11fc8e362ead..c3d6540c3280 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2021,9 +2021,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "event_filter",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= event_filter_show,
+		.write		= event_filter_write,
 	},
 	{
 		.name		= "mbm_assign_mode",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (25 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:08   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal Babu Moger
                   ` (7 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The "mbm_event" counter assignment mode allows users to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as it is
assigned.

Introduce a user-configurable option that determines if a counter will
automatically be assigned to an RMID, event pair when its associated
monitor group is created via mkdir.

Suggested-by: Peter Newman <peternewman@google.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Fixed the return in resctrl_mbm_assign_on_mkdir_write().

v15: Fixed the static checker warning in resctrl_mbm_assign_on_mkdir_write() reported in
     https://lore.kernel.org/lkml/dd4a1021-b996-438e-941c-69dfcea5f22a@intel.com/

v14: Added rdtgroup_mutex in resctrl_mbm_assign_on_mkdir_show().
     Updated resctrl.rst for clarity.
     Fixed squashing of few previous changes.
     Added more code documentation.

v13: Added Suggested-by tag.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The rdtgroup.c/monitor.c file has now been split between the FS and ARCH directories.

v12: New patch. Added after the discussion on the list.
     https://lore.kernel.org/lkml/CALPaoCh8siZKjL_3yvOYGL4cF_n_38KpUFgHVGbQ86nD+Q2_SA@mail.gmail.com/
---
 Documentation/filesystems/resctrl.rst | 16 ++++++++++
 fs/resctrl/monitor.c                  |  2 ++
 fs/resctrl/rdtgroup.c                 | 43 +++++++++++++++++++++++++++
 include/linux/resctrl.h               |  3 ++
 4 files changed, 64 insertions(+)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 37dbad4d50f7..165e0d315af7 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -354,6 +354,22 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
 	   local_reads,local_non_temporal_writes
 
+"mbm_assign_on_mkdir":
+	Determines if a counter will automatically be assigned to an RMID, event pair
+	when its associated monitor group is created via mkdir. It is enabled by default
+	on boot and users can disable by writing to the interface.
+
+	"0":
+		Auto assignment is disabled.
+	"1":
+		Auto assignment is enabled.
+
+	Example::
+
+	  # echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
+	  0
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 8efbeb910f77..6205bbfe08fb 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1077,6 +1077,8 @@ int resctrl_mon_resource_init(void)
 		resctrl_file_fflags_init("available_mbm_cntrs",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
+		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
+					 RFTYPE_RES_CACHE);
 	}
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index c3d6540c3280..bf04235d2603 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1895,6 +1895,42 @@ static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int resctrl_mbm_assign_on_mkdir_show(struct kernfs_open_file *of,
+					    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+
+	mutex_lock(&rdtgroup_mutex);
+	rdt_last_cmd_clear();
+
+	seq_printf(s, "%u\n", r->mon.mbm_assign_on_mkdir);
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}
+
+static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
+						 char *buf, size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+	bool value;
+	int ret;
+
+	ret = kstrtobool(buf, &value);
+	if (ret)
+		return ret;
+
+	mutex_lock(&rdtgroup_mutex);
+	rdt_last_cmd_clear();
+
+	r->mon.mbm_assign_on_mkdir = value;
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return nbytes;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -1904,6 +1940,13 @@ static struct rftype res_common_files[] = {
 		.seq_show	= rdt_last_cmd_status_show,
 		.fflags		= RFTYPE_TOP_INFO,
 	},
+	{
+		.name		= "mbm_assign_on_mkdir",
+		.mode		= 0644,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_mbm_assign_on_mkdir_show,
+		.write		= resctrl_mbm_assign_on_mkdir_write,
+	},
 	{
 		.name		= "num_closids",
 		.mode		= 0444,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 4d37827121a6..632b9ee5466a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -277,12 +277,15 @@ enum resctrl_schema_fmt {
  *			monitoring events can be configured.
  * @num_mbm_cntrs:	Number of assignable counters.
  * @mbm_cntr_assignable:Is system capable of supporting counter assignment?
+ * @mbm_assign_on_mkdir:True if counters should automatically be assigned to MBM
+ *			events of monitor groups created via mkdir.
  */
 struct resctrl_mon {
 	int			num_rmid;
 	unsigned int		mbm_cfg_mask;
 	int			num_mbm_cntrs;
 	bool			mbm_cntr_assignable;
+	bool			mbm_assign_on_mkdir;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (26 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:08   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
                   ` (6 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Resctrl provides a user-configurable option mbm_assign_on_mkdir that
determines if a counter will automatically be assigned to an RMID, event
pair when its associated monitor group is created via mkdir.

Enable mbm_assign_on_mkdir by default to automatically assign counters to
the two default events (MBM total and MBM local) of a new monitoring group
created via mkdir. This maintains backward compatibility with original
resctrl support for these two events.

Unassign and free counters belonging to a monitoring group when the group
is deleted.

Monitor group creation does not fail if a counter cannot be assigned to one
or both events. There may be limited counters and users have the
flexibility to modify counter assignments at a later time. Log the error
message "Failed to allocate counter for <event> in domain <id>" in
/sys/fs/resctrl/info/last_cmd_status when a new monitoring group is created
but counter assignment failed.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Updated the changelog. Thanks to Reinette.
     Moved r->mon.mbm_assign_on_mkdir initialization to resctrl_mon_resource_init().
     Minor code comment update.
     Updated  the Subject line to fs/resctrl:

v15: Updated the subject line.
     Updated changelog to add unassign part.
     Fixed the check in rdtgroup_assign_cntrs() to call assign correctly.
     Renamed resctrl_assign_cntr_event() -> rdtgroup_assign_cntr_event()
             resctrl_unassign_cntr_event() -> rdtgroup_unassign_cntr_event().

v14: Updated the changelog with changed name mbm_event.
     Update code comments with changed name mbm_event.
     Changed the code to reflect Tony's struct mon_evt changes.

v13: Changes due to calling of resctrl_assign_cntr_event() and resctrl_unassign_cntr_event().
     It only takes evtid. evt_cfg is not required anymore.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The monitor.c/rdtgroup.c files have been split between the FS and ARCH directories.

v12: Removed mbm_cntr_reset() as it is not required while removing the group.
     Update the commit text.
     Added r->mon_capable  check in rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs.

v11: Moved mbm_cntr_reset() to monitor.c.
     Added code reset non-architectural state in mbm_cntr_reset().
     Added missing rdtgroup_unassign_cntrs() calls on failure path.

v10: Assigned the counter before exposing the event files.
    Moved the call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
    This is called both CNTR_MON and MON group creation.
    Call mbm_cntr_reset() when unmounted to clear all the assignments.
    Taken care of few other feedback comments.

v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
    Updated couple of rdtgroup_unassign_cntrs() calls properly.
    Updated function comments.

v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
    Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
    Fixed the problem with unassigning the child MON groups of CTRL_MON group.

v7: Reworded the commit message.
    Removed the reference of ABMC with mbm_cntr_assign.
    Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.

v6: Removed the redundant comments on all the calls of
    rdtgroup_assign_cntrs. Updated the commit message.
    Dropped printing error message on every call of rdtgroup_assign_cntrs.

v5: Removed the code to enable/disable ABMC during the mount.
    That will be another patch.
    Added arch callers to get the arch specific data.
    Renamed fuctions to match the other abmc function.
    Added code comments for assignment failures.

v4: Few name changes based on the upstream discussion.
    Commit message update.

v3: This is a new patch. Patch addresses the upstream comment to enable
    ABMC feature by default if the feature is available.
---
 fs/resctrl/monitor.c  |  1 +
 fs/resctrl/rdtgroup.c | 70 +++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 6205bbfe08fb..5cf1b79c17f5 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1072,6 +1072,7 @@ int resctrl_mon_resource_init(void)
 		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
 								   READS_TO_LOCAL_S_MEM |
 								   NON_TEMP_WRITE_TO_LOCAL_MEM;
+		r->mon.mbm_assign_on_mkdir = true;
 		resctrl_file_fflags_init("num_mbm_cntrs",
 					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		resctrl_file_fflags_init("available_mbm_cntrs",
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index bf04235d2603..d087ba990cd3 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2792,6 +2792,54 @@ static void schemata_list_destroy(void)
 	}
 }
 
+/*
+ * rdtgroup_assign_cntrs() - Assign counters to MBM events. Called when
+ *			     a new group is created.
+ * If "mbm_event" counter assignment mode is enabled, counters should be
+ * automatically assigned if the "mbm_assign_on_mkdir" is set.
+ * Each group can accommodate two counters per domain: one for the total
+ * event and one for the local event. Assignments may fail due to the limited
+ * number of counters. However, it is not necessary to fail the group creation
+ * and thus no failure is returned. Users have the option to modify the
+ * counter assignments after the group has been created.
+ */
+static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+	if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r) ||
+	    !r->mon.mbm_assign_on_mkdir)
+		return;
+
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
+		rdtgroup_assign_cntr_event(NULL, rdtgrp,
+					   &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
+
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
+		rdtgroup_assign_cntr_event(NULL, rdtgrp,
+					   &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
+}
+
+/*
+ * rdtgroup_unassign_cntrs() - Unassign the counters associated with MBM events.
+ *			       Called when a group is deleted.
+ */
+static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+	if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r))
+		return;
+
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
+		rdtgroup_unassign_cntr_event(NULL, rdtgrp,
+					     &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
+
+	if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
+		rdtgroup_unassign_cntr_event(NULL, rdtgrp,
+					     &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
+}
+
 static int rdt_get_tree(struct fs_context *fc)
 {
 	struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -2848,6 +2896,8 @@ static int rdt_get_tree(struct fs_context *fc)
 		if (ret < 0)
 			goto out_info;
 
+		rdtgroup_assign_cntrs(&rdtgroup_default);
+
 		ret = mkdir_mondata_all(rdtgroup_default.kn,
 					&rdtgroup_default, &kn_mondata);
 		if (ret < 0)
@@ -2886,8 +2936,10 @@ static int rdt_get_tree(struct fs_context *fc)
 	if (resctrl_arch_mon_capable())
 		kernfs_remove(kn_mondata);
 out_mongrp:
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
+		rdtgroup_unassign_cntrs(&rdtgroup_default);
 		kernfs_remove(kn_mongrp);
+	}
 out_info:
 	kernfs_remove(kn_info);
 out_closid_exit:
@@ -3033,6 +3085,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
 
 	head = &rdtgrp->mon.crdtgrp_list;
 	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
+		rdtgroup_unassign_cntrs(sentry);
 		free_rmid(sentry->closid, sentry->mon.rmid);
 		list_del(&sentry->mon.crdtgrp_list);
 
@@ -3073,6 +3126,8 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
+		rdtgroup_unassign_cntrs(rdtgrp);
+
 		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 		kernfs_remove(rdtgrp->kn);
@@ -3157,6 +3212,7 @@ static void resctrl_fs_teardown(void)
 		return;
 
 	rmdir_all_sub();
+	rdtgroup_unassign_cntrs(&rdtgroup_default);
 	mon_put_kn_priv();
 	rdt_pseudo_lock_release();
 	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
@@ -3637,9 +3693,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
 	}
 	rdtgrp->mon.rmid = ret;
 
+	rdtgroup_assign_cntrs(rdtgrp);
+
 	ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
 	if (ret) {
 		rdt_last_cmd_puts("kernfs subdir error\n");
+		rdtgroup_unassign_cntrs(rdtgrp);
 		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 		return ret;
 	}
@@ -3649,8 +3708,10 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
 
 static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
 {
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
+		rdtgroup_unassign_cntrs(rgrp);
 		free_rmid(rgrp->closid, rgrp->mon.rmid);
+	}
 }
 
 /*
@@ -3926,6 +3987,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	update_closid_rmid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
+
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 	/*
@@ -3973,6 +4037,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
 	update_closid_rmid(tmpmask, NULL);
 
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 	closid_free(rdtgrp->closid);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (27 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:09   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify " Babu Moger
                   ` (5 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Introduce the mbm_L3_assignments resctrl file associated with CTRL_MON and
MON resource groups to display the counter assignment states of the
resource group when "mbm_event" counter assignment mode is enabled.

The list is displayed in the following format:
<Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state>

Event: A valid MBM event listed in
       /sys/fs/resctrl/info/L3_MON/event_configs directory.

Domain ID: A valid domain ID.

The assignment state can be one of the following:

_ : No counter assigned.

e : Counter assigned exclusively.

Example:
To list the assignment states for the default group
$ cd /sys/fs/resctrl
$ cat /sys/fs/resctrl/mbm_L3_assignments
mbm_total_bytes:0=e;1=e
mbm_local_bytes:0=e;1=e

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Fixed minor merge conflicts with code displacement.
     Changed the check with mbm_cntr_get() to "< 0" from " >=".

v15: Updated the changelog with Reinette's text.
     Updated the event format list to list multiple domains.
     Changed the goto out_assing to out_unlock.
     Updated to use new loop for_each_mon_event() instead of hardcoding.

v14: Added missed rdtgroup_kn_lock_live on failure case.
     Updated the user doc resctrl.rst to clarify counter assignments.
     Updated the changelog.

v13: Changelog update.
     Few changes in mbm_L3_assignments_show() after moving the event config to evt_list.
     Resolved conflicts caused by the recent FS/ARCH code restructure.
     The rdtgroup.c/monitor.c files have been split between the FS and ARCH directories.

v12: New patch:
     Assignment interface moved inside the group based the discussion
     https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/#t
---
 Documentation/filesystems/resctrl.rst | 28 ++++++++++++++
 fs/resctrl/monitor.c                  |  1 +
 fs/resctrl/rdtgroup.c                 | 54 +++++++++++++++++++++++++++
 3 files changed, 83 insertions(+)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 165e0d315af7..0b8ce942f112 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -514,6 +514,34 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	/sys/fs/resctrl/info/L3_MON/mon_features changes the input
 	event.
 
+"mbm_L3_assignments":
+	Exists when "mbm_event" counter assignment mode is supported and lists the
+	counter assignment states of the group.
+
+	The assignment list is displayed in the following format:
+
+	<Event>:<Domain ID>=<Assignment state>;<Domain ID>=<Assignment state>
+
+	Event: A valid MBM event in the
+	       /sys/fs/resctrl/info/L3_MON/event_configs directory.
+
+	Domain ID: A valid domain ID.
+
+	Assignment states:
+
+	_ : No counter assigned.
+
+	e : Counter assigned exclusively.
+
+	Example:
+	To display the counter assignment states for the default group.
+	::
+
+	 # cd /sys/fs/resctrl
+	 # cat /sys/fs/resctrl/mbm_L3_assignments
+	   mbm_total_bytes:0=e;1=e
+	   mbm_local_bytes:0=e;1=e
+
 Resource allocation rules
 -------------------------
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 5cf1b79c17f5..ebc049105949 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1080,6 +1080,7 @@ int resctrl_mon_resource_init(void)
 		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
 		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
 					 RFTYPE_RES_CACHE);
+		resctrl_file_fflags_init("mbm_L3_assignments", RFTYPE_MON_BASE);
 	}
 
 	return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index d087ba990cd3..47716e623a9c 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1931,6 +1931,54 @@ static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
 	return nbytes;
 }
 
+static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct rdt_mon_domain *d;
+	struct rdtgroup *rdtgrp;
+	struct mon_evt *mevt;
+	int ret = 0;
+	bool sep;
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (!rdtgrp) {
+		ret = -ENOENT;
+		goto out_unlock;
+	}
+
+	rdt_last_cmd_clear();
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
+		ret = -ENOENT;
+		goto out_unlock;
+	}
+
+	for_each_mon_event(mevt) {
+		if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid))
+			continue;
+
+		sep = false;
+		seq_printf(s, "%s:", mevt->name);
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			if (sep)
+				seq_putc(s, ';');
+
+			if (mbm_cntr_get(r, d, rdtgrp, mevt->evtid) < 0)
+				seq_printf(s, "%d=_", d->hdr.id);
+			else
+				seq_printf(s, "%d=e", d->hdr.id);
+
+			sep = true;
+		}
+		seq_putc(s, '\n');
+	}
+
+out_unlock:
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2069,6 +2117,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= event_filter_show,
 		.write		= event_filter_write,
 	},
+	{
+		.name		= "mbm_L3_assignments",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= mbm_L3_assignments_show,
+	},
 	{
 		.name		= "mbm_assign_mode",
 		.mode		= 0444,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify assignments in a group
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (28 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:10   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled Babu Moger
                   ` (4 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Enable the mbm_l3_assignments resctrl file to be used to modify counter
assignments of CTRL_MON and MON groups when the "mbm_event" counter
assignment mode is enabled.

The assignment modifications are done in the following format:
<Event>:<Domain id>=<Assignment state>

Event: A valid MBM event in the
       /sys/fs/resctrl/info/L3_MON/event_configs directory.

Domain ID: A valid domain ID. When writing, '*' applies the changes
	   to all domains.

Assignment states:

    _ : Unassign a counter.

    e : Assign a counter exclusively.

Examples:

$ cd /sys/fs/resctrl
$ cat /sys/fs/resctrl/mbm_L3_assignments
  mbm_total_bytes:0=e;1=e
  mbm_local_bytes:0=e;1=e

To unassign the counter associated with the mbm_total_bytes event on
domain 0:

$ echo "mbm_total_bytes:0=_" > mbm_L3_assignments
$ cat /sys/fs/resctrl/mbm_L3_assignments
  mbm_total_bytes:0=_;1=e
  mbm_local_bytes:0=e;1=e

To unassign the counter associated with the mbm_total_bytes event on
all the domains:

$ echo "mbm_total_bytes:*=_" > mbm_L3_assignments
$ cat /sys/fs/resctrl/mbm_L3_assignments
  mbm_total_bytes:0=_;1=_
  mbm_local_bytes:0=e;1=e

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Updated the changelog for minor corrections.
     Updated resctrl.rst few corrections and consistancy.
     Fixed few references of counter_configs to > event_configs.
     Renamed resctrl_process_assign() to resctrl_parse_mbm_assignment().
     Moved resctrl_parse_mbm_assignment() and rdtgroup_modify_assign_state() to monitor.c.

v15: Updated the changelog little bit.
     Fixed the spacing in event_filter display.
     Removed the enum ASSIGN_NONE etc. Not required anymore.
     Moved mbm_get_mon_event_by_name() to fs/resctrl/monitor.c
     Used the new macro for_each_mon_event().
     Renamed resctrl_get_assign_state() -> rdtgroup_modify_assign_state().
     Quite a few changes in resctrl_process_assign().
     Removed the found and domain variables.
     Called rdtgroup_modify_assign_state() directly where applicable.
     Removed couple of goto statements.

v14: Fixed the problem reported by Peter.
     Updated the changelog.
     Updated the user doc resctrl.rst.
     Added example section on how to use resctrl with mbm_assign_mode.

v13: Few changes in mbm_L3_assignments_write() after moving the event config to evt_list.
     Resolved conflicts caused by the recent FS/ARCH code restructure.

v12: New patch:
     Assignment interface moved inside the group based the discussion
     https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/#t
---
 Documentation/filesystems/resctrl.rst | 146 +++++++++++++++++++++++++-
 fs/resctrl/internal.h                 |   3 +
 fs/resctrl/monitor.c                  |  94 +++++++++++++++++
 fs/resctrl/rdtgroup.c                 |  48 ++++++++-
 4 files changed, 289 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 0b8ce942f112..0c8701103214 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -525,7 +525,8 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	Event: A valid MBM event in the
 	       /sys/fs/resctrl/info/L3_MON/event_configs directory.
 
-	Domain ID: A valid domain ID.
+	Domain ID: A valid domain ID. When writing, '*' applies the changes
+		   to all the domains.
 
 	Assignment states:
 
@@ -542,6 +543,34 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	   mbm_total_bytes:0=e;1=e
 	   mbm_local_bytes:0=e;1=e
 
+	Assignments can be modified by writing to the interface.
+
+	Example:
+	To unassign the counter associated with the mbm_total_bytes event on domain 0:
+	::
+
+	 # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
+	 # cat /sys/fs/resctrl/mbm_L3_assignments
+	   mbm_total_bytes:0=_;1=e
+	   mbm_local_bytes:0=e;1=e
+
+	To unassign the counter associated with the mbm_total_bytes event on all the domains:
+	::
+
+	 # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
+	 # cat /sys/fs/resctrl/mbm_L3_assignments
+	   mbm_total_bytes:0=_;1=_
+	   mbm_local_bytes:0=e;1=e
+
+	To assign a counter associated with the mbm_total_bytes event on all domains in
+	exclusive mode:
+	::
+
+	 # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
+	 # cat /sys/fs/resctrl/mbm_L3_assignments
+	   mbm_total_bytes:0=e;1=e
+	   mbm_local_bytes:0=e;1=e
+
 Resource allocation rules
 -------------------------
 
@@ -1577,6 +1606,121 @@ View the llc occupancy snapshot::
   # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
   11234000
 
+
+Examples on working with mbm_assign_mode
+========================================
+
+a. Check if MBM counter assignment mode is supported.
+::
+
+  # mount -t resctrl resctrl /sys/fs/resctrl/
+
+  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+  [mbm_event]
+  default
+
+The "mbm_event" mode is detected and enabled.
+
+b. Check how many assignable counters are supported.
+::
+
+  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
+  0=32;1=32
+
+c. Check how many assignable counters are available for assignment in each domain.
+::
+
+  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
+  0=30;1=30
+
+d. To list the default group's assign states.
+::
+
+  # cat /sys/fs/resctrl/mbm_L3_assignments
+  mbm_total_bytes:0=e;1=e
+  mbm_local_bytes:0=e;1=e
+
+e.  To unassign the counter associated with the mbm_total_bytes event on domain 0.
+::
+
+  # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
+  # cat /sys/fs/resctrl/mbm_L3_assignments
+  mbm_total_bytes:0=_;1=e
+  mbm_local_bytes:0=e;1=e
+
+f. To unassign the counter associated with the mbm_total_bytes event on all domains.
+::
+
+  # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
+  # cat /sys/fs/resctrl/mbm_L3_assignment
+  mbm_total_bytes:0=_;1=_
+  mbm_local_bytes:0=e;1=e
+
+g. To assign a counter associated with the mbm_total_bytes event on all domains in
+exclusive mode.
+::
+
+  # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
+  # cat /sys/fs/resctrl/mbm_L3_assignments
+  mbm_total_bytes:0=e;1=e
+  mbm_local_bytes:0=e;1=e
+
+h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is
+no change in reading the events with the assignment.  If the event is unassigned when
+reading, then the read will come back as "Unassigned".
+::
+
+  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
+  779247936
+  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
+  765207488
+
+i. Check the event configurations.
+::
+
+  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
+  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
+  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
+
+  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
+  local_reads,local_non_temporal_writes,local_reads_slow_memory
+
+j. Change the event configuration for mbm_local_bytes.
+::
+
+  # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
+  /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
+
+  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
+  local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
+
+This will update all (across all domains of all monitor groups) counter assignments
+associated with the mbm_local_bytes event.
+
+k. Now read the local event again. The first read may come back with "Unavailable"
+status. The subsequent read of mbm_local_bytes will display the current value.
+::
+
+  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
+  Unavailable
+  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
+  314101
+
+l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be
+done using the following command. Note that switching the mbm_assign_mode may reset all
+the MBM counters (and thus all MBM events) of all the resctrl groups.
+::
+
+  # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+  mbm_event
+  [default]
+
+m. Unmount the resctrl filesystem.
+::
+
+  # umount /sys/fs/resctrl/
+
 Intel RDT Errata
 ================
 
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index e2e3fc0c5fab..1350fc273258 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -418,6 +418,9 @@ int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v
 ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
 			   loff_t off);
 
+int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+				 char *event, char *tok);
+
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index ebc049105949..1e4f8e3bedc6 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1311,3 +1311,97 @@ void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt)
 			rdtgroup_update_cntr_event(r, crgrp, mevt->evtid);
 	}
 }
+
+/*
+ * mbm_get_mon_event_by_name() - Return the mon_evt entry for the matching
+ * event name.
+ */
+static struct mon_evt *mbm_get_mon_event_by_name(struct rdt_resource *r, char *name)
+{
+	struct mon_evt *mevt;
+
+	for_each_mon_event(mevt) {
+		if (mevt->rid == r->rid && mevt->enabled &&
+		    resctrl_is_mbm_event(mevt->evtid) &&
+		    !strcmp(mevt->name, name))
+			return mevt;
+	}
+
+	return NULL;
+}
+
+static int rdtgroup_modify_assign_state(char *assign, struct rdt_mon_domain *d,
+					struct rdtgroup *rdtgrp, struct mon_evt *mevt)
+{
+	int ret = 0;
+
+	if (!assign || strlen(assign) != 1)
+		return -EINVAL;
+
+	switch (*assign) {
+	case 'e':
+		ret = rdtgroup_assign_cntr_event(d, rdtgrp, mevt);
+		break;
+	case '_':
+		rdtgroup_unassign_cntr_event(d, rdtgrp, mevt);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+				 char *event, char *tok)
+{
+	struct rdt_mon_domain *d;
+	unsigned long dom_id = 0;
+	char *dom_str, *id_str;
+	struct mon_evt *mevt;
+	int ret;
+
+	mevt = mbm_get_mon_event_by_name(r, event);
+	if (!mevt) {
+		rdt_last_cmd_printf("Invalid event %s\n", event);
+		return  -ENOENT;
+	}
+
+next:
+	if (!tok || tok[0] == '\0')
+		return 0;
+
+	/* Start processing the strings for each domain */
+	dom_str = strim(strsep(&tok, ";"));
+
+	id_str = strsep(&dom_str, "=");
+
+	/* Check for domain id '*' which means all domains */
+	if (id_str && *id_str == '*') {
+		ret = rdtgroup_modify_assign_state(dom_str, NULL, rdtgrp, mevt);
+		if (ret)
+			rdt_last_cmd_printf("Assign operation '%s:*=%s' failed\n",
+					    event, dom_str);
+		return ret;
+	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
+		rdt_last_cmd_puts("Missing domain id\n");
+		return -EINVAL;
+	}
+
+	/* Verify if the dom_id is valid */
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		if (d->hdr.id == dom_id) {
+			ret = rdtgroup_modify_assign_state(dom_str, d, rdtgrp, mevt);
+			if (ret) {
+				rdt_last_cmd_printf("Assign operation '%s:%ld=%s' failed\n",
+						    event, dom_id, dom_str);
+				return ret;
+			}
+			goto next;
+		}
+	}
+
+	rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
+	return -EINVAL;
+}
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 47716e623a9c..2d2b91cd1f67 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1979,6 +1979,51 @@ static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file
 	return ret;
 }
 
+static ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf,
+					size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct rdtgroup *rdtgrp;
+	char *token, *event;
+	int ret = 0;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (!rdtgrp) {
+		rdtgroup_kn_unlock(of->kn);
+		return -ENOENT;
+	}
+	rdt_last_cmd_clear();
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
+		rdtgroup_kn_unlock(of->kn);
+		return -EINVAL;
+	}
+
+	while ((token = strsep(&buf, "\n")) != NULL) {
+		/*
+		 * The write command follows the following format:
+		 * “<Event>:<Domain ID>=<Assignment state>”
+		 * Extract the event name first.
+		 */
+		event = strsep(&token, ":");
+
+		ret = resctrl_parse_mbm_assignment(r, rdtgrp, event, token);
+		if (ret)
+			break;
+	}
+
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret ?: nbytes;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2119,9 +2164,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_L3_assignments",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= mbm_L3_assignments_show,
+		.write		= mbm_L3_assignments_write,
 	},
 	{
 		.name		= "mbm_assign_mode",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (29 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify " Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:11   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes Babu Moger
                   ` (3 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

The BMEC (Bandwidth Monitoring Event Configuration) feature enables
per-domain event configuration. With BMEC the MBM events are configured
using the mbm_total_bytes_config or mbm_local_bytes_config files in
/sys/fs/resctrl/info/L3_MON/ and the per-domain event configuration affects
all monitor resource groups.

The mbm_event counter assignment mode enables counters to be assigned to
RMID (i.e a monitor resource group), event pairs, with potentially unique
event configurations associated with every counter.

There may be systems that support both BMEC and mbm_event counter
assignment mode, but resctrl supporting both concurrently will present a
conflicting interface to the user with both per-domain and per RMID, event
configurations active at the same time.

The mbm_event counter assignment provides most flexibility to user space
and aligns with Arm's counter support. On systems that support both,
disable BMEC event configuration when mbm_event mode is enabled by hiding
the mbm_total_bytes_config or mbm_local_bytes_config files when mbm_event
mode is enabled. Ensure mon_features always displays accurate information
about monitor features.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Added new comment in resctrl_bmec_files_show() about kernfs_find_and_get failure.
     Added the parameter to resctrl_bmec_files_show() to pass the kernfs_node.

v15: Updated the changelog.
     Moved resctrl_bmec_files_show() inside rdtgroup_mkdir_info_resdir().
     Removed the unnecessary kernfs_get() call.

v14: Updated the changelog for change in mbm_assign_modes.
     Added check in rdt_mon_features_show to hide bmec related feature.

v13: New patch to hide BMEC related files.
---
 fs/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 2d2b91cd1f67..1aeac350774d 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1150,7 +1150,8 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
 		if (mevt->rid != r->rid || !mevt->enabled)
 			continue;
 		seq_printf(seq, "%s\n", mevt->name);
-		if (mevt->configurable)
+		if (mevt->configurable &&
+		    !resctrl_arch_mbm_cntr_assign_enabled(r))
 			seq_printf(seq, "%s_config\n", mevt->name);
 	}
 
@@ -1799,6 +1800,41 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+/*
+ * resctrl_bmec_files_show() — Controls the visibility of BMEC-related resctrl
+ * files. When @show is true, the files are displayed; when false, the files
+ * are hidden.
+ * Don't treat kernfs_find_and_get failure as an error, since this function may
+ * be called regardless of whether BMEC is supported or the event is enabled.
+ */
+static void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn,
+				    bool show)
+{
+	struct kernfs_node *kn_config;
+	char name[32];
+
+	if (!l3_mon_kn) {
+		sprintf(name, "%s_MON", r->name);
+		l3_mon_kn = kernfs_find_and_get(kn_info, name);
+		if (!l3_mon_kn)
+			return;
+	}
+
+	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_total_bytes_config");
+	if (kn_config) {
+		kernfs_show(kn_config, show);
+		kernfs_put(kn_config);
+	}
+
+	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_local_bytes_config");
+	if (kn_config) {
+		kernfs_show(kn_config, show);
+		kernfs_put(kn_config);
+	}
+
+	kernfs_put(l3_mon_kn);
+}
+
 static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
 					struct seq_file *s, void *v)
 {
@@ -2492,6 +2528,12 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
 			ret = resctrl_mkdir_event_configs(r, kn_subdir);
 			if (ret)
 				return ret;
+			/*
+			 * Hide BMEC related files if mbm_event mode
+			 * is enabled.
+			 */
+			if (resctrl_arch_mbm_cntr_assign_enabled(r))
+				resctrl_bmec_files_show(r, kn_subdir, false);
 		}
 	}
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (30 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:11   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported Babu Moger
                   ` (2 subsequent siblings)
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Resctrl subsystem can support two monitoring modes, "mbm_event" or
"default". In mbm_event mode, monitoring event can only accumulate data
while it is backed by a hardware counter. In "default" mode, resctrl
assumes there is a hardware counter for each event within every CTRL_MON
and MON group.

Introduce mbm_assign_mode resctrl file to switch between mbm_event and
default modes.

Example:
To list the MBM monitor modes supported:
$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_event]
default

To enable the "mbm_event" counter assignment mode:
$ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode

To enable the "default" monitoring mode:
$ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode

MBM event counters are automatically reset as part of changing the mode.
Clear both architectural and non-architectural event states to prevent
overflow conditions during the next event read. Also clear assignable
counter configuration on all the domains.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Minor changelog update.
     Minor update in resctrl.rst.
     Updated resctrl_bmec_files_show() to pass NULL for kn_fs_node.

v15: Minor changelog update.
     Minir user do resctrl.rst update.
     Fixed stray hunks.

v14: Updated the changelog to reflect the change in monitor mode naming.
     Added the call resctrl_bmec_files_show() to enable/disable files
     related to BMEC.
     Added resctrl_set_mon_evt_cfg() to reset event configuration values
     when mode is changes.

v13: Resolved the conflicts due to FS/ARCH restructure.
     Introduced the new resctrl_init_evt_configuration() to initialize
     the event modes and configuration values.
     Added the call to resctrl_bmec_files_show() hide/show BMEC related
     files.

v12: Fixed the documentation for a consistency.
     Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear
     counters and non-architectural states when monitor mode is changed.
     https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/

v11: Changed the name of the function rdtgroup_mbm_assign_mode_write() to
     resctrl_mbm_assign_mode_write().
     Rewrote the commit message with context.
     Added few more details in resctrl.rst about mbm_cntr_assign mode.
     Re-arranged the text in resctrl.rst file.

v10: The call mbm_cntr_reset() has been moved to earlier patch.
     Minor documentation update.

v9: Fixed extra spaces in user documentation.
    Fixed problem changing the mode to mbm_cntr_assign mode when it is
    not supported. Added extra checks to detect if systems supports it.
    Used the rdtgroup_cntr_id_init to initialize cntr_id.

v8: Reset the internal counters after mbm_cntr_assign mode is changed.
    Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset()
    Updated the documentation to make text generic.

v7: Changed the interface name to mbm_assign_mode.
    Removed the references of ABMC.
    Added the changes to reset global and domain bitmaps.
    Added the changes to reset rmid.

v6: Changed the mode name to mbm_cntr_assign.
    Moved all the FS related code here.
    Added changes to reset mbm_cntr_map and resctrl group counters.

v5: Change log and mode description text correction.

v4: Minor commit text changes. Keep the default to ABMC when supported.
    Fixed comments to reflect changed interface "mbm_mode".

v3: New patch to address the review comments from upstream.
---
 Documentation/filesystems/resctrl.rst | 22 +++++++-
 fs/resctrl/internal.h                 |  2 +
 fs/resctrl/monitor.c                  | 27 ++++++++++
 fs/resctrl/rdtgroup.c                 | 72 ++++++++++++++++++++++++++-
 4 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 0c8701103214..35bd58af5c61 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -259,7 +259,8 @@ with the following files:
 
 "mbm_assign_mode":
 	The supported counter assignment modes. The enclosed brackets indicate which mode
-	is enabled.
+	is enabled. The MBM events associated with counters may reset when "mbm_assign_mode"
+	is changed.
 	::
 
 	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
@@ -279,6 +280,15 @@ with the following files:
 	of counters available is described in the "num_mbm_cntrs" file. Changing the
 	mode may cause all counters on the resource to reset.
 
+	Moving to mbm_event counter assignment mode requires users to assign the counters
+	to the events. Otherwise, the MBM event counters will return 'Unassigned' when read.
+
+	The mode is beneficial for AMD platforms that support more CTRL_MON
+	and MON groups than available hardware counters. By default, this
+	feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth
+	Monitoring Counters) capability, ensuring counters remain assigned even
+	when the corresponding RMID is not actively used by any processor.
+
 	"default":
 
 	In default mode, resctrl assumes there is a hardware counter for each
@@ -288,6 +298,16 @@ with the following files:
 	result in misleading values or display "Unavailable" if no counter is assigned
 	to the event.
 
+	* To enable "mbm_event" counter assignment mode:
+	  ::
+
+	    # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
+	* To enable "default" monitoring mode:
+	  ::
+
+	    # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
 "num_mbm_cntrs":
 	The maximum number of counters (total of available and assigned counters) in
 	each domain when the system supports mbm_event mode.
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 1350fc273258..c666aaf7858f 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -410,6 +410,8 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
 int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
 		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
 void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt);
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d);
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d);
 
 void *rdt_kn_parent_priv(struct kernfs_node *kn);
 
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 1e4f8e3bedc6..a4411a128431 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1093,6 +1093,33 @@ void resctrl_mon_resource_exit(void)
 	dom_data_exit(r);
 }
 
+/*
+ * mbm_cntr_free_all() - Clear all the counter ID configuration details in the
+ *			 domain @d. Called when mbm_assign_mode is changed.
+ */
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+	memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs);
+}
+
+/*
+ * resctrl_reset_rmid_all() - Reset all non-architecture states for all the
+ *			      supported RMIDs.
+ */
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
+	enum resctrl_event_id evt;
+	int idx;
+
+	for_each_mbm_event_id(evt) {
+		if (!resctrl_is_mon_event_enabled(evt))
+			continue;
+		idx = MBM_STATE_IDX(evt);
+		memset(d->mbm_states[idx], 0, sizeof(*d->mbm_states[0]) * idx_limit);
+	}
+}
+
 /*
  * rdtgroup_assign_cntr() - Assign/unassign the counter ID for the event, RMID
  * pair in the domain.
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 1aeac350774d..68ba08e95a54 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1865,6 +1865,75 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of,
+					     char *buf, size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+	struct rdt_mon_domain *d;
+	int ret = 0;
+	bool enable;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!strcmp(buf, "default")) {
+		enable = 0;
+	} else if (!strcmp(buf, "mbm_event")) {
+		if (r->mon.mbm_cntr_assignable) {
+			enable = 1;
+		} else {
+			ret = -EINVAL;
+			rdt_last_cmd_puts("mbm_event mode is not supported\n");
+			goto out_unlock;
+		}
+	} else {
+		ret = -EINVAL;
+		rdt_last_cmd_puts("Unsupported assign mode\n");
+		goto out_unlock;
+	}
+
+	if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		ret = resctrl_arch_mbm_cntr_assign_set(r, enable);
+		if (ret)
+			goto out_unlock;
+
+		/* Update the visibility of BMEC related files */
+		resctrl_bmec_files_show(r, NULL, !enable);
+
+		/*
+		 * Initialize the default memory transaction values for
+		 * total and local events.
+		 */
+		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
+			mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
+		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
+			mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
+									   READS_TO_LOCAL_S_MEM |
+									   NON_TEMP_WRITE_TO_LOCAL_MEM;
+		/*
+		 * Reset all the non-achitectural RMID state and assignable counters.
+		 */
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			mbm_cntr_free_all(r, d);
+			resctrl_reset_rmid_all(r, d);
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
 				      struct seq_file *s, void *v)
 {
@@ -2207,9 +2276,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_assign_mode",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= resctrl_mbm_assign_mode_show,
+		.write		= resctrl_mbm_assign_mode_write,
 		.fflags		= RFTYPE_MON_INFO | RFTYPE_RES_CACHE,
 	},
 	{
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (31 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:11   ` Reinette Chatre
  2025-07-25 18:29 ` [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer Babu Moger
  2025-07-30 19:47 ` [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

Configure mbm_event mode on AMD platforms. On AMD platforms, it is
recommended to use the mbm_event mode, if supported, to prevent the
hardware from resetting counters between reads. This can result in
misleading values or display "Unavailable" if no counter is assigned
to the event.

The mbm_event mode, referred to as ABMC (Assignable Bandwidth Monitoring
Counters) on AMD, is enabled by default when supported by the system.

Update ABMC across all logical processors within the resctrl domain to
ensure proper functionality.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Fixed a minor conflict in arch/x86/kernel/cpu/resctrl/monitor.c.

v15: Minor comment update.

v14: Updated the changelog to reflect the change in name of the monitor mode
     to mbm_event.

v13 : Added the call resctrl_init_evt_configuration() to setup the event
      configuration during init.
      Resolved conflicts caused by the recent FS/ARCH code restructure.

v12: Moved the resctrl_arch_mbm_cntr_assign_set_one to domain_add_cpu_mon().
     Updated the commit log.

v11: Commit text in imperative tone. Added few more details.
     Moved resctrl_arch_mbm_cntr_assign_set_one() to monitor.c.

v10: Commit text in imperative tone.

v9: Minor code change due to merge. Actual code did not change.

v8: Renamed resctrl_arch_mbm_cntr_assign_configure to
        resctrl_arch_mbm_cntr_assign_set_one.
    Adde r->mon_capable check.
    Commit message update.

v7: Introduced resctrl_arch_mbm_cntr_assign_configure() to configure.
    Moved the default settings to rdt_get_mon_l3_config(). It should be
    done before the hotplug handler is called. It cannot be done at
    rdtgroup_init().

v6: Keeping the default enablement in arch init code for now.
     This may need some discussion.
     Renamed resctrl_arch_configure_abmc to resctrl_arch_mbm_cntr_assign_configure.

v5: New patch to enable ABMC by default.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 7 +++++++
 arch/x86/kernel/cpu/resctrl/internal.h | 1 +
 arch/x86/kernel/cpu/resctrl/monitor.c  | 8 ++++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 09cb5a70b1cb..bb707a762c53 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -520,6 +520,9 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 		d = container_of(hdr, struct rdt_mon_domain, hdr);
 
 		cpumask_set_cpu(cpu, &d->hdr.cpu_mask);
+		/* Update the mbm_assign_mode state for the CPU if supported */
+		if (r->mon.mbm_cntr_assignable)
+			resctrl_arch_mbm_cntr_assign_set_one(r);
 		return;
 	}
 
@@ -539,6 +542,10 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 	d->ci_id = ci->id;
 	cpumask_set_cpu(cpu, &d->hdr.cpu_mask);
 
+	/* Update the mbm_assign_mode state for the CPU if supported */
+	if (r->mon.mbm_cntr_assignable)
+		resctrl_arch_mbm_cntr_assign_set_one(r);
+
 	arch_mon_domain_online(r, d);
 
 	if (arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index ae4003d44df4..ee81c2d3f058 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -215,5 +215,6 @@ bool rdt_cpu_has(int flag);
 void __init intel_rdt_mbm_apply_quirk(void);
 
 void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
+void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
 
 #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 57c8409a8247..61c0ac5f0be1 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -455,6 +455,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		r->mon.mbm_cntr_assignable = true;
 		cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
 		r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
+		hw_res->mbm_cntr_assign_enabled = true;
 	}
 
 	r->mon_capable = true;
@@ -556,3 +557,10 @@ void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
 	if (am)
 		memset(am, 0, sizeof(*am));
 }
+
+void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r)
+{
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	resctrl_abmc_set_one_amd(&hw_res->mbm_cntr_assign_enabled);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (32 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported Babu Moger
@ 2025-07-25 18:29 ` Babu Moger
  2025-07-30 20:14   ` Reinette Chatre
  2025-07-30 19:47 ` [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
  34 siblings, 1 reply; 93+ messages in thread
From: Babu Moger @ 2025-07-25 18:29 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, babu.moger, tao1.su, sohil.mehta, kai.huang,
	xiaoyao.li, peterz, xin3.li, kan.liang, mario.limonciello,
	thomas.lendacky, perry.yuan, gautham.shenoy, chang.seok.bae,
	linux-doc, linux-kernel, peternewman, eranian

I have been contributing to resctrl for sometime now and I would like to
help with code reviews as well.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Reinette suggested to add me as a reviewer. I am glad to help as a reviewer.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f697a0c51721..70a2f83145db 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20866,6 +20866,7 @@ M:	Tony Luck <tony.luck@intel.com>
 M:	Reinette Chatre <reinette.chatre@intel.com>
 R:	Dave Martin <Dave.Martin@arm.com>
 R:	James Morse <james.morse@arm.com>
+R:	Babu Moger <babu.moger@amd.com>
 L:	linux-kernel@vger.kernel.org
 S:	Supported
 F:	Documentation/filesystems/resctrl.rst
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (33 preceding siblings ...)
  2025-07-25 18:29 ` [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer Babu Moger
@ 2025-07-30 19:47 ` Reinette Chatre
  2025-07-30 23:31   ` Moger, Babu
  34 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:47 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> i. Change the event configuration for mbm_local_bytes.
> 
> 	# echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
> 	/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> 
> 	# cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> 	local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads

Above are some more "counter_configs" stragglers.

Also, while considering our exchange in [1], I encountered quite a few functions doing
counter management work for which I believe monitor.c would be more appropriate. Centralizing
MBM counter management code to monitor.c was something that you planned for this version
so I may be missing why you decided to keep some of these functions in rdtgroup.c? I
highlighted these functions as I noticed them. 


Reinette

[1] https://lore.kernel.org/lkml/74f1b542-d489-4ff9-802c-5d6d5b8d50b4@amd.com/

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions
  2025-07-25 18:29 ` [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions Babu Moger
@ 2025-07-30 19:47   ` Reinette Chatre
  2025-07-30 20:23     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:47 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> From: Tony Luck <tony.luck@intel.com>
> 
> There are currently only three monitor events, all associated with
> the RDT_RESOURCE_L3 resource. Growing support for additional events
> will be easier with some restructuring to have a single point in
> file system code where all attributes of all events are defined.
> 
> Place all event descriptions into an array mon_event_all[]. Doing
> this has the beneficial side effect of removing the need for
> rdt_resource::evt_list.
> 
> Add resctrl_event_id::QOS_FIRST_EVENT for a lower bound on range
> checks for event ids and as the starting index to scan mon_event_all[].
> 
> Drop the code that builds evt_list and change the two places where
> the list is scanned to scan mon_event_all[] instead using a new
> helper macro for_each_mon_event().
> 
> Architecture code now informs file system code which events are
> available with resctrl_enable_mon_event().
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Please add your "Signed-off-by" to the commit tags of the first four patches.
When you do, ensure it follows the expected ordering
per "Ordering of commit tags" in Documentation/process/maintainer-tip.rst.

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-25 18:29 ` [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-07-30 19:47   ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:47 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> Users can create as many monitor groups as RMIDs supported by the hardware.
> However, bandwidth monitoring feature on AMD system only guarantees that
> RMIDs currently assigned to a processor will be tracked by hardware. The
> counters of any other RMIDs which are no longer being tracked will be reset
> to zero. The MBM event counters return "Unavailable" for the RMIDs that are
> not tracked by hardware. So, there can be only limited number of groups
> that can give guaranteed monitoring numbers. With ever changing
> configurations there is no way to definitely know which of these groups are
> being tracked during a particular time. Users do not have the option to
> monitor a group or set of groups for a certain period of time without
> worrying about RMID being reset in between.
> 
> The ABMC feature allows users to assign a hardware counter to an RMID,
> event pair and monitor bandwidth usage as long as it is assigned. The
> hardware continues to track the assigned counter until it is explicitly
> unassigned by the user. There is no need to worry about counters being
> reset during this period. Additionally, the user can specify the type of
> memory transactions (e.g., reads, writes) for the counter to track.
> 
> Without ABMC enabled, monitoring will work in current mode without
> assignment option.
> 
> The Linux resctrl subsystem provides an interface that allows monitoring of
> up to two memory bandwidth events per group, selected from a combination of
> available total and local events. When ABMC is enabled, two events will be
> assigned to each group by default, in line with the current interface
> design. Users will also have the option to configure which types of memory
> transactions are counted by these events.
> 
> Due to the limited number of available counters (32), users may quickly
> exhaust the available counters. If the system runs out of assignable ABMC
> counters, the kernel will report an error. In such cases, users will need
> to unassign one or more active counters to free up counters for new
> assignments. resctrl will provide options to assign or unassign events
> through the group-specific interface file.
> 
> The feature is detected via CPUID_Fn80000020_EBX_x00 bit 5.
> Bits Description
> 5    ABMC (Assignable Bandwidth Monitoring Counters)
> 
> The feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2025-07-25 18:29 ` [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
@ 2025-07-30 19:49   ` Reinette Chatre
  2025-08-06 21:04     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:49 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
> Bits Description
> 15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
>      Monitoring Counter ID + 1
> 
> The feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Detect the feature and number of assignable counters supported. For
> backward compatibility, upon detecting the assignable counter feature,
> enable the mbm_total_bytes and mbm_local_bytes events that users are
> familiar with as part of original L3 MBM support.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 267e9206a999..09cb5a70b1cb 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -883,6 +883,8 @@ static __init bool get_rdt_mon_resources(void)
>  		resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
>  		ret = true;
>  	}
> +	if (rdt_cpu_has(X86_FEATURE_ABMC))
> +		ret = true;
>  
>  	if (!ret)
>  		return false;
> @@ -990,7 +992,8 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
>  

To complement the change below, shouldn't the snippet that precedes it look like:
	if (!cpu_has(c, X86_FEATURE_CQM_LLC) && !cpu_has(c, X86_FEATURE_ABMC)) {
		...
		return;
	}

>  	if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
>  	    cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
> -	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
> +	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL) ||
> +	    cpu_has(c, X86_FEATURE_ABMC)) {
>  		u32 eax, ebx, ecx, edx;
>  
>  		/* QoS sub-leaf, EAX=0Fh, ECX=1 */
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 2558b1bdef8b..0a695ce68f46 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -339,6 +339,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>  	unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>  	unsigned int threshold;
> +	u32 eax, ebx, ecx, edx;
>  
>  	snc_nodes_per_l3_cache = snc_get_config();
>  
> @@ -368,14 +369,18 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>  	 */
>  	resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
>  
> -	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
> -		u32 eax, ebx, ecx, edx;
> -
> +	if (rdt_cpu_has(X86_FEATURE_BMEC) || rdt_cpu_has(X86_FEATURE_ABMC)) {
>  		/* Detect list of bandwidth sources that can be tracked */
>  		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
>  		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;

I interpret this mbm_cfg_mask initialization that an ABMC system will report which of
the memory transactions can be monitored. 
In patch #15 "fs/resctrl: Introduce event configuration field in struct mon_evt"
the event configurations of memory transactions that should be monitored are hardcoded
as below without taking into account what the system supports:

	resctrl_mon_resource_init() {
		...
		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
								   READS_TO_LOCAL_S_MEM |
								   NON_TEMP_WRITE_TO_LOCAL_MEM;
		...
	}

It may thus be that a system may not support all memory transactions it is configured to
monitor. It seems to me that the initialization done in resctrl_mon_resource_init() needs
to take r->mon.mbm_cfg_mask (what the system supports) into account? If so, then
the same hardcoding done by patch #32 in resctrl_mbm_assign_mode_write() should
also be changed.
	
Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-07-25 18:29 ` [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
@ 2025-07-30 19:50   ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:50 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> The ABMC feature allows users to assign a hardware counter to an RMID,
> event pair and monitor bandwidth usage as long as it is assigned. The
> hardware continues to track the assigned counter until it is explicitly
> unassigned by the user.
> 
> Implement an x86 architecture-specific handler to configure a counter. This
> architecture specific handler is called by resctrl fs when a counter is
> assigned or unassigned as well as when an already assigned counter's
> configuration should be updated. Configure counters by writing to the
> L3_QOS_ABMC_CFG MSR, specifying the counter ID, bandwidth source (RMID),
> and event configuration.
> 
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>     Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>     Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events
  2025-07-25 18:29 ` [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events Babu Moger
@ 2025-07-30 19:52   ` Reinette Chatre
  2025-08-07 18:29     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:52 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> When supported, "mbm_event" counter assignment mode offers "num_mbm_cntrs"
> number of counters that can be assigned to RMID, event pairs and monitor
> bandwidth usage as long as it is assigned.
> 
> Add the functionality to allocate and assign a counter to an RMID, event
> pair in the domain.
> 
> If all the counters are in use, kernel will log the error message

I think dropping "kernel will" will help the text to be imperative.

> "Failed to allocate counter for <event> in domain <id>" in
> /sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.

"when a new assignment is requested" can be dropped. Or alternatively:
	Log the error message "Failed to allocate counter for <event> in domain
	<id>" in /sys/fs/resctrl/info/last_cmd_status if all the counters
	are in use.

> Exit on the first failure when assigning counters across all the domains.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  fs/resctrl/internal.h |   3 +
>  fs/resctrl/monitor.c  | 130 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 133 insertions(+)
> 
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index db3a0f12ad77..419423bdabdc 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -387,6 +387,9 @@ bool closid_allocated(unsigned int closid);
>  
>  int resctrl_find_cleanest_closid(void);
>  
> +int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> +			       struct mon_evt *mevt);
> +

This internal.h change does not look necessary? Looking ahead this is because 
rdtgroup.c:rdtgroup_assign_cntrs() needs it, but rdtgroup_assign_cntrs()
also belongs in monitor.c, no? 

>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>  

...

> +/*
> + * rdtgroup_alloc_assign_cntr() - Allocate a counter ID and assign it to the event
> + * pointed to by @mevt and the resctrl group @rdtgrp within the domain @d.
> + *
> + * Return:
> + * 0 on success, < 0 on failure.
> + */
> +static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +				      struct rdtgroup *rdtgrp, struct mon_evt *mevt)
> +{
> +	int cntr_id;
> +
> +	/* No action required if the counter is assigned already. */
> +	cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid);
> +	if (cntr_id >= 0)
> +		return 0;
> +
> +	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, mevt->evtid);
> +	if (cntr_id <  0) {

Extra space above.

> +		rdt_last_cmd_printf("Failed to allocate counter for %s in domain %d\n",
> +				    mevt->name, d->hdr.id);
> +		return cntr_id;
> +	}
> +
> +	rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, true);
> +
> +	return 0;
> +}
> +

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign MBM events
  2025-07-25 18:29 ` [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign " Babu Moger
@ 2025-07-30 19:53   ` Reinette Chatre
  2025-08-07 18:33     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:53 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> The "mbm_event" counter assignment mode offers "num_mbm_cntrs" number of
> counters that can be assigned to RMID, event pairs and monitor bandwidth
> usage as long as it is assigned. If all the counters are in use, the
> kernel logs the error message "Unable to allocate counter in domain" in

Needs an update to match new message.

> /sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.
> 
> To make space for a new assignment, users must unassign an already
> assigned counter and retry the assignment again.
> 
> Add the functionality to unassign and free the counters in the domain.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---


...


> ---
>  fs/resctrl/internal.h |  2 ++
>  fs/resctrl/monitor.c  | 46 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 419423bdabdc..216588842444 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -389,6 +389,8 @@ int resctrl_find_cleanest_closid(void);
>  
>  int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>  			       struct mon_evt *mevt);
> +void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> +				  struct mon_evt *mevt);
>  

Similar comment as previous patch. Please try to keep all monitoring code in
monitor.c. The caller rdtgroup_unassign_cntrs() can move to monitor.c and it
can instead be made available via internal.h


>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members
  2025-07-25 18:29 ` [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members Babu Moger
@ 2025-07-30 19:54   ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:54 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> Reading monitoring data for a monitoring group requires both the RMID and
> CLOSID. The RMID and CLOSID are members of struct rdtgroup but passed
> separately to several functions involved in retrieving event data.
> 
> When "mbm_event" counter assignment mode is enabled, a counter ID is
> required to read event data. The counter ID is obtained through
> mbm_cntr_get(), which expects a struct rdtgroup pointer.
> 
> Provide a pointer to the struct rdtgroup as parameter to functions involved
> in retrieving event data to simplify access to RMID, CLOSID, and counter
> ID.
> 
> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode
  2025-07-25 18:29 ` [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode Babu Moger
@ 2025-07-30 19:59   ` Reinette Chatre
  2025-08-07 19:59     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:59 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> When supported, "mbm_event" counter assignment mode allows users to assign
> a hardware counter to an RMID, event pair and monitor the bandwidth usage
> as long as it is assigned. The hardware continues to track the assigned
> counter until it is explicitly unassigned by the user.
> 
> Introduce the architecture calls resctrl_arch_cntr_read() and
> resctrl_arch_reset_cntr() to read and reset event counters when "mbm_event"
> mode is supported. Function names are chosen to match existing

(apologies if I gave you the text ... trying to polish with more focus on
imperative tone now)
"Function names are chosen to match" -> "Function names match"?

> resctrl_arch_rmid_read() and resctrl_arch_reset_rmid().
> 
> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  include/linux/resctrl.h | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 50e38445183a..4d37827121a6 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -613,6 +613,44 @@ void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>  			      enum resctrl_event_id evtid, u32 rmid, u32 closid,
>  			      u32 cntr_id, bool assign);
>  
> +/**
> + * resctrl_arch_cntr_read() - Read the event data corresponding to the counter ID
> + *			      assigned to the RMID, event pair for this resource
> + *			      and domain.
> + * @r:			Resource that the counter should be read from.
> + * @d:			Domain that the counter should be read from.
> + * @closid:		CLOSID that matches the RMID.
> + * @rmid:		RMID used for counter ID assignment.

Can this be more specific, for example:
			The RMID to which @cntr_id is assigned.

> + * @cntr_id:		The counter ID whose event data should be read. Valid when
> + *			"mbm_event" mode is enabled and @eventid is MBM event.

Would the counter ID not always be valid? Specifically,  resctrl_arch_cntr_read() is
_only_ called when "mbm_event" mode is enabled and @eventid is _always_
an MBM event, no? If you agree, the @cntr_id description can be something like below
with the calling context details moved to general function description:

	 @cntr_id: The counter to read.

> + * @eventid:		eventid used for counter ID assignment, such as
> + *			QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.

The "@eventid is an MBM event" can move here? For example:
			The MBM event to which @cntr_id is assigned.			

> + * @val:		Result of the counter read in bytes.
> + *

It looks to me as though some of the @cntr_id text could move to be the
function description. The description can also be expanded to include where this
will be called from. For example, 

	Called on a CPU that belongs to domain @d when "mbm_event" mode is enabled.
	Called from a non-migrateable process context via smp_call_on_cpu() unless
	all CPUs are nohz_full, in which case it is called via IPI (smp_call_function_any()).
	
The goal is to make information specific. Please feel free to improve.

> + * Return:
> + * 0 on success, or -EIO, -EINVAL etc on error.
> + */
> +int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			   u32 closid, u32 rmid, int cntr_id,
> +			   enum resctrl_event_id eventid, u64 *val);
> +
> +/**
> + * resctrl_arch_reset_cntr() - Reset any private state associated with counter ID.
> + * @r:		The domain's resource.
> + * @d:		The counter ID's domain.
> + * @closid:	CLOSID that matches the RMID.
> + * @rmid:	RMID used for counter ID assignment.
> + * @cntr_id:	The counter ID whose event data should be reset. Valid when
> + *		"mbm_event" mode is enabled and @eventid is MBM event.
> + * @eventid:	eventid used for counter ID assignment, such as
> + *		QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.

Above should similarly be specific.

> + *
> + * This can be called from any CPU.
> + */
> +void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			     u32 closid, u32 rmid, int cntr_id,
> +			     enum resctrl_event_id eventid);
> +
>  extern unsigned int resctrl_rmid_realloc_threshold;
>  extern unsigned int resctrl_rmid_realloc_limit;
>  

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read()
  2025-07-25 18:29 ` [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read() Babu Moger
@ 2025-07-30 19:59   ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 19:59 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> resctrl_arch_rmid_read() adjusts the value obtained from MSR_IA32_QM_CTR to
> account for the overflow for MBM events and apply counter scaling for all
> the events. This logic is common to both reading an RMID and reading a
> hardware counter directly.
> 
> Refactor the hardware value adjustment logic into get_corrected_val() to
> prepare for support of reading a hardware counter.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read()
  2025-07-25 18:29 ` [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() Babu Moger
@ 2025-07-30 20:01   ` Reinette Chatre
  2025-08-08  2:05     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:01 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> System software can read resctrl event data for a particular resource by

"can read" -> "reads"

> writing the RMID and Event Identifier (EvtID) to the QM_EVTSEL register and
> then reading the event data from the QM_CTR register.
> 
> In ABMC mode, the event data of a specific counter ID can be read by

"can be read" -> "is read"

> setting the following fields: QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID
> = L3CacheABMC (=1) and setting [RMID] to the desired counter ID. Reading

"[RMID]" -> "QM_EVTSEL.RMID"

> QM_CTR will then return the contents of the specified counter ID. The

"will then return" -> "then returns"

> RMID_VAL_ERROR bit will be set if the counter configuration was invalid, or

"will be set" -> "is set"
"was invalid" -> "is invalid"

> if an invalid counter ID was set in the QM_EVTSEL[RMID] field. If the

"was set" -> "is set"

"in the QM_EVTSEL[RMID] field" -> "in QM_EVTSEL.RMID"


> counter data is currently unavailable, the RMID_VAL_UNAVAIL bit will be
> set.

"The RMID_VAL_UNAVAIL bit is set if the counter data is unavailable."

Please review after changes that all is coherent and in imperative tone and make
same adjustments to duplicate text in patch.

> 
> Introduce resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() to reset
> and read event data for a specific counter.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  arch/x86/kernel/cpu/resctrl/internal.h |  6 +++
>  arch/x86/kernel/cpu/resctrl/monitor.c  | 68 ++++++++++++++++++++++++++
>  2 files changed, 74 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 6bf6042f11b6..ae4003d44df4 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -40,6 +40,12 @@ struct arch_mbm_state {
>  /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
>  #define ABMC_ENABLE_BIT			0
>  
> +/*
> + * Qos Event Identifiers.
> + */
> +#define ABMC_EXTENDED_EVT_ID		BIT(31)
> +#define ABMC_EVT_ID			BIT(0)
> +
>  /**
>   * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
>   *			       a resource for a control function
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 1f77fd58e707..57c8409a8247 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -259,6 +259,74 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>  	return 0;
>  }
>  
> +static int __cntr_id_read(u32 cntr_id, u64 *val)
> +{
> +	u64 msr_val;
> +
> +	/*
> +	 * QM_EVTSEL Register definition:
> +	 * =======================================================
> +	 * Bits    Mnemonic        Description
> +	 * =======================================================
> +	 * 63:44   --              Reserved
> +	 * 43:32   RMID            Resource Monitoring Identifier
> +	 * 31      ExtEvtID        Extended Event Identifier
> +	 * 30:8    --              Reserved
> +	 * 7:0     EvtID           Event Identifier
> +	 * =======================================================
> +	 * The contents of a specific counter can be read by setting the
> +	 * following fields in QM_EVTSEL.ExtendedEvtID(=1) and

ExtEvtID vs ExtendedEvtID ... either the definition or the text should change to
use same names.
Can description of RMID be expanded to note that it may
contain RMID or counter ID?

> +	 * QM_EVTSEL.EvtID = L3CacheABMC (=1) and setting [RMID] to the
> +	 * desired counter ID. Reading QM_CTR will then return the
> +	 * contents of the specified counter. The RMID_VAL_ERROR bit will
> +	 * be set if the counter configuration was invalid, or if an invalid
> +	 * counter ID was set in the QM_EVTSEL[RMID] field. If the counter
> +	 * data is currently unavailable, the RMID_VAL_UNAVAIL bit will be set.
> +	 */
> +	wrmsr(MSR_IA32_QM_EVTSEL, ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID, cntr_id);
> +	rdmsrl(MSR_IA32_QM_CTR, msr_val);
> +
> +	if (msr_val & RMID_VAL_ERROR)
> +		return -EIO;
> +	if (msr_val & RMID_VAL_UNAVAIL)
> +		return -EINVAL;
> +
> +	*val = msr_val;
> +	return 0;
> +}
> +
> +void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			     u32 unused, u32 rmid, int cntr_id,
> +			     enum resctrl_event_id eventid)
> +{
> +	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
> +	struct arch_mbm_state *am;
> +
> +	am = get_arch_mbm_state(hw_dom, rmid, eventid);
> +	if (am) {
> +		memset(am, 0, sizeof(*am));
> +
> +		/* Record any initial, non-zero count value. */
> +		__cntr_id_read(cntr_id, &am->prev_msr);
> +	}
> +}
> +
> +int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			   u32 unused, u32 rmid, int cntr_id,
> +			   enum resctrl_event_id eventid, u64 *val)
> +{
> +	u64 msr_val;
> +	int ret;
> +
> +	ret = __cntr_id_read(cntr_id, &msr_val);
> +	if (ret)
> +		return ret;
> +
> +	*val = get_corrected_val(r, d, rmid, eventid, msr_val);
> +
> +	return 0;
> +}
> +
>  /*
>   * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1
>   * which indicates that RMIDs are configured in legacy mode.

code looks good.

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode
  2025-07-25 18:29 ` [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode Babu Moger
@ 2025-07-30 20:03   ` Reinette Chatre
  2025-08-08  2:20     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:03 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> When "mbm_event" counter assignment mode is enabled, the architecture
> requires a counter ID to read the event data.
> 
> Introduce an is_mbm_cntr field in struct rmid_read to indicate whether
> counter assignment mode is in use.
> 
> Update the logic to call resctrl_arch_cntr_read() and
> resctrl_arch_reset_cntr() when the assignment mode is active. Report
> 'Unassigned' in case the user attempts to read the event without assigning
> a hardware counter.
> 
> Declare mbm_cntr_get() in fs/resctrl/internal.h to make it accessible to
> other functions within fs/resctrl.

From what I can tell this is not needed by this patch. It is also a hint that
there may be some monitoring specific code outside of monitor.c. Looks like this
is done to support later patch #29 "fs/resctrl: Introduce mbm_L3_assignments to
list assignments in a group" where mbm_L3_assignments_show() should rather
be in monitor.c

> 
> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...

> ---
>  Documentation/filesystems/resctrl.rst |  6 ++++
>  fs/resctrl/ctrlmondata.c              | 22 +++++++++---
>  fs/resctrl/internal.h                 |  5 +++
>  fs/resctrl/monitor.c                  | 52 ++++++++++++++++++++-------
>  4 files changed, 67 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 446736dbd97f..4c24c5f3f4c1 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -434,6 +434,12 @@ When monitoring is enabled all MON groups will also contain:
>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>  	where "YY" is the node number.
>  
> +	When the 'mbm_event' counter assignment mode is enabled, reading
> +	an MBM event of a MON group returns 'Unassigned' if no hardware
> +	counter is assigned to it. For CTRL_MON groups, 'Unassigned' is
> +	returned if the MBM event does not have an assigned counter in the
> +	CTRL_MON group nor in any of its associated MON groups.
> +
>  "mon_hw_id":
>  	Available only with debug option. The identifier used by hardware
>  	for the monitor group. On x86 this is the RMID.
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index ad7ffc6acf13..31787ce6ec91 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -563,10 +563,15 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  	rr->r = r;
>  	rr->d = d;
>  	rr->first = first;
> -	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
> -	if (IS_ERR(rr->arch_mon_ctx)) {
> -		rr->err = -EINVAL;
> -		return;
> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
> +	    resctrl_is_mbm_event(evtid)) {
> +		rr->is_mbm_cntr = true;
> +	} else {
> +		rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
> +		if (IS_ERR(rr->arch_mon_ctx)) {
> +			rr->err = -EINVAL;
> +			return;
> +		}
>  	}
>  
>  	cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
> @@ -582,7 +587,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  	else
>  		smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
>  
> -	resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
> +	if (rr->arch_mon_ctx)
> +		resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
>  }
>  
>  int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> @@ -653,10 +659,16 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  
>  checkresult:
>  
> +	/*
> +	 * -ENOENT is a special case, set only when "mbm_event" counter assignment
> +	 * mode is enabled and no counter has been assigned.
> +	 */
>  	if (rr.err == -EIO)
>  		seq_puts(m, "Error\n");
>  	else if (rr.err == -EINVAL)
>  		seq_puts(m, "Unavailable\n");
> +	else if (rr.err == -ENOENT)
> +		seq_puts(m, "Unassigned\n");
>  	else
>  		seq_printf(m, "%llu\n", rr.val);
>  
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 216588842444..eeee83a5067a 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -110,6 +110,8 @@ struct mon_data {
>   *	   domains in @r sharing L3 @ci.id
>   * @evtid: Which monitor event to read.
>   * @first: Initialize MBM counter when true.
> + * @is_mbm_cntr: Is the counter valid? true if "mbm_event" counter assignment mode is
> + *	   enabled and it is an MBM event.

Since a counter may not be assigned to event being read I do not believe that "Is the counter
valid?" is accurate and should rather be dropped. Rest of text looks accurate to me.  

>   * @ci_id: Cacheinfo id for L3. Only set when @d is NULL. Used when summing domains.
>   * @err:   Error encountered when reading counter.
>   * @val:   Returned value of event counter. If @rgrp is a parent resource group,
> @@ -124,6 +126,7 @@ struct rmid_read {
>  	struct rdt_mon_domain	*d;
>  	enum resctrl_event_id	evtid;
>  	bool			first;
> +	bool			is_mbm_cntr;
>  	unsigned int		ci_id;
>  	int			err;
>  	u64			val;
> @@ -391,6 +394,8 @@ int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp
>  			       struct mon_evt *mevt);
>  void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>  				  struct mon_evt *mevt);
> +int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> +		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
>  

Not necessary? mbm_cntr_get() can remain internal to monitor.c

>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 070965d45770..a8b53b0ad0b7 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -362,13 +362,25 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>  	u32 closid = rdtgrp->closid;
>  	u32 rmid = rdtgrp->mon.rmid;
>  	struct rdt_mon_domain *d;
> +	int cntr_id = -ENOENT;
>  	struct cacheinfo *ci;
>  	struct mbm_state *m;
>  	int err, ret;
>  	u64 tval = 0;
>  
> +	if (rr->is_mbm_cntr) {
> +		cntr_id = mbm_cntr_get(rr->r, rr->d, rdtgrp, rr->evtid);
> +		if (cntr_id < 0) {
> +			rr->err = -ENOENT;
> +			return -EINVAL;
> +		}
> +	}
> +
>  	if (rr->first) {
> -		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
> +		if (rr->is_mbm_cntr)
> +			resctrl_arch_reset_cntr(rr->r, rr->d, closid, rmid, cntr_id, rr->evtid);
> +		else
> +			resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
>  		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
>  		if (m)
>  			memset(m, 0, sizeof(struct mbm_state));
> @@ -379,8 +391,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>  		/* Reading a single domain, must be on a CPU in that domain. */
>  		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
>  			return -EINVAL;
> -		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
> -						 rr->evtid, &tval, rr->arch_mon_ctx);
> +		if (rr->is_mbm_cntr)
> +			rr->err = resctrl_arch_cntr_read(rr->r, rr->d, closid, rmid, cntr_id,
> +							 rr->evtid, &tval);
> +		else
> +			rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
> +							 rr->evtid, &tval, rr->arch_mon_ctx);
>  		if (rr->err)
>  			return rr->err;
>  
> @@ -405,8 +421,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>  	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
>  		if (d->ci_id != rr->ci_id)
>  			continue;
> -		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
> -					     rr->evtid, &tval, rr->arch_mon_ctx);
> +		if (rr->is_mbm_cntr)
> +			err = resctrl_arch_cntr_read(rr->r, d, closid, rmid, cntr_id,
> +						     rr->evtid, &tval);
> +		else
> +			err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
> +						     rr->evtid, &tval, rr->arch_mon_ctx);
>  		if (!err) {
>  			rr->val += tval;
>  			ret = 0;
> @@ -613,11 +633,16 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>  	rr.r = r;
>  	rr.d = d;
>  	rr.evtid = evtid;
> -	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> -	if (IS_ERR(rr.arch_mon_ctx)) {
> -		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> -				    PTR_ERR(rr.arch_mon_ctx));
> -		return;
> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
> +	    resctrl_arch_mbm_cntr_assign_enabled(r)) {

Duplicate check?

> +		rr.is_mbm_cntr = true;
> +	} else {
> +		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> +		if (IS_ERR(rr.arch_mon_ctx)) {
> +			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> +					    PTR_ERR(rr.arch_mon_ctx));
> +			return;
> +		}
>  	}
>  
>  	__mon_event_count(rdtgrp, &rr);
> @@ -629,7 +654,8 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>  	if (is_mba_sc(NULL))
>  		mbm_bw_count(rdtgrp, &rr);
>  
> -	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
> +	if (rr.arch_mon_ctx)
> +		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
>  }
>  
>  static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> @@ -983,8 +1009,8 @@ static void rdtgroup_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *
>   * Return:
>   * Valid counter ID on success, or -ENOENT on failure.
>   */
> -static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> -			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> +		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>  {
>  	int cntr_id;
>  

Not necessary?

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration
  2025-07-25 18:29 ` [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration Babu Moger
@ 2025-07-30 20:03   ` Reinette Chatre
  2025-08-08  2:24     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:03 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> The "mbm_event" counter assignment mode allows the user to assign a
> hardware counter to an RMID, event pair and monitor the bandwidth as long
> as it is assigned. The user can specify the memory transaction(s) for the
> counter to track.
> 
> Add the definitions for supported memory transactions (e.g., read, write,
> etc.) the counter can be configured with.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  fs/resctrl/internal.h         | 11 +++++++++++
>  fs/resctrl/monitor.c          | 11 +++++++++++
>  include/linux/resctrl_types.h |  3 +++
>  3 files changed, 25 insertions(+)
> 
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index eeee83a5067a..693268bcbad2 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h

Looks like only monitoring code in monitor.c needs to know about
struct mbm_transaction so this can stay within monitor.c ?

> @@ -216,6 +216,17 @@ struct rdtgroup {
>  	struct pseudo_lock_region	*plr;
>  };
>  
> +/**
> + * struct mbm_transaction - Memory transaction an MBM event can be configured with.
> + * @name:	Name of memory transaction (read, write ...).
> + * @val:	The bit (eg. READS_TO_LOCAL_MEM or READS_TO_REMOTE_MEM) used to
> + *		represent the memory transaction within an event's configuration.
> + */
> +struct mbm_transaction {
> +	char	name[32];
> +	u32	val;
> +};
> +
>  /* rdtgroup.flags */
>  #define	RDT_DELETED		1
>  

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-07-25 18:29 ` [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
@ 2025-07-30 20:04   ` Reinette Chatre
  2025-08-08 13:56     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:04 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:


> ---
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 4c24c5f3f4c1..3dfc177f9792 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -310,6 +310,38 @@ with the following files:
>  	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>  	  0=30;1=30
>  
> +"event_configs":
> +	Directory that exists when "mbm_event" counter assignment mode is supported.
> +	Contains sub-directory for each MBM event that can be assigned to a counter.

"Contains sub-directory" -> "Contains a sub-directory"?

> +
> +	Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
> +	Each MBM event's sub-directory contains a file named "event_filter" that is
> +	used to view and modify which memory transactions the MBM event is configured
> +	with.
> +
> +	List of memory transaction types supported:
> +
> +	==========================  ========================================================
> +	Name			    Description
> +	==========================  ========================================================
> +	dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
> +	remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
> +	local_reads_slow_memory     Reads to slow memory in the local NUMA domain
> +	remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
> +	local_non_temporal_writes   Non-temporal writes to local NUMA domain
> +	remote_reads                Reads to memory in the non-local NUMA domain
> +	local_reads                 Reads to memory in the local NUMA domain
> +	==========================  ========================================================
> +
> +	For example::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
> +	  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
> +	  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +	  local_reads,local_non_temporal_writes,local_reads_slow_memory
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy

...

> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 16bcfeeb89e6..fa5f63126682 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>  	{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>  };
>  
> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
> +{
> +	struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
> +	bool sep = false;
> +	int i;
> +
> +	mutex_lock(&rdtgroup_mutex);
> +

There is inconsistency among the files introduced on how
"mbm_event mode disabled" case is handled. Some files return failure
from their _show()/_write() when "mbm_event mode is disabled", some don't. 

The "event_filter" file always prints the MBM transactions monitored
when assignable counters are supported, whether mbm_event mode is enabled
or not. This means that the MBM event's configuration values are printed
when "default" mode is enabled.  I have two concerns about this
1) This is potentially very confusing since switching to "default" will
   make the BMEC files visible that will enable the user to modify the
   event configurations per domain. Having this file print a global event
   configuration while there are potentially various different domain-specific
   configuration active will be confusing.
2) Can it be guaranteed that the MBM events will monitor the default
   assignable counter memory transactions when in "default" mode? It has
   never been possible to query which memory transactions are monitored by
   the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
   so this seems to use one feature to deduce capabilities or another?



> +	for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
> +		if (mevt->evt_cfg & mbm_transactions[i].val) {
> +			if (sep)
> +				seq_putc(seq, ',');
> +			seq_printf(seq, "%s", mbm_transactions[i].name);
> +			sep = true;
> +		}
> +	}
> +	seq_putc(seq, '\n');
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +
> +	return 0;
> +}
> +
>  /**
>   * resctrl_mon_resource_init() - Initialise global monitoring structures.
>   *
> @@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void)
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  		resctrl_file_fflags_init("available_mbm_cntrs",
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
> +		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>  	}
>  
>  	return 0;

...

> @@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>  		return ret;
>  
>  	ret = rdtgroup_add_files(kn_subdir, fflags);
> +	if (ret)
> +		return ret;
> +
> +	if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) {
> +		r = priv;
> +		if (r->mon.mbm_cntr_assignable) {
> +			ret = resctrl_mkdir_event_configs(r, kn_subdir);
> +			if (ret)
> +				return ret;
> +		}
> +	}
> +
>  	if (!ret)
>  		kernfs_activate(kn_subdir);
>  

Looks like the "if (!ret)" above can be dropped to always call "kernfs_activate(kn_subdir)"
on exit making it clear that this is success path and function exits early on any error.

Reinette




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations
  2025-07-25 18:29 ` [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations Babu Moger
@ 2025-07-30 20:05   ` Reinette Chatre
  2025-08-08 18:27     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:05 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> When "mbm_event" counter assignment mode is supported, users can modify

"supported" -> "enabled"?

> the event configuration by writing to the 'event_filter' resctrl file.
> The event configurations for mbm_event mode are located in
> /sys/fs/resctrl/info/L3_MON/event_configs/.
> 
> Update the assignments of all CTRL_MON and MON resource groups when the
> event configuration is modified.
> 
> Example:
> $ mount -t resctrl resctrl /sys/fs/resctrl
> 
> $ cd /sys/fs/resctrl/
> 
> $ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter
>   local_reads,local_non_temporal_writes,local_reads_slow_memory
> 
> $ echo "local_reads,local_non_temporal_writes" >
>   info/L3_MON/event_configs/mbm_total_bytes/event_filter
> 
> $ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter
>   local_reads,local_non_temporal_writes
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  Documentation/filesystems/resctrl.rst |  12 +++
>  fs/resctrl/internal.h                 |   4 +
>  fs/resctrl/monitor.c                  | 114 ++++++++++++++++++++++++++
>  fs/resctrl/rdtgroup.c                 |   3 +-
>  4 files changed, 132 insertions(+), 1 deletion(-)
> 

...

> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index e082d8718199..e2e3fc0c5fab 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -409,11 +409,15 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
>  				  struct mon_evt *mevt);
>  int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>  		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
> +void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt);

Is there some code ordering issue in monitor.c? Looks like this function
is only used in monitor.c so seeing it here is unexpected.

>  
>  void *rdt_kn_parent_priv(struct kernfs_node *kn);
>  
>  int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
>  
> +ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
> +			   loff_t off);
> +
>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>  
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index fa5f63126682..8efbeb910f77 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c

...

> @@ -1193,3 +1264,46 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
>  		rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt);
>  	}
>  }
> +
> +/*
> + * rdtgroup_update_cntr_event - Update the counter assignments for the event
> + *				in a group.
> + * @r:		Resource to which update needs to be done.
> + * @rdtgrp:	Resctrl group.
> + * @evtid:	MBM monitor event.
> + */
> +static void rdtgroup_update_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp,
> +				       enum resctrl_event_id evtid)
> +{
> +	struct rdt_mon_domain *d;
> +	int cntr_id;
> +
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
> +		if (cntr_id >= 0)
> +			resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
> +						 rdtgrp->closid, cntr_id, true);

Should non-arch MBM state be reset here?

> +	}
> +}
> +

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
  2025-07-25 18:29 ` [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir Babu Moger
@ 2025-07-30 20:08   ` Reinette Chatre
  2025-08-08 20:29     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:08 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> The "mbm_event" counter assignment mode allows users to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as it is
> assigned.

Above implies this addition is in support of "mbm_event" mode while the
implementation applies to any and all assignable counter modes, including
the "default" and for example the upcoming "soft-ABMC". It is clear to me
how this is used and interpreted when "mbm_event" mode is enabled, but not
for the others (more below).

> 
> Introduce a user-configurable option that determines if a counter will
> automatically be assigned to an RMID, event pair when its associated
> monitor group is created via mkdir.
> 
> Suggested-by: Peter Newman <peternewman@google.com>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  Documentation/filesystems/resctrl.rst | 16 ++++++++++
>  fs/resctrl/monitor.c                  |  2 ++
>  fs/resctrl/rdtgroup.c                 | 43 +++++++++++++++++++++++++++
>  include/linux/resctrl.h               |  3 ++
>  4 files changed, 64 insertions(+)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 37dbad4d50f7..165e0d315af7 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -354,6 +354,22 @@ with the following files:
>  	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>  	   local_reads,local_non_temporal_writes
>  
> +"mbm_assign_on_mkdir":

Needs a "Exists when "mbm_event" counter assignment mode is supported."?
Also needs clarification on on behavior when "mbm_event" is enabled vs. disabled.

> +	Determines if a counter will automatically be assigned to an RMID, event pair

"will automatically be" -> "is automatically"
"RMID, event" -> "RMID, MBM event"

> +	when its associated monitor group is created via mkdir. It is enabled by default
> +	on boot and users can disable by writing to the interface.

"users can disable" -> "users can disable this capability" or "can be disabled"?

This implementation enables user to read/write this file/property when "mbm_event" mode is
disabled. Considering this explanation I do not think it is clear how this file reflects
system behavior when in "default" mode. There is no difference between mbm_assign_on_mkdir
enabled/disabled when in "default" mode, no? 
Should interactions with "mbm_assign_on_mkdir" be restricted to when
"mbm_event" mode is enabled? If so, the next question would likely be whether value
should change during "mbm_event" enable->disable or "disable->enable". Above states
clearly that it is enabled on boot and it may be reasonable to have it keep (but not always
expose) user's setting when switching between modes.

Restricting it to "mbm_event" mode now gives us some flexibility when soft-ABMC follows
on if/how it can/should support this. What do you think?

> +
> +	"0":
> +		Auto assignment is disabled.
> +	"1":
> +		Auto assignment is enabled.
> +
> +	Example::
> +
> +	  # echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
> +	  0
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 8efbeb910f77..6205bbfe08fb 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -1077,6 +1077,8 @@ int resctrl_mon_resource_init(void)
>  		resctrl_file_fflags_init("available_mbm_cntrs",
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
> +		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
> +					 RFTYPE_RES_CACHE);
>  	}
>  
>  	return 0;
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index c3d6540c3280..bf04235d2603 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c

Please move resctrl_mbm_assign_on_mkdir_show() and resctrl_mbm_assign_on_mkdir_write() to monitor.c

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal
  2025-07-25 18:29 ` [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal Babu Moger
@ 2025-07-30 20:08   ` Reinette Chatre
  2025-08-11 23:39     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:08 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:

> ---
>  fs/resctrl/monitor.c  |  1 +
>  fs/resctrl/rdtgroup.c | 70 +++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 69 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 6205bbfe08fb..5cf1b79c17f5 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -1072,6 +1072,7 @@ int resctrl_mon_resource_init(void)
>  		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
>  								   READS_TO_LOCAL_S_MEM |
>  								   NON_TEMP_WRITE_TO_LOCAL_MEM;
> +		r->mon.mbm_assign_on_mkdir = true;
>  		resctrl_file_fflags_init("num_mbm_cntrs",
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  		resctrl_file_fflags_init("available_mbm_cntrs",
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index bf04235d2603..d087ba990cd3 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c

Please move rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to
be with counter management code in monitor.c

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group
  2025-07-25 18:29 ` [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
@ 2025-07-30 20:09   ` Reinette Chatre
  2025-08-11 23:42     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:09 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> Introduce the mbm_L3_assignments resctrl file associated with CTRL_MON and
> MON resource groups to display the counter assignment states of the
> resource group when "mbm_event" counter assignment mode is enabled.
> 
> The list is displayed in the following format:

needs imperative:
 "Display the list ..."

> <Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state>
> 
> Event: A valid MBM event listed in
>        /sys/fs/resctrl/info/L3_MON/event_configs directory.
> 
> Domain ID: A valid domain ID.
> 
> The assignment state can be one of the following:
> 
> _ : No counter assigned.
> 
> e : Counter assigned exclusively.
> 
> Example:
> To list the assignment states for the default group
> $ cd /sys/fs/resctrl
> $ cat /sys/fs/resctrl/mbm_L3_assignments
> mbm_total_bytes:0=e;1=e
> mbm_local_bytes:0=e;1=e
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...


> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 5cf1b79c17f5..ebc049105949 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -1080,6 +1080,7 @@ int resctrl_mon_resource_init(void)
>  		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>  		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
>  					 RFTYPE_RES_CACHE);
> +		resctrl_file_fflags_init("mbm_L3_assignments", RFTYPE_MON_BASE);
>  	}
>  
>  	return 0;
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index d087ba990cd3..47716e623a9c 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1931,6 +1931,54 @@ static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
>  	return nbytes;
>  }
>  
> +static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v)

Please move to monitor.c (then mbm_cntr_get() can be private to monitor.c also).

> +{
> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +	struct rdt_mon_domain *d;
> +	struct rdtgroup *rdtgrp;
> +	struct mon_evt *mevt;
> +	int ret = 0;
> +	bool sep;
> +
> +	rdtgrp = rdtgroup_kn_lock_live(of->kn);
> +	if (!rdtgrp) {
> +		ret = -ENOENT;
> +		goto out_unlock;
> +	}
> +
> +	rdt_last_cmd_clear();
> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
> +		ret = -ENOENT;

The error returned by the files when "mbm_event" is disabled (but supported) is
inconsistent. All but this one return EINVAL. Please make return code consistent.

> +		goto out_unlock;
> +	}
> +
> +	for_each_mon_event(mevt) {
> +		if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid))
> +			continue;
> +
> +		sep = false;
> +		seq_printf(s, "%s:", mevt->name);
> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +			if (sep)
> +				seq_putc(s, ';');
> +
> +			if (mbm_cntr_get(r, d, rdtgrp, mevt->evtid) < 0)
> +				seq_printf(s, "%d=_", d->hdr.id);
> +			else
> +				seq_printf(s, "%d=e", d->hdr.id);
> +
> +			sep = true;
> +		}
> +		seq_putc(s, '\n');
> +	}
> +
> +out_unlock:
> +	rdtgroup_kn_unlock(of->kn);
> +
> +	return ret;
> +}
> +
>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify assignments in a group
  2025-07-25 18:29 ` [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify " Babu Moger
@ 2025-07-30 20:10   ` Reinette Chatre
  2025-08-11 23:51     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:10 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> Enable the mbm_l3_assignments resctrl file to be used to modify counter
> assignments of CTRL_MON and MON groups when the "mbm_event" counter
> assignment mode is enabled.
> 
> The assignment modifications are done in the following format:

(needs imperative)

> <Event>:<Domain id>=<Assignment state>
> 
> Event: A valid MBM event in the
>        /sys/fs/resctrl/info/L3_MON/event_configs directory.
> 
> Domain ID: A valid domain ID. When writing, '*' applies the changes
> 	   to all domains.
> 
> Assignment states:
> 
>     _ : Unassign a counter.
> 
>     e : Assign a counter exclusively.
> 
> Examples:
> 
> $ cd /sys/fs/resctrl
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=e;1=e
>   mbm_local_bytes:0=e;1=e
> 
> To unassign the counter associated with the mbm_total_bytes event on
> domain 0:
> 
> $ echo "mbm_total_bytes:0=_" > mbm_L3_assignments
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=_;1=e
>   mbm_local_bytes:0=e;1=e
> 
> To unassign the counter associated with the mbm_total_bytes event on
> all the domains:
> 
> $ echo "mbm_total_bytes:*=_" > mbm_L3_assignments
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=_;1=_
>   mbm_local_bytes:0=e;1=e
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  Documentation/filesystems/resctrl.rst | 146 +++++++++++++++++++++++++-
>  fs/resctrl/internal.h                 |   3 +
>  fs/resctrl/monitor.c                  |  94 +++++++++++++++++
>  fs/resctrl/rdtgroup.c                 |  48 ++++++++-
>  4 files changed, 289 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 0b8ce942f112..0c8701103214 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -525,7 +525,8 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>  	Event: A valid MBM event in the
>  	       /sys/fs/resctrl/info/L3_MON/event_configs directory.
>  
> -	Domain ID: A valid domain ID.
> +	Domain ID: A valid domain ID. When writing, '*' applies the changes
> +		   to all the domains.
>  
>  	Assignment states:
>  
> @@ -542,6 +543,34 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>  	   mbm_total_bytes:0=e;1=e
>  	   mbm_local_bytes:0=e;1=e
>  
> +	Assignments can be modified by writing to the interface.
> +
> +	Example:
> +	To unassign the counter associated with the mbm_total_bytes event on domain 0:

The alignment is off when looking at the generated html. What seems to be intended is that
"Example" is some sort of heading but it ends up just being part of the sentence that follows
and thus not apply to other examples that follow.
It can also be "Examples" since there are more than one.

> +	::
> +
> +	 # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=_;1=e
> +	   mbm_local_bytes:0=e;1=e
> +
> +	To unassign the counter associated with the mbm_total_bytes event on all the domains:
> +	::
> +
> +	 # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=_;1=_
> +	   mbm_local_bytes:0=e;1=e
> +
> +	To assign a counter associated with the mbm_total_bytes event on all domains in
> +	exclusive mode:
> +	::
> +
> +	 # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=e;1=e
> +	   mbm_local_bytes:0=e;1=e
> +
>  Resource allocation rules
>  -------------------------
>  
> @@ -1577,6 +1606,121 @@ View the llc occupancy snapshot::
>    # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
>    11234000
>  
> +
> +Examples on working with mbm_assign_mode
> +========================================
> +
> +a. Check if MBM counter assignment mode is supported.
> +::
> +
> +  # mount -t resctrl resctrl /sys/fs/resctrl/
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  [mbm_event]
> +  default
> +
> +The "mbm_event" mode is detected and enabled.
> +
> +b. Check how many assignable counters are supported.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
> +  0=32;1=32
> +
> +c. Check how many assignable counters are available for assignment in each domain.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
> +  0=30;1=30
> +
> +d. To list the default group's assign states.
> +::
> +
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=e;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +e.  To unassign the counter associated with the mbm_total_bytes event on domain 0.
> +::
> +
> +  # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=_;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +f. To unassign the counter associated with the mbm_total_bytes event on all domains.
> +::
> +
> +  # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignment
> +  mbm_total_bytes:0=_;1=_
> +  mbm_local_bytes:0=e;1=e
> +
> +g. To assign a counter associated with the mbm_total_bytes event on all domains in
> +exclusive mode.
> +::
> +
> +  # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=e;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is
> +no change in reading the events with the assignment.  If the event is unassigned when
> +reading, then the read will come back as "Unassigned".

While this example is for a single resource group the supporting text goes back
and forth between being specific to one resource group and describing what happens
when there are multiple resource groups (see (j)). If it is just one resource group then above is
fine, but for multiple there are much more involved with the "unassigned". Same as what
was mentioned during previous version.

> +::
> +
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> +  779247936
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  765207488
> +
> +i. Check the event configurations.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
> +  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
> +  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +  local_reads,local_non_temporal_writes,local_reads_slow_memory
> +
> +j. Change the event configuration for mbm_local_bytes.
> +::
> +
> +  # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
> +  /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +  local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
> +
> +This will update all (across all domains of all monitor groups) counter assignments
> +associated with the mbm_local_bytes event.
> +
> +k. Now read the local event again. The first read may come back with "Unavailable"
> +status. The subsequent read of mbm_local_bytes will display the current value.
> +::
> +
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  Unavailable
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  314101
> +
> +l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be
> +done using the following command. Note that switching the mbm_assign_mode may reset all
> +the MBM counters (and thus all MBM events) of all the resctrl groups.
> +::
> +
> +  # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  mbm_event
> +  [default]
> +
> +m. Unmount the resctrl filesystem.
> +::
> +
> +  # umount /sys/fs/resctrl/
> +
>  Intel RDT Errata
>  ================
>  
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index e2e3fc0c5fab..1350fc273258 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -418,6 +418,9 @@ int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v
>  ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
>  			   loff_t off);
>  
> +int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
> +				 char *event, char *tok);
> +
>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>  
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index ebc049105949..1e4f8e3bedc6 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -1311,3 +1311,97 @@ void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt)
>  			rdtgroup_update_cntr_event(r, crgrp, mevt->evtid);
>  	}
>  }
> +
> +/*
> + * mbm_get_mon_event_by_name() - Return the mon_evt entry for the matching
> + * event name.
> + */
> +static struct mon_evt *mbm_get_mon_event_by_name(struct rdt_resource *r, char *name)
> +{
> +	struct mon_evt *mevt;
> +
> +	for_each_mon_event(mevt) {
> +		if (mevt->rid == r->rid && mevt->enabled &&
> +		    resctrl_is_mbm_event(mevt->evtid) &&
> +		    !strcmp(mevt->name, name))
> +			return mevt;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int rdtgroup_modify_assign_state(char *assign, struct rdt_mon_domain *d,
> +					struct rdtgroup *rdtgrp, struct mon_evt *mevt)
> +{
> +	int ret = 0;
> +
> +	if (!assign || strlen(assign) != 1)
> +		return -EINVAL;
> +
> +	switch (*assign) {
> +	case 'e':
> +		ret = rdtgroup_assign_cntr_event(d, rdtgrp, mevt);
> +		break;
> +	case '_':
> +		rdtgroup_unassign_cntr_event(d, rdtgrp, mevt);
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
> +				 char *event, char *tok)
> +{
> +	struct rdt_mon_domain *d;
> +	unsigned long dom_id = 0;
> +	char *dom_str, *id_str;
> +	struct mon_evt *mevt;
> +	int ret;
> +
> +	mevt = mbm_get_mon_event_by_name(r, event);
> +	if (!mevt) {
> +		rdt_last_cmd_printf("Invalid event %s\n", event);
> +		return  -ENOENT;

Extra space

> +	}
> +
> +next:
> +	if (!tok || tok[0] == '\0')
> +		return 0;
> +
> +	/* Start processing the strings for each domain */
> +	dom_str = strim(strsep(&tok, ";"));
> +
> +	id_str = strsep(&dom_str, "=");
> +
> +	/* Check for domain id '*' which means all domains */
> +	if (id_str && *id_str == '*') {
> +		ret = rdtgroup_modify_assign_state(dom_str, NULL, rdtgrp, mevt);
> +		if (ret)
> +			rdt_last_cmd_printf("Assign operation '%s:*=%s' failed\n",
> +					    event, dom_str);
> +		return ret;
> +	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
> +		rdt_last_cmd_puts("Missing domain id\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Verify if the dom_id is valid */
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		if (d->hdr.id == dom_id) {
> +			ret = rdtgroup_modify_assign_state(dom_str, d, rdtgrp, mevt);
> +			if (ret) {
> +				rdt_last_cmd_printf("Assign operation '%s:%ld=%s' failed\n",
> +						    event, dom_id, dom_str);
> +				return ret;
> +			}
> +			goto next;
> +		}
> +	}
> +
> +	rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
> +	return -EINVAL;
> +}
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 47716e623a9c..2d2b91cd1f67 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1979,6 +1979,51 @@ static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file
>  	return ret;
>  }
>  
> +static ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf,

Please move to monitor.c

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled
  2025-07-25 18:29 ` [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled Babu Moger
@ 2025-07-30 20:11   ` Reinette Chatre
  2025-08-12 19:16     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:11 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> @@ -1799,6 +1800,41 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>  	return ret ?: nbytes;
>  }
>  
> +/*
> + * resctrl_bmec_files_show() — Controls the visibility of BMEC-related resctrl
> + * files. When @show is true, the files are displayed; when false, the files
> + * are hidden.
> + * Don't treat kernfs_find_and_get failure as an error, since this function may
> + * be called regardless of whether BMEC is supported or the event is enabled.
> + */
> +static void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn,
> +				    bool show)
> +{
> +	struct kernfs_node *kn_config;
> +	char name[32];
> +
> +	if (!l3_mon_kn) {
> +		sprintf(name, "%s_MON", r->name);
> +		l3_mon_kn = kernfs_find_and_get(kn_info, name);
> +		if (!l3_mon_kn)
> +			return;
> +	}
> +
> +	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_total_bytes_config");
> +	if (kn_config) {
> +		kernfs_show(kn_config, show);
> +		kernfs_put(kn_config);
> +	}
> +
> +	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_local_bytes_config");
> +	if (kn_config) {
> +		kernfs_show(kn_config, show);
> +		kernfs_put(kn_config);
> +	}
> +
> +	kernfs_put(l3_mon_kn);

Looks like this will drop an extra reference if l3_mon_kn was provided as parameter.

Reinette



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes
  2025-07-25 18:29 ` [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes Babu Moger
@ 2025-07-30 20:11   ` Reinette Chatre
  2025-08-12 19:18     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:11 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 1aeac350774d..68ba08e95a54 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1865,6 +1865,75 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>  	return 0;
>  }
>  
> +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of,
> +					     char *buf, size_t nbytes, loff_t off)

Please move to monitor.c

> +{
> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
> +	struct rdt_mon_domain *d;
> +	int ret = 0;
> +	bool enable;
> +
> +	/* Valid input requires a trailing newline */
> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
> +		return -EINVAL;
> +
> +	buf[nbytes - 1] = '\0';
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +
> +	if (!strcmp(buf, "default")) {
> +		enable = 0;
> +	} else if (!strcmp(buf, "mbm_event")) {
> +		if (r->mon.mbm_cntr_assignable) {
> +			enable = 1;
> +		} else {
> +			ret = -EINVAL;
> +			rdt_last_cmd_puts("mbm_event mode is not supported\n");
> +			goto out_unlock;
> +		}
> +	} else {
> +		ret = -EINVAL;
> +		rdt_last_cmd_puts("Unsupported assign mode\n");
> +		goto out_unlock;
> +	}
> +
> +	if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +		ret = resctrl_arch_mbm_cntr_assign_set(r, enable);
> +		if (ret)
> +			goto out_unlock;
> +
> +		/* Update the visibility of BMEC related files */
> +		resctrl_bmec_files_show(r, NULL, !enable);
> +
> +		/*
> +		 * Initialize the default memory transaction values for
> +		 * total and local events.
> +		 */
> +		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> +			mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
> +		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
> +			mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
> +									   READS_TO_LOCAL_S_MEM |
> +									   NON_TEMP_WRITE_TO_LOCAL_MEM;

This needs to take into account the configurations that
hardware supports.

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported
  2025-07-25 18:29 ` [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported Babu Moger
@ 2025-07-30 20:11   ` Reinette Chatre
  2025-08-12 19:21     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:11 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> Configure mbm_event mode on AMD platforms. On AMD platforms, it is
> recommended to use the mbm_event mode, if supported, to prevent the
> hardware from resetting counters between reads. This can result in
> misleading values or display "Unavailable" if no counter is assigned
> to the event.
> 
> The mbm_event mode, referred to as ABMC (Assignable Bandwidth Monitoring
> Counters) on AMD, is enabled by default when supported by the system.

needs imperative

> 
> Update ABMC across all logical processors within the resctrl domain to
> ensure proper functionality.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Patch looks good.

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer
  2025-07-25 18:29 ` [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer Babu Moger
@ 2025-07-30 20:14   ` Reinette Chatre
  2025-08-12 19:23     ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 20:14 UTC (permalink / raw)
  To: Babu Moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:
> I have been contributing to resctrl for sometime now and I would like to
> help with code reviews as well.

You do not need to be in MAINTAINERS file to help with code reviews. I do believe
it is important that you are cc'd on all future contributions since I am not able
to test these new features you are enabling so having you keep an eye on the health
of these areas is greatly appreciated.   

> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v16: Reinette suggested to add me as a reviewer. I am glad to help as a reviewer.
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f697a0c51721..70a2f83145db 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -20866,6 +20866,7 @@ M:	Tony Luck <tony.luck@intel.com>
>  M:	Reinette Chatre <reinette.chatre@intel.com>
>  R:	Dave Martin <Dave.Martin@arm.com>
>  R:	James Morse <james.morse@arm.com>
> +R:	Babu Moger <babu.moger@amd.com>
>  L:	linux-kernel@vger.kernel.org
>  S:	Supported
>  F:	Documentation/filesystems/resctrl.rst

Acked-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions
  2025-07-30 19:47   ` Reinette Chatre
@ 2025-07-30 20:23     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-07-30 20:23 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 14:47, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> From: Tony Luck <tony.luck@intel.com>
>>
>> There are currently only three monitor events, all associated with
>> the RDT_RESOURCE_L3 resource. Growing support for additional events
>> will be easier with some restructuring to have a single point in
>> file system code where all attributes of all events are defined.
>>
>> Place all event descriptions into an array mon_event_all[]. Doing
>> this has the beneficial side effect of removing the need for
>> rdt_resource::evt_list.
>>
>> Add resctrl_event_id::QOS_FIRST_EVENT for a lower bound on range
>> checks for event ids and as the starting index to scan mon_event_all[].
>>
>> Drop the code that builds evt_list and change the two places where
>> the list is scanned to scan mon_event_all[] instead using a new
>> helper macro for_each_mon_event().
>>
>> Architecture code now informs file system code which events are
>> available with resctrl_enable_mon_event().
>>
>> Signed-off-by: Tony Luck <tony.luck@intel.com>
>> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> Please add your "Signed-off-by" to the commit tags of the first four patches.
> When you do, ensure it follows the expected ordering
> per "Ordering of commit tags" in Documentation/process/maintainer-tip.rst.
> 

Sure. It appears I am patch handler in this case. Seems like this is a
correct order.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-30 19:47 ` [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
@ 2025-07-30 23:31   ` Moger, Babu
  2025-07-30 23:57     ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-07-30 23:31 UTC (permalink / raw)
  To: Reinette Chatre, Babu Moger, corbet, tony.luck, james.morse, tglx,
	mingo, bp, dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,,

On 7/30/2025 2:47 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> i. Change the event configuration for mbm_local_bytes.
>>
>> 	# echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
>> 	/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>
>> 	# cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> 	local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
> 
> Above are some more "counter_configs" stragglers.

Yea. Sure. Missed that.

> 
> Also, while considering our exchange in [1], I encountered quite a few functions doing
> counter management work for which I believe monitor.c would be more appropriate. Centralizing
> MBM counter management code to monitor.c was something that you planned for this version
> so I may be missing why you decided to keep some of these functions in rdtgroup.c? I
> highlighted these functions as I noticed them.
> 

I looked at them. Most of the functions you mentioned are directly 
referenced in res_common_files[] (show or write) and some of them are 
even named as rdtgroup_<>. So, was not sure about moving them.
Sure, I will move them one by one to monitor.c
Thanks
Babu

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-30 23:31   ` Moger, Babu
@ 2025-07-30 23:57     ` Reinette Chatre
  2025-07-31 14:17       ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-07-30 23:57 UTC (permalink / raw)
  To: Moger, Babu, Babu Moger, corbet, tony.luck, james.morse, tglx,
	mingo, bp, dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 7/30/25 4:31 PM, Moger, Babu wrote:
> Hi Reinette,,
> 
> On 7/30/2025 2:47 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>> i. Change the event configuration for mbm_local_bytes.
>>>
>>>     # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
>>>     /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>
>>>     # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>     local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
>>
>> Above are some more "counter_configs" stragglers.
> 
> Yea. Sure. Missed that.
> 
>>
>> Also, while considering our exchange in [1], I encountered quite a few functions doing
>> counter management work for which I believe monitor.c would be more appropriate. Centralizing
>> MBM counter management code to monitor.c was something that you planned for this version
>> so I may be missing why you decided to keep some of these functions in rdtgroup.c? I
>> highlighted these functions as I noticed them.
>>
> 
> I looked at them. Most of the functions you mentioned are directly referenced in res_common_files[] (show or write) and some of them are even named as rdtgroup_<>. So, was not sure about moving them.

If you prefer a precedent you can compare with rdtgroup_schemata_write()/rdtgroup_schemata_show()
that is directly referenced in res_common_files[] while the implementation can be found in
ctrlmondata.c that is intended to contain the "Cache allocation code".

I assumed we agreed on this since I specifically highlighted the topic of the handlers in [1] and you
responded to referring to the handler event_filter_show() and mentioned that you plan to consider the
others. This version thus looks different from what I thought we agreed on :/

> Sure, I will move them one by one to monitor.c

[1] https://lore.kernel.org/lkml/0fa9a12b-e900-4ceb-b59c-e653ec3db0ca@intel.com/

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2025-07-30 23:57     ` Reinette Chatre
@ 2025-07-31 14:17       ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-07-31 14:17 UTC (permalink / raw)
  To: Reinette Chatre, Babu Moger, corbet, tony.luck, james.morse, tglx,
	mingo, bp, dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 6:57 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/30/25 4:31 PM, Moger, Babu wrote:
>> Hi Reinette,,
>>
>> On 7/30/2025 2:47 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>> i. Change the event configuration for mbm_local_bytes.
>>>>
>>>>      # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
>>>>      /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>>
>>>>      # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>>      local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
>>>
>>> Above are some more "counter_configs" stragglers.
>>
>> Yea. Sure. Missed that.
>>
>>>
>>> Also, while considering our exchange in [1], I encountered quite a few functions doing
>>> counter management work for which I believe monitor.c would be more appropriate. Centralizing
>>> MBM counter management code to monitor.c was something that you planned for this version
>>> so I may be missing why you decided to keep some of these functions in rdtgroup.c? I
>>> highlighted these functions as I noticed them.
>>>
>>
>> I looked at them. Most of the functions you mentioned are directly referenced in res_common_files[] (show or write) and some of them are even named as rdtgroup_<>. So, was not sure about moving them.
> 
> If you prefer a precedent you can compare with rdtgroup_schemata_write()/rdtgroup_schemata_show()
> that is directly referenced in res_common_files[] while the implementation can be found in
> ctrlmondata.c that is intended to contain the "Cache allocation code".
> 
> I assumed we agreed on this since I specifically highlighted the topic of the handlers in [1] and you
> responded to referring to the handler event_filter_show() and mentioned that you plan to consider the
> others. This version thus looks different from what I thought we agreed on :/

Looks like I misunderstood few things here.  Will take care of it in 
next revision.

Thanks
Babu

> 
>> Sure, I will move them one by one to monitor.c
> 
> [1] https://lore.kernel.org/lkml/0fa9a12b-e900-4ceb-b59c-e653ec3db0ca@intel.com/
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes
  2025-07-25 18:29 ` [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes Babu Moger
@ 2025-08-06 21:02   ` Moger, Babu
  2025-08-06 21:30     ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-06 21:02 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/25/25 13:29, Babu Moger wrote:
> Introduce the resctrl file "mbm_assign_mode" to list the supported counter
> assignment modes.
> 
> The "mbm_event" counter assignment mode allows users to assign a hardware
> counter to an RMID, event pair and monitor bandwidth usage as long as it is
> assigned. The hardware continues to track the assigned counter until it is
> explicitly unassigned by the user. Each event within a resctrl group can be
> assigned independently in this mode.
> 
> On AMD systems "mbm_event" mode is backed by the ABMC (Assignable
> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
> 
> The "default" mode is the existing mode that works without the explicit
> counter assignment, instead relying on dynamic counter assignment by
> hardware that may result in hardware not dedicating a counter resulting
> in monitoring data reads returning "Unavailable".
> 
> Provide an interface to display the monitor modes on the system.
> 
> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> [mbm_event]
> default
> 
> Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64.
> 
> On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be
> defined when the "mbm_event" mode is supported.
> 
> Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the user
> interface remains compatible with upcoming Arm64 support. IS_ENABLED()
> safely evaluates to 0 when the configuration is not defined.
> 
> As a result, for MPAM, the display would be either:
> [default]
> or
> [mbm_event]
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> v16: Update with Reviewed-by tag.
> 
> v15: Minor text changes in changelog and resctrl.rst.
> 
> v14: Changed the name of the monitor mode to mbm_cntr_evt_assign based on the discussion.
>      https://lore.kernel.org/lkml/7628cec8-5914-4895-8289-027e7821777e@amd.com/
>      Changed the name of the mbm_assign_mode's.
>      Updated resctrl.rst for mbm_event mode.
>      Changed subject line to fs/resctrl.
> 
> v13: Updated the commit log with motivation for adding CONFIG_RESCTRL_ASSIGN_FIXED.
>      Added fflag RFTYPE_RES_CACHE for mbm_assign_mode file.
>      Updated user doc. Removed the references to "mbm_assign_control".
>      Resolved the conflicts with latest FS/ARCH code restructure.
> 
> v12: Minor text update in change log and user documentation.
>      Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
>      This will be defined only in arm and not in x86.
> 
> v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
>      Removed few texts in resctrl.rst about AMD specific information.
>      Updated few texts.
> 
> v10: Added few more text to user documentation clarify on the default mode.
> 
> v9: Updated user documentation based on comments.
> 
> v8: Commit message update.
> 
> v7: Updated the descriptions/commit log in resctrl.rst to generic text.
>     Thanks to James and Reinette.
>     Rename mbm_mode to mbm_assign_mode.
>     Introduced mutex lock in rdtgroup_mbm_mode_show().
> 
> v6: Added documentation for mbm_cntr_assign and legacy mode.
>     Moved mbm_mode fflags initialization to static initialization.
> 
> v5: Changed interface name to mbm_mode.
>     It will be always available even if ABMC feature is not supported.
>     Added description in resctrl.rst about ABMC mode.
>     Fixed display abmc and legacy consistantly.
> 
> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
> 
> v3: New patch to display ABMC capability.
> ---
>  Documentation/filesystems/resctrl.rst | 31 ++++++++++++++++++++++
>  fs/resctrl/rdtgroup.c                 | 37 +++++++++++++++++++++++++++
>  2 files changed, 68 insertions(+)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index c97fd77a107d..b692829fec5f 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -257,6 +257,37 @@ with the following files:
>  	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>  	    0=0x30;1=0x30;3=0x15;4=0x15
>  
> +"mbm_assign_mode":
> +	The supported counter assignment modes. The enclosed brackets indicate which mode
> +	is enabled.
> +	::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +	  [mbm_event]
> +	  default
> +
> +	"mbm_event":
> +
> +	mbm_event mode allows users to assign a hardware counter to an RMID, event
> +	pair and monitor the bandwidth usage as long as it is assigned. The hardware
> +	continues to track the assigned counter until it is explicitly unassigned by
> +	the user. Each event within a resctrl group can be assigned independently.
> +
> +	In this mode, a monitoring event can only accumulate data while it is backed
> +	by a hardware counter. Use "mbm_L3_assignments" found in each CTRL_MON and MON
> +	group to specify which of the events should have a counter assigned. The number
> +	of counters available is described in the "num_mbm_cntrs" file. Changing the
> +	mode may cause all counters on the resource to reset.
> +
> +	"default":
> +
> +	In default mode, resctrl assumes there is a hardware counter for each
> +	event within every CTRL_MON and MON group. On AMD platforms, it is
> +	recommended to use the mbm_event mode, if supported, to prevent reset of MBM
> +	events between reads resulting from hardware re-allocating counters. This can
> +	result in misleading values or display "Unavailable" if no counter is assigned
> +	to the event.
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index ca0475b75390..c7ca9113a12a 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1799,6 +1799,36 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>  	return ret ?: nbytes;
>  }
>  
> +static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
> +					struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
> +	bool enabled;
> +
> +	mutex_lock(&rdtgroup_mutex);
> +	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
> +
> +	if (r->mon.mbm_cntr_assignable) {
> +		if (enabled)
> +			seq_puts(s, "[mbm_event]\n");
> +		else
> +			seq_puts(s, "[default]\n");
> +
> +		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
> +			if (enabled)
> +				seq_puts(s, "default\n");
> +			else
> +				seq_puts(s, "mbm_event\n");
> +		}
> +	} else {
> +		seq_puts(s, "[default]\n");
> +	}
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +
> +	return 0;
> +}

The resctrl_mbm_assign_mode_show() can also be moved to monitor.c.

What do you think?



-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2025-07-30 19:49   ` Reinette Chatre
@ 2025-08-06 21:04     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-06 21:04 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 14:49, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
>> Bits Description
>> 15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
>>      Monitoring Counter ID + 1
>>
>> The feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Detect the feature and number of assignable counters supported. For
>> backward compatibility, upon detecting the assignable counter feature,
>> enable the mbm_total_bytes and mbm_local_bytes events that users are
>> familiar with as part of original L3 MBM support.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 267e9206a999..09cb5a70b1cb 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -883,6 +883,8 @@ static __init bool get_rdt_mon_resources(void)
>>  		resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
>>  		ret = true;
>>  	}
>> +	if (rdt_cpu_has(X86_FEATURE_ABMC))
>> +		ret = true;
>>  
>>  	if (!ret)
>>  		return false;
>> @@ -990,7 +992,8 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
>>  
> 
> To complement the change below, shouldn't the snippet that precedes it look like:
> 	if (!cpu_has(c, X86_FEATURE_CQM_LLC) && !cpu_has(c, X86_FEATURE_ABMC)) {
> 		...
> 		return;
> 	}


Sure. Added now.

> 
>>  	if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
>>  	    cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
>> -	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
>> +	    cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL) ||
>> +	    cpu_has(c, X86_FEATURE_ABMC)) {
>>  		u32 eax, ebx, ecx, edx;
>>  
>>  		/* QoS sub-leaf, EAX=0Fh, ECX=1 */
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 2558b1bdef8b..0a695ce68f46 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -339,6 +339,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>>  	unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
>>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>  	unsigned int threshold;
>> +	u32 eax, ebx, ecx, edx;
>>  
>>  	snc_nodes_per_l3_cache = snc_get_config();
>>  
>> @@ -368,14 +369,18 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>>  	 */
>>  	resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
>>  
>> -	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
>> -		u32 eax, ebx, ecx, edx;
>> -
>> +	if (rdt_cpu_has(X86_FEATURE_BMEC) || rdt_cpu_has(X86_FEATURE_ABMC)) {
>>  		/* Detect list of bandwidth sources that can be tracked */
>>  		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
>>  		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
> 
> I interpret this mbm_cfg_mask initialization that an ABMC system will report which of
> the memory transactions can be monitored. 
> In patch #15 "fs/resctrl: Introduce event configuration field in struct mon_evt"
> the event configurations of memory transactions that should be monitored are hardcoded
> as below without taking into account what the system supports:
> 
> 	resctrl_mon_resource_init() {
> 		...
> 		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
> 		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
> 								   READS_TO_LOCAL_S_MEM |
> 								   NON_TEMP_WRITE_TO_LOCAL_MEM;
> 		...
> 	}

That is correct.
Changed the assignment.

mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg =
r->mon.mbm_cfg_mask & (READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM |
NON_TEMP_WRITE_TO_LOCAL_MEM);



> 
> It may thus be that a system may not support all memory transactions it is configured to
> monitor. It seems to me that the initialization done in resctrl_mon_resource_init() needs
> to take r->mon.mbm_cfg_mask (what the system supports) into account? If so, then
> the same hardcoding done by patch #32 in resctrl_mbm_assign_mode_write() should
> also be changed.

Yes. Sure.

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters
  2025-07-25 18:29 ` [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters Babu Moger
@ 2025-08-06 21:12   ` Moger, Babu
  2025-08-06 21:31     ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-06 21:12 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/25/25 13:29, Babu Moger wrote:
> The "mbm_event" counter assignment mode allows users to assign a hardware
> counter to an RMID, event pair and monitor bandwidth usage as long as it is
> assigned.  The hardware continues to track the assigned counter until it is
> explicitly unassigned by the user.
> 
> Create 'num_mbm_cntrs' resctrl file that displays the number of counters
> supported in each domain. 'num_mbm_cntrs' is only visible to user space
> when the system supports "mbm_event" mode.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> v16: Added Reviewed-by tag.
> 
> v15: Changed "assign a hardware counter ID" to "assign a hardware counter"
>      in couple of places.
> 
> v14: Minor update to changelog and user doc (resctrl.rst).
>      Changed subject line to fs/resctrl.
> 
> v13: Updated the changelog.
>      Added fflags RFTYPE_RES_CACHE to the file num_mbm_cntrs.
>      Replaced seq_puts from seq_putc where applicable.
>      Resolved conflicts caused by the recent FS/ARCH code restructure.
>      The files monitor.c/rdtgroup.c have been split between FS and ARCH directories.
> 
> v12: Changed the code to display the max supported monitoring counters in
>      each domain. Also updated the documentation.
>      Resolved the conflict with the latest code.
> 
> v11: Renamed rdtgroup_num_mbm_cntrs_show() to resctrl_num_mbm_cntrs_show().
>      Few monor text updates.
> 
> v10: No changes.
> 
> v9: Updated user document based on the comments.
>     Will add a new file available_mbm_cntrs later in the series.
> 
> v8: Commit message update and documentation update.
> 
> v7: Minor commit log text changes.
> 
> v6: No changes.
> 
> v5: Changed the display name from num_cntrs to num_mbm_cntrs.
>     Updated the commit message.
>     Moved the patch after mbm_mode is introduced.
> 
> v4: Changed the counter name to num_cntrs. And few text changes.
> 
> v3: Changed the field name to mbm_assign_cntrs.
> 
> v2: Changed the field name to mbm_assignable_counters from abmc_counter.
> ---
>  Documentation/filesystems/resctrl.rst | 11 ++++++++++
>  fs/resctrl/monitor.c                  |  2 ++
>  fs/resctrl/rdtgroup.c                 | 30 +++++++++++++++++++++++++++
>  3 files changed, 43 insertions(+)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index b692829fec5f..4eb27530be6f 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -288,6 +288,17 @@ with the following files:
>  	result in misleading values or display "Unavailable" if no counter is assigned
>  	to the event.
>  
> +"num_mbm_cntrs":
> +	The maximum number of counters (total of available and assigned counters) in
> +	each domain when the system supports mbm_event mode.
> +
> +	For example, on a system with maximum of 32 memory bandwidth monitoring
> +	counters in each of its L3 domains:
> +	::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
> +	  0=32;1=32
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 66c8c635f4b3..4539b08db7b9 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -929,6 +929,8 @@ int resctrl_mon_resource_init(void)
>  			resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID);
>  		if (!resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
>  			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
> +		resctrl_file_fflags_init("num_mbm_cntrs",
> +					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  	}
>  
>  	return 0;
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index c7ca9113a12a..acbda73a9b9d 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1829,6 +1829,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>  	return 0;
>  }
>  
> +static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
> +				      struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
> +	struct rdt_mon_domain *dom;
> +	bool sep = false;
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
> +		if (sep)
> +			seq_putc(s, ';');
> +
> +		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
> +		sep = true;
> +	}
> +	seq_putc(s, '\n');
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +	return 0;
> +}

How about moving this also to monitor.c?


>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{
> @@ -1866,6 +1890,12 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= rdt_default_ctrl_show,
>  		.fflags		= RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
>  	},
> +	{
> +		.name		= "num_mbm_cntrs",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= resctrl_num_mbm_cntrs_show,
> +	},
>  	{
>  		.name		= "min_cbm_bits",
>  		.mode		= 0444,

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters
  2025-07-25 18:29 ` [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters Babu Moger
@ 2025-08-06 21:19   ` Moger, Babu
  2025-08-06 21:31     ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-06 21:19 UTC (permalink / raw)
  To: corbet, tony.luck, reinette.chatre, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/25/25 13:29, Babu Moger wrote:
> Introduce the "available_mbm_cntrs" resctrl file to display the number of
> counters available for assignment in each domain when "mbm_event" mode is
> enabled.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> v16: Added Reviewed-by tag.
> 
> v15: Minor changelog text update.
>      Minor resctrl.rst text update and corrected the error text in
>      resctrl_available_mbm_cntrs_show().
>      Changed the goto label to out_unlock for consistency.
> 
> v14: Minor changelog update.
>      Changed subject line to fs/resctrl.
> 
> v13: Resolved conflicts caused by the recent FS/ARCH code restructure.
>      The files monitor.c and rdtgroup.c file has now been split between
>      the FS and ARCH directories.
> 
> v12: Minor change to change log.
>      Updated the documentation text with an example.
>      Replaced seq_puts(s, ";") with seq_putc(s, ';');
>      Added missing rdt_last_cmd_clear() in resctrl_available_mbm_cntrs_show().
> 
> v11: Rename rdtgroup_available_mbm_cntrs_show() to resctrl_available_mbm_cntrs_show().
>      Few minor text changes.
> 
> v10: Patch changed to handle the counters at domain level.
>      https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
>      So, display logic also changed now.
> 
> v9: New patch
> ---
>  Documentation/filesystems/resctrl.rst | 11 ++++++
>  fs/resctrl/monitor.c                  |  2 ++
>  fs/resctrl/rdtgroup.c                 | 48 +++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 4eb27530be6f..446736dbd97f 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -299,6 +299,17 @@ with the following files:
>  	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>  	  0=32;1=32
>  
> +"available_mbm_cntrs":
> +	The number of counters available for assignment in each domain when mbm_event
> +	mode is enabled on the system.
> +
> +	For example, on a system with 30 available [hardware] assignable counters
> +	in each of its L3 domains:
> +	::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
> +	  0=30;1=30
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 4539b08db7b9..a0b0ea45c7b4 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -931,6 +931,8 @@ int resctrl_mon_resource_init(void)
>  			resctrl_enable_mon_event(QOS_L3_MBM_LOCAL_EVENT_ID);
>  		resctrl_file_fflags_init("num_mbm_cntrs",
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
> +		resctrl_file_fflags_init("available_mbm_cntrs",
> +					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  	}
>  
>  	return 0;
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index a09566720d4f..15d10c346307 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1853,6 +1853,48 @@ static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
>  	return 0;
>  }
>  
> +static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
> +					    struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
> +	struct rdt_mon_domain *dom;
> +	bool sep = false;
> +	u32 cntrs, i;
> +	int ret = 0;
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +
> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
> +		if (sep)
> +			seq_putc(s, ';');
> +
> +		cntrs = 0;
> +		for (i = 0; i < r->mon.num_mbm_cntrs; i++) {
> +			if (!dom->cntr_cfg[i].rdtgrp)
> +				cntrs++;
> +		}
> +
> +		seq_printf(s, "%d=%u", dom->hdr.id, cntrs);
> +		sep = true;
> +	}
> +	seq_putc(s, '\n');
> +
> +out_unlock:
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret;
> +}
> +

This also can be moved to monitor.c.  What do you think?


>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{
> @@ -1876,6 +1918,12 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= rdt_mon_features_show,
>  		.fflags		= RFTYPE_MON_INFO,
>  	},
> +	{
> +		.name		= "available_mbm_cntrs",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= resctrl_available_mbm_cntrs_show,
> +	},
>  	{
>  		.name		= "num_rmids",
>  		.mode		= 0444,

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes
  2025-08-06 21:02   ` Moger, Babu
@ 2025-08-06 21:30     ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-08-06 21:30 UTC (permalink / raw)
  To: babu.moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/6/25 2:02 PM, Moger, Babu wrote:
> On 7/25/25 13:29, Babu Moger wrote:
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index ca0475b75390..c7ca9113a12a 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1799,6 +1799,36 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>>  	return ret ?: nbytes;
>>  }
>>  
>> +static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>> +					struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
>> +	bool enabled;
>> +
>> +	mutex_lock(&rdtgroup_mutex);
>> +	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
>> +
>> +	if (r->mon.mbm_cntr_assignable) {
>> +		if (enabled)
>> +			seq_puts(s, "[mbm_event]\n");
>> +		else
>> +			seq_puts(s, "[default]\n");
>> +
>> +		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
>> +			if (enabled)
>> +				seq_puts(s, "default\n");
>> +			else
>> +				seq_puts(s, "mbm_event\n");
>> +		}
>> +	} else {
>> +		seq_puts(s, "[default]\n");
>> +	}
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +
>> +	return 0;
>> +}
> 
> The resctrl_mbm_assign_mode_show() can also be moved to monitor.c.
> 
> What do you think?

I agree. monitor.c is appropriate for this code.

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters
  2025-08-06 21:12   ` Moger, Babu
@ 2025-08-06 21:31     ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-08-06 21:31 UTC (permalink / raw)
  To: babu.moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/6/25 2:12 PM, Moger, Babu wrote:
> On 7/25/25 13:29, Babu Moger wrote:
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index c7ca9113a12a..acbda73a9b9d 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1829,6 +1829,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>>  	return 0;
>>  }
>>  
>> +static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
>> +				      struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
>> +	struct rdt_mon_domain *dom;
>> +	bool sep = false;
>> +
>> +	cpus_read_lock();
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>> +		if (sep)
>> +			seq_putc(s, ';');
>> +
>> +		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
>> +		sep = true;
>> +	}
>> +	seq_putc(s, '\n');
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +	cpus_read_unlock();
>> +	return 0;
>> +}
> 
> How about moving this also to monitor.c?

Yes, this sounds good to me. This is what I had in mind when suggesting in [1]
that the monitoring related handlers should be located in monitor.c

Reinette

[1] https://lore.kernel.org/lkml/0fa9a12b-e900-4ceb-b59c-e653ec3db0ca@intel.com/


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters
  2025-08-06 21:19   ` Moger, Babu
@ 2025-08-06 21:31     ` Reinette Chatre
  2025-08-06 22:04       ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-08-06 21:31 UTC (permalink / raw)
  To: babu.moger, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/6/25 2:19 PM, Moger, Babu wrote:
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index a09566720d4f..15d10c346307 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1853,6 +1853,48 @@ static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
>>  	return 0;
>>  }
>>  
>> +static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
>> +					    struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
>> +	struct rdt_mon_domain *dom;
>> +	bool sep = false;
>> +	u32 cntrs, i;
>> +	int ret = 0;
>> +
>> +	cpus_read_lock();
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	rdt_last_cmd_clear();
>> +
>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
>> +		ret = -EINVAL;
>> +		goto out_unlock;
>> +	}
>> +
>> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>> +		if (sep)
>> +			seq_putc(s, ';');
>> +
>> +		cntrs = 0;
>> +		for (i = 0; i < r->mon.num_mbm_cntrs; i++) {
>> +			if (!dom->cntr_cfg[i].rdtgrp)
>> +				cntrs++;
>> +		}
>> +
>> +		seq_printf(s, "%d=%u", dom->hdr.id, cntrs);
>> +		sep = true;
>> +	}
>> +	seq_putc(s, '\n');
>> +
>> +out_unlock:
>> +	mutex_unlock(&rdtgroup_mutex);
>> +	cpus_read_unlock();
>> +
>> +	return ret;
>> +}
>> +
> 
> This also can be moved to monitor.c.  What do you think?

Yes, I believe monitor.c is most appropriate for all the monitoring
related handlers [1].

Reinette

[1] https://lore.kernel.org/lkml/0fa9a12b-e900-4ceb-b59c-e653ec3db0ca@intel.com/

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters
  2025-08-06 21:31     ` Reinette Chatre
@ 2025-08-06 22:04       ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-06 22:04 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 8/6/25 16:31, Reinette Chatre wrote:
> Hi Babu,
> 
> On 8/6/25 2:19 PM, Moger, Babu wrote:
>>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>>> index a09566720d4f..15d10c346307 100644
>>> --- a/fs/resctrl/rdtgroup.c
>>> +++ b/fs/resctrl/rdtgroup.c
>>> @@ -1853,6 +1853,48 @@ static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
>>>  	return 0;
>>>  }
>>>  
>>> +static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
>>> +					    struct seq_file *s, void *v)
>>> +{
>>> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
>>> +	struct rdt_mon_domain *dom;
>>> +	bool sep = false;
>>> +	u32 cntrs, i;
>>> +	int ret = 0;
>>> +
>>> +	cpus_read_lock();
>>> +	mutex_lock(&rdtgroup_mutex);
>>> +
>>> +	rdt_last_cmd_clear();
>>> +
>>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
>>> +		ret = -EINVAL;
>>> +		goto out_unlock;
>>> +	}
>>> +
>>> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>> +		if (sep)
>>> +			seq_putc(s, ';');
>>> +
>>> +		cntrs = 0;
>>> +		for (i = 0; i < r->mon.num_mbm_cntrs; i++) {
>>> +			if (!dom->cntr_cfg[i].rdtgrp)
>>> +				cntrs++;
>>> +		}
>>> +
>>> +		seq_printf(s, "%d=%u", dom->hdr.id, cntrs);
>>> +		sep = true;
>>> +	}
>>> +	seq_putc(s, '\n');
>>> +
>>> +out_unlock:
>>> +	mutex_unlock(&rdtgroup_mutex);
>>> +	cpus_read_unlock();
>>> +
>>> +	return ret;
>>> +}
>>> +
>>
>> This also can be moved to monitor.c.  What do you think?
> 
> Yes, I believe monitor.c is most appropriate for all the monitoring
> related handlers [1].

Sure. Will do.

> 
> Reinette
> 
> [1] https://lore.kernel.org/lkml/0fa9a12b-e900-4ceb-b59c-e653ec3db0ca@intel.com/
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events
  2025-07-30 19:52   ` Reinette Chatre
@ 2025-08-07 18:29     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-07 18:29 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 14:52, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> When supported, "mbm_event" counter assignment mode offers "num_mbm_cntrs"
>> number of counters that can be assigned to RMID, event pairs and monitor
>> bandwidth usage as long as it is assigned.
>>
>> Add the functionality to allocate and assign a counter to an RMID, event
>> pair in the domain.
>>
>> If all the counters are in use, kernel will log the error message
> 
> I think dropping "kernel will" will help the text to be imperative.
> 
>> "Failed to allocate counter for <event> in domain <id>" in
>> /sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.
> 
> "when a new assignment is requested" can be dropped. Or alternatively:
> 	Log the error message "Failed to allocate counter for <event> in domain
> 	<id>" in /sys/fs/resctrl/info/last_cmd_status if all the counters
> 	are in use.
> 

Sure. will do.

>> Exit on the first failure when assigning counters across all the domains.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> ---
>>  fs/resctrl/internal.h |   3 +
>>  fs/resctrl/monitor.c  | 130 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 133 insertions(+)
>>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index db3a0f12ad77..419423bdabdc 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -387,6 +387,9 @@ bool closid_allocated(unsigned int closid);
>>  
>>  int resctrl_find_cleanest_closid(void);
>>  
>> +int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>> +			       struct mon_evt *mevt);
>> +
> 
> This internal.h change does not look necessary? Looking ahead this is because 
> rdtgroup.c:rdtgroup_assign_cntrs() needs it, but rdtgroup_assign_cntrs()
> also belongs in monitor.c, no? 

Yes. Brought rdtgroup_assign_cntrs() in this patch for completeness and
moved everything into monitor.c.

> 
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>>  
> 
> ...
> 
>> +/*
>> + * rdtgroup_alloc_assign_cntr() - Allocate a counter ID and assign it to the event
>> + * pointed to by @mevt and the resctrl group @rdtgrp within the domain @d.
>> + *
>> + * Return:
>> + * 0 on success, < 0 on failure.
>> + */
>> +static int rdtgroup_alloc_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +				      struct rdtgroup *rdtgrp, struct mon_evt *mevt)
>> +{
>> +	int cntr_id;
>> +
>> +	/* No action required if the counter is assigned already. */
>> +	cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid);
>> +	if (cntr_id >= 0)
>> +		return 0;
>> +
>> +	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, mevt->evtid);
>> +	if (cntr_id <  0) {
> 
> Extra space above.

Sure.

> 
>> +		rdt_last_cmd_printf("Failed to allocate counter for %s in domain %d\n",
>> +				    mevt->name, d->hdr.id);
>> +		return cntr_id;
>> +	}
>> +
>> +	rdtgroup_assign_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid, rdtgrp->closid, cntr_id, true);
>> +
>> +	return 0;
>> +}
>> +
> 
> Reinette
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign MBM events
  2025-07-30 19:53   ` Reinette Chatre
@ 2025-08-07 18:33     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-07 18:33 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 14:53, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> The "mbm_event" counter assignment mode offers "num_mbm_cntrs" number of
>> counters that can be assigned to RMID, event pairs and monitor bandwidth
>> usage as long as it is assigned. If all the counters are in use, the
>> kernel logs the error message "Unable to allocate counter in domain" in
> 
> Needs an update to match new message.

Sure.

> 
>> /sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.
>>
>> To make space for a new assignment, users must unassign an already
>> assigned counter and retry the assignment again.
>>
>> Add the functionality to unassign and free the counters in the domain.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> 
> ...
> 
> 
>> ---
>>  fs/resctrl/internal.h |  2 ++
>>  fs/resctrl/monitor.c  | 46 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 48 insertions(+)
>>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 419423bdabdc..216588842444 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -389,6 +389,8 @@ int resctrl_find_cleanest_closid(void);
>>  
>>  int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>>  			       struct mon_evt *mevt);
>> +void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>> +				  struct mon_evt *mevt);
>>  
> 
> Similar comment as previous patch. Please try to keep all monitoring code in
> monitor.c. The caller rdtgroup_unassign_cntrs() can move to monitor.c and it
> can instead be made available via internal.h

Yes. Brought rdtgroup_unassign_cntrs() in this patch for completeness and
moved everything into monitor.c.
> 
> 
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> 
> Reinette
> 
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode
  2025-07-30 19:59   ` Reinette Chatre
@ 2025-08-07 19:59     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-07 19:59 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 14:59, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> When supported, "mbm_event" counter assignment mode allows users to assign
>> a hardware counter to an RMID, event pair and monitor the bandwidth usage
>> as long as it is assigned. The hardware continues to track the assigned
>> counter until it is explicitly unassigned by the user.
>>
>> Introduce the architecture calls resctrl_arch_cntr_read() and
>> resctrl_arch_reset_cntr() to read and reset event counters when "mbm_event"
>> mode is supported. Function names are chosen to match existing
> 
> (apologies if I gave you the text ... trying to polish with more focus on
> imperative tone now)
> "Function names are chosen to match" -> "Function names match"?

Looks good.

> 
>> resctrl_arch_rmid_read() and resctrl_arch_reset_rmid().
>>
>> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> ---
>>  include/linux/resctrl.h | 38 ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 38 insertions(+)
>>
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 50e38445183a..4d37827121a6 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -613,6 +613,44 @@ void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>>  			      enum resctrl_event_id evtid, u32 rmid, u32 closid,
>>  			      u32 cntr_id, bool assign);
>>  
>> +/**
>> + * resctrl_arch_cntr_read() - Read the event data corresponding to the counter ID
>> + *			      assigned to the RMID, event pair for this resource
>> + *			      and domain.
>> + * @r:			Resource that the counter should be read from.
>> + * @d:			Domain that the counter should be read from.
>> + * @closid:		CLOSID that matches the RMID.
>> + * @rmid:		RMID used for counter ID assignment.
> 
> Can this be more specific, for example:
> 			The RMID to which @cntr_id is assigned.

Sure.

> 
>> + * @cntr_id:		The counter ID whose event data should be read. Valid when
>> + *			"mbm_event" mode is enabled and @eventid is MBM event.
> 
> Would the counter ID not always be valid? Specifically,  resctrl_arch_cntr_read() is
> _only_ called when "mbm_event" mode is enabled and @eventid is _always_
> an MBM event, no? If you agree, the @cntr_id description can be something like below
> with the calling context details moved to general function description:
> 
> 	 @cntr_id: The counter to read.

Yes. that is fine.

> 
>> + * @eventid:		eventid used for counter ID assignment, such as
>> + *			QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.
> 
> The "@eventid is an MBM event" can move here? For example:
> 			The MBM event to which @cntr_id is assigned.	

Sure.
		
> 
>> + * @val:		Result of the counter read in bytes.
>> + *
> 
> It looks to me as though some of the @cntr_id text could move to be the
> function description. The description can also be expanded to include where this
> will be called from. For example, 
> 
> 	Called on a CPU that belongs to domain @d when "mbm_event" mode is enabled.
> 	Called from a non-migrateable process context via smp_call_on_cpu() unless
> 	all CPUs are nohz_full, in which case it is called via IPI (smp_call_function_any()).
> 	
> The goal is to make information specific. Please feel free to improve.

Looks good.

> 
>> + * Return:
>> + * 0 on success, or -EIO, -EINVAL etc on error.
>> + */
>> +int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			   u32 closid, u32 rmid, int cntr_id,
>> +			   enum resctrl_event_id eventid, u64 *val);
>> +
>> +/**
>> + * resctrl_arch_reset_cntr() - Reset any private state associated with counter ID.
>> + * @r:		The domain's resource.
>> + * @d:		The counter ID's domain.
>> + * @closid:	CLOSID that matches the RMID.
>> + * @rmid:	RMID used for counter ID assignment.
>> + * @cntr_id:	The counter ID whose event data should be reset. Valid when
>> + *		"mbm_event" mode is enabled and @eventid is MBM event.
>> + * @eventid:	eventid used for counter ID assignment, such as
>> + *		QOS_L3_MBM_TOTAL_EVENT_ID or QOS_L3_MBM_LOCAL_EVENT_ID.
> 
> Above should similarly be specific.
> 

Sure.

>> + *
>> + * This can be called from any CPU.
>> + */
>> +void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			     u32 closid, u32 rmid, int cntr_id,
>> +			     enum resctrl_event_id eventid);
>> +
>>  extern unsigned int resctrl_rmid_realloc_threshold;
>>  extern unsigned int resctrl_rmid_realloc_limit;
>>  
> 
> Reinette
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read()
  2025-07-30 20:01   ` Reinette Chatre
@ 2025-08-08  2:05     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-08  2:05 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:01, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> System software can read resctrl event data for a particular resource by
> 
> "can read" -> "reads"
> 

Sure.

>> writing the RMID and Event Identifier (EvtID) to the QM_EVTSEL register and
>> then reading the event data from the QM_CTR register.
>>
>> In ABMC mode, the event data of a specific counter ID can be read by
> 
> "can be read" -> "is read"
> 

Sure.

>> setting the following fields: QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID
>> = L3CacheABMC (=1) and setting [RMID] to the desired counter ID. Reading
> 
> "[RMID]" -> "QM_EVTSEL.RMID"
> 

Sure.

>> QM_CTR will then return the contents of the specified counter ID. The
> 
> "will then return" -> "then returns"
> 

sure.

>> RMID_VAL_ERROR bit will be set if the counter configuration was invalid, or
> 
> "will be set" -> "is set"
> "was invalid" -> "is invalid"
> 

Sure.

>> if an invalid counter ID was set in the QM_EVTSEL[RMID] field. If the
> 
> "was set" -> "is set"

Sure.

> 
> "in the QM_EVTSEL[RMID] field" -> "in QM_EVTSEL.RMID"
> 
> 

Sure.

>> counter data is currently unavailable, the RMID_VAL_UNAVAIL bit will be
>> set.
> 
> "The RMID_VAL_UNAVAIL bit is set if the counter data is unavailable."
> 
> Please review after changes that all is coherent and in imperative tone and make
> same adjustments to duplicate text in patch.

ok.

> 
>>
>> Introduce resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() to reset
>> and read event data for a specific counter.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> ---
>>  arch/x86/kernel/cpu/resctrl/internal.h |  6 +++
>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 68 ++++++++++++++++++++++++++
>>  2 files changed, 74 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 6bf6042f11b6..ae4003d44df4 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -40,6 +40,12 @@ struct arch_mbm_state {
>>  /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
>>  #define ABMC_ENABLE_BIT			0
>>  
>> +/*
>> + * Qos Event Identifiers.
>> + */
>> +#define ABMC_EXTENDED_EVT_ID		BIT(31)
>> +#define ABMC_EVT_ID			BIT(0)
>> +
>>  /**
>>   * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
>>   *			       a resource for a control function
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 1f77fd58e707..57c8409a8247 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -259,6 +259,74 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>>  	return 0;
>>  }
>>  
>> +static int __cntr_id_read(u32 cntr_id, u64 *val)
>> +{
>> +	u64 msr_val;
>> +
>> +	/*
>> +	 * QM_EVTSEL Register definition:
>> +	 * =======================================================
>> +	 * Bits    Mnemonic        Description
>> +	 * =======================================================
>> +	 * 63:44   --              Reserved
>> +	 * 43:32   RMID            Resource Monitoring Identifier
>> +	 * 31      ExtEvtID        Extended Event Identifier
>> +	 * 30:8    --              Reserved
>> +	 * 7:0     EvtID           Event Identifier
>> +	 * =======================================================
>> +	 * The contents of a specific counter can be read by setting the
>> +	 * following fields in QM_EVTSEL.ExtendedEvtID(=1) and
> 
> ExtEvtID vs ExtendedEvtID ... either the definition or the text should change to
> use same names.

ExtendedEvtID

> Can description of RMID be expanded to note that it may
> contain RMID or counter ID?

Sure.

> 
>> +	 * QM_EVTSEL.EvtID = L3CacheABMC (=1) and setting [RMID] to the
>> +	 * desired counter ID. Reading QM_CTR will then return the
>> +	 * contents of the specified counter. The RMID_VAL_ERROR bit will
>> +	 * be set if the counter configuration was invalid, or if an invalid
>> +	 * counter ID was set in the QM_EVTSEL[RMID] field. If the counter
>> +	 * data is currently unavailable, the RMID_VAL_UNAVAIL bit will be set.
>> +	 */
>> +	wrmsr(MSR_IA32_QM_EVTSEL, ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID, cntr_id);
>> +	rdmsrl(MSR_IA32_QM_CTR, msr_val);
>> +
>> +	if (msr_val & RMID_VAL_ERROR)
>> +		return -EIO;
>> +	if (msr_val & RMID_VAL_UNAVAIL)
>> +		return -EINVAL;
>> +
>> +	*val = msr_val;
>> +	return 0;
>> +}
>> +
>> +void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			     u32 unused, u32 rmid, int cntr_id,
>> +			     enum resctrl_event_id eventid)
>> +{
>> +	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>> +	struct arch_mbm_state *am;
>> +
>> +	am = get_arch_mbm_state(hw_dom, rmid, eventid);
>> +	if (am) {
>> +		memset(am, 0, sizeof(*am));
>> +
>> +		/* Record any initial, non-zero count value. */
>> +		__cntr_id_read(cntr_id, &am->prev_msr);
>> +	}
>> +}
>> +
>> +int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			   u32 unused, u32 rmid, int cntr_id,
>> +			   enum resctrl_event_id eventid, u64 *val)
>> +{
>> +	u64 msr_val;
>> +	int ret;
>> +
>> +	ret = __cntr_id_read(cntr_id, &msr_val);
>> +	if (ret)
>> +		return ret;
>> +
>> +	*val = get_corrected_val(r, d, rmid, eventid, msr_val);
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1
>>   * which indicates that RMIDs are configured in legacy mode.
> 
> code looks good.
> 
> Reinette
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode
  2025-07-30 20:03   ` Reinette Chatre
@ 2025-08-08  2:20     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-08  2:20 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:03, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> When "mbm_event" counter assignment mode is enabled, the architecture
>> requires a counter ID to read the event data.
>>
>> Introduce an is_mbm_cntr field in struct rmid_read to indicate whether
>> counter assignment mode is in use.
>>
>> Update the logic to call resctrl_arch_cntr_read() and
>> resctrl_arch_reset_cntr() when the assignment mode is active. Report
>> 'Unassigned' in case the user attempts to read the event without assigning
>> a hardware counter.
>>
>> Declare mbm_cntr_get() in fs/resctrl/internal.h to make it accessible to
>> other functions within fs/resctrl.
> 
>>From what I can tell this is not needed by this patch. It is also a hint that
> there may be some monitoring specific code outside of monitor.c. Looks like this
> is done to support later patch #29 "fs/resctrl: Introduce mbm_L3_assignments to
> list assignments in a group" where mbm_L3_assignments_show() should rather
> be in monitor.c

Yes. Will move all these to monitor.c.

> 
>>
>> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
> 
>> ---
>>  Documentation/filesystems/resctrl.rst |  6 ++++
>>  fs/resctrl/ctrlmondata.c              | 22 +++++++++---
>>  fs/resctrl/internal.h                 |  5 +++
>>  fs/resctrl/monitor.c                  | 52 ++++++++++++++++++++-------
>>  4 files changed, 67 insertions(+), 18 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 446736dbd97f..4c24c5f3f4c1 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -434,6 +434,12 @@ When monitoring is enabled all MON groups will also contain:
>>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>>  	where "YY" is the node number.
>>  
>> +	When the 'mbm_event' counter assignment mode is enabled, reading
>> +	an MBM event of a MON group returns 'Unassigned' if no hardware
>> +	counter is assigned to it. For CTRL_MON groups, 'Unassigned' is
>> +	returned if the MBM event does not have an assigned counter in the
>> +	CTRL_MON group nor in any of its associated MON groups.
>> +
>>  "mon_hw_id":
>>  	Available only with debug option. The identifier used by hardware
>>  	for the monitor group. On x86 this is the RMID.
>> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
>> index ad7ffc6acf13..31787ce6ec91 100644
>> --- a/fs/resctrl/ctrlmondata.c
>> +++ b/fs/resctrl/ctrlmondata.c
>> @@ -563,10 +563,15 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>>  	rr->r = r;
>>  	rr->d = d;
>>  	rr->first = first;
>> -	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
>> -	if (IS_ERR(rr->arch_mon_ctx)) {
>> -		rr->err = -EINVAL;
>> -		return;
>> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
>> +	    resctrl_is_mbm_event(evtid)) {
>> +		rr->is_mbm_cntr = true;
>> +	} else {
>> +		rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
>> +		if (IS_ERR(rr->arch_mon_ctx)) {
>> +			rr->err = -EINVAL;
>> +			return;
>> +		}
>>  	}
>>  
>>  	cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
>> @@ -582,7 +587,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>>  	else
>>  		smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
>>  
>> -	resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
>> +	if (rr->arch_mon_ctx)
>> +		resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
>>  }
>>  
>>  int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>> @@ -653,10 +659,16 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>>  
>>  checkresult:
>>  
>> +	/*
>> +	 * -ENOENT is a special case, set only when "mbm_event" counter assignment
>> +	 * mode is enabled and no counter has been assigned.
>> +	 */
>>  	if (rr.err == -EIO)
>>  		seq_puts(m, "Error\n");
>>  	else if (rr.err == -EINVAL)
>>  		seq_puts(m, "Unavailable\n");
>> +	else if (rr.err == -ENOENT)
>> +		seq_puts(m, "Unassigned\n");
>>  	else
>>  		seq_printf(m, "%llu\n", rr.val);
>>  
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 216588842444..eeee83a5067a 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -110,6 +110,8 @@ struct mon_data {
>>   *	   domains in @r sharing L3 @ci.id
>>   * @evtid: Which monitor event to read.
>>   * @first: Initialize MBM counter when true.
>> + * @is_mbm_cntr: Is the counter valid? true if "mbm_event" counter assignment mode is
>> + *	   enabled and it is an MBM event.
> 
> Since a counter may not be assigned to event being read I do not believe that "Is the counter
> valid?" is accurate and should rather be dropped. Rest of text looks accurate to me.  

Sure.

> 
>>   * @ci_id: Cacheinfo id for L3. Only set when @d is NULL. Used when summing domains.
>>   * @err:   Error encountered when reading counter.
>>   * @val:   Returned value of event counter. If @rgrp is a parent resource group,
>> @@ -124,6 +126,7 @@ struct rmid_read {
>>  	struct rdt_mon_domain	*d;
>>  	enum resctrl_event_id	evtid;
>>  	bool			first;
>> +	bool			is_mbm_cntr;
>>  	unsigned int		ci_id;
>>  	int			err;
>>  	u64			val;
>> @@ -391,6 +394,8 @@ int rdtgroup_assign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp
>>  			       struct mon_evt *mevt);
>>  void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
>>  				  struct mon_evt *mevt);
>> +int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
>>  
> 
> Not necessary? mbm_cntr_get() can remain internal to monitor.c

Yes. Not necessary.

> 
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 070965d45770..a8b53b0ad0b7 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -362,13 +362,25 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>>  	u32 closid = rdtgrp->closid;
>>  	u32 rmid = rdtgrp->mon.rmid;
>>  	struct rdt_mon_domain *d;
>> +	int cntr_id = -ENOENT;
>>  	struct cacheinfo *ci;
>>  	struct mbm_state *m;
>>  	int err, ret;
>>  	u64 tval = 0;
>>  
>> +	if (rr->is_mbm_cntr) {
>> +		cntr_id = mbm_cntr_get(rr->r, rr->d, rdtgrp, rr->evtid);
>> +		if (cntr_id < 0) {
>> +			rr->err = -ENOENT;
>> +			return -EINVAL;
>> +		}
>> +	}
>> +
>>  	if (rr->first) {
>> -		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
>> +		if (rr->is_mbm_cntr)
>> +			resctrl_arch_reset_cntr(rr->r, rr->d, closid, rmid, cntr_id, rr->evtid);
>> +		else
>> +			resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
>>  		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
>>  		if (m)
>>  			memset(m, 0, sizeof(struct mbm_state));
>> @@ -379,8 +391,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>>  		/* Reading a single domain, must be on a CPU in that domain. */
>>  		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
>>  			return -EINVAL;
>> -		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
>> -						 rr->evtid, &tval, rr->arch_mon_ctx);
>> +		if (rr->is_mbm_cntr)
>> +			rr->err = resctrl_arch_cntr_read(rr->r, rr->d, closid, rmid, cntr_id,
>> +							 rr->evtid, &tval);
>> +		else
>> +			rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
>> +							 rr->evtid, &tval, rr->arch_mon_ctx);
>>  		if (rr->err)
>>  			return rr->err;
>>  
>> @@ -405,8 +421,12 @@ static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
>>  	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
>>  		if (d->ci_id != rr->ci_id)
>>  			continue;
>> -		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
>> -					     rr->evtid, &tval, rr->arch_mon_ctx);
>> +		if (rr->is_mbm_cntr)
>> +			err = resctrl_arch_cntr_read(rr->r, d, closid, rmid, cntr_id,
>> +						     rr->evtid, &tval);
>> +		else
>> +			err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
>> +						     rr->evtid, &tval, rr->arch_mon_ctx);
>>  		if (!err) {
>>  			rr->val += tval;
>>  			ret = 0;
>> @@ -613,11 +633,16 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>>  	rr.r = r;
>>  	rr.d = d;
>>  	rr.evtid = evtid;
>> -	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
>> -	if (IS_ERR(rr.arch_mon_ctx)) {
>> -		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
>> -				    PTR_ERR(rr.arch_mon_ctx));
>> -		return;
>> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
>> +	    resctrl_arch_mbm_cntr_assign_enabled(r)) {
> 
> Duplicate check?

Yes.

> 
>> +		rr.is_mbm_cntr = true;
>> +	} else {
>> +		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
>> +		if (IS_ERR(rr.arch_mon_ctx)) {
>> +			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
>> +					    PTR_ERR(rr.arch_mon_ctx));
>> +			return;
>> +		}
>>  	}
>>  
>>  	__mon_event_count(rdtgrp, &rr);
>> @@ -629,7 +654,8 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>>  	if (is_mba_sc(NULL))
>>  		mbm_bw_count(rdtgrp, &rr);
>>  
>> -	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
>> +	if (rr.arch_mon_ctx)
>> +		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
>>  }
>>  
>>  static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
>> @@ -983,8 +1009,8 @@ static void rdtgroup_assign_cntr(struct rdt_resource *r, struct rdt_mon_domain *
>>   * Return:
>>   * Valid counter ID on success, or -ENOENT on failure.
>>   */
>> -static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>>  {
>>  	int cntr_id;
>>  
> 
> Not necessary?
> 

Yes.

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration
  2025-07-30 20:03   ` Reinette Chatre
@ 2025-08-08  2:24     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-08  2:24 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:03, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> The "mbm_event" counter assignment mode allows the user to assign a
>> hardware counter to an RMID, event pair and monitor the bandwidth as long
>> as it is assigned. The user can specify the memory transaction(s) for the
>> counter to track.
>>
>> Add the definitions for supported memory transactions (e.g., read, write,
>> etc.) the counter can be configured with.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> ---
>>  fs/resctrl/internal.h         | 11 +++++++++++
>>  fs/resctrl/monitor.c          | 11 +++++++++++
>>  include/linux/resctrl_types.h |  3 +++
>>  3 files changed, 25 insertions(+)
>>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index eeee83a5067a..693268bcbad2 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
> 
> Looks like only monitoring code in monitor.c needs to know about
> struct mbm_transaction so this can stay within monitor.c ?

Sure.

> 
>> @@ -216,6 +216,17 @@ struct rdtgroup {
>>  	struct pseudo_lock_region	*plr;
>>  };
>>  
>> +/**
>> + * struct mbm_transaction - Memory transaction an MBM event can be configured with.
>> + * @name:	Name of memory transaction (read, write ...).
>> + * @val:	The bit (eg. READS_TO_LOCAL_MEM or READS_TO_REMOTE_MEM) used to
>> + *		represent the memory transaction within an event's configuration.
>> + */
>> +struct mbm_transaction {
>> +	char	name[32];
>> +	u32	val;
>> +};
>> +
>>  /* rdtgroup.flags */
>>  #define	RDT_DELETED		1
>>  
> 
> Reinette
> 
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-07-30 20:04   ` Reinette Chatre
@ 2025-08-08 13:56     ` Moger, Babu
  2025-08-08 15:12       ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 13:56 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:04 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>
>
>> ---
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 4c24c5f3f4c1..3dfc177f9792 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -310,6 +310,38 @@ with the following files:
>>   	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>>   	  0=30;1=30
>>   
>> +"event_configs":
>> +	Directory that exists when "mbm_event" counter assignment mode is supported.
>> +	Contains sub-directory for each MBM event that can be assigned to a counter.
> "Contains sub-directory" -> "Contains a sub-directory"?

Sure.


>> +
>> +	Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
>> +	Each MBM event's sub-directory contains a file named "event_filter" that is
>> +	used to view and modify which memory transactions the MBM event is configured
>> +	with.
>> +
>> +	List of memory transaction types supported:
>> +
>> +	==========================  ========================================================
>> +	Name			    Description
>> +	==========================  ========================================================
>> +	dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
>> +	remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
>> +	local_reads_slow_memory     Reads to slow memory in the local NUMA domain
>> +	remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
>> +	local_non_temporal_writes   Non-temporal writes to local NUMA domain
>> +	remote_reads                Reads to memory in the non-local NUMA domain
>> +	local_reads                 Reads to memory in the local NUMA domain
>> +	==========================  ========================================================
>> +
>> +	For example::
>> +
>> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>> +	  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
>> +	  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
>> +
>> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
>> +	  local_reads,local_non_temporal_writes,local_reads_slow_memory
>> +
>>   "max_threshold_occupancy":
>>   		Read/write file provides the largest value (in
>>   		bytes) at which a previously used LLC_occupancy
> ...
>
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 16bcfeeb89e6..fa5f63126682 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>   	{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>   };
>>   
>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>> +{
>> +	struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>> +	bool sep = false;
>> +	int i;
>> +
>> +	mutex_lock(&rdtgroup_mutex);
>> +
> There is inconsistency among the files introduced on how
> "mbm_event mode disabled" case is handled. Some files return failure
> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>
> The "event_filter" file always prints the MBM transactions monitored
> when assignable counters are supported, whether mbm_event mode is enabled
> or not. This means that the MBM event's configuration values are printed
> when "default" mode is enabled.  I have two concerns about this
> 1) This is potentially very confusing since switching to "default" will
>     make the BMEC files visible that will enable the user to modify the
>     event configurations per domain. Having this file print a global event
>     configuration while there are potentially various different domain-specific
>     configuration active will be confusing.
Yes. I agree.
> 2) Can it be guaranteed that the MBM events will monitor the default
>     assignable counter memory transactions when in "default" mode? It has
>     never been possible to query which memory transactions are monitored by
>     the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>     so this seems to use one feature to deduce capabilities or another?

So, initialize both total and local event configuration to default 
values when switched to "default mode"?

Something like this?

mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;

mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | 
READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;

We are already doing that right (in patch 32)?


>
>
>> +	for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
>> +		if (mevt->evt_cfg & mbm_transactions[i].val) {
>> +			if (sep)
>> +				seq_putc(seq, ',');
>> +			seq_printf(seq, "%s", mbm_transactions[i].name);
>> +			sep = true;
>> +		}
>> +	}
>> +	seq_putc(seq, '\n');
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * resctrl_mon_resource_init() - Initialise global monitoring structures.
>>    *
>> @@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void)
>>   					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>>   		resctrl_file_fflags_init("available_mbm_cntrs",
>>   					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>> +		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>>   	}
>>   
>>   	return 0;
> ...
>
>> @@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>>   		return ret;
>>   
>>   	ret = rdtgroup_add_files(kn_subdir, fflags);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) {
>> +		r = priv;
>> +		if (r->mon.mbm_cntr_assignable) {
>> +			ret = resctrl_mkdir_event_configs(r, kn_subdir);
>> +			if (ret)
>> +				return ret;
>> +		}
>> +	}
>> +
>>   	if (!ret)
>>   		kernfs_activate(kn_subdir);
>>   
> Looks like the "if (!ret)" above can be dropped to always call "kernfs_activate(kn_subdir)"
> on exit making it clear that this is success path and function exits early on any error.

Sure. Will do,

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-08-08 13:56     ` Moger, Babu
@ 2025-08-08 15:12       ` Reinette Chatre
  2025-08-08 17:47         ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-08-08 15:12 UTC (permalink / raw)
  To: Moger, Babu, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/8/25 6:56 AM, Moger, Babu wrote:
> On 7/30/2025 3:04 PM, Reinette Chatre wrote:
>> On 7/25/25 11:29 AM, Babu Moger wrote:

>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>> index 16bcfeeb89e6..fa5f63126682 100644
>>> --- a/fs/resctrl/monitor.c
>>> +++ b/fs/resctrl/monitor.c
>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>>       {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>>   };
>>>   +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>>> +{
>>> +    struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>>> +    bool sep = false;
>>> +    int i;
>>> +
>>> +    mutex_lock(&rdtgroup_mutex);
>>> +
>> There is inconsistency among the files introduced on how
>> "mbm_event mode disabled" case is handled. Some files return failure
>> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>>
>> The "event_filter" file always prints the MBM transactions monitored
>> when assignable counters are supported, whether mbm_event mode is enabled
>> or not. This means that the MBM event's configuration values are printed
>> when "default" mode is enabled.  I have two concerns about this
>> 1) This is potentially very confusing since switching to "default" will
>>     make the BMEC files visible that will enable the user to modify the
>>     event configurations per domain. Having this file print a global event
>>     configuration while there are potentially various different domain-specific
>>     configuration active will be confusing.
> Yes. I agree.

hmmm ... ok, you say that you agree but I cannot tell if you are going to
do anything about it.

On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config
files should be the *only* source of MBM event configuration information, no?

It may be ok to have event_filter print configuration when assignable counters are disabled
if BMEC is not supported but that would require that this information will always be
known for a "default" monitoring setup. While this may be true for AMD it is not obvious
to me that this is universally true. Once this file exists in this form resctrl will always
be required to provide data for the event configuration and it is not clear to me that
this can always be guaranteed.

>> 2) Can it be guaranteed that the MBM events will monitor the default
>>     assignable counter memory transactions when in "default" mode? It has
>>     never been possible to query which memory transactions are monitored by
>>     the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>>     so this seems to use one feature to deduce capabilities or another?
> 
> So, initialize both total and local event configuration to default values when switched to "default mode"?
> 
> Something like this?
> 
> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
> 
> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;
> 
> We are already doing that right (in patch 32)?

Yes, but it creates this strange dependency where the "default" monitoring mode
(that has been supported long before configurable events and assignable counters came
along) requires support of "assignable counter mode".

Consider it from another view, if resctrl wants to make event configuration available
for the "default" mode then the "event_filter" file could always be visible when MBM
local/total is supported to give users insight into what is monitored, whether
assignable counters are supported or not. This is not possible on systems that do
not support ABMC though, right?

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-08-08 15:12       ` Reinette Chatre
@ 2025-08-08 17:47         ` Moger, Babu
  2025-08-08 18:23           ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 17:47 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 8/8/2025 10:12 AM, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/8/25 6:56 AM, Moger, Babu wrote:
>> On 7/30/2025 3:04 PM, Reinette Chatre wrote:
>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>>> index 16bcfeeb89e6..fa5f63126682 100644
>>>> --- a/fs/resctrl/monitor.c
>>>> +++ b/fs/resctrl/monitor.c
>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>>>        {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>>>    };
>>>>    +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>>>> +{
>>>> +    struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>>>> +    bool sep = false;
>>>> +    int i;
>>>> +
>>>> +    mutex_lock(&rdtgroup_mutex);
>>>> +
>>> There is inconsistency among the files introduced on how
>>> "mbm_event mode disabled" case is handled. Some files return failure
>>> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>>>
>>> The "event_filter" file always prints the MBM transactions monitored
>>> when assignable counters are supported, whether mbm_event mode is enabled
>>> or not. This means that the MBM event's configuration values are printed
>>> when "default" mode is enabled.  I have two concerns about this
>>> 1) This is potentially very confusing since switching to "default" will
>>>      make the BMEC files visible that will enable the user to modify the
>>>      event configurations per domain. Having this file print a global event
>>>      configuration while there are potentially various different domain-specific
>>>      configuration active will be confusing.
>> Yes. I agree.
> hmmm ... ok, you say that you agree but I cannot tell if you are going to
> do anything about it.
>
> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config
> files should be the *only* source of MBM event configuration information, no?

That is correct.


>
> It may be ok to have event_filter print configuration when assignable counters are disabled
> if BMEC is not supported but that would require that this information will always be
> known for a "default" monitoring setup. While this may be true for AMD it is not obvious
> to me that this is universally true. Once this file exists in this form resctrl will always
> be required to provide data for the event configuration and it is not clear to me that
> this can always be guaranteed.

Yea. It is not true universally true. I don't know what these values are 
for Intel and ARM.

>
>>> 2) Can it be guaranteed that the MBM events will monitor the default
>>>      assignable counter memory transactions when in "default" mode? It has
>>>      never been possible to query which memory transactions are monitored by
>>>      the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>>>      so this seems to use one feature to deduce capabilities or another?
>> So, initialize both total and local event configuration to default values when switched to "default mode"?
>>
>> Something like this?
>>
>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
>>
>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;
>>
>> We are already doing that right (in patch 32)?
> Yes, but it creates this strange dependency where the "default" monitoring mode
> (that has been supported long before configurable events and assignable counters came
> along) requires support of "assignable counter mode".
>
> Consider it from another view, if resctrl wants to make event configuration available
> for the "default" mode then the "event_filter" file could always be visible when MBM
> local/total is supported to give users insight into what is monitored, whether
> assignable counters are supported or not. This is not possible on systems that do
> not support ABMC though, right?

That is correct. With BMEC, each domain can have its own settings.  So, 
printing the default will not be accurate.

What options do we have here.

How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())?  
Only print the values when ABMC is supported else print information in 
"last_cmd_status".

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-08-08 17:47         ` Moger, Babu
@ 2025-08-08 18:23           ` Reinette Chatre
  2025-08-08 18:48             ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-08-08 18:23 UTC (permalink / raw)
  To: Moger, Babu, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/8/25 10:47 AM, Moger, Babu wrote:
> On 8/8/2025 10:12 AM, Reinette Chatre wrote:
>> On 8/8/25 6:56 AM, Moger, Babu wrote:
>>> On 7/30/2025 3:04 PM, Reinette Chatre wrote:
>>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>>>> index 16bcfeeb89e6..fa5f63126682 100644
>>>>> --- a/fs/resctrl/monitor.c
>>>>> +++ b/fs/resctrl/monitor.c
>>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>>>>        {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>>>>    };
>>>>>    +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>>>>> +{
>>>>> +    struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>>>>> +    bool sep = false;
>>>>> +    int i;
>>>>> +
>>>>> +    mutex_lock(&rdtgroup_mutex);
>>>>> +
>>>> There is inconsistency among the files introduced on how
>>>> "mbm_event mode disabled" case is handled. Some files return failure
>>>> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>>>>
>>>> The "event_filter" file always prints the MBM transactions monitored
>>>> when assignable counters are supported, whether mbm_event mode is enabled
>>>> or not. This means that the MBM event's configuration values are printed
>>>> when "default" mode is enabled.  I have two concerns about this
>>>> 1) This is potentially very confusing since switching to "default" will
>>>>      make the BMEC files visible that will enable the user to modify the
>>>>      event configurations per domain. Having this file print a global event
>>>>      configuration while there are potentially various different domain-specific
>>>>      configuration active will be confusing.
>>> Yes. I agree.
>> hmmm ... ok, you say that you agree but I cannot tell if you are going to
>> do anything about it.
>>
>> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config
>> files should be the *only* source of MBM event configuration information, no?
> 
> That is correct.
> 
> 
>>
>> It may be ok to have event_filter print configuration when assignable counters are disabled
>> if BMEC is not supported but that would require that this information will always be
>> known for a "default" monitoring setup. While this may be true for AMD it is not obvious
>> to me that this is universally true. Once this file exists in this form resctrl will always
>> be required to provide data for the event configuration and it is not clear to me that
>> this can always be guaranteed.
> 
> Yea. It is not true universally true. I don't know what these values are for Intel and ARM.
> 
>>
>>>> 2) Can it be guaranteed that the MBM events will monitor the default
>>>>      assignable counter memory transactions when in "default" mode? It has
>>>>      never been possible to query which memory transactions are monitored by
>>>>      the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>>>>      so this seems to use one feature to deduce capabilities or another?
>>> So, initialize both total and local event configuration to default values when switched to "default mode"?
>>>
>>> Something like this?
>>>
>>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
>>>
>>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;
>>>
>>> We are already doing that right (in patch 32)?
>> Yes, but it creates this strange dependency where the "default" monitoring mode
>> (that has been supported long before configurable events and assignable counters came
>> along) requires support of "assignable counter mode".
>>
>> Consider it from another view, if resctrl wants to make event configuration available
>> for the "default" mode then the "event_filter" file could always be visible when MBM
>> local/total is supported to give users insight into what is monitored, whether
>> assignable counters are supported or not. This is not possible on systems that do
>> not support ABMC though, right?
> 
> That is correct. With BMEC, each domain can have its own settings.  So, printing the default will not be accurate.
> 
> What options do we have here.
> 
> How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())?  Only print the values when ABMC is supported else print information in "last_cmd_status".
> 

Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print
information in "last_cmd_status".? If this is what you actually meant then I agree. I think
doing so places clear boundary on this feature that gives us more options/flexibility for
future changes.

Reinette



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations
  2025-07-30 20:05   ` Reinette Chatre
@ 2025-08-08 18:27     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 18:27 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:05 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> When "mbm_event" counter assignment mode is supported, users can modify
> "supported" -> "enabled"?

Sure.


>> the event configuration by writing to the 'event_filter' resctrl file.
>> The event configurations for mbm_event mode are located in
>> /sys/fs/resctrl/info/L3_MON/event_configs/.
>>
>> Update the assignments of all CTRL_MON and MON resource groups when the
>> event configuration is modified.
>>
>> Example:
>> $ mount -t resctrl resctrl /sys/fs/resctrl
>>
>> $ cd /sys/fs/resctrl/
>>
>> $ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter
>>    local_reads,local_non_temporal_writes,local_reads_slow_memory
>>
>> $ echo "local_reads,local_non_temporal_writes" >
>>    info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>
>> $ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>    local_reads,local_non_temporal_writes
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
>
>> ---
>>   Documentation/filesystems/resctrl.rst |  12 +++
>>   fs/resctrl/internal.h                 |   4 +
>>   fs/resctrl/monitor.c                  | 114 ++++++++++++++++++++++++++
>>   fs/resctrl/rdtgroup.c                 |   3 +-
>>   4 files changed, 132 insertions(+), 1 deletion(-)
>>
> ...
>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index e082d8718199..e2e3fc0c5fab 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -409,11 +409,15 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
>>   				  struct mon_evt *mevt);
>>   int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>>   		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
>> +void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt);
> Is there some code ordering issue in monitor.c? Looks like this function
> is only used in monitor.c so seeing it here is unexpected.
Yes. Not required anymore.
>
>>   
>>   void *rdt_kn_parent_priv(struct kernfs_node *kn);
>>   
>>   int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
>>   
>> +ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
>> +			   loff_t off);
>> +
>>   #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>   int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>>   
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index fa5f63126682..8efbeb910f77 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
> ...
>
>> @@ -1193,3 +1264,46 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
>>   		rdtgroup_free_unassign_cntr(r, d, rdtgrp, mevt);
>>   	}
>>   }
>> +
>> +/*
>> + * rdtgroup_update_cntr_event - Update the counter assignments for the event
>> + *				in a group.
>> + * @r:		Resource to which update needs to be done.
>> + * @rdtgrp:	Resctrl group.
>> + * @evtid:	MBM monitor event.
>> + */
>> +static void rdtgroup_update_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp,
>> +				       enum resctrl_event_id evtid)
>> +{
>> +	struct rdt_mon_domain *d;
>> +	int cntr_id;
>> +
>> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>> +		if (cntr_id >= 0)
>> +			resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
>> +						 rdtgrp->closid, cntr_id, true);
> Should non-arch MBM state be reset here?

Yes. Added it now.

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-08-08 18:23           ` Reinette Chatre
@ 2025-08-08 18:48             ` Moger, Babu
  2025-08-08 20:26               ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 18:48 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 8/8/2025 1:23 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/8/25 10:47 AM, Moger, Babu wrote:
>> On 8/8/2025 10:12 AM, Reinette Chatre wrote:
>>> On 8/8/25 6:56 AM, Moger, Babu wrote:
>>>> On 7/30/2025 3:04 PM, Reinette Chatre wrote:
>>>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>>>>> index 16bcfeeb89e6..fa5f63126682 100644
>>>>>> --- a/fs/resctrl/monitor.c
>>>>>> +++ b/fs/resctrl/monitor.c
>>>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>>>>>         {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>>>>>     };
>>>>>>     +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>>>>>> +{
>>>>>> +    struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>>>>>> +    bool sep = false;
>>>>>> +    int i;
>>>>>> +
>>>>>> +    mutex_lock(&rdtgroup_mutex);
>>>>>> +
>>>>> There is inconsistency among the files introduced on how
>>>>> "mbm_event mode disabled" case is handled. Some files return failure
>>>>> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>>>>>
>>>>> The "event_filter" file always prints the MBM transactions monitored
>>>>> when assignable counters are supported, whether mbm_event mode is enabled
>>>>> or not. This means that the MBM event's configuration values are printed
>>>>> when "default" mode is enabled.  I have two concerns about this
>>>>> 1) This is potentially very confusing since switching to "default" will
>>>>>       make the BMEC files visible that will enable the user to modify the
>>>>>       event configurations per domain. Having this file print a global event
>>>>>       configuration while there are potentially various different domain-specific
>>>>>       configuration active will be confusing.
>>>> Yes. I agree.
>>> hmmm ... ok, you say that you agree but I cannot tell if you are going to
>>> do anything about it.
>>>
>>> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config
>>> files should be the *only* source of MBM event configuration information, no?
>> That is correct.
>>
>>
>>> It may be ok to have event_filter print configuration when assignable counters are disabled
>>> if BMEC is not supported but that would require that this information will always be
>>> known for a "default" monitoring setup. While this may be true for AMD it is not obvious
>>> to me that this is universally true. Once this file exists in this form resctrl will always
>>> be required to provide data for the event configuration and it is not clear to me that
>>> this can always be guaranteed.
>> Yea. It is not true universally true. I don't know what these values are for Intel and ARM.
>>
>>>>> 2) Can it be guaranteed that the MBM events will monitor the default
>>>>>       assignable counter memory transactions when in "default" mode? It has
>>>>>       never been possible to query which memory transactions are monitored by
>>>>>       the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>>>>>       so this seems to use one feature to deduce capabilities or another?
>>>> So, initialize both total and local event configuration to default values when switched to "default mode"?
>>>>
>>>> Something like this?
>>>>
>>>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
>>>>
>>>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;
>>>>
>>>> We are already doing that right (in patch 32)?
>>> Yes, but it creates this strange dependency where the "default" monitoring mode
>>> (that has been supported long before configurable events and assignable counters came
>>> along) requires support of "assignable counter mode".
>>>
>>> Consider it from another view, if resctrl wants to make event configuration available
>>> for the "default" mode then the "event_filter" file could always be visible when MBM
>>> local/total is supported to give users insight into what is monitored, whether
>>> assignable counters are supported or not. This is not possible on systems that do
>>> not support ABMC though, right?
>> That is correct. With BMEC, each domain can have its own settings.  So, printing the default will not be accurate.
>>
>> What options do we have here.
>>
>> How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())?  Only print the values when ABMC is supported else print information in "last_cmd_status".
>>
> Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print
> information in "last_cmd_status".? If this is what you actually meant then I agree. I think
> doing so places clear boundary on this feature that gives us more options/flexibility for
> future changes.

Yes. That is what I meant.  We have same check in event_filer_write(). 
Will add the same check in event_filter_show().

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/
  2025-08-08 18:48             ` Moger, Babu
@ 2025-08-08 20:26               ` Reinette Chatre
  0 siblings, 0 replies; 93+ messages in thread
From: Reinette Chatre @ 2025-08-08 20:26 UTC (permalink / raw)
  To: Moger, Babu, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/8/25 11:48 AM, Moger, Babu wrote:
> On 8/8/2025 1:23 PM, Reinette Chatre wrote:
>> On 8/8/25 10:47 AM, Moger, Babu wrote:
>>> On 8/8/2025 10:12 AM, Reinette Chatre wrote:
>>>> On 8/8/25 6:56 AM, Moger, Babu wrote:
>>>>> On 7/30/2025 3:04 PM, Reinette Chatre wrote:
>>>>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>>>>>>> index 16bcfeeb89e6..fa5f63126682 100644
>>>>>>> --- a/fs/resctrl/monitor.c
>>>>>>> +++ b/fs/resctrl/monitor.c
>>>>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>>>>>>>         {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>>>>>>>     };
>>>>>>>     +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>>>>>>> +{
>>>>>>> +    struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
>>>>>>> +    bool sep = false;
>>>>>>> +    int i;
>>>>>>> +
>>>>>>> +    mutex_lock(&rdtgroup_mutex);
>>>>>>> +
>>>>>> There is inconsistency among the files introduced on how
>>>>>> "mbm_event mode disabled" case is handled. Some files return failure
>>>>>> from their _show()/_write() when "mbm_event mode is disabled", some don't.
>>>>>>
>>>>>> The "event_filter" file always prints the MBM transactions monitored
>>>>>> when assignable counters are supported, whether mbm_event mode is enabled
>>>>>> or not. This means that the MBM event's configuration values are printed
>>>>>> when "default" mode is enabled.  I have two concerns about this
>>>>>> 1) This is potentially very confusing since switching to "default" will
>>>>>>       make the BMEC files visible that will enable the user to modify the
>>>>>>       event configurations per domain. Having this file print a global event
>>>>>>       configuration while there are potentially various different domain-specific
>>>>>>       configuration active will be confusing.
>>>>> Yes. I agree.
>>>> hmmm ... ok, you say that you agree but I cannot tell if you are going to
>>>> do anything about it.
>>>>
>>>> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config
>>>> files should be the *only* source of MBM event configuration information, no?
>>> That is correct.
>>>
>>>
>>>> It may be ok to have event_filter print configuration when assignable counters are disabled
>>>> if BMEC is not supported but that would require that this information will always be
>>>> known for a "default" monitoring setup. While this may be true for AMD it is not obvious
>>>> to me that this is universally true. Once this file exists in this form resctrl will always
>>>> be required to provide data for the event configuration and it is not clear to me that
>>>> this can always be guaranteed.
>>> Yea. It is not true universally true. I don't know what these values are for Intel and ARM.
>>>
>>>>>> 2) Can it be guaranteed that the MBM events will monitor the default
>>>>>>       assignable counter memory transactions when in "default" mode? It has
>>>>>>       never been possible to query which memory transactions are monitored by
>>>>>>       the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
>>>>>>       so this seems to use one feature to deduce capabilities or another?
>>>>> So, initialize both total and local event configuration to default values when switched to "default mode"?
>>>>>
>>>>> Something like this?
>>>>>
>>>>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask;
>>>>>
>>>>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM;
>>>>>
>>>>> We are already doing that right (in patch 32)?
>>>> Yes, but it creates this strange dependency where the "default" monitoring mode
>>>> (that has been supported long before configurable events and assignable counters came
>>>> along) requires support of "assignable counter mode".
>>>>
>>>> Consider it from another view, if resctrl wants to make event configuration available
>>>> for the "default" mode then the "event_filter" file could always be visible when MBM
>>>> local/total is supported to give users insight into what is monitored, whether
>>>> assignable counters are supported or not. This is not possible on systems that do
>>>> not support ABMC though, right?
>>> That is correct. With BMEC, each domain can have its own settings.  So, printing the default will not be accurate.
>>>
>>> What options do we have here.
>>>
>>> How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())?  Only print the values when ABMC is supported else print information in "last_cmd_status".
>>>
>> Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print
>> information in "last_cmd_status".? If this is what you actually meant then I agree. I think
>> doing so places clear boundary on this feature that gives us more options/flexibility for
>> future changes.
> 
> Yes. That is what I meant.  We have same check in event_filer_write(). Will add the same check in event_filter_show().
> 

Thank you. This makes this specific behavior consistent and addresses the
topic that started this thread:
	> There is inconsistency among the files introduced on how
	> "mbm_event mode disabled" case is handled."
Could you please check the final work to confirm that the new resctrl
files are consistent in handling of "mbm_event mode supported" and "mbm_event mode"
enabled vs disabled cases? For example, they consider same scenarios as valid/invalid
and return same error code in invalid cases.

Reinette


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
  2025-07-30 20:08   ` Reinette Chatre
@ 2025-08-08 20:29     ` Moger, Babu
  2025-08-08 21:00       ` Reinette Chatre
  0 siblings, 1 reply; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 20:29 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:08 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> The "mbm_event" counter assignment mode allows users to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as it is
>> assigned.
> Above implies this addition is in support of "mbm_event" mode while the
> implementation applies to any and all assignable counter modes, including
> the "default" and for example the upcoming "soft-ABMC". It is clear to me
> how this is used and interpreted when "mbm_event" mode is enabled, but not
> for the others (more below).
>
>> Introduce a user-configurable option that determines if a counter will
>> automatically be assigned to an RMID, event pair when its associated
>> monitor group is created via mkdir.
>>
>> Suggested-by: Peter Newman <peternewman@google.com>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
>
>> ---
>>   Documentation/filesystems/resctrl.rst | 16 ++++++++++
>>   fs/resctrl/monitor.c                  |  2 ++
>>   fs/resctrl/rdtgroup.c                 | 43 +++++++++++++++++++++++++++
>>   include/linux/resctrl.h               |  3 ++
>>   4 files changed, 64 insertions(+)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 37dbad4d50f7..165e0d315af7 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -354,6 +354,22 @@ with the following files:
>>   	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>   	   local_reads,local_non_temporal_writes
>>   
>> +"mbm_assign_on_mkdir":
> Needs a "Exists when "mbm_event" counter assignment mode is supported."?
> Also needs clarification on on behavior when "mbm_event" is enabled vs. disabled.
I think we should allow it to modify  only when "mbm_event" is enabled.
>
>> +	Determines if a counter will automatically be assigned to an RMID, event pair
> "will automatically be" -> "is automatically"
> "RMID, event" -> "RMID, MBM event"
Sure.
>> +	when its associated monitor group is created via mkdir. It is enabled by default
>> +	on boot and users can disable by writing to the interface.
> "users can disable" -> "users can disable this capability" or "can be disabled"?
Sure.
>
> This implementation enables user to read/write this file/property when "mbm_event" mode is
> disabled. Considering this explanation I do not think it is clear how this file reflects
> system behavior when in "default" mode. There is no difference between mbm_assign_on_mkdir
> enabled/disabled when in "default" mode, no?
Yes. So, we should only allow modifications only when mbm_event mode is 
enabled.
> Should interactions with "mbm_assign_on_mkdir" be restricted to when
> "mbm_event" mode is enabled? If so, the next question would likely be whether value
Yes.
> should change during "mbm_event" enable->disable or "disable->enable". Above states
> clearly that it is enabled on boot and it may be reasonable to have it keep (but not always
> expose) user's setting when switching between modes.
>
> Restricting it to "mbm_event" mode now gives us some flexibility when soft-ABMC follows
> on if/how it can/should support this. What do you think?

Yes. We should restrict it to modify only when mbm_event mode is enabled.

And always enable it when switching from

"mbm_event" disable -> enable:  r->mon.mbm_assign_on_mkdir = true;

"mbm_event" enable -> enable: "no need to modify as the value does not affect the behavior."

>
>> +
>> +	"0":
>> +		Auto assignment is disabled.
>> +	"1":
>> +		Auto assignment is enabled.
>> +
>> +	Example::
>> +
>> +	  # echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
>> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
>> +	  0
>> +
>>   "max_threshold_occupancy":
>>   		Read/write file provides the largest value (in
>>   		bytes) at which a previously used LLC_occupancy
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 8efbeb910f77..6205bbfe08fb 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -1077,6 +1077,8 @@ int resctrl_mon_resource_init(void)
>>   		resctrl_file_fflags_init("available_mbm_cntrs",
>>   					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>>   		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>> +		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
>> +					 RFTYPE_RES_CACHE);
>>   	}
>>   
>>   	return 0;
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index c3d6540c3280..bf04235d2603 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
> Please move resctrl_mbm_assign_on_mkdir_show() and resctrl_mbm_assign_on_mkdir_write() to monitor.c

Sure.

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
  2025-08-08 20:29     ` Moger, Babu
@ 2025-08-08 21:00       ` Reinette Chatre
  2025-08-08 21:10         ` Moger, Babu
  0 siblings, 1 reply; 93+ messages in thread
From: Reinette Chatre @ 2025-08-08 21:00 UTC (permalink / raw)
  To: Moger, Babu, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Babu,

On 8/8/25 1:29 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 7/30/2025 3:08 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>> The "mbm_event" counter assignment mode allows users to assign a hardware
>>> counter to an RMID, event pair and monitor the bandwidth as long as it is
>>> assigned.
>> Above implies this addition is in support of "mbm_event" mode while the
>> implementation applies to any and all assignable counter modes, including
>> the "default" and for example the upcoming "soft-ABMC". It is clear to me
>> how this is used and interpreted when "mbm_event" mode is enabled, but not
>> for the others (more below).
>>
>>> Introduce a user-configurable option that determines if a counter will
>>> automatically be assigned to an RMID, event pair when its associated
>>> monitor group is created via mkdir.
>>>
>>> Suggested-by: Peter Newman <peternewman@google.com>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>> ...
>>
>>> ---
>>>   Documentation/filesystems/resctrl.rst | 16 ++++++++++
>>>   fs/resctrl/monitor.c                  |  2 ++
>>>   fs/resctrl/rdtgroup.c                 | 43 +++++++++++++++++++++++++++
>>>   include/linux/resctrl.h               |  3 ++
>>>   4 files changed, 64 insertions(+)
>>>
>>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>>> index 37dbad4d50f7..165e0d315af7 100644
>>> --- a/Documentation/filesystems/resctrl.rst
>>> +++ b/Documentation/filesystems/resctrl.rst
>>> @@ -354,6 +354,22 @@ with the following files:
>>>         # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>>          local_reads,local_non_temporal_writes
>>>   +"mbm_assign_on_mkdir":
>> Needs a "Exists when "mbm_event" counter assignment mode is supported."?
>> Also needs clarification on on behavior when "mbm_event" is enabled vs. disabled.
> I think we should allow it to modify  only when "mbm_event" is enabled.
>>
>>> +    Determines if a counter will automatically be assigned to an RMID, event pair
>> "will automatically be" -> "is automatically"
>> "RMID, event" -> "RMID, MBM event"
> Sure.
>>> +    when its associated monitor group is created via mkdir. It is enabled by default
>>> +    on boot and users can disable by writing to the interface.
>> "users can disable" -> "users can disable this capability" or "can be disabled"?
> Sure.
>>
>> This implementation enables user to read/write this file/property when "mbm_event" mode is
>> disabled. Considering this explanation I do not think it is clear how this file reflects
>> system behavior when in "default" mode. There is no difference between mbm_assign_on_mkdir
>> enabled/disabled when in "default" mode, no?
> Yes. So, we should only allow modifications only when mbm_event mode is enabled.
>> Should interactions with "mbm_assign_on_mkdir" be restricted to when
>> "mbm_event" mode is enabled? If so, the next question would likely be whether value
> Yes.
>> should change during "mbm_event" enable->disable or "disable->enable". Above states
>> clearly that it is enabled on boot and it may be reasonable to have it keep (but not always
>> expose) user's setting when switching between modes.
>>
>> Restricting it to "mbm_event" mode now gives us some flexibility when soft-ABMC follows
>> on if/how it can/should support this. What do you think?
> 
> Yes. We should restrict it to modify only when mbm_event mode is enabled.

I agree. I also think it should not be displayed with mbm_event mode is disabled. This is
because it indicates to user space that counters are automatically assigned to RMID, event
pairs and since "default" mode depends on hardware doing this it may not be accurate when, for
example, ABMC is disabled. Alternative is to add a third value, for example "enabled", "disabled", and
"undefined"(?). This sounds a bit awkward though so I think simplest may be to have this file
also be consistent with the others and return error when mbm_event mode is disabled.

> 
> And always enable it when switching from
> 
> "mbm_event" disable -> enable:  r->mon.mbm_assign_on_mkdir = true;
> 
> "mbm_event" enable -> enable: "no need to modify as the value does not affect the behavior."
> 

ok, please note this may need an update to the doc that currently only states "enabled by
default on boot" to indicate it is also automatically enabled when enabling mbm_event mode.

Reinette

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir
  2025-08-08 21:00       ` Reinette Chatre
@ 2025-08-08 21:10         ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-08 21:10 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette.

On 8/8/2025 4:00 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/8/25 1:29 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 7/30/2025 3:08 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 7/25/25 11:29 AM, Babu Moger wrote:
>>>> The "mbm_event" counter assignment mode allows users to assign a hardware
>>>> counter to an RMID, event pair and monitor the bandwidth as long as it is
>>>> assigned.
>>> Above implies this addition is in support of "mbm_event" mode while the
>>> implementation applies to any and all assignable counter modes, including
>>> the "default" and for example the upcoming "soft-ABMC". It is clear to me
>>> how this is used and interpreted when "mbm_event" mode is enabled, but not
>>> for the others (more below).
>>>
>>>> Introduce a user-configurable option that determines if a counter will
>>>> automatically be assigned to an RMID, event pair when its associated
>>>> monitor group is created via mkdir.
>>>>
>>>> Suggested-by: Peter Newman <peternewman@google.com>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>> ...
>>>
>>>> ---
>>>>    Documentation/filesystems/resctrl.rst | 16 ++++++++++
>>>>    fs/resctrl/monitor.c                  |  2 ++
>>>>    fs/resctrl/rdtgroup.c                 | 43 +++++++++++++++++++++++++++
>>>>    include/linux/resctrl.h               |  3 ++
>>>>    4 files changed, 64 insertions(+)
>>>>
>>>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>>>> index 37dbad4d50f7..165e0d315af7 100644
>>>> --- a/Documentation/filesystems/resctrl.rst
>>>> +++ b/Documentation/filesystems/resctrl.rst
>>>> @@ -354,6 +354,22 @@ with the following files:
>>>>          # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>>>>           local_reads,local_non_temporal_writes
>>>>    +"mbm_assign_on_mkdir":
>>> Needs a "Exists when "mbm_event" counter assignment mode is supported."?
>>> Also needs clarification on on behavior when "mbm_event" is enabled vs. disabled.
>> I think we should allow it to modify  only when "mbm_event" is enabled.
>>>> +    Determines if a counter will automatically be assigned to an RMID, event pair
>>> "will automatically be" -> "is automatically"
>>> "RMID, event" -> "RMID, MBM event"
>> Sure.
>>>> +    when its associated monitor group is created via mkdir. It is enabled by default
>>>> +    on boot and users can disable by writing to the interface.
>>> "users can disable" -> "users can disable this capability" or "can be disabled"?
>> Sure.
>>> This implementation enables user to read/write this file/property when "mbm_event" mode is
>>> disabled. Considering this explanation I do not think it is clear how this file reflects
>>> system behavior when in "default" mode. There is no difference between mbm_assign_on_mkdir
>>> enabled/disabled when in "default" mode, no?
>> Yes. So, we should only allow modifications only when mbm_event mode is enabled.
>>> Should interactions with "mbm_assign_on_mkdir" be restricted to when
>>> "mbm_event" mode is enabled? If so, the next question would likely be whether value
>> Yes.
>>> should change during "mbm_event" enable->disable or "disable->enable". Above states
>>> clearly that it is enabled on boot and it may be reasonable to have it keep (but not always
>>> expose) user's setting when switching between modes.
>>>
>>> Restricting it to "mbm_event" mode now gives us some flexibility when soft-ABMC follows
>>> on if/how it can/should support this. What do you think?
>> Yes. We should restrict it to modify only when mbm_event mode is enabled.
> I agree. I also think it should not be displayed with mbm_event mode is disabled. This is
> because it indicates to user space that counters are automatically assigned to RMID, event
> pairs and since "default" mode depends on hardware doing this it may not be accurate when, for
> example, ABMC is disabled. Alternative is to add a third value, for example "enabled", "disabled", and
> "undefined"(?). This sounds a bit awkward though so I think simplest may be to have this file
> also be consistent with the others and return error when mbm_event mode is disabled.


Yes. We can do that.


>
>> And always enable it when switching from
>>
>> "mbm_event" disable -> enable:  r->mon.mbm_assign_on_mkdir = true;
>>
>> "mbm_event" enable -> enable: "no need to modify as the value does not affect the behavior."
>>
> ok, please note this may need an update to the doc that currently only states "enabled by
> default on boot" to indicate it is also automatically enabled when enabling mbm_event mode.


Will add texts about it.

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal
  2025-07-30 20:08   ` Reinette Chatre
@ 2025-08-11 23:39     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-11 23:39 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:08 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>
>> ---
>>   fs/resctrl/monitor.c  |  1 +
>>   fs/resctrl/rdtgroup.c | 70 +++++++++++++++++++++++++++++++++++++++++--
>>   2 files changed, 69 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 6205bbfe08fb..5cf1b79c17f5 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -1072,6 +1072,7 @@ int resctrl_mon_resource_init(void)
>>   		mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
>>   								   READS_TO_LOCAL_S_MEM |
>>   								   NON_TEMP_WRITE_TO_LOCAL_MEM;
>> +		r->mon.mbm_assign_on_mkdir = true;
>>   		resctrl_file_fflags_init("num_mbm_cntrs",
>>   					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>>   		resctrl_file_fflags_init("available_mbm_cntrs",
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index bf04235d2603..d087ba990cd3 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
> Please move rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to
> be with counter management code in monitor.c

Sure.  Taken care of this.

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group
  2025-07-30 20:09   ` Reinette Chatre
@ 2025-08-11 23:42     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-11 23:42 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:09 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> Introduce the mbm_L3_assignments resctrl file associated with CTRL_MON and
>> MON resource groups to display the counter assignment states of the
>> resource group when "mbm_event" counter assignment mode is enabled.
>>
>> The list is displayed in the following format:
> needs imperative:
>   "Display the list ..."
Sure.
>
>> <Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state>
>>
>> Event: A valid MBM event listed in
>>         /sys/fs/resctrl/info/L3_MON/event_configs directory.
>>
>> Domain ID: A valid domain ID.
>>
>> The assignment state can be one of the following:
>>
>> _ : No counter assigned.
>>
>> e : Counter assigned exclusively.
>>
>> Example:
>> To list the assignment states for the default group
>> $ cd /sys/fs/resctrl
>> $ cat /sys/fs/resctrl/mbm_L3_assignments
>> mbm_total_bytes:0=e;1=e
>> mbm_local_bytes:0=e;1=e
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
>
>
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 5cf1b79c17f5..ebc049105949 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -1080,6 +1080,7 @@ int resctrl_mon_resource_init(void)
>>   		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>>   		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO |
>>   					 RFTYPE_RES_CACHE);
>> +		resctrl_file_fflags_init("mbm_L3_assignments", RFTYPE_MON_BASE);
>>   	}
>>   
>>   	return 0;
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index d087ba990cd3..47716e623a9c 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1931,6 +1931,54 @@ static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
>>   	return nbytes;
>>   }
>>   
>> +static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v)
> Please move to monitor.c (then mbm_cntr_get() can be private to monitor.c also).


Sure.

>
>> +{
>> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>> +	struct rdt_mon_domain *d;
>> +	struct rdtgroup *rdtgrp;
>> +	struct mon_evt *mevt;
>> +	int ret = 0;
>> +	bool sep;
>> +
>> +	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>> +	if (!rdtgrp) {
>> +		ret = -ENOENT;
>> +		goto out_unlock;
>> +	}
>> +
>> +	rdt_last_cmd_clear();
>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
>> +		ret = -ENOENT;
> The error returned by the files when "mbm_event" is disabled (but supported) is
> inconsistent. All but this one return EINVAL. Please make return code consistent.


Yes. Sure.

>
>> +		goto out_unlock;
>> +	}
>> +
>> +	for_each_mon_event(mevt) {
>> +		if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid))
>> +			continue;
>> +
>> +		sep = false;
>> +		seq_printf(s, "%s:", mevt->name);
>> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +			if (sep)
>> +				seq_putc(s, ';');
>> +
>> +			if (mbm_cntr_get(r, d, rdtgrp, mevt->evtid) < 0)
>> +				seq_printf(s, "%d=_", d->hdr.id);
>> +			else
>> +				seq_printf(s, "%d=e", d->hdr.id);
>> +
>> +			sep = true;
>> +		}
>> +		seq_putc(s, '\n');
>> +	}
>> +
>> +out_unlock:
>> +	rdtgroup_kn_unlock(of->kn);
>> +
>> +	return ret;
>> +}
>> +
>>   /* rdtgroup information files for one cache resource. */
>>   static struct rftype res_common_files[] = {
>>   	{
> Reinette
>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify assignments in a group
  2025-07-30 20:10   ` Reinette Chatre
@ 2025-08-11 23:51     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-11 23:51 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/2025 3:10 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> Enable the mbm_l3_assignments resctrl file to be used to modify counter
>> assignments of CTRL_MON and MON groups when the "mbm_event" counter
>> assignment mode is enabled.
>>
>> The assignment modifications are done in the following format:
> (needs imperative)


Sure.


>
>> <Event>:<Domain id>=<Assignment state>
>>
>> Event: A valid MBM event in the
>>         /sys/fs/resctrl/info/L3_MON/event_configs directory.
>>
>> Domain ID: A valid domain ID. When writing, '*' applies the changes
>> 	   to all domains.
>>
>> Assignment states:
>>
>>      _ : Unassign a counter.
>>
>>      e : Assign a counter exclusively.
>>
>> Examples:
>>
>> $ cd /sys/fs/resctrl
>> $ cat /sys/fs/resctrl/mbm_L3_assignments
>>    mbm_total_bytes:0=e;1=e
>>    mbm_local_bytes:0=e;1=e
>>
>> To unassign the counter associated with the mbm_total_bytes event on
>> domain 0:
>>
>> $ echo "mbm_total_bytes:0=_" > mbm_L3_assignments
>> $ cat /sys/fs/resctrl/mbm_L3_assignments
>>    mbm_total_bytes:0=_;1=e
>>    mbm_local_bytes:0=e;1=e
>>
>> To unassign the counter associated with the mbm_total_bytes event on
>> all the domains:
>>
>> $ echo "mbm_total_bytes:*=_" > mbm_L3_assignments
>> $ cat /sys/fs/resctrl/mbm_L3_assignments
>>    mbm_total_bytes:0=_;1=_
>>    mbm_local_bytes:0=e;1=e
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
>
>> ---
>>   Documentation/filesystems/resctrl.rst | 146 +++++++++++++++++++++++++-
>>   fs/resctrl/internal.h                 |   3 +
>>   fs/resctrl/monitor.c                  |  94 +++++++++++++++++
>>   fs/resctrl/rdtgroup.c                 |  48 ++++++++-
>>   4 files changed, 289 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 0b8ce942f112..0c8701103214 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -525,7 +525,8 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>>   	Event: A valid MBM event in the
>>   	       /sys/fs/resctrl/info/L3_MON/event_configs directory.
>>   
>> -	Domain ID: A valid domain ID.
>> +	Domain ID: A valid domain ID. When writing, '*' applies the changes
>> +		   to all the domains.
>>   
>>   	Assignment states:
>>   
>> @@ -542,6 +543,34 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>>   	   mbm_total_bytes:0=e;1=e
>>   	   mbm_local_bytes:0=e;1=e
>>   
>> +	Assignments can be modified by writing to the interface.
>> +
>> +	Example:
>> +	To unassign the counter associated with the mbm_total_bytes event on domain 0:
> The alignment is off when looking at the generated html. What seems to be intended is that
> "Example" is some sort of heading but it ends up just being part of the sentence that follows
> and thus not apply to other examples that follow.
> It can also be "Examples" since there are more than one.


Checking it again.

>
>> +	::
>> +
>> +	 # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
>> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
>> +	   mbm_total_bytes:0=_;1=e
>> +	   mbm_local_bytes:0=e;1=e
>> +
>> +	To unassign the counter associated with the mbm_total_bytes event on all the domains:
>> +	::
>> +
>> +	 # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
>> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
>> +	   mbm_total_bytes:0=_;1=_
>> +	   mbm_local_bytes:0=e;1=e
>> +
>> +	To assign a counter associated with the mbm_total_bytes event on all domains in
>> +	exclusive mode:
>> +	::
>> +
>> +	 # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
>> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
>> +	   mbm_total_bytes:0=e;1=e
>> +	   mbm_local_bytes:0=e;1=e
>> +
>>   Resource allocation rules
>>   -------------------------
>>   
>> @@ -1577,6 +1606,121 @@ View the llc occupancy snapshot::
>>     # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
>>     11234000
>>   
>> +
>> +Examples on working with mbm_assign_mode
>> +========================================
>> +
>> +a. Check if MBM counter assignment mode is supported.
>> +::
>> +
>> +  # mount -t resctrl resctrl /sys/fs/resctrl/
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> +  [mbm_event]
>> +  default
>> +
>> +The "mbm_event" mode is detected and enabled.
>> +
>> +b. Check how many assignable counters are supported.
>> +::
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>> +  0=32;1=32
>> +
>> +c. Check how many assignable counters are available for assignment in each domain.
>> +::
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>> +  0=30;1=30
>> +
>> +d. To list the default group's assign states.
>> +::
>> +
>> +  # cat /sys/fs/resctrl/mbm_L3_assignments
>> +  mbm_total_bytes:0=e;1=e
>> +  mbm_local_bytes:0=e;1=e
>> +
>> +e.  To unassign the counter associated with the mbm_total_bytes event on domain 0.
>> +::
>> +
>> +  # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
>> +  # cat /sys/fs/resctrl/mbm_L3_assignments
>> +  mbm_total_bytes:0=_;1=e
>> +  mbm_local_bytes:0=e;1=e
>> +
>> +f. To unassign the counter associated with the mbm_total_bytes event on all domains.
>> +::
>> +
>> +  # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
>> +  # cat /sys/fs/resctrl/mbm_L3_assignment
>> +  mbm_total_bytes:0=_;1=_
>> +  mbm_local_bytes:0=e;1=e
>> +
>> +g. To assign a counter associated with the mbm_total_bytes event on all domains in
>> +exclusive mode.
>> +::
>> +
>> +  # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
>> +  # cat /sys/fs/resctrl/mbm_L3_assignments
>> +  mbm_total_bytes:0=e;1=e
>> +  mbm_local_bytes:0=e;1=e
>> +
>> +h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is
>> +no change in reading the events with the assignment.  If the event is unassigned when
>> +reading, then the read will come back as "Unassigned".
> While this example is for a single resource group the supporting text goes back
> and forth between being specific to one resource group and describing what happens
> when there are multiple resource groups (see (j)). If it is just one resource group then above is
> fine, but for multiple there are much more involved with the "unassigned". Same as what
> was mentioned during previous version.

Removed the "Unassigned" related text.  Also removed texts about 
multiple groups.

We already have details on "Unassigned" in mon_data section.


>
>> +::
>> +
>> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> +  779247936
>> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> +  765207488
>> +
>> +i. Check the event configurations.
>> +::
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
>> +  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
>> +  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
>> +  local_reads,local_non_temporal_writes,local_reads_slow_memory
>> +
>> +j. Change the event configuration for mbm_local_bytes.
>> +::
>> +
>> +  # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
>> +  /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
>> +
>> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
>> +  local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
>> +
>> +This will update all (across all domains of all monitor groups) counter assignments
>> +associated with the mbm_local_bytes event.
>> +
>> +k. Now read the local event again. The first read may come back with "Unavailable"
>> +status. The subsequent read of mbm_local_bytes will display the current value.
>> +::
>> +
>> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> +  Unavailable
>> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> +  314101
>> +
>> +l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be
>> +done using the following command. Note that switching the mbm_assign_mode may reset all
>> +the MBM counters (and thus all MBM events) of all the resctrl groups.
>> +::
>> +
>> +  # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> +  mbm_event
>> +  [default]
>> +
>> +m. Unmount the resctrl filesystem.
>> +::
>> +
>> +  # umount /sys/fs/resctrl/
>> +
>>   Intel RDT Errata
>>   ================
>>   
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index e2e3fc0c5fab..1350fc273258 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -418,6 +418,9 @@ int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v
>>   ssize_t event_filter_write(struct kernfs_open_file *of, char *buf, size_t nbytes,
>>   			   loff_t off);
>>   
>> +int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
>> +				 char *event, char *tok);
>> +
>>   #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>   int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>>   
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index ebc049105949..1e4f8e3bedc6 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -1311,3 +1311,97 @@ void resctrl_update_cntr_allrdtgrp(struct mon_evt *mevt)
>>   			rdtgroup_update_cntr_event(r, crgrp, mevt->evtid);
>>   	}
>>   }
>> +
>> +/*
>> + * mbm_get_mon_event_by_name() - Return the mon_evt entry for the matching
>> + * event name.
>> + */
>> +static struct mon_evt *mbm_get_mon_event_by_name(struct rdt_resource *r, char *name)
>> +{
>> +	struct mon_evt *mevt;
>> +
>> +	for_each_mon_event(mevt) {
>> +		if (mevt->rid == r->rid && mevt->enabled &&
>> +		    resctrl_is_mbm_event(mevt->evtid) &&
>> +		    !strcmp(mevt->name, name))
>> +			return mevt;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static int rdtgroup_modify_assign_state(char *assign, struct rdt_mon_domain *d,
>> +					struct rdtgroup *rdtgrp, struct mon_evt *mevt)
>> +{
>> +	int ret = 0;
>> +
>> +	if (!assign || strlen(assign) != 1)
>> +		return -EINVAL;
>> +
>> +	switch (*assign) {
>> +	case 'e':
>> +		ret = rdtgroup_assign_cntr_event(d, rdtgrp, mevt);
>> +		break;
>> +	case '_':
>> +		rdtgroup_unassign_cntr_event(d, rdtgrp, mevt);
>> +		break;
>> +	default:
>> +		ret = -EINVAL;
>> +		break;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +int resctrl_parse_mbm_assignment(struct rdt_resource *r, struct rdtgroup *rdtgrp,
>> +				 char *event, char *tok)
>> +{
>> +	struct rdt_mon_domain *d;
>> +	unsigned long dom_id = 0;
>> +	char *dom_str, *id_str;
>> +	struct mon_evt *mevt;
>> +	int ret;
>> +
>> +	mevt = mbm_get_mon_event_by_name(r, event);
>> +	if (!mevt) {
>> +		rdt_last_cmd_printf("Invalid event %s\n", event);
>> +		return  -ENOENT;
> Extra space


Sure.

>
>> +	}
>> +
>> +next:
>> +	if (!tok || tok[0] == '\0')
>> +		return 0;
>> +
>> +	/* Start processing the strings for each domain */
>> +	dom_str = strim(strsep(&tok, ";"));
>> +
>> +	id_str = strsep(&dom_str, "=");
>> +
>> +	/* Check for domain id '*' which means all domains */
>> +	if (id_str && *id_str == '*') {
>> +		ret = rdtgroup_modify_assign_state(dom_str, NULL, rdtgrp, mevt);
>> +		if (ret)
>> +			rdt_last_cmd_printf("Assign operation '%s:*=%s' failed\n",
>> +					    event, dom_str);
>> +		return ret;
>> +	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
>> +		rdt_last_cmd_puts("Missing domain id\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Verify if the dom_id is valid */
>> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +		if (d->hdr.id == dom_id) {
>> +			ret = rdtgroup_modify_assign_state(dom_str, d, rdtgrp, mevt);
>> +			if (ret) {
>> +				rdt_last_cmd_printf("Assign operation '%s:%ld=%s' failed\n",
>> +						    event, dom_id, dom_str);
>> +				return ret;
>> +			}
>> +			goto next;
>> +		}
>> +	}
>> +
>> +	rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
>> +	return -EINVAL;
>> +}
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 47716e623a9c..2d2b91cd1f67 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1979,6 +1979,51 @@ static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file
>>   	return ret;
>>   }
>>   
>> +static ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf,
> Please move to monitor.c


Sure.

Thanks

Babu


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled
  2025-07-30 20:11   ` Reinette Chatre
@ 2025-08-12 19:16     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-12 19:16 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:11, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> @@ -1799,6 +1800,41 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>>  	return ret ?: nbytes;
>>  }
>>  
>> +/*
>> + * resctrl_bmec_files_show() — Controls the visibility of BMEC-related resctrl
>> + * files. When @show is true, the files are displayed; when false, the files
>> + * are hidden.
>> + * Don't treat kernfs_find_and_get failure as an error, since this function may
>> + * be called regardless of whether BMEC is supported or the event is enabled.
>> + */
>> +static void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn,
>> +				    bool show)
>> +{
>> +	struct kernfs_node *kn_config;
>> +	char name[32];
>> +
>> +	if (!l3_mon_kn) {
>> +		sprintf(name, "%s_MON", r->name);
>> +		l3_mon_kn = kernfs_find_and_get(kn_info, name);
>> +		if (!l3_mon_kn)
>> +			return;
>> +	}
>> +
>> +	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_total_bytes_config");
>> +	if (kn_config) {
>> +		kernfs_show(kn_config, show);
>> +		kernfs_put(kn_config);
>> +	}
>> +
>> +	kn_config = kernfs_find_and_get(l3_mon_kn, "mbm_local_bytes_config");
>> +	if (kn_config) {
>> +		kernfs_show(kn_config, show);
>> +		kernfs_put(kn_config);
>> +	}
>> +
>> +	kernfs_put(l3_mon_kn);
> 
> Looks like this will drop an extra reference if l3_mon_kn was provided as parameter.
> 

Yes. Fixed it now.
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes
  2025-07-30 20:11   ` Reinette Chatre
@ 2025-08-12 19:18     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-12 19:18 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:11, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 1aeac350774d..68ba08e95a54 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1865,6 +1865,75 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>>  	return 0;
>>  }
>>  
>> +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of,
>> +					     char *buf, size_t nbytes, loff_t off)
> 
> Please move to monitor.c
> 

Sure.

>> +{
>> +	struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
>> +	struct rdt_mon_domain *d;
>> +	int ret = 0;
>> +	bool enable;
>> +
>> +	/* Valid input requires a trailing newline */
>> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
>> +		return -EINVAL;
>> +
>> +	buf[nbytes - 1] = '\0';
>> +
>> +	cpus_read_lock();
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	rdt_last_cmd_clear();
>> +
>> +	if (!strcmp(buf, "default")) {
>> +		enable = 0;
>> +	} else if (!strcmp(buf, "mbm_event")) {
>> +		if (r->mon.mbm_cntr_assignable) {
>> +			enable = 1;
>> +		} else {
>> +			ret = -EINVAL;
>> +			rdt_last_cmd_puts("mbm_event mode is not supported\n");
>> +			goto out_unlock;
>> +		}
>> +	} else {
>> +		ret = -EINVAL;
>> +		rdt_last_cmd_puts("Unsupported assign mode\n");
>> +		goto out_unlock;
>> +	}
>> +
>> +	if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +		ret = resctrl_arch_mbm_cntr_assign_set(r, enable);
>> +		if (ret)
>> +			goto out_unlock;
>> +
>> +		/* Update the visibility of BMEC related files */
>> +		resctrl_bmec_files_show(r, NULL, !enable);
>> +
>> +		/*
>> +		 * Initialize the default memory transaction values for
>> +		 * total and local events.
>> +		 */
>> +		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
>> +			mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = MAX_EVT_CONFIG_BITS;
>> +		if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
>> +			mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM |
>> +									   READS_TO_LOCAL_S_MEM |
>> +									   NON_TEMP_WRITE_TO_LOCAL_MEM;
> 
> This needs to take into account the configurations that
> hardware supports.
> 

Yes. Sure.

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported
  2025-07-30 20:11   ` Reinette Chatre
@ 2025-08-12 19:21     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-12 19:21 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:11, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> Configure mbm_event mode on AMD platforms. On AMD platforms, it is
>> recommended to use the mbm_event mode, if supported, to prevent the
>> hardware from resetting counters between reads. This can result in
>> misleading values or display "Unavailable" if no counter is assigned
>> to the event.
>>
>> The mbm_event mode, referred to as ABMC (Assignable Bandwidth Monitoring
>> Counters) on AMD, is enabled by default when supported by the system.
> 
> needs imperative
> 

Sure.

>>
>> Update ABMC across all logical processors within the resctrl domain to
>> ensure proper functionality.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> Patch looks good.
> 
-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer
  2025-07-30 20:14   ` Reinette Chatre
@ 2025-08-12 19:23     ` Moger, Babu
  0 siblings, 0 replies; 93+ messages in thread
From: Moger, Babu @ 2025-08-12 19:23 UTC (permalink / raw)
  To: Reinette Chatre, corbet, tony.luck, james.morse, tglx, mingo, bp,
	dave.hansen
  Cc: Dave.Martin, x86, hpa, akpm, paulmck, rostedt, Neeraj.Upadhyay,
	david, arnd, fvdl, seanjc, jpoimboe, pawan.kumar.gupta, xin,
	manali.shukla, tao1.su, sohil.mehta, kai.huang, xiaoyao.li,
	peterz, xin3.li, kan.liang, mario.limonciello, thomas.lendacky,
	perry.yuan, gautham.shenoy, chang.seok.bae, linux-doc,
	linux-kernel, peternewman, eranian

Hi Reinette,

On 7/30/25 15:14, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/25/25 11:29 AM, Babu Moger wrote:
>> I have been contributing to resctrl for sometime now and I would like to
>> help with code reviews as well.
> 
> You do not need to be in MAINTAINERS file to help with code reviews. I do believe
> it is important that you are cc'd on all future contributions since I am not able
> to test these new features you are enabling so having you keep an eye on the health
> of these areas is greatly appreciated.   

Agreed.

> 
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v16: Reinette suggested to add me as a reviewer. I am glad to help as a reviewer.
>> ---
>>  MAINTAINERS | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index f697a0c51721..70a2f83145db 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -20866,6 +20866,7 @@ M:	Tony Luck <tony.luck@intel.com>
>>  M:	Reinette Chatre <reinette.chatre@intel.com>
>>  R:	Dave Martin <Dave.Martin@arm.com>
>>  R:	James Morse <james.morse@arm.com>
>> +R:	Babu Moger <babu.moger@amd.com>
>>  L:	linux-kernel@vger.kernel.org
>>  S:	Supported
>>  F:	Documentation/filesystems/resctrl.rst
> 
> Acked-by: Reinette Chatre <reinette.chatre@intel.com>
> 
-
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2025-08-12 19:23 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 18:29 [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-07-25 18:29 ` [PATCH v16 01/34] x86,fs/resctrl: Consolidate monitor event descriptions Babu Moger
2025-07-30 19:47   ` Reinette Chatre
2025-07-30 20:23     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 02/34] x86,fs/resctrl: Replace architecture event enabled checks Babu Moger
2025-07-25 18:29 ` [PATCH v16 03/34] x86/resctrl: Remove 'rdt_mon_features' global variable Babu Moger
2025-07-25 18:29 ` [PATCH v16 04/34] x86,fs/resctrl: Prepare for more monitor events Babu Moger
2025-07-25 18:29 ` [PATCH v16 05/34] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-07-30 19:47   ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 06/34] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-07-25 18:29 ` [PATCH v16 07/34] x86,fs/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-07-25 18:29 ` [PATCH v16 08/34] x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-07-30 19:49   ` Reinette Chatre
2025-08-06 21:04     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 09/34] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-07-25 18:29 ` [PATCH v16 10/34] fs/resctrl: Introduce the interface to display monitoring modes Babu Moger
2025-08-06 21:02   ` Moger, Babu
2025-08-06 21:30     ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 11/34] fs/resctrl: Add resctrl file to display number of assignable counters Babu Moger
2025-08-06 21:12   ` Moger, Babu
2025-08-06 21:31     ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 12/34] fs/resctrl: Introduce mbm_cntr_cfg to track assignable counters per domain Babu Moger
2025-07-25 18:29 ` [PATCH v16 13/34] fs/resctrl: Introduce interface to display number of free MBM counters Babu Moger
2025-08-06 21:19   ` Moger, Babu
2025-08-06 21:31     ` Reinette Chatre
2025-08-06 22:04       ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 14/34] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-07-25 18:29 ` [PATCH v16 15/34] fs/resctrl: Introduce event configuration field in struct mon_evt Babu Moger
2025-07-25 18:29 ` [PATCH v16 16/34] x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-07-30 19:50   ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 17/34] fs/resctrl: Add the functionality to assign MBM events Babu Moger
2025-07-30 19:52   ` Reinette Chatre
2025-08-07 18:29     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 18/34] fs/resctrl: Add the functionality to unassign " Babu Moger
2025-07-30 19:53   ` Reinette Chatre
2025-08-07 18:33     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 19/34] fs/resctrl: Pass struct rdtgroup instead of individual members Babu Moger
2025-07-30 19:54   ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 20/34] fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode Babu Moger
2025-07-30 19:59   ` Reinette Chatre
2025-08-07 19:59     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 21/34] x86/resctrl: Refactor resctrl_arch_rmid_read() Babu Moger
2025-07-30 19:59   ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 22/34] x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() Babu Moger
2025-07-30 20:01   ` Reinette Chatre
2025-08-08  2:05     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 23/34] fs/resctrl: Support counter read/reset with mbm_event assignment mode Babu Moger
2025-07-30 20:03   ` Reinette Chatre
2025-08-08  2:20     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 24/34] fs/resctrl: Add definitions for MBM event configuration Babu Moger
2025-07-30 20:03   ` Reinette Chatre
2025-08-08  2:24     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 25/34] fs/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
2025-07-30 20:04   ` Reinette Chatre
2025-08-08 13:56     ` Moger, Babu
2025-08-08 15:12       ` Reinette Chatre
2025-08-08 17:47         ` Moger, Babu
2025-08-08 18:23           ` Reinette Chatre
2025-08-08 18:48             ` Moger, Babu
2025-08-08 20:26               ` Reinette Chatre
2025-07-25 18:29 ` [PATCH v16 26/34] fs/resctrl: Provide interface to update the event configurations Babu Moger
2025-07-30 20:05   ` Reinette Chatre
2025-08-08 18:27     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 27/34] fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir Babu Moger
2025-07-30 20:08   ` Reinette Chatre
2025-08-08 20:29     ` Moger, Babu
2025-08-08 21:00       ` Reinette Chatre
2025-08-08 21:10         ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 28/34] fs/resctrl: Auto assign counters on mkdir and clean up on group removal Babu Moger
2025-07-30 20:08   ` Reinette Chatre
2025-08-11 23:39     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 29/34] fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
2025-07-30 20:09   ` Reinette Chatre
2025-08-11 23:42     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 30/34] fs/resctrl: Introduce the interface to modify " Babu Moger
2025-07-30 20:10   ` Reinette Chatre
2025-08-11 23:51     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 31/34] fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled Babu Moger
2025-07-30 20:11   ` Reinette Chatre
2025-08-12 19:16     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 32/34] fs/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-07-30 20:11   ` Reinette Chatre
2025-08-12 19:18     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 33/34] x86/resctrl: Configure mbm_event mode if supported Babu Moger
2025-07-30 20:11   ` Reinette Chatre
2025-08-12 19:21     ` Moger, Babu
2025-07-25 18:29 ` [PATCH v16 34/34] MAINTAINERS: resctrl: add myself as reviewer Babu Moger
2025-07-30 20:14   ` Reinette Chatre
2025-08-12 19:23     ` Moger, Babu
2025-07-30 19:47 ` [PATCH v16 00/34] x86,fs/resctrl: Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
2025-07-30 23:31   ` Moger, Babu
2025-07-30 23:57     ` Reinette Chatre
2025-07-31 14:17       ` Moger, Babu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).