linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
@ 2025-04-04  0:18 Babu Moger
  2025-04-04  0:18 ` [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
                   ` (25 more replies)
  0 siblings, 26 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian


This series adds the support for Assignable Bandwidth Monitoring Counters
(ABMC). It is also called QoS RMID Pinning feature

Series is written such that it is easier to support other assignable
features supported from different vendors.

The feature details are documented in the  APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC). The documentation is available at
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

The patches are based on top of commit
bab03103a34f1 Merge branch into tip/master: 'x86/sev'

# Introduction

Users can create as many monitor groups as RMIDs supported by the hardware.
However, bandwidth monitoring feature on AMD system only guarantees that
RMIDs currently assigned to a processor will be tracked by hardware.
The counters of any other RMIDs which are no longer being tracked will be
reset to zero. The MBM event counters return "Unavailable" for the RMIDs
that are not tracked by hardware. So, there can be only limited number of
groups that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups
are being tracked for certain point of time. Users do not have the option
to monitor a group or set of groups for certain period of time without
worrying about counter being reset in between.
    
The ABMC feature provides an option to the user to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as it is
assigned.  The assigned RMID will be tracked by the hardware until the user
unassigns it manually. There is no need to worry about counters being reset
during this period. Additionally, the user can specify a bitmask identifying
the specific bandwidth types from the given source to track with the counter.

Without ABMC enabled, monitoring will work in current 'default' mode without
assignment option.

# History

Previous implementation of ABMC had dependancy on BMEC (Bandwidth Monitoring
Event Configuration). Peter had concerns with that implementation because
it may be not be compatible with ARM's MPAM.

Here are the threads discussing the concerns and new interface to address the concerns.
https://lore.kernel.org/lkml/CALPaoCg97cLVVAcacnarp+880xjsedEWGJPXhYpy4P7=ky4MZw@mail.gmail.com/
https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/

Here are the finalized requirements based on the discussion:

*   Remove BMEC dependency on the ABMC feature.

*   Eliminate global assignment listing. The interface
    /sys/fs/resctrl/info/L3_MON/mbm_assign_control is no longer required.

*   Create the configuration directories at /sys/fs/resctrl/info/L3_MON/counter_configs/.
    The configuration file names should be free-form, allowing users to create them as needed.

*   Perform assignment listing at the group level by introducing mbm_L3_assignments
    in each monitoring group. The listing should provide the following details:

    Event Configuration: Specifies the event configuration applied. This will be crucial
    when "mkdir" on event configuration is added in the future, leading to the creation
    of mon_data/mon_l3_*/<event configuration>.

    Domains: Identifies the domains where the configuration is applied, supporting multi-domain setups.

    Assignment Type: Indicates whether the assignment is Exclusive (e or d), Shared (s), or Unassigned (_).

*   Provide option to enable or disable auto assignment when new group is created.

This series tries to address all the requirements listed above.

# Implementation details

Create a generic interface aimed to support user space assignment of scarce
counters used for monitoring. First usage of interface is by ABMC with option
to expand usage to "soft-ABMC" and MPAM counters in future.

Feature adds following interface files:

/sys/fs/resctrl/info/L3_MON/mbm_assign_mode: Reports the list of assignable
monitoring features supported. The enclosed brackets indicate which
feature is enabled.

/sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: Reports the number of monitoring
counters available for assignment.

/sys/fs/resctrl/info/L3_MON/available_mbm_cntrs: Reports the number of monitoring
counters free in each domain.

/sys/fs/resctrl/info/L3_MON/counter_configs : Directory to hold the counter configuration.

/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter : Default configuration
for MBM total events.

/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter : Default configuration
for MBM local events.

/sys/fs/resctrl/mbm_L3_assignments: Interface to list or modify assignment states on each group.

# Examples

a. Check if ABMC support is available
	#mount -t resctrl resctrl /sys/fs/resctrl/

	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	[mbm_cntr_assign]
	default

	ABMC feature is detected and it is enabled.

b. Check how many ABMC counters are available. 

	# cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs 
	32

c. Check how many ABMC counters are available in each domain.

	# cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs 
	0=30;1=30

d. Check default counter configuration.

	# cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter 
	local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
        local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all

	# cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter 
	local_reads, local_non_temporal_writes, local_reads_slow_memory

e. Series adds a new interface file "mbm_L3_assignments" in each monitoring group
   to list and modify any group's monitoring states.

	The list is displayed in the following format:

        <Event configuration>:<Domain id>=<Assignment type>

        Event configuration: A valid event configuration listed in the
        /sys/fs/resctrl/info/L3_MON/counter_configs directory.

        Domain ID: A valid domain ID number.

        Assignment types:

        _ : No event configuration assigned

        e : Event configuration assigned in exclusive mode

	Initial group status:
	# cat /sys/fs/resctrl/mbm_L3_assignments
	mbm_total_bytes:0=e;1=e
	mbm_local_bytes:0=e;1=e

	To unassign the configuration of mbm_total_bytes on domain 0:
	#echo "mbm_total_bytes:0=_" > mbm_L3_assignments
	#cat mbm_L3_assignments
	mbm_total_bytes:0=_;1=e
	mbm_local_bytes:0=e;1=e

	To unassign the mbm_total_bytes configuration on all domains:
    	$echo "mbm_total_bytes:*=_" > mbm_L3_assignments
	$cat mbm_L3_assignments
	mbm_total_bytes:0=_;1=_
	mbm_local_bytes:0=e;1=e

	To assign the mbm_total_bytes configuration on all domains in exclusive mode:
    	$echo "mbm_total_bytes:*=e" > mbm_L3_assignments
	$cat mbm_L3_assignments
	mbm_total_bytes:0=e;1=e
	mbm_local_bytes:0=e;1=e

g. Read the events mbm_total_bytes and mbm_local_bytes of the default group.
   There is no change in reading the events with ABMC. If the event is unassigned
   when reading, then the read will come back as "Unassigned".
	
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	779247936
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 
	765207488
	
h. Check the default event configurations.

	#cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
	local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
	local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all

	#cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
	local_reads, local_non_temporal_writes, local_reads_slow_memory

i. Change the event configuration for mbm_local_bytes.

	#echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
	/sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter

	#cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
	local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads
	
        This will update the assignments where mbm_local_bytes are configured.
	
j. Now read the total event again. The first read may come back with "Unavailable"
   status. The subsequent read of mbm_total_bytes will display only the read events.
	
	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	Unavailable
	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	314101

k. Users will have the option to go back to 'default' mbm_assign_mode if required.
   This can be done using the following command. Note that switching the
   mbm_assign_mode will reset all the MBM counters of all resctrl groups.

	# echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
	mbm_cntr_assign
	[default]
	
l. Unmount the resctrl
	 
	#umount /sys/fs/resctrl/
---
v12:
   This version is kind of RFC series with a new interface.
   
   Removed Reviewed-by tag on few patches when the patch has changed.

   Moved BMEC related patches (1 and 2) to beginning of the series.
   Removed the dependancy on BMEC to ABMC feature.

   Removed the un-necessary initialization of mon_config_info structure.
   Changed wrmsrl instead of wrmsr to address the below comment.
   https://lore.kernel.org/lkml/0fc8dbd4-07d8-40bd-8eec-402b48762807@zytor.com/

   Fixed the conflicts due to recent changes in rdt_resource data structure.
   Added new mbm_cfg_mask field to resctrl_mon.
   
   Added the code to reset arch state inside _resctrl_abmc_enable().

   Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
   This will be defined only in arm and not in x86.

   Changed the code to display the max supported monitoring counters in each domain.
   
   Fixed the struct mbm_cntr_cfg code documentation.
   Moved the struct mbm_cntr_cfg definition to resctrl/internal.h as suggested by James.

   Replaced seq_puts(s, ";") with seq_putc(s, ';');
   Added missing rdt_last_cmd_clear() in resctrl_available_mbm_cntrs_show().

   Added the check to reset the architecture-specific state only when assign is requested.

   Added evt_cfg as the parameter to resctrl_arch_config_cntr() as the user will
   be passing the event configuration from /info/L3_MON/event_configs/.

   Changed the check in resctrl_alloc_config_cntr() to reduce the indentation.
   Fixed the handling error on first failure while assigning.
   Added new parameter event configuration (evt_cfg) to get the event configuration from user space.

   Added tte support for reading ABMC counters. This is bit involved change and affects lots of code.

   New patch to support event configurations via new counter_configs method.

   Removed mbm_cntr_reset() as it is not required while removing the group.

   Added new patch to handle auto assign on group creation ("mbm_assign_on_mkdir")

   Added couple of patches add interface for "mbm_L3_assignments" on each mon group.

   Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear counters and
   non-architectural states when monitor mode is changed.
   https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/

   Moved the resctrl_arch_mbm_cntr_assign_set_one to domain_add_cpu_mon().

   Patches 17, 18, 19, 20, 21, 23, 24 are completely new to address the new interface requirement.

v11:
   The commit 2937f9c361f7a ("x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflags")
   is already merged. Removed from the series.
   
   Resolved minor conflicts due to code displacement in latest code.
 
   Moved the monitoring related calls to monitor.c file when possible.
   Moved some of the changes from include/linux/resctrl.h to arch/x86/kernel/cpu/resctrl/internal.h
   as requested by Reinette. This changes will be moved back when arch and non code is separated.
   
   Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
   Renamed rdtgroup_num_mbm_cntrs_show() to resctrl_num_mbm_cntrs_show().

   Moved the mon_config_info structure definition to internal.h.
   Moved resctrl_arch_mon_event_config_get() and resctrl_arch_mon_event_config_set()
   to monitor.c file.

   Moved resctrl_arch_assign_cntr() and resctrl_abmc_config_one_amd() to monitor.c.
   Added the code to reset the arch state in resctrl_arch_assign_cntr().
   Also removed resctrl_arch_reset_rmid() inside IPI as the counters are reset from the callers.

   Renamed rdtgroup_assign_cntr_event() to resctrl_assign_cntr_event().
   Refactored the resctrl_assign_cntr_event().
   Added functionality to exit on the first error during assignment.
   Simplified mbm_cntr_free().
   Removed the function mbm_cntr_assigned(). Will be using mbm_cntr_get() to
   figure out if the counter is assigned or not.
   
   Renamed rdtgroup_unassign_cntr_event() to resctrl_unassign_cntr_event().
   Refactored the resctrl_unassign_cntr_event().

   Moved mbm_cntr_reset() to monitor.c.
   Added code reset non-architectural state in mbm_cntr_reset().
   Added missing rdtgroup_unassign_cntrs() calls on failure path.

   Domain can be NULL with SNC support so moved the unassign check in rdtgroup_mondata_show().

   Renamed rdtgroup_mbm_assign_mode_write() to resctrl_mbm_assign_mode_write().
   Added more details in resctrl.rst about mbm_cntr_assign mode.
   Re-arranged the text in resctrl.rst file in section mbm_cntr_assign.

   Moved resctrl_arch_mbm_cntr_assign_set_one() to monitor.c

   Added non-arch RMID reset in mbm_config_write_domain().
   Removed resctrl_arch_reset_rmid() call in resctrl_abmc_config_one_amd(). Not required
   as reset of arch and non-arch rmid counters done from the callers. It simplies the IPI code.

   Fixed printing the separator after each domain while listing the group assignments.
   Renamed rdtgroup_mbm_assign_control_show to resctrl_mbm_assign_control_show().

   Fixed the static check warning with initializing dom_id in resctrl_process_flags()

   Added change log in each patch for specific changes.

v10:
   Major change is related to domain specific assignment.
   Added struct mbm_cntr_cfg inside mon domains. This will handle
   the domain specific assignments as discussed in below.
   https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
   I did not see the need to add cntr_id in mbm_state structure. Not used in the code.
   Following patches take care of these changes.
   Patch 12, 13, 15, 16, 17, 18.
   
   Added __init attribute to cache_alloc_hsw_probe(). Followed function
   prototype rules (preferred order is storage class before return type).
   
   Moved the mon_config_info structure definition to resctrl.h
   
   Added call resctrl_arch_reset_rmid() to reset the RMID in the domain inside IPI call
   resctrl_abmc_config_one_amd.
   
   SMP and non-SMP call support is not required in resctrl_arch_config_cntr with new
   domain specific assign approach/data structure.
   
   Assigned the counter before exposing the event files.
   Moved the call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
   This is called both CNTR_MON and MON group creation.
   
   Call mbm_cntr_reset() when unmounted to clear all the assignments.
   
   Fixed the issue with finding the domain in multiple iterations in rdtgroup_process_flags().
   
   Printed full error message with domain information when assign fails.
   
   Taken care of other text comments in all the patches. Patch specific changes are in each patch.
   
   If I missed something please point me and it is not intentional.

v9:
   Patch 14 is a new addition. 
   Major change in patch 24.
   Moved the fix patch to address __init attribute to begining of the series.
   Fixed all the call sequences. Added additional Fixed tags.

   Added Reviewed-by where applicable.

   Took care of couple of minor merge conflicts with latest code.
   Re-ordered the MSR in couple of instances.
   Added available_mbm_cntrs (patch 14) to print the number of counter in a domain.

   Used MBM_EVENT_ARRAY_INDEX macro to get the event index.
   Introduced rdtgroup_cntr_id_init() to initialize the cntr_id

   Introduced new function resctrl_config_cntr to assign the counter, update
   the bitmap and reset the architectural state.
   Taken care of error handling(freeing the counter) when assignment fails.
  
   Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
   Updated couple of rdtgroup_unassign_cntrs() calls properly.

   Fixed problem changing the mode to mbm_cntr_assign mode when it is
   not supported. Added extra checks to detect if systems supports it.
   
   https://lore.kernel.org/lkml/03b278b5-6c15-4d09-9ab7-3317e84a409e@intel.com/
   As discussed in the above comment, introduced resctrl_mon_event_config_set to
   handle IPI. But sending another IPI inside IPI causes problem. Kernel
   reports SMP warning. So, introduced resctrl_arch_update_cntr() to send the
   command directly.

   Fixed handling special case '//0=' and '//".
   Removed extra strstr() call in rdtgroup_mbm_assign_control_write().
   Added generic failure text when assignment operation fails.
   Corrected user documentation format texts.

v8:
  Patches are getting into final stages. 
  Couple of changes Patch 8, Patch 19 and Patch 23.
  Most of the other changes are related to rename and text message updates.

  Details are in each patch. Here is the summary.

  Added __init attribute to dom_data_init() in patch 8/25.
  Moved the mbm_cntrs_init() and mbm_cntrs_exit() functionality inside
  dom_data_init() and dom_data_exit() respectively.

  Renamed resctrl_mbm_evt_config_init() to arch_mbm_evt_config_init()
  Renamed resctrl_arch_event_config_get() to resctrl_arch_mon_event_config_get().
          resctrl_arch_event_config_set() to resctrl_arch_mon_event_config_set().

  Rename resctrl_arch_assign_cntr to resctrl_arch_config_cntr.
  Renamed rdtgroup_assign_cntr() to rdtgroup_assign_cntr_event().
  Added the code to return the error if rdtgroup_assign_cntr_event fails.
  Moved definition of MBM_EVENT_ARRAY_INDEX to resctrl/internal.h.
  Renamed rdtgroup_mbm_cntr_is_assigned to mbm_cntr_assigned_to_domain
  Added return error handling in resctrl_arch_config_cntr().
  Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
  Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
  Fixed the problem with unassigning the child MON groups of CTRL_MON group.
  Reset the internal counters after mbm_cntr_assign mode is changed.
  Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset()
  Renamed resctrl_arch_mbm_cntr_assign_configure to
            resctrl_arch_mbm_cntr_assign_set_one.

  Used the same IPI as event update to modify the assignment.
  Could not do the way we discussed in the thread.
  https://lore.kernel.org/lkml/f77737ac-d3f6-3e4b-3565-564f79c86ca8@amd.com/
  Needed to figure out event type to update the configuration.

  Moved unassign first and assign during the assign modification.
  Assign none "_" takes priority. Cannot be mixed with other flags.
  Updated the documentation and .rst file format. htmldoc looks ok.

v7:
   Major changes are related to FS and arch codes separation.
   Changed few interface names based on feedback.
   Here are the summary and each patch contains changes specific the patch.

   Removed WARN_ON for num_mbm_cntrs. Decided to dynamically allocate the bitmap.
   WARN_ON is not required anymore.
 
   Renamed the function resctrl_arch_get_abmc_enabled() to resctrl_arch_mbm_cntr_assign_enabled().

   Merged resctrl_arch_mbm_cntr_assign_disable, resctrl_arch_mbm_cntr_assign_disable
   and renamed to resctrl_arch_mbm_cntr_assign_set(). Passed the struct rdt_resource
   to these functions.

   Removed resctrl_arch_reset_rmid_all() from arch code. This will be done from FS the caller.

   Updated the descriptions/commit log in resctrl.rst to generic text. Removed ABMC references.
   Renamed mbm_mode to mbm_assign_mode.
   Renamed mbm_control to  mbm_assign_control.
   Introduced mutex lock in rdtgroup_mbm_mode_show().
 
   The 'legacy' mode is called 'default' mode. 

   Removed the static allocation and now allocating bitmap mbm_cntr_free_map dynamically.

   Merged rdtgroup_assign_cntr(), rdtgroup_alloc_cntr() into one.
   Merged rdtgroup_unassign_cntr(), rdtgroup_free_cntr() into one.
   
  Added struct rdt_resource to the interface functions resctrl_arch_assign_cntr ()
  and resctrl_arch_unassign_cntr().
  Rename rdtgroup_abmc_cfg() to resctrl_abmc_config_one_amd().
   
  Added a new patch to fix counter assignment on event config changes.

  Removed the references of ABMC from user interfaces.

  Simplified the parsing (strsep(&token, "//") in rdtgroup_mbm_assign_control_write().
  Added mutex lock in rdtgroup_mbm_assign_control_write() while processing.

  Thomas Gleixner asked us to update  https://gitlab.com/x86-cpuid.org/x86-cpuid-db. 
  It needs internal approval. We are working on it.

v6:
  We still need to finalize few interface details on mbm_assign_mode and mbm_assign_control
  in case of ABMC and Soft-ABMC. We can continue the discussion with this series.

  Added support for domain-id '*' to update all the domains at once.
  Fixed assign interface to allocate the counter if counter is
  not assigned.   
  Fixed unassign interface to free the counter if the counter is not
  assigned in any of the domains.

  Renamed abmc_capable to mbm_cntr_assignable.

  Renamed abmc_enabled to mbm_cntr_assign_enabled.
  Used msr_set_bit and msr_clear_bit for msr updates.
  Renamed resctrl_arch_abmc_enable() to resctrl_arch_mbm_cntr_assign_enable().
  Renamed resctrl_arch_abmc_disable() to resctrl_arch_mbm_cntr_assign_disable().

  Changed the display name from num_cntrs to num_mbm_cntrs.

  Removed the variable mbm_cntrs_free_map_len. This is not required.
  Removed the call mbm_cntrs_init() in arch code. This needs to be done at higher level.
  Used DECLARE_BITMAP to initialize mbm_cntrs_free_map.
  Removed unused config value definitions.

  Introduced mbm_cntr_map to track counters at domain level. With this
  we dont need to send MSR read to read the counter configuration.

  Separated all the counter id management to upper level in FS code.

  Added checks to detect "Unassigned" before reading the RMID.

  More details in each patch.

v5:
  Rebase changes (because of SNC support)

  Interface changes.
   /sys/fs/resctrl/mbm_assign to /sys/fs/resctrl/mbm_assign_mode.
   /sys/fs/resctrl/mbm_assign_control to /sys/fs/resctrl/mbm_assign_control.

  Added few arch specific routines.
  resctrl_arch_get_abmc_enabled.
  resctrl_arch_abmc_enable.
  resctrl_arch_abmc_disable.

  Few renames
   num_cntrs_free_map -> mbm_cntrs_free_map
   num_cntrs_init -> mbm_cntrs_init
   arch_domain_mbm_evt_config -> resctrl_arch_mbm_evt_config

  Introduced resctrl_arch_event_config_get and
    resctrl_arch_event_config_set() to update event configuration.

  Removed mon_state field mongroup. Added MON_CNTR_UNSET to initialize counters.

  Renamed ctr_id to cntr_id for the hardware counter.
 
  Report "Unassigned" in case the user attempts to read the events without assigning the counter.
  
  ABMC is enabled during the boot up. Can be enabled or disabled later.

  Fixed opcode and flags combination.
    '=_" is valid.
    "-_" amd "+_" is not valid.

 Added all the comments as far as I know. If I missed something, it is not intentional.

v4: 
  Main change is domain specific event assignment.
  Kept the ABMC feature as a default.
  Dynamcic switching between ABMC and mbm_legacy is still allowed.
  We are still not clear about mount option.
  Moved the monitoring related data in resctrl_mon structure from rdt_resource.
  Fixed the display of legacy and ABMC mode.
  Used bimap APIs when possible.
  Removed event configuration read from MSRs. We can use the
  internal saved data.(patch 12)
  Added more comments about L3_QOS_ABMC_CFG MSR.
  Added IPIs to read the assignment status for each domain (patch 18 and 19)
  More details in each patch.

v3:
   This series adds the support for global assignment mode discussed in
   the thread. https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
   Removed the individual assignment mode and included the global assignment interface.
   Added following interface files.
   a. /sys/fs/resctrl/info/L3_MON/mbm_assign
      Used for displaying the current assignment mode and switch between
      ABMC and legacy mode.
   b. /sys/fs/resctrl/info/L3_MON/mbm_assign_control
      Used for lising the groups assignment mode and modify the assignment states.
   c. Most of the changes are related to the new interface.
   d. Addressed the comments from Reinette, James and Peter.
   e. Hope I have addressed most of the major feedbacks discussed. If I missed
      something then it is not intentional. Please feel free to comment.
   f. Sending this as an RFC as per Reinette's comment. So, this is still open
      for discussion.

v2:
   a. Major change is the way ABMC is enabled. Earlier, user needed to remount
      with -o abmc to enable ABMC feature. Removed that option now.
      Now users can enable ABMC by "$echo 1 to /sys/fs/resctrl/info/L3_MON/mbm_assign_enable".
     
   b. Added new word 21 to x86/cpufeatures.h.

   c. Display unsupported if user attempts to read the events when ABMC is enabled
      and event is not assigned.

   d. Display monitor_state as "Unsupported" when ABMC is disabled.
  
   e. Text updates and rebase to latest tip tree (as of Jan 18).
 
   f. This series is still work in progress. I am yet to hear from ARM developers. 

--------------------------------------------------------------------------------------

Previous revisions:
v11: https://lore.kernel.org/lkml/cover.1737577229.git.babu.moger@amd.com/
v10: https://lore.kernel.org/lkml/cover.1734034524.git.babu.moger@amd.com/
v9: https://lore.kernel.org/lkml/cover.1730244116.git.babu.moger@amd.com/
v8: https://lore.kernel.org/lkml/cover.1728495588.git.babu.moger@amd.com/
v7: https://lore.kernel.org/lkml/cover.1725488488.git.babu.moger@amd.com/
v6: https://lore.kernel.org/lkml/cover.1722981659.git.babu.moger@amd.com/
v5: https://lore.kernel.org/lkml/cover.1720043311.git.babu.moger@amd.com/
v4: https://lore.kernel.org/lkml/cover.1716552602.git.babu.moger@amd.com/
v3: https://lore.kernel.org/lkml/cover.1711674410.git.babu.moger@amd.com/  
v2: https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
v1: https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/

Babu Moger (26):
  x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct
    rdt_hw_mon_domain
  x86/resctrl: Remove MSR reading of event configuration value
  x86/cpufeatures: Add support for Assignable Bandwidth Monitoring
    Counters (ABMC)
  x86/resctrl: Add ABMC feature in the command line options
  x86/resctrl: Consolidate monitoring related data from rdt_resource
  x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  x86/resctrl: Add support to enable/disable AMD ABMC feature
  x86/resctrl: Introduce the interface to display monitor mode
  x86/resctrl: Introduce interface to display number of monitoring
    counters
  x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at
    domain
  x86/resctrl: Introduce interface to display number of free MBM
    counters
  x86/resctrl: Add data structures and definitions for ABMC assignment
  x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter
    with ABMC
  x86/resctrl: Add the functionality to assign MBM events
  x86/resctrl: Add the functionality to unassign MBM events
  x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign
    mode
  x86/resctrl: Add the support for reading ABMC counters
  x86/resctrl: Add default MBM event configurations for mbm_cntr_assign
    mode
  x86/resctrl: Add event configuration directory under info/L3_MON/
  x86/resctrl: Provide interface to update the event configurations
  x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments
  x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is
    enabled
  x86/resctrl: Introduce mbm_L3_assignments to list assignments in a
    group
  x86/resctrl: Introduce the interface to modify assignments in a group
  x86/resctrl: Introduce the interface to switch between monitor modes
  x86/resctrl: Configure mbm_cntr_assign mode if supported

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/arch/x86/resctrl.rst            | 186 ++++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   2 +
 arch/x86/kernel/cpu/resctrl/core.c            |  15 +-
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |  18 +-
 arch/x86/kernel/cpu/resctrl/internal.h        |  98 +++
 arch/x86/kernel/cpu/resctrl/monitor.c         | 456 +++++++++-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 809 ++++++++++++++++--
 arch/x86/kernel/cpu/scattered.c               |   1 +
 include/linux/resctrl.h                       |  72 +-
 include/linux/resctrl_types.h                 |  17 +
 13 files changed, 1559 insertions(+), 120 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 20:49   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

If the BMEC (Bandwidth Monitoring Event Configuration) feature is
supported, the bandwidth events can be configured to track specific
events. The event configuration is domain specific. Event configurations
are not stored in resctrl but instead always read from or written to
hardware directly when prompted by user space.

Read the event configuration from the hardware during domain
initialization and store the configuration value in the rdt_hw_mon_domain
structure for later use when the user requests to display it.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Fixed the conflicts due to recent merge.
     This patch is for BMEC and there is no dependancy on ABMC feature.
     Moved it earlier.

v11: Resolved minor conflicts due to code displacement. Actual code didnt
     change.

v10: Conflicts due to code displacement. Actual code didnt change.

v9: Added Reviewed-by tag. No other changes.

v8: Renamed resctrl_mbm_evt_config_init() to arch_mbm_evt_config_init()
    Minor commit message update.

v7: Fixed initializing INVALID_CONFIG_VALUE to mbm_local_cfg in case of error.

v6: Renamed resctrl_arch_mbm_evt_config -> resctrl_mbm_evt_config_init
    Initialized value to INVALID_CONFIG_VALUE if it is not configurable.
    Minor commit message update.

v5: Exported mon_event_config_index_get.
    Renamed arch_domain_mbm_evt_config to resctrl_arch_mbm_evt_config.

v4: Read the configuration information from the hardware to initialize.
    Added few commit messages.
    Fixed the tab spaces.

v3: Minor changes related to rebase in mbm_config_write_domain.

v2: No changes.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
 arch/x86/kernel/cpu/resctrl/internal.h |  9 +++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
 4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index cf29681d01e0..a28de257168f 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -558,6 +558,8 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 		return;
 	}
 
+	arch_mbm_evt_config_init(hw_dom);
+
 	list_add_tail_rcu(&d->hdr.list, add_pos);
 
 	err = resctrl_online_mon_domain(r, d);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c44c5b496355..9846153aa48f 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -32,6 +32,9 @@
  */
 #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
 
+#define INVALID_CONFIG_VALUE		U32_MAX
+#define INVALID_CONFIG_INDEX		UINT_MAX
+
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
@@ -335,6 +338,8 @@ struct rdt_hw_ctrl_domain {
  * @d_resctrl:	Properties exposed to the resctrl file system
  * @arch_mbm_total:	arch private state for MBM total bandwidth
  * @arch_mbm_local:	arch private state for MBM local bandwidth
+ * @mbm_total_cfg:	MBM total bandwidth configuration
+ * @mbm_local_cfg:	MBM local bandwidth configuration
  *
  * Members of this structure are accessed via helpers that provide abstraction.
  */
@@ -342,6 +347,8 @@ struct rdt_hw_mon_domain {
 	struct rdt_mon_domain		d_resctrl;
 	struct arch_mbm_state		*arch_mbm_total;
 	struct arch_mbm_state		*arch_mbm_local;
+	u32				mbm_total_cfg;
+	u32				mbm_local_cfg;
 };
 
 static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct rdt_ctrl_domain *r)
@@ -504,6 +511,8 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
+void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
+unsigned int mon_event_config_index_get(u32 evtid);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index a93ed7d2a160..abd337fbd01d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1284,6 +1284,32 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	return 0;
 }
 
+void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom)
+{
+	unsigned int index;
+	u64 msrval;
+
+	/*
+	 * Read the configuration registers QOS_EVT_CFG_n, where <n> is
+	 * the BMEC event number (EvtID).
+	 */
+	if (mbm_total_event.configurable) {
+		index = mon_event_config_index_get(QOS_L3_MBM_TOTAL_EVENT_ID);
+		rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
+		hw_dom->mbm_total_cfg = msrval & MAX_EVT_CONFIG_BITS;
+	} else {
+		hw_dom->mbm_total_cfg = INVALID_CONFIG_VALUE;
+	}
+
+	if (mbm_local_event.configurable) {
+		index = mon_event_config_index_get(QOS_L3_MBM_LOCAL_EVENT_ID);
+		rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
+		hw_dom->mbm_local_cfg = msrval & MAX_EVT_CONFIG_BITS;
+	} else {
+		hw_dom->mbm_local_cfg = INVALID_CONFIG_VALUE;
+	}
+}
+
 void resctrl_mon_resource_exit(void)
 {
 	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index c6274d40b217..bee32eaef8ab 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1601,7 +1601,7 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
  *         1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
  *         INVALID_CONFIG_INDEX for invalid evtid
  */
-static inline unsigned int mon_event_config_index_get(u32 evtid)
+unsigned int mon_event_config_index_get(u32 evtid)
 {
 	switch (evtid) {
 	case QOS_L3_MBM_TOTAL_EVENT_ID:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2025-04-04  0:18 ` [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 20:50   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (23 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The event configuration is domain specific and initialized during domain
initialization. The values are stored in struct rdt_hw_mon_domain.

It is not required to read the configuration register every time user asks
for it. Use the value stored in struct rdt_hw_mon_domain instead.

Introduce resctrl_arch_mon_event_config_get() and
resctrl_arch_mon_event_config_set() to get/set architecture domain specific
mbm_total_cfg/mbm_local_cfg values.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Removed the un-necessary initialization of mon_config_info structure.
     Changed wrmsrl instead of wrmsr to address the below comment.
     https://lore.kernel.org/lkml/0fc8dbd4-07d8-40bd-8eec-402b48762807@zytor.com/
     Fixed a minor typo in comment.
     Added comments to resctrl_arch_mon_event_config_get() and resctrl_arch_mon_event_config_set()
     Resolved the conflicts from the recent changes.
     This patch is for BMEC and there is no dependancy on ABMC feature. Moved it earlier.

v11: Moved the mon_config_info structure definition to internal.h.
     Moved resctrl_arch_mon_event_config_get() and resctrl_arch_mon_event_config_set()
     to monitor.c file.
     Renamed local variable from val to config_val.

v10: Moved the mon_config_info structure definition to resctrl.h.

v9: Removed QOS_L3_OCCUP_EVENT_ID switch case in resctrl_arch_mon_event_config_set.
    Fixed a unnecessary space.

v8: Renamed
    resctrl_arch_event_config_get() to resctrl_arch_mon_event_config_get().
    resctrl_arch_event_config_set() to resctrl_arch_mon_event_config_set().

v7: Removed check if (val == INVALID_CONFIG_VALUE) as resctrl_arch_event_config_get
    already prints warning.
    Kept the Event config value definitions as is.

v6: Fixed inconstancy with types. Made all the types to u32 for config
    value.
    Removed few rdt_last_cmd_puts as it is not necessary.
    Removed unused config value definitions.
    Few more updates to commit message.

v5: Introduced resctrl_arch_event_config_get and
    resctrl_arch_event_config_get() based on our discussion.
    https://lore.kernel.org/lkml/68e861f9-245d-4496-a72e-46fc57d19c62@amd.com/

v4: New patch.
---
 arch/x86/kernel/cpu/resctrl/monitor.c  | 52 +++++++++++++++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 63 +++++---------------------
 include/linux/resctrl.h                | 18 +++-----
 3 files changed, 71 insertions(+), 62 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index abd337fbd01d..b84cd48c3d95 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1330,3 +1330,55 @@ void __init intel_rdt_mbm_apply_quirk(void)
 	mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold;
 	mbm_cf = mbm_cf_table[cf_index].cf;
 }
+
+/*
+ * May run on CPU that does not belong to domain.
+ */
+u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
+				      enum resctrl_event_id eventid)
+{
+	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+
+	switch (eventid) {
+	case QOS_L3_OCCUP_EVENT_ID:
+		break;
+	case QOS_L3_MBM_TOTAL_EVENT_ID:
+		return hw_dom->mbm_total_cfg;
+	case QOS_L3_MBM_LOCAL_EVENT_ID:
+		return hw_dom->mbm_local_cfg;
+	}
+
+	/* Never expect to get here */
+	WARN_ON_ONCE(1);
+
+	return INVALID_CONFIG_VALUE;
+}
+
+/*
+ * Runs on CPU that belongs to domain.
+ */
+void resctrl_arch_mon_event_config_set(void *info)
+{
+	struct resctrl_mon_config_info *mon_info = info;
+	struct rdt_hw_mon_domain *hw_dom;
+	unsigned int index;
+
+	index = mon_event_config_index_get(mon_info->evtid);
+	if (index == INVALID_CONFIG_INDEX)
+		return;
+
+	wrmsrl(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config);
+
+	hw_dom = resctrl_to_arch_mon_dom(mon_info->d);
+
+	switch (mon_info->evtid) {
+	case QOS_L3_MBM_TOTAL_EVENT_ID:
+		hw_dom->mbm_total_cfg = mon_info->mon_config;
+		break;
+	case QOS_L3_MBM_LOCAL_EVENT_ID:
+		hw_dom->mbm_local_cfg = mon_info->mon_config;
+		break;
+	default:
+		break;
+	}
+}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index bee32eaef8ab..b8100c89f1a6 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1614,34 +1614,11 @@ unsigned int mon_event_config_index_get(u32 evtid)
 	}
 }
 
-void resctrl_arch_mon_event_config_read(void *_config_info)
-{
-	struct resctrl_mon_config_info *config_info = _config_info;
-	unsigned int index;
-	u64 msrval;
-
-	index = mon_event_config_index_get(config_info->evtid);
-	if (index == INVALID_CONFIG_INDEX) {
-		pr_warn_once("Invalid event id %d\n", config_info->evtid);
-		return;
-	}
-	rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
-
-	/* Report only the valid event configuration bits */
-	config_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
-}
-
-static void mondata_config_read(struct resctrl_mon_config_info *mon_info)
-{
-	smp_call_function_any(&mon_info->d->hdr.cpu_mask,
-			      resctrl_arch_mon_event_config_read, mon_info, 1);
-}
-
 static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
 {
-	struct resctrl_mon_config_info mon_info;
 	struct rdt_mon_domain *dom;
 	bool sep = false;
+	u32 config_val;
 
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
@@ -1650,13 +1627,8 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid
 		if (sep)
 			seq_puts(s, ";");
 
-		memset(&mon_info, 0, sizeof(struct resctrl_mon_config_info));
-		mon_info.r = r;
-		mon_info.d = dom;
-		mon_info.evtid = evtid;
-		mondata_config_read(&mon_info);
-
-		seq_printf(s, "%d=0x%02x", dom->hdr.id, mon_info.mon_config);
+		config_val = resctrl_arch_mon_event_config_get(dom, evtid);
+		seq_printf(s, "%d=0x%02x", dom->hdr.id, config_val);
 		sep = true;
 	}
 	seq_puts(s, "\n");
@@ -1687,35 +1659,23 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
 	return 0;
 }
 
-void resctrl_arch_mon_event_config_write(void *_config_info)
-{
-	struct resctrl_mon_config_info *config_info = _config_info;
-	unsigned int index;
-
-	index = mon_event_config_index_get(config_info->evtid);
-	if (index == INVALID_CONFIG_INDEX) {
-		pr_warn_once("Invalid event id %d\n", config_info->evtid);
-		return;
-	}
-	wrmsr(MSR_IA32_EVT_CFG_BASE + index, config_info->mon_config, 0);
-}
-
 static void mbm_config_write_domain(struct rdt_resource *r,
 				    struct rdt_mon_domain *d, u32 evtid, u32 val)
 {
-	struct resctrl_mon_config_info mon_info = {0};
+	struct resctrl_mon_config_info mon_info;
+	u32 config_val;
 
 	/*
-	 * Read the current config value first. If both are the same then
+	 * Check the current config value first. If both are the same then
 	 * no need to write it again.
 	 */
+	config_val = resctrl_arch_mon_event_config_get(d, evtid);
+	if (config_val == INVALID_CONFIG_VALUE || config_val == val)
+		return;
+
 	mon_info.r = r;
 	mon_info.d = d;
 	mon_info.evtid = evtid;
-	mondata_config_read(&mon_info);
-	if (mon_info.mon_config == val)
-		return;
-
 	mon_info.mon_config = val;
 
 	/*
@@ -1724,7 +1684,8 @@ static void mbm_config_write_domain(struct rdt_resource *r,
 	 * are scoped at the domain level. Writing any of these MSRs
 	 * on one CPU is observed by all the CPUs in the domain.
 	 */
-	smp_call_function_any(&d->hdr.cpu_mask, resctrl_arch_mon_event_config_write,
+	smp_call_function_any(&d->hdr.cpu_mask,
+			      resctrl_arch_mon_event_config_set,
 			      &mon_info, 1);
 
 	/*
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 880351ca3dfc..afa9aabf014c 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -361,7 +361,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
 __init bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
 
 /**
- * resctrl_arch_mon_event_config_write() - Write the config for an event.
+ * resctrl_arch_mon_event_config_set() - Write the config for an event.
  * @config_info: struct resctrl_mon_config_info describing the resource, domain
  *		 and event.
  *
@@ -370,19 +370,15 @@ __init bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt);
  *
  * Called via IPI to reach a CPU that is a member of the specified domain.
  */
-void resctrl_arch_mon_event_config_write(void *config_info);
+void resctrl_arch_mon_event_config_set(void *config_info);
 
 /**
- * resctrl_arch_mon_event_config_read() - Read the config for an event.
- * @config_info: struct resctrl_mon_config_info describing the resource, domain
- *		 and event.
- *
- * Reads resource, domain and eventid from @config_info and reads the
- * hardware config value into config_info->mon_config.
- *
- * Called via IPI to reach a CPU that is a member of the specified domain.
+ * resctrl_arch_mon_event_config_get() - Get config value from the hardware domain.
+ * @d:			Monitoring domain to read config value
+ * @eventid:		enum resctrl_event_id describing type
  */
-void resctrl_arch_mon_event_config_read(void *config_info);
+u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
+				      enum resctrl_event_id eventid);
 
 /* For use by arch code to remap resctrl's smaller CDP CLOSID range */
 static inline u32 resctrl_get_config_index(u32 closid,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2025-04-04  0:18 ` [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
  2025-04-04  0:18 ` [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 20:52   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 04/26] x86/resctrl: Add ABMC feature in the command line options Babu Moger
                   ` (22 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Users can create as many monitor groups as RMIDs supported by the hardware.
However, bandwidth monitoring feature on AMD system only guarantees that
RMIDs currently assigned to a processor will be tracked by hardware. The
counters of any other RMIDs which are no longer being tracked will be reset
to zero. The MBM event counters return "Unavailable" for the RMIDs that are
not tracked by hardware. So, there can be only limited number of groups
that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups are
being tracked for certain point of time. Users do not have the option to
monitor a group or set of groups for certain period of time without
worrying about RMID being reset in between.

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as it is
assigned. The assigned RMID will be tracked by the hardware until the user
unassigns it manually. There is no need to worry about counters being reset
during this period. Additionally, the user can specify a bitmask
identifying the specific bandwidth types from the given source to track
with the counter.

Without ABMC enabled, monitoring will work in current mode without
assignment option.

Linux resctrl subsystem provides the interface to count maximum of two
memory bandwidth events per group, from a combination of available total
and local events. Keeping the current interface, users can enable a maximum
of 2 ABMC counters per group. User will also have the option to enable only
one counter to the group. If the system runs out of assignable ABMC
counters, kernel will display an error. Users need to disable an already
enabled counter to make space for new assignments.

The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5.
Bits Description
5    ABMC (Assignable Bandwidth Monitoring Counters)

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---

Note: Checkpatch checks/warnings are ignored to maintain coding style.

v12: Removed the dependancy on X86_FEATURE_BMEC.
     Removed the Reviewed-by tag as patch has changed.

v11: No changes.

v10: No changes.

v9: Took care of couple of minor merge conflicts. No other changes.

v8: No changes.

v7: Removed "" from feature flags. Not required anymore.
    https://lore.kernel.org/lkml/20240817145058.GCZsC40neU4wkPXeVR@fat_crate.local/

v6: Added Reinette's Reviewed-by. Moved the Checkpatch note below ---.

v5: Minor rebase change and subject line update.

v4: Changes because of rebase. Feature word 21 has few more additions now.
    Changed the text to "tracked by hardware" instead of active.

v3: Change because of rebase. Actual patch did not change.

v2: Added dependency on X86_FEATURE_BMEC.
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 2 ++
 arch/x86/kernel/cpu/scattered.c    | 1 +
 3 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 8b7cf13e0acb..accc1c328672 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -479,6 +479,7 @@
 #define X86_FEATURE_AMD_FAST_CPPC	(21*32 + 5) /* Fast CPPC */
 #define X86_FEATURE_AMD_HETEROGENEOUS_CORES (21*32 + 6) /* Heterogeneous Core Topology */
 #define X86_FEATURE_AMD_WORKLOAD_CLASS	(21*32 + 7) /* Workload Classification */
+#define X86_FEATURE_ABMC		(21*32 + 8) /* Assignable Bandwidth Monitoring Counters */
 
 /*
  * BUG word(s)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index a2fbea0be535..2f54831e04e5 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -71,6 +71,8 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_TOTAL   },
 	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_LOCAL   },
+	{ X86_FEATURE_ABMC,			X86_FEATURE_CQM_MBM_TOTAL   },
+	{ X86_FEATURE_ABMC,			X86_FEATURE_CQM_MBM_LOCAL   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
 	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 16f3ca30626a..3b72b72270f1 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -49,6 +49,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_MBA,			CPUID_EBX,  6, 0x80000008, 0 },
 	{ X86_FEATURE_SMBA,			CPUID_EBX,  2, 0x80000020, 0 },
 	{ X86_FEATURE_BMEC,			CPUID_EBX,  3, 0x80000020, 0 },
+	{ X86_FEATURE_ABMC,			CPUID_EBX,  5, 0x80000020, 0 },
 	{ X86_FEATURE_AMD_WORKLOAD_CLASS,	CPUID_EAX, 22, 0x80000021, 0 },
 	{ X86_FEATURE_PERFMON_V2,		CPUID_EAX,  0, 0x80000022, 0 },
 	{ X86_FEATURE_AMD_LBR_V2,		CPUID_EAX,  1, 0x80000022, 0 },
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 04/26] x86/resctrl: Add ABMC feature in the command line options
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (2 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 05/26] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Add the command line option to enable or disable exposing the ABMC
(Assignable Bandwidth Monitoring Counters) hardware feature to resctrl.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v12: No changes.

v11: No changes.

v10: No changes.

v9: No code changes. Added Reviewed-by.

v8: Commit message update.

v7: No changes

v6: No changes

v5: No changes

v4: No changes

v3: No changes

v2: No changes
---
 Documentation/admin-guide/kernel-parameters.txt | 2 +-
 Documentation/arch/x86/resctrl.rst              | 1 +
 arch/x86/kernel/cpu/resctrl/core.c              | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3e5e41cbe3ce..c4a88e9202da 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5942,7 +5942,7 @@
 	rdt=		[HW,X86,RDT]
 			Turn on/off individual RDT features. List is:
 			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
-			mba, smba, bmec.
+			mba, smba, bmec, abmc.
 			E.g. to turn on cmt and turn off mba use:
 				rdt=cmt,!mba
 
diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 6768fc1fad16..fb90f08e564e 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -26,6 +26,7 @@ MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total", "cqm_mbm_local"
 MBA (Memory Bandwidth Allocation)		"mba"
 SMBA (Slow Memory Bandwidth Allocation)         ""
 BMEC (Bandwidth Monitoring Event Configuration) ""
+ABMC (Assignable Bandwidth Monitoring Counters) ""
 ===============================================	================================
 
 Historically, new features were made visible by default in /proc/cpuinfo. This
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a28de257168f..5ac1fe79a030 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -725,6 +725,7 @@ enum {
 	RDT_FLAG_MBA,
 	RDT_FLAG_SMBA,
 	RDT_FLAG_BMEC,
+	RDT_FLAG_ABMC,
 };
 
 #define RDT_OPT(idx, n, f)	\
@@ -750,6 +751,7 @@ static struct rdt_options rdt_options[]  __initdata = {
 	RDT_OPT(RDT_FLAG_MBA,	    "mba",	X86_FEATURE_MBA),
 	RDT_OPT(RDT_FLAG_SMBA,	    "smba",	X86_FEATURE_SMBA),
 	RDT_OPT(RDT_FLAG_BMEC,	    "bmec",	X86_FEATURE_BMEC),
+	RDT_OPT(RDT_FLAG_ABMC,	    "abmc",	X86_FEATURE_ABMC),
 };
 #define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 05/26] x86/resctrl: Consolidate monitoring related data from rdt_resource
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (3 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 04/26] x86/resctrl: Add ABMC feature in the command line options Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 06/26] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The cache allocation and memory bandwidth allocation feature properties
are consolidated into struct resctrl_cache and struct resctrl_membw
respectively.

In preparation for more monitoring properties that will clobber the
existing resource struct more, re-organize the monitoring specific
properties to also be in a separate structure.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Fixed the conflicts due to recent changes in rdt_resource data structure.
     Added new mbm_cfg_mask field to resctrl_mon.
     Removed Reviewed-by tag as patch has changed.

v11: No changes.

v10: No changes.

v9: No changes.

v8: Added Reviewed-by from Reinette. No other changes.

v7: Added kernel doc for data structure. Minor text update.

v6: Update commit message and update kernel doc for rdt_resource.

v5: Commit message update.
    Also changes related to data structure updates does to SNC support.

v4: New patch.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  4 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c  | 20 ++++++++++----------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++++++------
 include/linux/resctrl.h                | 22 +++++++++++++++-------
 4 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 5ac1fe79a030..16f700c2d00d 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -110,7 +110,7 @@ u32 resctrl_arch_system_num_rmid_idx(void)
 	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
 
 	/* RMID are independent numbers for x86. num_rmid_idx == num_rmid */
-	return r->num_rmid;
+	return r->mon.num_rmid;
 }
 
 struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
@@ -553,7 +553,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 
 	arch_mon_domain_online(r, d);
 
-	if (arch_domain_mbm_alloc(r->num_rmid, hw_dom)) {
+	if (arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) {
 		mon_domain_free(hw_dom);
 		return;
 	}
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b84cd48c3d95..38970096ef3d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -222,7 +222,7 @@ static int logical_rmid_to_physical_rmid(int cpu, int lrmid)
 	if (snc_nodes_per_l3_cache == 1)
 		return lrmid;
 
-	return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid;
+	return lrmid + (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->mon.num_rmid;
 }
 
 static int __rmid_read_phys(u32 prmid, enum resctrl_event_id eventid, u64 *val)
@@ -297,11 +297,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
 
 	if (resctrl_arch_is_mbm_total_enabled())
 		memset(hw_dom->arch_mbm_total, 0,
-		       sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
+		       sizeof(*hw_dom->arch_mbm_total) * r->mon.num_rmid);
 
 	if (resctrl_arch_is_mbm_local_enabled())
 		memset(hw_dom->arch_mbm_local, 0,
-		       sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
+		       sizeof(*hw_dom->arch_mbm_local) * r->mon.num_rmid);
 }
 
 static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
@@ -1099,14 +1099,14 @@ static struct mon_evt mbm_local_event = {
  */
 static void l3_mon_evt_init(struct rdt_resource *r)
 {
-	INIT_LIST_HEAD(&r->evt_list);
+	INIT_LIST_HEAD(&r->mon.evt_list);
 
 	if (resctrl_arch_is_llc_occupancy_enabled())
-		list_add_tail(&llc_occupancy_event.list, &r->evt_list);
+		list_add_tail(&llc_occupancy_event.list, &r->mon.evt_list);
 	if (resctrl_arch_is_mbm_total_enabled())
-		list_add_tail(&mbm_total_event.list, &r->evt_list);
+		list_add_tail(&mbm_total_event.list, &r->mon.evt_list);
 	if (resctrl_arch_is_mbm_local_enabled())
-		list_add_tail(&mbm_local_event.list, &r->evt_list);
+		list_add_tail(&mbm_local_event.list, &r->mon.evt_list);
 }
 
 /*
@@ -1247,7 +1247,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 	resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024;
 	hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l3_cache;
-	r->num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache;
+	r->mon.num_rmid = (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3_cache;
 	hw_res->mbm_width = MBM_CNTR_WIDTH_BASE;
 
 	if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
@@ -1262,7 +1262,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	 *
 	 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
 	 */
-	threshold = resctrl_rmid_realloc_limit / r->num_rmid;
+	threshold = resctrl_rmid_realloc_limit / r->mon.num_rmid;
 
 	/*
 	 * Because num_rmid may not be a power of two, round the value
@@ -1276,7 +1276,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
-		r->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
 	}
 
 	r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b8100c89f1a6..17de38e26f94 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1134,7 +1134,7 @@ static int rdt_num_rmids_show(struct kernfs_open_file *of,
 {
 	struct rdt_resource *r = of->kn->parent->priv;
 
-	seq_printf(seq, "%d\n", r->num_rmid);
+	seq_printf(seq, "%d\n", r->mon.num_rmid);
 
 	return 0;
 }
@@ -1145,7 +1145,7 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
 	struct rdt_resource *r = of->kn->parent->priv;
 	struct mon_evt *mevt;
 
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	list_for_each_entry(mevt, &r->mon.evt_list, list) {
 		seq_printf(seq, "%s\n", mevt->name);
 		if (mevt->configurable)
 			seq_printf(seq, "%s_config\n", mevt->name);
@@ -1728,9 +1728,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
 	}
 
 	/* Value from user cannot be more than the supported set of events */
-	if ((val & r->mbm_cfg_mask) != val) {
+	if ((val & r->mon.mbm_cfg_mask) != val) {
 		rdt_last_cmd_printf("Invalid event configuration: max valid mask is 0x%02x\n",
-				    r->mbm_cfg_mask);
+				    r->mon.mbm_cfg_mask);
 		return -EINVAL;
 	}
 
@@ -3117,13 +3117,13 @@ static int mon_add_all_files(struct kernfs_node *kn, struct rdt_mon_domain *d,
 	struct mon_evt *mevt;
 	int ret;
 
-	if (WARN_ON(list_empty(&r->evt_list)))
+	if (WARN_ON(list_empty(&r->mon.evt_list)))
 		return -EPERM;
 
 	priv.u.rid = r->rid;
 	priv.u.domid = do_sum ? d->ci->id : d->hdr.id;
 	priv.u.sum = do_sum;
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	list_for_each_entry(mevt, &r->mon.evt_list, list) {
 		priv.u.evtid = mevt->evtid;
 		ret = mon_addfile(kn, mevt->name, priv.priv);
 		if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index afa9aabf014c..f31bb48f2b1f 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -241,40 +241,48 @@ enum resctrl_schema_fmt {
 	RESCTRL_SCHEMA_RANGE,
 };
 
+/**
+ * struct resctrl_mon - Monitoring related data of a resctrl resource
+ * @num_rmid:		Number of RMIDs available
+ * @mbm_cfg_mask:	Bandwidth sources that can be tracked when bandwidth
+ *			monitoring events can be configured.
+ * @evt_list:		List of monitoring events
+ */
+struct resctrl_mon {
+	int			num_rmid;
+	unsigned int		mbm_cfg_mask;
+	struct list_head	evt_list;
+};
+
 /**
  * struct rdt_resource - attributes of a resctrl resource
  * @rid:		The index of the resource
  * @alloc_capable:	Is allocation available on this machine
  * @mon_capable:	Is monitor feature available on this machine
- * @num_rmid:		Number of RMIDs available
  * @ctrl_scope:		Scope of this resource for control functions
  * @mon_scope:		Scope of this resource for monitor functions
  * @cache:		Cache allocation related data
  * @membw:		If the component has bandwidth controls, their properties.
+ * @mon:		Monitoring related data.
  * @ctrl_domains:	RCU list of all control domains for this resource
  * @mon_domains:	RCU list of all monitor domains for this resource
  * @name:		Name to use in "schemata" file.
  * @schema_fmt:		Which format string and parser is used for this schema.
- * @evt_list:		List of monitoring events
- * @mbm_cfg_mask:	Bandwidth sources that can be tracked when bandwidth
- *			monitoring events can be configured.
  * @cdp_capable:	Is the CDP feature available on this resource
  */
 struct rdt_resource {
 	int			rid;
 	bool			alloc_capable;
 	bool			mon_capable;
-	int			num_rmid;
 	enum resctrl_scope	ctrl_scope;
 	enum resctrl_scope	mon_scope;
 	struct resctrl_cache	cache;
 	struct resctrl_membw	membw;
+	struct resctrl_mon	mon;
 	struct list_head	ctrl_domains;
 	struct list_head	mon_domains;
 	char			*name;
 	enum resctrl_schema_fmt	schema_fmt;
-	struct list_head	evt_list;
-	unsigned int		mbm_cfg_mask;
 	bool			cdp_capable;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 06/26] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (4 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 05/26] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 07/26] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
Bits Description
15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
     Monitoring Counter ID + 1

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Detect the feature and number of assignable monitoring counters supported.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Resolved conflicts because of latest merge.
     Removed Reviewed-by as the patch has changed.

v11: No changes.

v10: No changes.

v9: Added Reviewed-by tag. No code changes

v8: Used GENMASK for the mask.

v7: Removed WARN_ON for num_mbm_cntrs. Decided to dynamically allocate the
    bitmap. WARN_ON is not required anymore.
    Removed redundant comments.

v6: Commit message update.
    Renamed abmc_capable to mbm_cntr_assignable.

v5: Name change num_cntrs to num_mbm_cntrs.
    Moved abmc_capable to resctrl_mon.

v4: Removed resctrl_arch_has_abmc(). Added all the code inline. We dont
    need to separate this as arch code.

v3: Removed changes related to mon_features.
    Moved rdt_cpu_has to core.c and added new function resctrl_arch_has_abmc.
    Also moved the fields mbm_assign_capable and mbm_assign_cntrs to
    rdt_resource. (James)

v2: Changed the field name to mbm_assign_capable from abmc_capable.
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 9 +++++++--
 include/linux/resctrl.h               | 4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 38970096ef3d..4132efd83be5 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1242,6 +1242,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	unsigned int threshold;
+	u32 eax, ebx, ecx, edx;
 
 	snc_nodes_per_l3_cache = snc_get_config();
 
@@ -1272,13 +1273,17 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	resctrl_rmid_realloc_threshold = resctrl_arch_round_mon_val(threshold);
 
 	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
-		u32 eax, ebx, ecx, edx;
-
 		/* Detect list of bandwidth sources that can be tracked */
 		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
 		r->mon.mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
 	}
 
+	if (rdt_cpu_has(X86_FEATURE_ABMC)) {
+		r->mon.mbm_cntr_assignable = true;
+		cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
+		r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
+	}
+
 	r->mon_capable = true;
 
 	return 0;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index f31bb48f2b1f..8247c33bbf5a 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -246,11 +246,15 @@ enum resctrl_schema_fmt {
  * @num_rmid:		Number of RMIDs available
  * @mbm_cfg_mask:	Bandwidth sources that can be tracked when bandwidth
  *			monitoring events can be configured.
+ * @num_mbm_cntrs:	Number of assignable monitoring counters
+ * @mbm_cntr_assignable:Is system capable of supporting monitor assignment?
  * @evt_list:		List of monitoring events
  */
 struct resctrl_mon {
 	int			num_rmid;
 	unsigned int		mbm_cfg_mask;
+	int			num_mbm_cntrs;
+	bool			mbm_cntr_assignable;
 	struct list_head	evt_list;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 07/26] x86/resctrl: Add support to enable/disable AMD ABMC feature
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (5 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 06/26] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Add the functionality to enable/disable AMD ABMC feature.

AMD ABMC feature is enabled by setting enabled bit(0) in MSR
L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
to be updated on all the logical processors in the QOS Domain.

Hardware counters will reset when ABMC state is changed.

The ABMC feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Clarified the comment on _resctrl_abmc_enable().
     Added the code to reset arch state in _resctrl_abmc_enable().
     Resolved the conflicts with latest merge.

v11: Moved the monitoring related calls to monitor.c file.
     Moved the changes from include/linux/resctrl.h to
     arch/x86/kernel/cpu/resctrl/internal.h.
     Removed the Reviewed-by tag as patch changed.
     Actual code did not change.

v10: No changes.

v9: Re-ordered the MSR and added Reviewed-by tag.

v8: Commit message update and moved around the comments about L3_QOS_EXT_CFG
    to _resctrl_abmc_enable.

v7: Renamed the function
    resctrl_arch_get_abmc_enabled() to resctrl_arch_mbm_cntr_assign_enabled().

    Merged resctrl_arch_mbm_cntr_assign_disable, resctrl_arch_mbm_cntr_assign_disable
    and renamed to resctrl_arch_mbm_cntr_assign_set().

    Moved the function definition to linux/resctrl.h.

    Passed the struct rdt_resource to these functions.
    Removed resctrl_arch_reset_rmid_all() from arch code. This will be done
    from the caller.

v6: Renamed abmc_enabled to mbm_cntr_assign_enabled.
    Used msr_set_bit and msr_clear_bit for msr updates.
    Renamed resctrl_arch_abmc_enable() to resctrl_arch_mbm_cntr_assign_enable().
    Renamed resctrl_arch_abmc_disable() to resctrl_arch_mbm_cntr_assign_disable().
    Made _resctrl_abmc_enable to return void.

v5: Renamed resctrl_abmc_enable to resctrl_arch_abmc_enable.
    Renamed resctrl_abmc_disable to resctrl_arch_abmc_disable.
    Introduced resctrl_arch_get_abmc_enabled to get abmc state from
    non-arch code.
    Renamed resctrl_abmc_set_all to _resctrl_abmc_enable().
    Modified commit log to make it clear about AMD ABMC feature.

v3: No changes.

v2: Few text changes in commit message.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h | 12 ++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 38 ++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index bc6d2de109b5..cb3c0720d910 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1201,6 +1201,7 @@
 /* - AMD: */
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
+#define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 
 /* AMD-V MSRs */
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 9846153aa48f..ad4789740a33 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -35,6 +35,9 @@
 #define INVALID_CONFIG_VALUE		U32_MAX
 #define INVALID_CONFIG_INDEX		UINT_MAX
 
+/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
+#define ABMC_ENABLE_BIT			0
+
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
@@ -388,6 +391,7 @@ struct msr_param {
  * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
  * @mbm_width:		Monitor width, to detect and correct for overflow.
  * @cdp_enabled:	CDP state of this resource
+ * @mbm_cntr_assign_enabled:	ABMC feature is enabled
  *
  * Members of this structure are either private to the architecture
  * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -401,6 +405,7 @@ struct rdt_hw_resource {
 	unsigned int		mon_scale;
 	unsigned int		mbm_width;
 	bool			cdp_enabled;
+	bool			mbm_cntr_assign_enabled;
 };
 
 static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
@@ -424,6 +429,13 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
 
 void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
 
+static inline bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
+{
+	return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
+
 /* CPUID.(EAX=10H, ECX=ResID=1).EAX */
 union cpuid_0x10_1_eax {
 	struct {
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 4132efd83be5..6ed7e51d3fdb 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1387,3 +1387,41 @@ void resctrl_arch_mon_event_config_set(void *info)
 		break;
 	}
 }
+
+static void resctrl_abmc_set_one_amd(void *arg)
+{
+	bool *enable = arg;
+
+	if (*enable)
+		msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+	else
+		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+}
+
+/*
+ * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs
+ * associated with all monitor domains.
+ */
+static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
+{
+	struct rdt_mon_domain *d;
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		on_each_cpu_mask(&d->hdr.cpu_mask,
+				 resctrl_abmc_set_one_amd, &enable, 1);
+		resctrl_arch_reset_rmid_all(r, d);
+	}
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
+{
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	if (r->mon.mbm_cntr_assignable &&
+	    hw_res->mbm_cntr_assign_enabled != enable) {
+		_resctrl_abmc_enable(r, enable);
+		hw_res->mbm_cntr_assign_enabled = enable;
+	}
+
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (6 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 07/26] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 20:56   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Introduce the interface file "mbm_assign_mode" to list monitor modes
supported.

The "mbm_cntr_assign" mode provides the option to assign a counter to
an RMID, event pair and monitor the bandwidth as long as it is assigned.

On AMD systems "mbm_cntr_assign" mode is backed by the ABMC (Assignable
Bandwidth Monitoring Counters) hardware feature and is enabled by default.

The "default" mode is the existing monitoring mode that works without the
explicit counter assignment, instead relying on dynamic counter assignment
by hardware that may result in hardware not dedicating a counter resulting
in monitoring data reads returning "Unavailable".

Provide an interface to display the monitor mode on the system.
$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_cntr_assign]
default

Added an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to handle Arm64
platforms. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined, whereas on
Arm64, it is. As a result, for MPAM, the file would be either:
[default]
or
[mbm_cntr_assign]

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Minor text update in change log and user documentation.
     Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
     This will be defined only in arm and not in x86.

v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
     Removed few texts in resctrl.rst about AMD specific information.
     Updated few texts.

v10: Added few more text to user documentation clarify on the default mode.

v9: Updated user documentation based on comments.

v8: Commit message update.

v7: Updated the descriptions/commit log in resctrl.rst to generic text.
    Thanks to James and Reinette.
    Rename mbm_mode to mbm_assign_mode.
    Introduced mutex lock in rdtgroup_mbm_mode_show().

v6: Added documentation for mbm_cntr_assign and legacy mode.
    Moved mbm_mode fflags initialization to static initialization.

v5: Changed interface name to mbm_mode.
    It will be always available even if ABMC feature is not supported.
    Added description in resctrl.rst about ABMC mode.
    Fixed display abmc and legacy consistantly.

v4: Fixed the checks for legacy and abmc mode. Default it ABMC.

v3: New patch to display ABMC capability.

???END
---
 Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 ++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index fb90f08e564e..bb96b44019fe 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -257,6 +257,33 @@ with the following files:
 	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
 	    0=0x30;1=0x30;3=0x15;4=0x15
 
+"mbm_assign_mode":
+	Reports the list of monitoring modes supported. The enclosed brackets
+	indicate which mode is enabled.
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+	  [mbm_cntr_assign]
+	  default
+
+	"mbm_cntr_assign":
+
+	In mbm_cntr_assign mode, a monitoring event can only accumulate data
+	while it is backed by a hardware counter. The user-space is able to
+	specify which of the events in CTRL_MON or MON groups should have a
+	counter assigned using the "mbm_assign_control" file. The number of
+	counters available is described in the "num_mbm_cntrs" file. Changing
+	the mode may cause all counters on the resource to reset.
+
+	"default":
+
+	In default mode, resctrl assumes there is a hardware counter for each
+	event within every CTRL_MON and MON group. On AMD platforms, it is
+	recommended to use the mbm_cntr_assign mode, if supported, to prevent
+	the hardware from resetting counters between reads. This can result in
+	misleading values or display "Unavailable" if no counter is assigned
+	to the event.
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 17de38e26f94..626be6becca7 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -882,6 +882,36 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
+					struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	bool enabled;
+
+	mutex_lock(&rdtgroup_mutex);
+	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
+
+	if (r->mon.mbm_cntr_assignable) {
+		if (enabled)
+			seq_puts(s, "[mbm_cntr_assign]\n");
+		else
+			seq_puts(s, "[default]\n");
+
+		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
+			if (enabled)
+				seq_puts(s, "default\n");
+			else
+				seq_puts(s, "mbm_cntr_assign\n");
+		}
+	} else {
+		seq_puts(s, "[default]\n");
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1908,6 +1938,13 @@ static struct rftype res_common_files[] = {
 		.seq_show	= mbm_local_bytes_config_show,
 		.write		= mbm_local_bytes_config_write,
 	},
+	{
+		.name		= "mbm_assign_mode",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_mbm_assign_mode_show,
+		.fflags		= RFTYPE_MON_INFO,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (7 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:01   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 10/26] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
                   ` (16 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The mbm_cntr_assign mode provides an option to the user to assign a
counter to an RMID, event pair and monitor the bandwidth as long as
the counter is assigned. Number of assignments depend on number of
monitoring counters available.

Provide the interface to display the number of monitoring counters
supported in each domain. The resctrl file 'num_mbm_cntrs' is visible
to user space when the system supports mbm_cntr_assign mode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Changed the code to display the max supported monitoring counters in
     each domain. Also updated the documentation.
     Resolved the conflict with the latest code.

v11: Renamed rdtgroup_num_mbm_cntrs_show() to resctrl_num_mbm_cntrs_show().
     Few monor text updates.

v10: No changes.

v9: Updated user document based on the comments.
    Will add a new file available_mbm_cntrs later in the series.

v8: Commit message update and documentation update.

v7: Minor commit log text changes.

v6: No changes.

v5: Changed the display name from num_cntrs to num_mbm_cntrs.
    Updated the commit message.
    Moved the patch after mbm_mode is introduced.

v4: Changed the counter name to num_cntrs. And few text changes.

v3: Changed the field name to mbm_assign_cntrs.

v2: Changed the field name to mbm_assignable_counters from abmc_counter.
---
 Documentation/arch/x86/resctrl.rst     | 11 ++++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  3 +++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index bb96b44019fe..35d908befdfb 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -284,6 +284,17 @@ with the following files:
 	misleading values or display "Unavailable" if no counter is assigned
 	to the event.
 
+"num_mbm_cntrs":
+	The maximum number of monitoring counters (total of available and assigned
+	counters) in each domain when the system supports mbm_cntr_assign mode.
+
+	For example, on a system with maximum of 32 memory bandwidth monitoring
+	counters in each of its L3 domains:
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
+	  0=32;1=32
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 6ed7e51d3fdb..028b49878ad0 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1234,6 +1234,9 @@ int __init resctrl_mon_resource_init(void)
 	else if (resctrl_arch_is_mbm_total_enabled())
 		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
 
+	if (r->mon.mbm_cntr_assignable)
+		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
+
 	return 0;
 }
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 626be6becca7..0c9d7a702b93 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -912,6 +912,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
+				      struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	struct rdt_mon_domain *dom;
+	bool sep = false;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
+		if (sep)
+			seq_puts(s, ";");
+
+		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
+		sep = true;
+	}
+	seq_puts(s, "\n");
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+	return 0;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1945,6 +1969,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= resctrl_mbm_assign_mode_show,
 		.fflags		= RFTYPE_MON_INFO,
 	},
+	{
+		.name		= "num_mbm_cntrs",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_num_mbm_cntrs_show,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 10/26] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (8 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 11/26] x86/resctrl: Introduce interface to display number of free MBM counters Babu Moger
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

In mbm_cntr_assign mode hardware counters are assigned/unassigned to an
MBM event of a monitor group. Hardware counters are assigned/unassigned
at monitoring domain level.

Manage a monitoring domain's hardware counters using a per monitoring
domain array of struct mbm_cntr_cfg that is indexed by the hardware
counter ID. A hardware counter's configuration contains the MBM event
ID and points to the monitoring group that it is assigned to, with a
NULL pointer meaning that the hardware counter is available for assignment.

There is no direct way to determine which hardware counters are assigned
to a particular monitoring group. Check every entry of every hardware
counter configuration array in every monitoring domain to query which
MBM events of a monitoring group is tracked by hardware. Such queries are
acceptable because of a very small number of assignable counters (32
to 64).

Suggested-by: Peter Newman <peternewman@google.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Fixed the struct mbm_cntr_cfg code documentation.
     Removed few strange charactors in changelog.
     Added the counter range for better understanding.
     Moved the struct mbm_cntr_cfg definition to resctrl/internal.h as
     suggested by James.

v11: Refined the change log based on Reinette's feedback.
     Fixed few style issues.

v10: Patch changed completely to handle the counters at domain level.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     Removed Reviewed-by tag.
     Did not see the need to add cntr_id in mbm_state structure. Not used in the code.

v9: Added Reviewed-by tag. No other changes.

v8: Minor commit message changes.

v7: Added check mbm_cntr_assignable for allocating bitmap mbm_cntr_map

v6: New patch to add domain level assignment.
---
 arch/x86/kernel/cpu/resctrl/internal.h | 14 ++++++++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
 include/linux/resctrl.h                |  2 ++
 3 files changed, 27 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index ad4789740a33..e4b169fd6970 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -244,6 +244,20 @@ struct rdtgroup {
 	struct pseudo_lock_region	*plr;
 };
 
+/**
+ * struct mbm_cntr_cfg - assignable counter configuration
+ * @evtid:		 MBM event to which the counter is assigned. Only valid
+ *			 if @rdtgroup is not NULL.
+ * @evt_cfg:		 Event configuration value.
+ * @rdtgrp:		 resctrl group assigned to the counter. NULL if the
+ *			 counter is free.
+ */
+struct mbm_cntr_cfg {
+	enum resctrl_event_id	evtid;
+	u32			evt_cfg;
+	struct rdtgroup		*rdtgrp;
+};
+
 /* rdtgroup.flags */
 #define	RDT_DELETED		1
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0c9d7a702b93..cb7a8a2de3ff 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4127,6 +4127,7 @@ static void __init rdtgroup_setup_default(void)
 
 static void domain_destroy_mon_state(struct rdt_mon_domain *d)
 {
+	kfree(d->cntr_cfg);
 	bitmap_free(d->rmid_busy_llc);
 	kfree(d->mbm_total);
 	kfree(d->mbm_local);
@@ -4213,6 +4214,16 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
 			return -ENOMEM;
 		}
 	}
+	if (resctrl_is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
+		tsize = sizeof(*d->cntr_cfg);
+		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
+		if (!d->cntr_cfg) {
+			bitmap_free(d->rmid_busy_llc);
+			kfree(d->mbm_total);
+			kfree(d->mbm_local);
+			return -ENOMEM;
+		}
+	}
 
 	return 0;
 }
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8247c33bbf5a..294b15de664e 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -153,6 +153,7 @@ struct rdt_ctrl_domain {
  * @cqm_limbo:		worker to periodically read CQM h/w counters
  * @mbm_work_cpu:	worker CPU for MBM h/w counters
  * @cqm_work_cpu:	worker CPU for CQM h/w counters
+ * @cntr_cfg:		assignable counters configuration
  */
 struct rdt_mon_domain {
 	struct rdt_domain_hdr		hdr;
@@ -164,6 +165,7 @@ struct rdt_mon_domain {
 	struct delayed_work		cqm_limbo;
 	int				mbm_work_cpu;
 	int				cqm_work_cpu;
+	struct mbm_cntr_cfg		*cntr_cfg;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 11/26] x86/resctrl: Introduce interface to display number of free MBM counters
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (9 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 10/26] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Provide the interface to display the number of monitoring counters
available for assignment in each domain when mbm_cntr_assign mode is
enabled.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Minor change to change log.
     Updated the documentation text with an example.
     Replaced seq_puts(s, ";") with seq_putc(s, ';');
     Added missing rdt_last_cmd_clear() in resctrl_available_mbm_cntrs_show().

v11: Rename rdtgroup_available_mbm_cntrs_show() to resctrl_available_mbm_cntrs_show().
     Few minor text changes.

v10: Patch changed to handle the counters at domain level.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     So, display logic also changed now.

v9: New patch
---
 Documentation/arch/x86/resctrl.rst     | 11 ++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  4 ++-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 48 ++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 35d908befdfb..44128fbda4fe 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -295,6 +295,17 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
 	  0=32;1=32
 
+"available_mbm_cntrs":
+	The number of monitoring counters available for assignment in each
+	domain when mbm_cntr_assign mode is enabled on the system.
+
+	For example, on a system with 30 available [hardware] monitoring counters
+	in each of its L3 domains:
+	::
+
+	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
+	  0=30;1=30
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 028b49878ad0..8a88ac29d57d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1234,8 +1234,10 @@ int __init resctrl_mon_resource_init(void)
 	else if (resctrl_arch_is_mbm_total_enabled())
 		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
 
-	if (r->mon.mbm_cntr_assignable)
+	if (r->mon.mbm_cntr_assignable) {
 		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
+		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
+	}
 
 	return 0;
 }
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index cb7a8a2de3ff..07792b45bd63 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -936,6 +936,48 @@ static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
+					    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	struct rdt_mon_domain *dom;
+	bool sep = false;
+	u32 cntrs, i;
+	int ret = 0;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_cntr_assign mode is not enabled\n");
+		ret = -EINVAL;
+		goto unlock_cntrs_show;
+	}
+
+	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
+		if (sep)
+			seq_putc(s, ';');
+
+		cntrs = 0;
+		for (i = 0; i < r->mon.num_mbm_cntrs; i++) {
+			if (!dom->cntr_cfg[i].rdtgrp)
+				cntrs++;
+		}
+
+		seq_printf(s, "%d=%u", dom->hdr.id, cntrs);
+		sep = true;
+	}
+	seq_puts(s, "\n");
+
+unlock_cntrs_show:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1975,6 +2017,12 @@ static struct rftype res_common_files[] = {
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= resctrl_num_mbm_cntrs_show,
 	},
+	{
+		.name		= "available_mbm_cntrs",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_available_mbm_cntrs_show,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (10 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 11/26] x86/resctrl: Introduce interface to display number of free MBM counters Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:01   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
                   ` (13 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as the
counter is assigned. The bandwidth events will be tracked by the hardware
until the user changes the configuration. Each resctrl group can configure
maximum two counters, one for total event and one for local event.

The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
Configuration is done by setting the counter id, bandwidth source (RMID)
and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
Configuration).

Attempts to read or write the MSR when ABMC is not enabled will result
in a #GP(0) exception.

Introduce the data structures and definitions for MSR L3_QOS_ABMC_CFG
(0xC000_03FDh):
=========================================================================
Bits 	Mnemonic	Description			Access Reset
							Type   Value
=========================================================================
63 	CfgEn 		Configuration Enable 		R/W 	0

62 	CtrEn 		Enable/disable counting		R/W 	0

61:53 	– 		Reserved 			MBZ 	0

52:48 	CtrID 		Counter Identifier		R/W	0

47 	IsCOS		BwSrc field is a CLOSID		R/W	0
			(not an RMID)

46:44 	–		Reserved			MBZ	0

43:32	BwSrc		Bandwidth Source		R/W	0
			(RMID or CLOSID)

31:0	BwType		Bandwidth configuration		R/W	0
			to track for this counter
==========================================================================

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
v12: No changes.

v11: No changes.

v10: No changes.

v9: Removed the references of L3_QOS_ABMC_DSC.
    Text changes about configuration in kernel doc.

v8: Update the configuration notes in kernel_doc.
    Few commit message update.

v7: Removed the reference of L3_QOS_ABMC_DSC as it is not used anymore.
    Moved the configuration notes to kernel_doc.
    Adjusted the tabs for l3_qos_abmc_cfg and checkpatch seems happy.

v6: Removed all the fs related changes.
    Added note on CfgEn,CtrEn.
    Removed the definitions which are not used.
    Removed cntr_id initialization.

v5: Moved assignment flags here (path 10/19 of v4).
    Added MON_CNTR_UNSET definition to initialize cntr_id's.
    More details in commit log.
    Renamed few fields in l3_qos_abmc_cfg for readability.

v4: Added more descriptions.
    Changed the name abmc_ctr_id to ctr_id.
    Added L3_QOS_ABMC_DSC. Used for reading the configuration.

v3: No changes.

v2: No changes.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h | 35 ++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index cb3c0720d910..17f80eec2202 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1201,6 +1201,7 @@
 /* - AMD: */
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
+#define MSR_IA32_L3_QOS_ABMC_CFG	0xc00003fd
 #define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e4b169fd6970..0b73ec451d2c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -483,6 +483,41 @@ union cpuid_0x10_x_edx {
 	unsigned int full;
 };
 
+/*
+ * ABMC counters are configured by writing to L3_QOS_ABMC_CFG.
+ * @bw_type		: Bandwidth configuration (supported by BMEC)
+ *			  tracked by the @cntr_id.
+ * @bw_src		: Bandwidth source (RMID or CLOSID).
+ * @reserved1		: Reserved.
+ * @is_clos		: @bw_src field is a CLOSID (not an RMID).
+ * @cntr_id		: Counter identifier.
+ * @reserved		: Reserved.
+ * @cntr_en		: Counting enable bit.
+ * @cfg_en		: Configuration enable bit.
+ *
+ * Configuration and counting:
+ * Counter can be configured across multiple writes to MSR. Configuration
+ * is applied only when @cfg_en = 1. Counter @cntr_id is reset when the
+ * configuration is applied.
+ * @cfg_en = 1, @cntr_en = 0 : Apply @cntr_id configuration but do not
+ *                             count events.
+ * @cfg_en = 1, @cntr_en = 1 : Apply @cntr_id configuration and start
+ *                             counting events.
+ */
+union l3_qos_abmc_cfg {
+	struct {
+		unsigned long bw_type  :32,
+			      bw_src   :12,
+			      reserved1: 3,
+			      is_clos  : 1,
+			      cntr_id  : 5,
+			      reserved : 9,
+			      cntr_en  : 1,
+			      cfg_en   : 1;
+	} split;
+	unsigned long full;
+};
+
 void rdt_last_cmd_clear(void);
 void rdt_last_cmd_puts(const char *s);
 __printf(1, 2)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (11 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:02   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events Babu Moger
                   ` (12 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as it
is assigned. The assigned RMID will be tracked by the hardware until the
user unassigns it manually.

Implement an architecture-specific handler to assign and unassign the
counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
specifying the counter ID, bandwidth source (RMID), and event
configuration.

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
    Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Added the check to reset the architecture-specific state only when
     assign is requested.
     Added evt_cfg as the parameter as the user will be passing the event
     configuration from /info/L3_MON/event_configs/.

v11: Moved resctrl_arch_assign_cntr() and resctrl_abmc_config_one_amd() to
     monitor.c.
     Added the code to reset the arch state in resctrl_arch_assign_cntr().
     Also removed resctrl_arch_reset_rmid() inside IPI as the counters are
     reset from the callers.
     Re-wrote commit message.

v10: Added call resctrl_arch_reset_rmid() to reset the RMID in the domain
     inside IPI call.
     SMP and non-SMP call support is not required in resctrl_arch_config_cntr
     with new domain specific assign approach/data structure.
     Commit message update.

v9: Removed the code to reset the architectural state. It will done
    in another patch.

v8: Rename resctrl_arch_assign_cntr to resctrl_arch_config_cntr.

v7: Separated arch and fs functions. This patch only has arch implementation.
    Added struct rdt_resource to the interface resctrl_arch_assign_cntr.
    Rename rdtgroup_abmc_cfg() to resctrl_abmc_config_one_amd().

v6: Removed mbm_cntr_alloc() from this patch to keep fs and arch code
    separate.
    Added code to update the counter assignment at domain level.

v5: Few name changes to match cntr_id.
    Changed the function names to
      rdtgroup_assign_cntr
      resctr_arch_assign_cntr
      More comments on commit log.
      Added function summary.

v4: Commit message update.
      User bitmap APIs where applicable.
      Changed the interfaces considering MPAM(arm).
      Added domain specific assignment.

v3: Removed the static from the prototype of rdtgroup_assign_abmc.
      The function is not called directly from user anymore. These
      changes are related to global assignment interface.

v2: Minor text changes in commit message.
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 39 +++++++++++++++++++++++++++
 include/linux/resctrl.h               | 15 +++++++++++
 2 files changed, 54 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 8a88ac29d57d..77f8662dc50b 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1430,3 +1430,42 @@ int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
 
 	return 0;
 }
+
+static void resctrl_abmc_config_one_amd(void *info)
+{
+	union l3_qos_abmc_cfg *abmc_cfg = info;
+
+	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
+}
+
+/*
+ * Send an IPI to the domain to assign the counter to RMID, event pair.
+ */
+int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			     enum resctrl_event_id evtid, u32 rmid, u32 closid,
+			     u32 cntr_id, u32 evt_cfg, bool assign)
+{
+	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+	union l3_qos_abmc_cfg abmc_cfg = { 0 };
+	struct arch_mbm_state *am;
+
+	abmc_cfg.split.cfg_en = 1;
+	abmc_cfg.split.cntr_en = assign ? 1 : 0;
+	abmc_cfg.split.cntr_id = cntr_id;
+	abmc_cfg.split.bw_src = rmid;
+	abmc_cfg.split.bw_type = evt_cfg;
+
+	smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
+
+	/*
+	 * Reset the architectural state so that reading of hardware
+	 * counter is not considered as an overflow in next update.
+	 */
+	if (assign) {
+		am = get_arch_mbm_state(hw_dom, rmid, evtid);
+		if (am)
+			memset(am, 0, sizeof(*am));
+	}
+
+	return 0;
+}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 294b15de664e..60270606f1b8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -394,6 +394,21 @@ void resctrl_arch_mon_event_config_set(void *config_info);
 u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
 				      enum resctrl_event_id eventid);
 
+/**
+ * resctrl_arch_config_cntr() - Configure the counter on the domain
+ * @r:			resource that the counter should be read from.
+ * @d:			domain that the counter should be read from.
+ * @evtid:		event type to assign
+ * @rmid:		rmid of the counter to read.
+ * @closid:		closid that matches the rmid.
+ * @cntr_id:		Counter ID to configure
+ * @evt_cfg:		event configuration
+ * @assign:		assign or unassign
+ */
+int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+                             enum resctrl_event_id evtid, u32 rmid, u32 closid,
+                             u32 cntr_id, u32 evt_cfg, bool assign);
+
 /* For use by arch code to remap resctrl's smaller CDP CLOSID range */
 static inline u32 resctrl_get_config_index(u32 closid,
 					   enum resctrl_conf_type type)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (12 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:04   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 15/26] x86/resctrl: Add the functionality to unassign " Babu Moger
                   ` (11 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
can be assigned to an RMID, event pair and monitor the bandwidth as long
as it is assigned.

Add the functionality to allocate and assign the counters to RMID, event
pair in the domain.

If all the counters are in use, the kernel will log the error message
"Unable to allocate counter in domain" in /sys/fs/resctrl/info/
last_cmd_status when a new assignment is requested. Exit on the first
failure when assigning counters across all the domains.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Fixed typo in the subjest line.
     Replaced several counters with "num_mbm_cntrs" counters.
     Changed the check in resctrl_alloc_config_cntr() to reduce the indentation.
     Fixed the handling error on first failure.
     Added domain id and event id on failure.
     Fixed the return error override.
     Added new parameter event configuration (evt_cfg) to get the event configuration
     from user space.

v11: Patch changed again quite a bit.
     Moved the functions to monitor.c.
     Renamed rdtgroup_assign_cntr_event() to resctrl_assign_cntr_event().
     Refactored the resctrl_assign_cntr_event().
     Added functionality to exit on the first error during assignment.
     Simplified mbm_cntr_free().
     Removed the function mbm_cntr_assigned(). Will be using mbm_cntr_get() to
     figure out if the counter is assigned or not.
     Updated commit message and code comments.

v10: Patch changed completely.
     Counters are managed at the domain based on the discussion.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     Reset non-architectural MBM state.
     Commit message update.

v9: Introduced new function resctrl_config_cntr to assign the counter, update
    the bitmap and reset the architectural state.
    Taken care of error handling(freeing the counter) when assignment fails.
    Moved mbm_cntr_assigned_to_domain here as it used in this patch.
    Minor text changes.

v8: Renamed rdtgroup_assign_cntr() to rdtgroup_assign_cntr_event().
    Added the code to return the error if rdtgroup_assign_cntr_event fails.
    Moved definition of MBM_EVENT_ARRAY_INDEX to resctrl/internal.h.
    Updated typo in the comments.

v7: New patch. Moved all the FS code here.
    Merged rdtgroup_assign_cntr and rdtgroup_alloc_cntr.
    Adde new #define MBM_EVENT_ARRAY_INDEX.
---
 arch/x86/kernel/cpu/resctrl/internal.h |   2 +
 arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
 2 files changed, 126 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 0b73ec451d2c..1a8ac511241a 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
 void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
 unsigned int mon_event_config_index_get(u32 evtid);
+int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
+			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 77f8662dc50b..ff55a4fe044f 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1469,3 +1469,127 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
 
 	return 0;
 }
+
+/*
+ * Configure the counter for the event, RMID pair for the domain. Reset the
+ * non-architectural state to clear all the event counters.
+ */
+static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+			       enum resctrl_event_id evtid, u32 rmid, u32 closid,
+			       u32 cntr_id, u32 evt_cfg, bool assign)
+{
+	struct mbm_state *m;
+	int ret;
+
+	ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
+	if (ret)
+		return ret;
+
+	m = get_mbm_state(d, closid, rmid, evtid);
+	if (m)
+		memset(m, 0, sizeof(struct mbm_state));
+
+	return ret;
+}
+
+static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+{
+	int cntr_id;
+
+	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
+		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
+		    d->cntr_cfg[cntr_id].evtid == evtid)
+			return cntr_id;
+	}
+
+	return -ENOENT;
+}
+
+static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
+			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+{
+	int cntr_id;
+
+	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
+		if (!d->cntr_cfg[cntr_id].rdtgrp) {
+			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
+			d->cntr_cfg[cntr_id].evtid = evtid;
+			return cntr_id;
+		}
+	}
+
+	return -ENOSPC;
+}
+
+static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
+{
+	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
+}
+
+/*
+ * Allocate a fresh counter and configure the event if not assigned already.
+ */
+static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
+				     u32 evt_cfg)
+{
+	int cntr_id, ret = 0;
+
+	/*
+	 * No need to allocate or configure if the counter is already assigned
+	 * and the event configuration is up to date.
+	 */
+	cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
+	if (cntr_id >= 0) {
+		if (d->cntr_cfg[cntr_id].evt_cfg == evt_cfg)
+			return 0;
+
+		goto cntr_configure;
+	}
+
+	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
+	if (cntr_id <  0) {
+		rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
+				    d->hdr.id);
+		return cntr_id;
+	}
+
+cntr_configure:
+	/* Update and configure the domain with the new event configuration value */
+	d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
+
+	ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
+				  cntr_id, evt_cfg, true);
+	if (ret) {
+		rdt_last_cmd_printf("Assignment of event %d failed on domain %d\n",
+				    evtid, d->hdr.id);
+		mbm_cntr_free(d, cntr_id);
+	}
+
+	return ret;
+}
+
+/*
+ * Assign a hardware counter to event @evtid of group @rdtgrp. Counter will be
+ * assigned to all the domains if @d is NULL else the counter will be assigned
+ * to @d.
+ */
+int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
+			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
+			      u32 evt_cfg)
+{
+	int ret = 0;
+
+	if (!d) {
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
+	}
+
+	return ret;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 15/26] x86/resctrl: Add the functionality to unassign MBM events
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (13 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode Babu Moger
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
can be assigned to an RMID, event pair and monitor the bandwidth as long
as it is assigned. If all the counters are in use, the kernel will log the
error messag "Unable to allocate counter in domain" in
/sys/fs/resctrl/info/last_cmd_status when a new assignment is requested.

To make space for a new assignment, users must unassign an already
assigned counter and retry the assignment again.

Add the functionality to unassign and free the counters in the domain.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Updated the commit text to make bit more clear.
     Replaced several counters with "num_mbm_cntrs" counters.
     Fixed typo in the subjest line.
     Fixed the handling error on first failure.
     Added domain id and event id on failure.
     Added new parameter event configuration (evt_cfg) to provide the event from
     user space.

v11: Moved the functions to monitor.c.
     Renamed rdtgroup_unassign_cntr_event() to resctrl_unassign_cntr_event().
     Refactored the resctrl_unassign_cntr_event().
     Updated commit message and code comments.

v10: Patch changed again.
     Counters are managed at the domain based on the discussion.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     commit message update.

v9: Changes related to addition of new function resctrl_config_cntr().
    The removed rdtgroup_mbm_cntr_is_assigned() as it was introduced
    already.
    Text changes to take care comments.

v8: Renamed rdtgroup_mbm_cntr_is_assigned to mbm_cntr_assigned_to_domain
    Added return error handling in resctrl_arch_config_cntr().

v7: Merged rdtgroup_unassign_cntr and rdtgroup_free_cntr functions.
    Renamed rdtgroup_mbm_cntr_test() to rdtgroup_mbm_cntr_is_assigned().
    Reworded the commit log little bit.

v6: Removed mbm_cntr_free from this patch.
    Added counter test in all the domains and free if it is not assigned to
    any domains.

v5: Few name changes to match cntr_id.
    Changed the function names to rdtgroup_unassign_cntr
    More comments on commit log.

v4: Added domain specific unassign feature.
    Few name changes.

v3: Removed the static from the prototype of rdtgroup_unassign_abmc.
    The function is not called directly from user anymore. These
    changes are related to global assignment interface.

v2: No changes.
---
 arch/x86/kernel/cpu/resctrl/internal.h |  2 ++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 46 ++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 1a8ac511241a..13a2a9b064df 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -576,6 +576,8 @@ void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
 unsigned int mon_event_config_index_get(u32 evtid);
 int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
 			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
+int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
+				struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ff55a4fe044f..84dcb227f84c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1593,3 +1593,49 @@ int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
 
 	return ret;
 }
+
+/*
+ * Unassign and free the counter if assigned.
+ */
+static int resctrl_free_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+				    struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
+				    u32 evt_cfg)
+{
+	int cntr_id, ret = 0;
+
+	cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
+
+	if (cntr_id < 0)
+		return ret;
+
+	ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
+				  rdtgrp->closid, cntr_id, evt_cfg, false);
+
+	mbm_cntr_free(d, cntr_id);
+
+	return ret;
+}
+
+/*
+ * Unassign a hardware counter associated with @evtid from the domain and
+ * the group. Unassign the counters from all the domains if @d is NULL else
+ * unassign from @d.
+ */
+int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
+				struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
+				u32 evt_cfg)
+{
+	int ret = 0;
+
+	if (!d) {
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			ret = resctrl_free_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = resctrl_free_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
+	}
+
+	return ret;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (14 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 15/26] x86/resctrl: Add the functionality to unassign " Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:08   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters Babu Moger
                   ` (9 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

In mbm_cntr_assign mode, the hardware counter should be assigned to read
the MBM events.

Report 'Unassigned' in case the user attempts to read the events without
assigning the counter.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Updated the documentation for more clarity.

v11: Domain can be NULL with SNC support so moved the unassign check in
     rdtgroup_mondata_show().

v10: Moved the code to check the assign state inside mon_event_read().
     Fixed few text comments.

v9: Used is_mbm_event() to check the event type.
    Minor user documentation update.

v8: Used MBM_EVENT_ARRAY_INDEX to get the index for the MBM event.
    Documentation update to make the text generic.

v7: Moved the documentation under "mon_data".
    Updated the text little bit.

v6: Added more explaination in the resctrl.rst
    Added checks to detect "Unassigned" before reading RMID.

v5: New patch.
---
 Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 ++++++++++++++
 arch/x86/kernel/cpu/resctrl/internal.h    |  3 +++
 arch/x86/kernel/cpu/resctrl/monitor.c     |  4 ++--
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  2 +-
 5 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 44128fbda4fe..71ed1cfed33a 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -430,6 +430,16 @@ When monitoring is enabled all MON groups will also contain:
 	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
 	where "YY" is the node number.
 
+	The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters
+	and allows users to assign a counter to mon_hw_id, event pair enabling
+	bandwidth monitoring for as long as the counter remains assigned.
+	The hardware will continue tracking the assigned mon_hw_id until
+	the user manually unassigns it, ensuring that counters are not reset
+	during this period. System may run out of assignable counters when
+	all the counters are already assigned. In that case, MBM event counters
+	will return 'Unassigned' when the event is read. Users must manually
+	assign a counter to read the events.
+
 "mon_hw_id":
 	Available only with debug option. The identifier used by hardware
 	for the monitor group. On x86 this is the RMID.
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 0a0ac5f6112e..2225c40b8888 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -710,6 +710,18 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 			goto out;
 		}
 		d = container_of(hdr, struct rdt_mon_domain, hdr);
+
+		/*
+		 * Report 'Unassigned' if mbm_cntr_assign mode is enabled and
+		 * counter is unassigned.
+		 */
+		if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
+		    resctrl_is_mbm_event(evtid) &&
+		    (mbm_cntr_get(r, d, rdtgrp, evtid) < 0)) {
+			rr.err = -ENOENT;
+			goto checkresult;
+		}
+
 		mon_event_read(&rr, r, d, rdtgrp, &d->hdr.cpu_mask, evtid, false);
 	}
 
@@ -719,6 +731,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 		seq_puts(m, "Error\n");
 	else if (rr.err == -EINVAL)
 		seq_puts(m, "Unavailable\n");
+	else if (rr.err == -ENOENT)
+		seq_puts(m, "Unassigned\n");
 	else
 		seq_printf(m, "%llu\n", rr.val);
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 13a2a9b064df..fbb045aec7e5 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -574,10 +574,13 @@ bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
 void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
 unsigned int mon_event_config_index_get(u32 evtid);
+bool resctrl_is_mbm_event(int e);
 int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
 			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
 int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
 				struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
+int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 84dcb227f84c..5e7970fd0a97 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1492,8 +1492,8 @@ static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
 	return ret;
 }
 
-static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
-			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
+int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
+		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
 {
 	int cntr_id;
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 07792b45bd63..d84f47db4e43 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -123,7 +123,7 @@ static bool resctrl_is_mbm_enabled(void)
 		resctrl_arch_is_mbm_local_enabled());
 }
 
-static bool resctrl_is_mbm_event(int e)
+bool resctrl_is_mbm_event(int e)
 {
 	return (e >= QOS_L3_MBM_TOTAL_EVENT_ID &&
 		e <= QOS_L3_MBM_LOCAL_EVENT_ID);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (15 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:21   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode Babu Moger
                   ` (8 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Software can read the assignable counters using the QM_EVTSEL and QM_CTR
register pair.

QM_EVTSEL Register definition:
=======================================================
Bits	Mnemonic	Description
=======================================================
63:44	--		Reserved
43:32   RMID		Resource Monitoring Identifier
31	ExtEvtID	Extended Event Identifier
30:8	--		Reserved
7:0	EvtID		Event Identifier
=======================================================

The contents of a specific counter can be read by setting the following
fields in QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID = L3CacheABMC (=1)
and setting [RMID] to the desired counter ID. Reading QM_CTR will then
return the contents of the specified counter. The E bit will be set if the
counter configuration was invalid, or if an invalid counter ID was set
in the QM_EVTSEL[RMID] field.

Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch to support extended event mode when ABMC is enabled.
---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  4 +-
 arch/x86/kernel/cpu/resctrl/internal.h    |  7 +++
 arch/x86/kernel/cpu/resctrl/monitor.c     | 69 ++++++++++++++++-------
 include/linux/resctrl.h                   |  9 +--
 4 files changed, 63 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 2225c40b8888..da78389c6ac7 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -636,6 +636,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 	rr->r = r;
 	rr->d = d;
 	rr->first = first;
+	rr->cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
 	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
 	if (IS_ERR(rr->arch_mon_ctx)) {
 		rr->err = -EINVAL;
@@ -661,13 +662,14 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 {
 	struct kernfs_open_file *of = m->private;
+	enum resctrl_event_id evtid;
 	struct rdt_domain_hdr *hdr;
 	struct rmid_read rr = {0};
 	struct rdt_mon_domain *d;
-	u32 resid, evtid, domid;
 	struct rdtgroup *rdtgrp;
 	struct rdt_resource *r;
 	union mon_data_bits md;
+	u32 resid, domid;
 	int ret = 0;
 
 	rdtgrp = rdtgroup_kn_lock_live(of->kn);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index fbb045aec7e5..b7d1a59f09f8 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -38,6 +38,12 @@
 /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
 #define ABMC_ENABLE_BIT			0
 
+/*
+ * ABMC Qos Event Identifiers.
+ */
+#define ABMC_EXTENDED_EVT_ID		BIT(31)
+#define ABMC_EVT_ID			1
+
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
@@ -156,6 +162,7 @@ struct rmid_read {
 	struct rdt_mon_domain	*d;
 	enum resctrl_event_id	evtid;
 	bool			first;
+	int			cntr_id;
 	struct cacheinfo	*ci;
 	int			err;
 	u64			val;
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 5e7970fd0a97..58476c065921 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -269,8 +269,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mon_domain *hw_do
 }
 
 void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
-			     u32 unused, u32 rmid,
-			     enum resctrl_event_id eventid)
+			     u32 unused, u32 rmid, enum resctrl_event_id eventid,
+			     int cntr_id)
 {
 	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
 	int cpu = cpumask_any(&d->hdr.cpu_mask);
@@ -281,7 +281,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
 	if (am) {
 		memset(am, 0, sizeof(*am));
 
-		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
+		if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
+		    resctrl_is_mbm_event(eventid)) {
+			if (cntr_id < 0)
+				return;
+			prmid = cntr_id;
+			eventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;
+		} else {
+			prmid = logical_rmid_to_physical_rmid(cpu, rmid);
+		}
 		/* Record any initial, non-zero count value. */
 		__rmid_read_phys(prmid, eventid, &am->prev_msr);
 	}
@@ -313,12 +321,13 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
 }
 
 int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
-			   u32 unused, u32 rmid, enum resctrl_event_id eventid,
-			   u64 *val, void *ignored)
+			   u32 unused, u32 rmid, int cntr_id,
+			   enum resctrl_event_id eventid, u64 *val, void *ignored)
 {
 	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	int cpu = cpumask_any(&d->hdr.cpu_mask);
+	enum resctrl_event_id peventid;
 	struct arch_mbm_state *am;
 	u64 msr_val, chunks;
 	u32 prmid;
@@ -326,8 +335,19 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
 
 	resctrl_arch_rmid_read_context_check();
 
-	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
-	ret = __rmid_read_phys(prmid, eventid, &msr_val);
+	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
+	    resctrl_is_mbm_event(eventid)) {
+		if (cntr_id < 0)
+			return cntr_id;
+
+		prmid = cntr_id;
+		peventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;
+	} else {
+		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
+		peventid = eventid;
+	}
+
+	ret = __rmid_read_phys(prmid, peventid, &msr_val);
 	if (ret)
 		return ret;
 
@@ -392,7 +412,7 @@ void __check_limbo(struct rdt_mon_domain *d, bool force_free)
 			break;
 
 		entry = __rmid_entry(idx);
-		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid,
+		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, -1,
 					   QOS_L3_OCCUP_EVENT_ID, &val,
 					   arch_mon_ctx)) {
 			rmid_dirty = true;
@@ -599,7 +619,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	u64 tval = 0;
 
 	if (rr->first) {
-		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
+		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid, rr->cntr_id);
 		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
 		if (m)
 			memset(m, 0, sizeof(struct mbm_state));
@@ -610,7 +630,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 		/* Reading a single domain, must be on a CPU in that domain. */
 		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
 			return -EINVAL;
-		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
+		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->cntr_id,
 						 rr->evtid, &tval, rr->arch_mon_ctx);
 		if (rr->err)
 			return rr->err;
@@ -635,7 +655,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
 		if (d->ci->id != rr->ci->id)
 			continue;
-		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
+		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid, rr->cntr_id,
 					     rr->evtid, &tval, rr->arch_mon_ctx);
 		if (!err) {
 			rr->val += tval;
@@ -703,8 +723,8 @@ void mon_event_count(void *info)
 
 	if (rdtgrp->type == RDTCTRL_GROUP) {
 		list_for_each_entry(entry, head, mon.crdtgrp_list) {
-			if (__mon_event_count(entry->closid, entry->mon.rmid,
-					      rr) == 0)
+			rr->cntr_id = mbm_cntr_get(rr->r, rr->d, entry, rr->evtid);
+			if (__mon_event_count(entry->closid, entry->mon.rmid, rr) == 0)
 				ret = 0;
 		}
 	}
@@ -835,13 +855,15 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
 }
 
 static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
-				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
+				 u32 closid, u32 rmid, int cntr_id,
+				 enum resctrl_event_id evtid)
 {
 	struct rmid_read rr = {0};
 
 	rr.r = r;
 	rr.d = d;
 	rr.evtid = evtid;
+	rr.cntr_id = cntr_id;
 	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
 	if (IS_ERR(rr.arch_mon_ctx)) {
 		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
@@ -862,17 +884,22 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
 }
 
 static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
-		       u32 closid, u32 rmid)
+		       struct rdtgroup *rdtgrp, u32 closid, u32 rmid)
 {
+	int cntr_id;
 	/*
 	 * This is protected from concurrent reads from user as both
 	 * the user and overflow handler hold the global mutex.
 	 */
-	if (resctrl_arch_is_mbm_total_enabled())
-		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
+	if (resctrl_arch_is_mbm_total_enabled()) {
+		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
+		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_TOTAL_EVENT_ID);
+	}
 
-	if (resctrl_arch_is_mbm_local_enabled())
-		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
+	if (resctrl_arch_is_mbm_local_enabled()) {
+		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
+		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_LOCAL_EVENT_ID);
+	}
 }
 
 /*
@@ -945,11 +972,11 @@ void mbm_handle_overflow(struct work_struct *work)
 	d = container_of(work, struct rdt_mon_domain, mbm_over.work);
 
 	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
-		mbm_update(r, d, prgrp->closid, prgrp->mon.rmid);
+		mbm_update(r, d, prgrp, prgrp->closid, prgrp->mon.rmid);
 
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list)
-			mbm_update(r, d, crgrp->closid, crgrp->mon.rmid);
+			mbm_update(r, d, crgrp, crgrp->closid, crgrp->mon.rmid);
 
 		if (is_mba_sc(NULL))
 			update_mba_bw(prgrp, d);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 60270606f1b8..107cb14a0db2 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -466,8 +466,9 @@ void resctrl_offline_cpu(unsigned int cpu);
  * 0 on success, or -EIO, -EINVAL etc on error.
  */
 int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
-			   u32 closid, u32 rmid, enum resctrl_event_id eventid,
-			   u64 *val, void *arch_mon_ctx);
+			   u32 closid, u32 rmid, int cntr_id,
+			   enum resctrl_event_id eventid, u64 *val,
+			   void *arch_mon_ctx);
 
 /**
  * resctrl_arch_rmid_read_context_check()  - warn about invalid contexts
@@ -513,8 +514,8 @@ struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h, int id,
  * This can be called from any CPU.
  */
 void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
-			     u32 closid, u32 rmid,
-			     enum resctrl_event_id eventid);
+			     u32 closid, u32 rmid, enum resctrl_event_id eventid,
+			     int cntr_id);
 
 /**
  * resctrl_arch_reset_rmid_all() - Reset all private state associated with
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (16 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 21:44   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
                   ` (7 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

By default, each resctrl group supports two MBM events: mbm_total_bytes
and mbm_local_bytes. To maintain the same level of support, two default
MBM configurations are added. These configurations will initially be used
to set up the counters upon mounting, while users will have the option to
modify them as needed.

Event configuration values:
========================================================
 Bits    Mnemonics       Description
====   ========================================================
 6       VictimBW        Dirty Victims from all types of memory
 5       RmtSlowFill     Reads to slow memory in the non-local NUMA domain
 4       LclSlowFill     Reads to slow memory in the local NUMA domain
 3       RmtNTWr         Non-temporal writes to non-local NUMA domain
 2       LclNTWr         Non-temporal writes to local NUMA domain
 1       mtFill          Reads to memory in the non-local NUMA domain
 0       LclFill         Reads to memory in the local NUMA domain
====    ========================================================

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch to support event configurations via new counter_configs
     method.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
 include/linux/resctrl_types.h          | 17 +++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d84f47db4e43..aba23e2096db 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -57,6 +57,21 @@ static struct kernfs_node *kn_mongrp;
 /* Kernel fs node for "mon_data" directory under root */
 static struct kernfs_node *kn_mondata;
 
+struct mbm_evt_value mbm_evt_values[NUM_MBM_EVT_VALUES] = {
+	{"local_reads", 0x1},
+	{"remote_reads", 0x2},
+	{"local_non_temporal_writes", 0x4},
+	{"remote_non_temporal_writes", 0x8},
+	{"local_reads_slow_memory", 0x10},
+	{"remote_reads_slow_memory", 0x20},
+	{"dirty_victim_writes_all", 0x40},
+};
+
+struct mbm_assign_config mbm_assign_configs[NUM_MBM_ASSIGN_CONFIGS] = {
+	{"mbm_total_bytes", QOS_L3_MBM_TOTAL_EVENT_ID, 0x7f},
+	{"mbm_local_bytes", QOS_L3_MBM_LOCAL_EVENT_ID, 0x15},
+};
+
 /*
  * Used to store the max resource name width to display the schemata names in
  * a tabular format.
diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
index f26450b3326b..3d98c7bdb459 100644
--- a/include/linux/resctrl_types.h
+++ b/include/linux/resctrl_types.h
@@ -31,6 +31,9 @@
 /* Max event bits supported */
 #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
 
+#define NUM_MBM_EVT_VALUES		7
+#define NUM_MBM_ASSIGN_CONFIGS		2
+
 enum resctrl_res_level {
 	RDT_RESOURCE_L3,
 	RDT_RESOURCE_L2,
@@ -51,4 +54,18 @@ enum resctrl_event_id {
 	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
 };
 
+struct mbm_evt_value {
+	char	evt_name[32];
+	u32	evt_val;
+};
+
+/**
+ * struct mbm_assign_config - Configuration values
+ */
+struct mbm_assign_config {
+	char			name[32];
+	enum resctrl_event_id	evtid;
+	u32			val;
+};
+
 #endif /* __LINUX_RESCTRL_TYPES_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (17 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 22:04   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations Babu Moger
                   ` (6 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Create the configuration directory and files for mbm_cntr_assign mode.
These configurations will be used to assign MBM events in mbm_cntr_assign
mode, with two default configurations created upon mounting.

Example:
$ cd /sys/fs/resctrl/
$ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
  local_reads, remote_reads, local_non_temporal_writes,
  remote_non_temporal_writes, local_reads_slow_memory,
  remote_reads_slow_memory, dirty_victim_writes_all

$ cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
  local_reads, local_non_temporal_writes, local_reads_slow_memory

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
---
 Documentation/arch/x86/resctrl.rst     | 29 ++++++++++
 arch/x86/kernel/cpu/resctrl/internal.h |  2 +
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 77 ++++++++++++++++++++++++++
 4 files changed, 109 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 71ed1cfed33a..99f9f4b9b501 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -306,6 +306,35 @@ with the following files:
 	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
 	  0=30;1=30
 
+"counter_configs:
+	The directory for storing event configuration files, which will be used to
+	assign counters when the mbm_cntr_assign mode is enabled.
+
+	Following types of events are supported:
+
+	==== ========================= ============================================================
+	Bits Name   		         Description
+	==== ========================= ============================================================
+	6    dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
+	5    remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
+	4    local_reads_slow_memory     Reads to slow memory in the local NUMA domain
+	3    remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
+	2    local_non_temporal_writes   Non-temporal writes to local NUMA domain
+	1    remote_reads                Reads to memory in the non-local NUMA domain
+	0    local_reads                 Reads to memory in the local NUMA domain
+	==== ========================= ==========================================================
+
+	Two default configurations, mbm_local_bytes and mbm_total_bytes, will be created
+	upon mounting.
+	::
+
+	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
+	    local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
+	    local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all
+
+	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
+	    local_reads, local_non_temporal_writes, local_reads_slow_memory
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index b7d1a59f09f8..a943450bf2c8 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -282,11 +282,13 @@ struct mbm_cntr_cfg {
 #define RFTYPE_RES_CACHE		BIT(8)
 #define RFTYPE_RES_MB			BIT(9)
 #define RFTYPE_DEBUG			BIT(10)
+#define RFTYPE_CONFIG			BIT(11)
 #define RFTYPE_CTRL_INFO		(RFTYPE_INFO | RFTYPE_CTRL)
 #define RFTYPE_MON_INFO			(RFTYPE_INFO | RFTYPE_MON)
 #define RFTYPE_TOP_INFO			(RFTYPE_INFO | RFTYPE_TOP)
 #define RFTYPE_CTRL_BASE		(RFTYPE_BASE | RFTYPE_CTRL)
 #define RFTYPE_MON_BASE			(RFTYPE_BASE | RFTYPE_MON)
+#define RFTYPE_MON_CONFIG		(RFTYPE_CONFIG | RFTYPE_MON)
 
 /* List of all resource groups */
 extern struct list_head rdt_all_groups;
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 58476c065921..4525295b1725 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1264,6 +1264,7 @@ int __init resctrl_mon_resource_init(void)
 	if (r->mon.mbm_cntr_assignable) {
 		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
 		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
+		resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
 	}
 
 	return 0;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index aba23e2096db..b2122a1dd36c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1907,6 +1907,25 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
+{
+	struct mbm_assign_config *assign_config = of->kn->parent->priv;
+	bool sep = false;
+	int i;
+
+	for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
+		if (assign_config->val & mbm_evt_values[i].evt_val) {
+			if (sep)
+				seq_puts(seq, ", ");
+			seq_printf(seq, "%s", mbm_evt_values[i].evt_name);
+			sep = true;
+		}
+	}
+	seq_puts(seq, "\n");
+
+	return 0;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2019,6 +2038,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= mbm_local_bytes_config_show,
 		.write		= mbm_local_bytes_config_write,
 	},
+	{
+		.name		= "event_filter",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= event_filter_show,
+	},
 	{
 		.name		= "mbm_assign_mode",
 		.mode		= 0444,
@@ -2314,6 +2339,52 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
 	return ret;
 }
 
+static int resctrl_mkdir_info_configs(void *priv,  char *name, unsigned long fflags)
+{
+	struct kernfs_node *l3_mon_kn, *kn_subdir, *kn_subdir2;
+	int ret, i;
+
+	l3_mon_kn = kernfs_find_and_get(kn_info, name);
+	if (!l3_mon_kn)
+		return -ENOENT;
+
+	kn_subdir = kernfs_create_dir(l3_mon_kn, "counter_configs", l3_mon_kn->mode, priv);
+	if (IS_ERR(kn_subdir)) {
+		kernfs_put(l3_mon_kn);
+		return PTR_ERR(kn_subdir);
+	}
+
+	ret = rdtgroup_kn_set_ugid(kn_subdir);
+	if (ret) {
+		kernfs_put(l3_mon_kn);
+		return ret;
+	}
+
+	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
+		kn_subdir2 = kernfs_create_dir(kn_subdir, mbm_assign_configs[i].name,
+					       kn_subdir->mode, &mbm_assign_configs[i]);
+		if (IS_ERR(kn_subdir)) {
+			ret = PTR_ERR(kn_subdir2);
+			goto config_out;
+		}
+
+		ret = rdtgroup_kn_set_ugid(kn_subdir2);
+		if (ret)
+			goto config_out;
+
+		ret = rdtgroup_add_files(kn_subdir2, fflags);
+		if (!ret)
+			kernfs_activate(kn_subdir);
+	}
+
+config_out:
+	kernfs_put(l3_mon_kn);
+	if (ret)
+		kernfs_remove(kn_subdir);
+
+	return ret;
+}
+
 static unsigned long fflags_from_resource(struct rdt_resource *r)
 {
 	switch (r->rid) {
@@ -2360,6 +2431,12 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
 		ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
 		if (ret)
 			goto out_destroy;
+
+		if (r->mon.mbm_cntr_assignable) {
+			ret = resctrl_mkdir_info_configs(r, name, RFTYPE_MON_CONFIG);
+			if (ret)
+				goto out_destroy;
+		}
 	}
 
 	ret = rdtgroup_kn_set_ugid(kn_info);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (18 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 22:07   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments Babu Moger
                   ` (5 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Users can modify the event configuration by writing to the event_filter
interface file. The event configurations for mbm_cntr_assign mode are
located in /sys/fs/resctrl/info/event_configs/.

Update the assignments of all groups when the event configuration is
modified.

Example:
$ cd /sys/fs/resctrl/
$ echo "local_reads, local_non_temporal_writes" >
  info/L3_MON/counter_configs/mbm_total_bytes/event_filter

$ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
 local_reads, local_non_temporal_writes

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch to modify event configurations.
---
 Documentation/arch/x86/resctrl.rst     |  10 +++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 115 ++++++++++++++++++++++++-
 2 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 99f9f4b9b501..4e6feba6fb08 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -335,6 +335,16 @@ with the following files:
 	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
 	    local_reads, local_non_temporal_writes, local_reads_slow_memory
 
+	The event configuration can be modified by writing to the event_filter file within
+	the configuration directory.
+	::
+
+	    # echo "local_reads, local_non_temporal_writes" >
+	      /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
+
+	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
+	    local_reads, local_non_temporal_writes
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b2122a1dd36c..7792455f0b26 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1926,6 +1926,118 @@ static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq,
 	return 0;
 }
 
+static int resctrl_group_assign(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+				enum resctrl_event_id evtid, u32 evt_cfg)
+{
+	struct rdt_mon_domain *d;
+	int cntr_id, ret;
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
+		if (cntr_id >= 0 && d->cntr_cfg[cntr_id].evt_cfg != evt_cfg) {
+			d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
+			ret = resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
+						       rdtgrp->closid, cntr_id, evt_cfg, true);
+			if (ret) {
+				rdt_last_cmd_printf("Assign failed event %d domain %d group %s\n",
+						    evtid, d->hdr.id, rdtgrp->kn->name);
+				return ret;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static int resctrl_update_assign(struct rdt_resource *r, enum resctrl_event_id evtid,
+				 u32 evt_cfg)
+{
+	struct rdtgroup *prgrp, *crgrp;
+	int ret;
+
+	/* Check if the cntr_id is associated to the event type updated */
+	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
+		ret = resctrl_group_assign(r, prgrp, evtid, evt_cfg);
+		if (ret)
+			return ret;
+
+		list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list, mon.crdtgrp_list) {
+			ret = resctrl_group_assign(r, crgrp, evtid, evt_cfg);
+			if (ret)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int resctrl_process_configs(char *tok, u32 *val)
+{
+	char *evt_str;
+	bool found;
+	int i;
+
+next_config:
+	if (!tok || tok[0] == '\0')
+		return 0;
+
+	/* Start processing the strings for each event type */
+	evt_str = strim(strsep(&tok, ","));
+	found = false;
+	for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
+		if (!strcmp(mbm_evt_values[i].evt_name, evt_str)) {
+			*val |=  mbm_evt_values[i].evt_val;
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		rdt_last_cmd_printf("Invalid event type %s\n", evt_str);
+		return -EINVAL;
+	}
+
+	goto next_config;
+}
+
+static ssize_t event_filter_write(struct kernfs_open_file *of, char *buf,
+				  size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct mbm_assign_config *assign_config = of->kn->parent->priv;
+	u32 evt_cfg = 0;
+	int ret = 0;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_cntr_assign mode is not enabled\n");
+		ret = -EINVAL;
+		goto unlock_out;
+	}
+
+	ret = resctrl_process_configs(buf, &evt_cfg);
+	if (!ret && assign_config->val != evt_cfg) {
+		assign_config->val = evt_cfg;
+		ret = resctrl_update_assign(r, assign_config->evtid, evt_cfg);
+	}
+
+unlock_out:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2040,9 +2152,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "event_filter",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= event_filter_show,
+		.write		= event_filter_write,
 	},
 	{
 		.name		= "mbm_assign_mode",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (19 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-11 22:08   ` Reinette Chatre
  2025-04-04  0:18 ` [PATCH v12 22/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
                   ` (4 subsequent siblings)
  25 siblings, 1 reply; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The mbm_cntr_assign mode provides an option to the user to assign a
counter to an RMID, event pair and monitor the bandwidth as long as
the counter is assigned.

Introduce a configuration option to automatically assign counter IDs
when a resctrl group is created, provided the counters are available.
By default, this option is enabled at boot.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch. Added after the discussion on the list.
     https://lore.kernel.org/lkml/CALPaoCh8siZKjL_3yvOYGL4cF_n_38KpUFgHVGbQ86nD+Q2_SA@mail.gmail.com/
---
 Documentation/arch/x86/resctrl.rst     | 10 +++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 ++++++++++++++++++++++++++
 include/linux/resctrl.h                |  2 ++
 4 files changed, 53 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 4e6feba6fb08..a67f09323da0 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -345,6 +345,16 @@ with the following files:
 	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
 	    local_reads, local_non_temporal_writes
 
+"mbm_assign_on_mkdir":
+	Automatically assign the monitoring counters on resctrl group creation
+	if the counters are available. It is enabled by default on boot and users
+	can disable by writing to the interface.
+	::
+
+	  # echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
+	  0
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 4525295b1725..ee31dfe2c224 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1265,6 +1265,7 @@ int __init resctrl_mon_resource_init(void)
 		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
 		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
 		resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
+		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO);
 	}
 
 	return 0;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7792455f0b26..592a9dc5b404 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -993,6 +993,39 @@ static int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int resctrl_mbm_assign_on_mkdir_show(struct kernfs_open_file *of,
+					    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+
+	seq_printf(s, "%u\n", r->mon.mbm_assign_on_mkdir);
+
+	return 0;
+}
+
+static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
+						 char *buf, size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	bool value;
+	int ret;
+
+	ret = kstrtobool(buf, &value);
+	if (ret)
+		return ret;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+	rdt_last_cmd_clear();
+
+	r->mon.mbm_assign_on_mkdir = value;
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2176,6 +2209,13 @@ static struct rftype res_common_files[] = {
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= resctrl_available_mbm_cntrs_show,
 	},
+	{
+		.name		= "mbm_assign_on_mkdir",
+		.mode		= 0644,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= resctrl_mbm_assign_on_mkdir_show,
+		.write		= resctrl_mbm_assign_on_mkdir_write,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 107cb14a0db2..ad3d986c4ea1 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -250,6 +250,7 @@ enum resctrl_schema_fmt {
  *			monitoring events can be configured.
  * @num_mbm_cntrs:	Number of assignable monitoring counters
  * @mbm_cntr_assignable:Is system capable of supporting monitor assignment?
+ * @mbm_assign_on_mkdir:Auto enable monitor assignment on mkdir?
  * @evt_list:		List of monitoring events
  */
 struct resctrl_mon {
@@ -257,6 +258,7 @@ struct resctrl_mon {
 	unsigned int		mbm_cfg_mask;
 	int			num_mbm_cntrs;
 	bool			mbm_cntr_assignable;
+	bool			mbm_assign_on_mkdir;
 	struct list_head	evt_list;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 22/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (20 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 23/26] x86/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Automatically assign or unassign counters when a resctrl group is created
or deleted. By default, each group requires two counters: one for the MBM
total event and one for the MBM local event.

The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
can be assigned to an RMID, event pair and monitor the bandwidth as long
as it is assigned. If these counters are exhausted, the kernel will log
the error message "Unable to allocate counter in domain" in
/sys/fs/resctrl/info/last_cmd_status when a new group is created.

However, the creation of a group should not fail due to assignment
failures. Users have the flexibility to modify the assignments at a later
time.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Removed mbm_cntr_reset() as it is not required while removing the group.
     Update the commit text.
     Added r->mon_capable  check in rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs.

v11: Moved mbm_cntr_reset() to monitor.c.
     Added code reset non-architectural state in mbm_cntr_reset().
     Added missing rdtgroup_unassign_cntrs() calls on failure path.

v10: Assigned the counter before exposing the event files.
    Moved the call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
    This is called both CNTR_MON and MON group creation.
    Call mbm_cntr_reset() when unmounted to clear all the assignments.
    Taken care of few other feedback comments.

v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
    Updated couple of rdtgroup_unassign_cntrs() calls properly.
    Updated function comments.

v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
    Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
    Fixed the problem with unassigning the child MON groups of CTRL_MON group.

v7: Reworded the commit message.
    Removed the reference of ABMC with mbm_cntr_assign.
    Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.

v6: Removed the redundant comments on all the calls of
    rdtgroup_assign_cntrs. Updated the commit message.
    Dropped printing error message on every call of rdtgroup_assign_cntrs.

v5: Removed the code to enable/disable ABMC during the mount.
    That will be another patch.
    Added arch callers to get the arch specific data.
    Renamed fuctions to match the other abmc function.
    Added code comments for assignment failures.

v4: Few name changes based on the upstream discussion.
    Commit message update.

v3: This is a new patch. Patch addresses the upstream comment to enable
    ABMC feature by default if the feature is available.
---
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 95 +++++++++++++++++++++++++-
 2 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ee31dfe2c224..4e22563dda60 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1316,6 +1316,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		r->mon.mbm_cntr_assignable = true;
 		cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
 		r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
+		r->mon.mbm_assign_on_mkdir = true;
 	}
 
 	r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 592a9dc5b404..3e440ace60e0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -72,6 +72,18 @@ struct mbm_assign_config mbm_assign_configs[NUM_MBM_ASSIGN_CONFIGS] = {
 	{"mbm_local_bytes", QOS_L3_MBM_LOCAL_EVENT_ID, 0x15},
 };
 
+static struct mbm_assign_config *mbm_get_assign_config(enum resctrl_event_id evtid)
+{
+	int i;
+
+	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
+		if (mbm_assign_configs[i].evtid == evtid)
+			return &mbm_assign_configs[i];
+	}
+
+	return NULL;
+}
+
 /*
  * Used to store the max resource name width to display the schemata names in
  * a tabular format.
@@ -3043,6 +3055,67 @@ static void schemata_list_destroy(void)
 	}
 }
 
+/*
+ * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
+ * counters are automatically assigned. Each group can accommodate two counters:
+ * one for the total event and one for the local event. Assignments may fail
+ * due to the limited number of counters. However, it is not necessary to fail
+ * the group creation and thus no failure is returned. Users have the option
+ * to modify the counter assignments after the group has been created.
+ */
+static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct mbm_assign_config *assign_config;
+
+	if (!r->mon_capable)
+		return;
+
+	if (resctrl_arch_mbm_cntr_assign_enabled(r) && !r->mon.mbm_assign_on_mkdir)
+		return;
+
+	if (resctrl_arch_is_mbm_total_enabled()) {
+		assign_config = mbm_get_assign_config(QOS_L3_MBM_TOTAL_EVENT_ID);
+		if (assign_config)
+			resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID,
+						  assign_config->val);
+	}
+
+	if (resctrl_arch_is_mbm_local_enabled()) {
+		assign_config = mbm_get_assign_config(QOS_L3_MBM_LOCAL_EVENT_ID);
+		if (assign_config)
+			resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID,
+						  assign_config->val);
+	}
+}
+
+/*
+ * Called when a group is deleted. Counters are unassigned if it was in
+ * assigned state.
+ */
+static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct mbm_assign_config *assign_config;
+
+	if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r))
+		return;
+
+	if (resctrl_arch_is_mbm_total_enabled()) {
+		assign_config = mbm_get_assign_config(QOS_L3_MBM_TOTAL_EVENT_ID);
+		if (assign_config)
+			resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID,
+						    assign_config->val);
+	}
+
+	if (resctrl_arch_is_mbm_local_enabled()) {
+		assign_config = mbm_get_assign_config(QOS_L3_MBM_LOCAL_EVENT_ID);
+		if (assign_config)
+			resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID,
+						    assign_config->val);
+	}
+}
+
 static int rdt_get_tree(struct fs_context *fc)
 {
 	struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -3097,6 +3170,8 @@ static int rdt_get_tree(struct fs_context *fc)
 		if (ret < 0)
 			goto out_info;
 
+		rdtgroup_assign_cntrs(&rdtgroup_default);
+
 		ret = mkdir_mondata_all(rdtgroup_default.kn,
 					&rdtgroup_default, &kn_mondata);
 		if (ret < 0)
@@ -3135,8 +3210,10 @@ static int rdt_get_tree(struct fs_context *fc)
 	if (resctrl_arch_mon_capable())
 		kernfs_remove(kn_mondata);
 out_mongrp:
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
+		rdtgroup_unassign_cntrs(&rdtgroup_default);
 		kernfs_remove(kn_mongrp);
+	}
 out_info:
 	kernfs_remove(kn_info);
 out_schemata_free:
@@ -3312,6 +3389,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
 
 	head = &rdtgrp->mon.crdtgrp_list;
 	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
+		rdtgroup_unassign_cntrs(sentry);
 		free_rmid(sentry->closid, sentry->mon.rmid);
 		list_del(&sentry->mon.crdtgrp_list);
 
@@ -3352,6 +3430,8 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
+		rdtgroup_unassign_cntrs(rdtgrp);
+
 		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 		kernfs_remove(rdtgrp->kn);
@@ -3384,6 +3464,7 @@ static void rdt_kill_sb(struct super_block *sb)
 		resctrl_arch_reset_all_ctrls(r);
 
 	rmdir_all_sub();
+	rdtgroup_unassign_cntrs(&rdtgroup_default);
 	rdt_pseudo_lock_release();
 	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
 	schemata_list_destroy();
@@ -3847,9 +3928,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
 	}
 	rdtgrp->mon.rmid = ret;
 
+	rdtgroup_assign_cntrs(rdtgrp);
+
 	ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
 	if (ret) {
 		rdt_last_cmd_puts("kernfs subdir error\n");
+		rdtgroup_unassign_cntrs(rdtgrp);
 		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 		return ret;
 	}
@@ -3859,8 +3943,10 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
 
 static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
 {
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
+		rdtgroup_unassign_cntrs(rgrp);
 		free_rmid(rgrp->closid, rgrp->mon.rmid);
+	}
 }
 
 static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
@@ -4128,6 +4214,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	update_closid_rmid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
+
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 	/*
@@ -4175,6 +4264,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
 	update_closid_rmid(tmpmask, NULL);
 
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 	closid_free(rdtgrp->closid);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 23/26] x86/resctrl: Introduce mbm_L3_assignments to list assignments in a group
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (21 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 22/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 24/26] x86/resctrl: Introduce the interface to modify " Babu Moger
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

The interface displays the assignment states for each group when
mbm_cntr_assign mode is enabled.

The list is displayed in the following format:
<Event configuration>:<Domain id>=<Assignment type>

Event configuration: A valid event configuration listed in the
/sys/fs/resctrl/info/L3_MON/counter_configs directory.

Domain ID: A valid domain ID number.

The assignment type can be one of the following:

_ : No event configuration assigned

e : Event configuration assigned in exclusive mode

Example:
$cd /sys/fs/resctrl
$cat mbm_L3_assignments
mbm_total_bytes:0=e;1=e
mbm_local_bytes:0=e;1=e

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch:
     Assignment interface moved inside the group based the discussion
     https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/#t
---
 Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 48 ++++++++++++++++++++++++++
 3 files changed, 76 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index a67f09323da0..31085b4e15f6 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -503,6 +503,33 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	/sys/fs/resctrl/info/L3_MON/mon_features changes the input
 	event.
 
+"mbm_L3_assignments":
+	The interface displays the assignment states for each group when
+	mbm_cntr_assign mode is enabled.
+
+	The list is displayed in the following format:
+
+	<Event configuration>:<Domain id>=<Assignment type>
+
+	Event configuration: A valid event configuration listed in the
+	/sys/fs/resctrl/info/L3_MON/counter_configs directory.
+
+	Domain ID: A valid domain ID number.
+
+	Assignment types:
+
+	_ : No event configuration assigned
+
+	e : Event configuration assigned in exclusive mode
+
+	Example:
+	::
+
+	 # cd /sys/fs/resctrl
+	 # cat mbm_L3_assignments
+	 mbm_total_bytes:0=e;1=e
+	 mbm_local_bytes:0=e;1=e
+
 Resource allocation rules
 -------------------------
 
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 4e22563dda60..0c6fd5f6ec19 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1266,6 +1266,7 @@ int __init resctrl_mon_resource_init(void)
 		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
 		resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
 		resctrl_file_fflags_init("mbm_assign_on_mkdir", RFTYPE_MON_INFO);
+		resctrl_file_fflags_init("mbm_L3_assignments", RFTYPE_MON_BASE);
 	}
 
 	return 0;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3e440ace60e0..ee1c949c2436 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2083,6 +2083,48 @@ static ssize_t event_filter_write(struct kernfs_open_file *of, char *buf,
 	return ret ?: nbytes;
 }
 
+static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct rdt_mon_domain *d;
+	struct rdtgroup *rdtgrp;
+	int i, ret = 0;
+	bool sep;
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (!rdtgrp)
+		return -ENOENT;
+
+	rdt_last_cmd_clear();
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_cntr_assign mode not enabled\n");
+		ret = -ENOENT;
+		goto assign_out;
+	}
+
+	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
+		sep = false;
+		seq_printf(s, "%s:", mbm_assign_configs[i].name);
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			if (sep)
+				seq_puts(s, ";");
+
+			if (mbm_cntr_get(r, d, rdtgrp, mbm_assign_configs[i].evtid) >= 0)
+				seq_printf(s, "%d=e", d->hdr.id);
+			else
+				seq_printf(s, "%d=_", d->hdr.id);
+
+			sep = true;
+		}
+		seq_puts(s, "\n");
+	}
+
+assign_out:
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2202,6 +2244,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= event_filter_show,
 		.write		= event_filter_write,
 	},
+	{
+		.name		= "mbm_L3_assignments",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= mbm_L3_assignments_show,
+	},
 	{
 		.name		= "mbm_assign_mode",
 		.mode		= 0444,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 24/26] x86/resctrl: Introduce the interface to modify assignments in a group
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (22 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 23/26] x86/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 25/26] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
  2025-04-04  0:18 ` [PATCH v12 26/26] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="Venkatamma123$", Size: 7912 bytes --]

Introduce an interface to modify assignments within a group.

Modifications follow this format:
<Event configuration>:<Domain id>=<Assignment type>

The assignment type can be one of the following:

_ : No event configuration assigned

e : Event configuration assigned in exclusive mode

Domain id can be any valid domain ID number or '*' to update all the
domains.

Example:
$cd /sys/fs/resctrl
$cat mbm_L3_assignments
mbm_total_bytes:0=e;1=e
mbm_local_bytes:0=e;1=e

To unassign the configuration of mbm_total_bytes on domain 0:

$echo "mbm_total_bytes:0=_" > mbm_L3_assignments
$cat mbm_L3_assignments
mbm_total_bytes:0=_;1=e
mbm_local_bytes:0=e;1=e

To unassign the mbm_total_bytes configuration on all domains:

$echo "mbm_total_bytes:*=_" > mbm_L3_assignments
$cat mbm_L3_assignments
mbm_total_bytes:0=_;1=_
mbm_local_bytes:0=e;1=e

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: New patch:
     Assignment interface moved inside the group based the discussion
     https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/#t
---
 Documentation/arch/x86/resctrl.rst     |  29 ++++-
 arch/x86/kernel/cpu/resctrl/internal.h |   9 ++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 162 ++++++++++++++++++++++++-
 3 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 31085b4e15f6..ad35c38eed34 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -514,7 +514,7 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	Event configuration: A valid event configuration listed in the
 	/sys/fs/resctrl/info/L3_MON/counter_configs directory.
 
-	Domain ID: A valid domain ID number.
+	Domain ID: A valid domain ID number or '*' to update all the domains.
 
 	Assignment types:
 
@@ -530,6 +530,33 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
 	 mbm_total_bytes:0=e;1=e
 	 mbm_local_bytes:0=e;1=e
 
+	The assignments can be modified by writing to the interface.
+
+	Example:
+	To unassign the configuration of mbm_total_bytes on domain 0:
+	::
+
+	 # echo "mbm_total_bytes:0=_" > mbm_L3_assignments
+	 # cat mbm_L3_assignments
+	 mbm_total_bytes:0=_;1=e
+	 mbm_local_bytes:0=e;1=e
+
+	To unassign the mbm_total_bytes configuration on all domains:
+	::
+
+	 # echo "mbm_total_bytes:*=_" > mbm_L3_assignments
+	 # cat mbm_L3_assignments
+	 mbm_total_bytes:0=_;1=_
+	 mbm_local_bytes:0=e;1=e
+
+	To assign the mbm_total_bytes configuration on all domains in exclusive mode:
+	::
+
+	 # echo "mbm_total_bytes:*=e" > mbm_L3_assignments
+	 # cat mbm_L3_assignments
+	 mbm_total_bytes:0=e;1=e
+	 mbm_local_bytes:0=e;1=e
+
 Resource allocation rules
 -------------------------
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index a943450bf2c8..2020a2fe7135 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -44,6 +44,15 @@
 #define ABMC_EXTENDED_EVT_ID		BIT(31)
 #define ABMC_EVT_ID			1
 
+/*
+ * Assignment types for mbm_cntr_assign mode
+ */
+enum {
+	ASSIGN_NONE		= 0,
+	ASSIGN_EXCLUSIVE,
+	ASSIGN_INVALID,
+};
+
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ee1c949c2436..5d9c4c216522 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -84,6 +84,18 @@ static struct mbm_assign_config *mbm_get_assign_config(enum resctrl_event_id evt
 	return NULL;
 }
 
+static struct mbm_assign_config *mbm_get_assign_config_by_name(char *config)
+{
+	int i;
+
+	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
+		if (!strcmp(mbm_assign_configs[i].name, config))
+			return &mbm_assign_configs[i];
+	}
+
+	return NULL;
+}
+
 /*
  * Used to store the max resource name width to display the schemata names in
  * a tabular format.
@@ -2125,6 +2137,153 @@ static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file
 	return ret;
 }
 
+static unsigned int resctrl_get_assing_type(char *assign)
+{
+	unsigned int mon_state = ASSIGN_NONE;
+	int len = strlen(assign);
+
+	if (!len || len > 1)
+		return ASSIGN_INVALID;
+
+	switch (*assign) {
+	case 'e':
+		mon_state = ASSIGN_EXCLUSIVE;
+		break;
+	case '_':
+		mon_state = ASSIGN_NONE;
+		break;
+	default:
+		mon_state = ASSIGN_INVALID;
+		break;
+	}
+
+	return mon_state;
+}
+
+static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+				  char *config, char *tok)
+{
+	struct mbm_assign_config *assign_config;
+	struct rdt_mon_domain *d;
+	char *dom_str, *id_str;
+	unsigned long dom_id = 0;
+	int assign_type;
+	char domain[10];
+	bool found;
+	int ret;
+
+	assign_config = mbm_get_assign_config_by_name(config);
+	if (!assign_config) {
+		rdt_last_cmd_printf("Invalid assign configuration %s\n", config);
+		return  -ENOENT;
+	}
+
+next:
+	if (!tok || tok[0] == '\0')
+		return 0;
+
+	/* Start processing the strings for each domain */
+	dom_str = strim(strsep(&tok, ";"));
+
+	id_str = strsep(&dom_str, "=");
+
+	/* Check for domain id '*' which means all domains */
+	if (id_str && *id_str == '*') {
+		d = NULL;
+		goto check_state;
+	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
+		rdt_last_cmd_puts("Missing domain id\n");
+		return -EINVAL;
+	}
+
+	/* Verify if the dom_id is valid */
+	found = false;
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		if (d->hdr.id == dom_id) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
+		return -EINVAL;
+	}
+
+check_state:
+	assign_type = resctrl_get_assing_type(dom_str);
+
+	switch (assign_type) {
+	case ASSIGN_NONE:
+		ret = resctrl_unassign_cntr_event(r, d, rdtgrp, assign_config->evtid,
+						  assign_config->val);
+		break;
+	case ASSIGN_EXCLUSIVE:
+		ret = resctrl_assign_cntr_event(r, d, rdtgrp, assign_config->evtid,
+						assign_config->val);
+		break;
+	case ASSIGN_INVALID:
+		ret = -EINVAL;
+	}
+
+	if (ret)
+		goto out_fail;
+
+	goto next;
+
+out_fail:
+	sprintf(domain, d ? "%ld" : "*", dom_id);
+
+	rdt_last_cmd_printf("Assign operation '%s:%s=%s' failed\n", config, domain, dom_str);
+
+	return ret;
+}
+
+static ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf,
+					size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct rdtgroup *rdtgrp;
+	char *token, *config;
+	int ret = 0;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (!rdtgrp) {
+		rdtgroup_kn_unlock(of->kn);
+		return -ENOENT;
+	}
+	rdt_last_cmd_clear();
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		rdt_last_cmd_puts("mbm_cntr_assign mode is not enabled\n");
+		rdtgroup_kn_unlock(of->kn);
+		return -EINVAL;
+	}
+
+	while ((token = strsep(&buf, "\n")) != NULL) {
+		/*
+		 * The write command follows the following format:
+		 * “<Assign config>:<domain_id>=<assign mode>”
+		 * Extract Assign config first.
+		 */
+		config = strsep(&token, ":");
+
+		ret = resctrl_process_assign(r, rdtgrp, config, token);
+		if (ret)
+			break;
+	}
+
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret ?: nbytes;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -2246,9 +2405,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_L3_assignments",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= mbm_L3_assignments_show,
+		.write		= mbm_L3_assignments_write,
 	},
 	{
 		.name		= "mbm_assign_mode",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 25/26] x86/resctrl: Introduce the interface to switch between monitor modes
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (23 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 24/26] x86/resctrl: Introduce the interface to modify " Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  2025-04-04  0:18 ` [PATCH v12 26/26] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Resctrl subsystem can support two monitoring modes, "mbm_cntr_assign" or
"default". In mbm_cntr_assign, monitoring event can only accumulate data
while it is backed by a hardware counter. In "default" mode, resctrl
assumes there is a hardware counter for each event within every CTRL_MON
and MON group.

Introduce interface to switch between mbm_cntr_assign and default modes.

$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_cntr_assign]
default

To enable the "mbm_cntr_assign" monitoring mode:
$ echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode

To enable the "default" monitoring mode:
$ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode

MBM event counters are automatically reset as part of changing the mode.
Clear both architectural and non-architectural event states to prevent
overflow conditions during the next event read.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Fixed the documentation for a consistency.
     Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear
     counters and non-architectural states when monitor mode is changed.
     https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/

v11: Changed the name of the function rdtgroup_mbm_assign_mode_write() to
     resctrl_mbm_assign_mode_write().
     Rewrote the commit message with context.
     Added few more details in resctrl.rst about mbm_cntr_assign mode.
     Re-arranged the text in resctrl.rst file.

v10: The call mbm_cntr_reset() has been moved to earlier patch.
     Minor documentation update.

v9: Fixed extra spaces in user documentation.
    Fixed problem changing the mode to mbm_cntr_assign mode when it is
    not supported. Added extra checks to detect if systems supports it.
    Used the rdtgroup_cntr_id_init to initialize cntr_id.

v8: Reset the internal counters after mbm_cntr_assign mode is changed.
    Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset()
    Updated the documentation to make text generic.

v7: Changed the interface name to mbm_assign_mode.
    Removed the references of ABMC.
    Added the changes to reset global and domain bitmaps.
    Added the changes to reset rmid.

v6: Changed the mode name to mbm_cntr_assign.
    Moved all the FS related code here.
    Added changes to reset mbm_cntr_map and resctrl group counters.

v5: Change log and mode description text correction.

v4: Minor commit text changes. Keep the default to ABMC when supported.
    Fixed comments to reflect changed interface "mbm_mode".

v3: New patch to address the review comments from upstream.
---
 Documentation/arch/x86/resctrl.rst     | 25 ++++++++++-
 arch/x86/kernel/cpu/resctrl/internal.h |  2 +
 arch/x86/kernel/cpu/resctrl/monitor.c  | 16 +++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 58 +++++++++++++++++++++++++-
 4 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index ad35c38eed34..05f1852ad8e2 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -259,7 +259,10 @@ with the following files:
 
 "mbm_assign_mode":
 	Reports the list of monitoring modes supported. The enclosed brackets
-	indicate which mode is enabled.
+	indicate which mode is enabled. The MBM events (mbm_total_bytes and/or
+	mbm_local_bytes) associated with counters may reset when "mbm_assign_mode"
+	is changed.
+
 	::
 
 	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
@@ -275,6 +278,16 @@ with the following files:
 	counters available is described in the "num_mbm_cntrs" file. Changing
 	the mode may cause all counters on the resource to reset.
 
+	Moving to mbm_cntr_assign mode require users to assign the counters to
+	the events. Otherwise, the MBM event counters will return 'Unassigned'
+	when read.
+
+	The mode is beneficial for AMD platforms that support more CTRL_MON
+	and MON groups than available hardware counters. By default, this
+	feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth
+	Monitoring Counters) capability, ensuring counters remain assigned even
+	when the corresponding RMID is not actively used by any processor.
+
 	"default":
 
 	In default mode, resctrl assumes there is a hardware counter for each
@@ -284,6 +297,16 @@ with the following files:
 	misleading values or display "Unavailable" if no counter is assigned
 	to the event.
 
+	* To enable "mbm_cntr_assign" monitoring mode:
+	  ::
+
+	    # echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
+	* To enable "default" monitoring mode:
+	  ::
+
+	    # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
 "num_mbm_cntrs":
 	The maximum number of monitoring counters (total of available and assigned
 	counters) in each domain when the system supports mbm_cntr_assign mode.
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2020a2fe7135..2f3a5d78d153 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -599,6 +599,8 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d
 				struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
 int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
 		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d);
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 0c6fd5f6ec19..7f2e1fdfa936 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -610,6 +610,17 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid,
 	}
 }
 
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
+
+	if (resctrl_arch_is_mbm_total_enabled())
+		memset(d->mbm_total, 0, sizeof(struct mbm_state) * idx_limit);
+
+	if (resctrl_arch_is_mbm_local_enabled())
+		memset(d->mbm_local, 0, sizeof(struct mbm_state) * idx_limit);
+}
+
 static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
 {
 	int cpu = smp_processor_id();
@@ -1558,6 +1569,11 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
 	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
 }
 
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+	memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs);
+}
+
 /*
  * Allocate a fresh counter and configure the event if not assigned already.
  */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5d9c4c216522..d10cf1e5b914 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1050,6 +1050,61 @@ static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of,
+					     char *buf, size_t nbytes, loff_t off)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	struct rdt_mon_domain *d;
+	int ret = 0;
+	bool enable;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!strcmp(buf, "default")) {
+		enable = 0;
+	} else if (!strcmp(buf, "mbm_cntr_assign")) {
+		if (r->mon.mbm_cntr_assignable) {
+			enable = 1;
+		} else {
+			ret = -EINVAL;
+			rdt_last_cmd_puts("mbm_cntr_assign mode is not supported\n");
+			goto write_exit;
+		}
+	} else {
+		ret = -EINVAL;
+		rdt_last_cmd_puts("Unsupported assign mode\n");
+		goto write_exit;
+	}
+
+	if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) {
+		ret = resctrl_arch_mbm_cntr_assign_set(r, enable);
+		if (ret)
+			goto write_exit;
+		/*
+		 * Reset all the non-achitectural RMID state and assignable counters.
+		 */
+		list_for_each_entry(d, &r->mon_domains, hdr.list) {
+			mbm_cntr_free_all(r, d);
+			resctrl_reset_rmid_all(r, d);
+		}
+	}
+
+write_exit:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2412,9 +2467,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_assign_mode",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= resctrl_mbm_assign_mode_show,
+		.write		= resctrl_mbm_assign_mode_write,
 		.fflags		= RFTYPE_MON_INFO,
 	},
 	{
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v12 26/26] x86/resctrl: Configure mbm_cntr_assign mode if supported
  2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (24 preceding siblings ...)
  2025-04-04  0:18 ` [PATCH v12 25/26] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
@ 2025-04-04  0:18 ` Babu Moger
  25 siblings, 0 replies; 80+ messages in thread
From: Babu Moger @ 2025-04-04  0:18 UTC (permalink / raw)
  To: tony.luck, reinette.chatre, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, babu.moger, kan.liang, xin3.li,
	ebiggers, xin, sohil.mehta, andrew.cooper3, mario.limonciello,
	linux-doc, linux-kernel, maciej.wieczor-retman, eranian

Configure mbm_cntr_assign mode on AMD platforms. On AMD platforms, it is
recommended to use the mbm_cntr_assign mode, if supported, to prevent the
hardware from resetting counters between reads. This can result in
misleading values or display "Unavailable" if no counter is assigned to
the event.

The mbm_cntr_assign mode, referred to as ABMC (Assignable Bandwidth
Monitoring Counters) on AMD, is enabled by default when supported by the
system.

Update ABMC across all logical processors within the resctrl domain to
ensure proper functionality.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v12: Moved the resctrl_arch_mbm_cntr_assign_set_one to domain_add_cpu_mon().
     Updated the commit log.

v11: Commit text in imperative tone. Added few more details.
     Moved resctrl_arch_mbm_cntr_assign_set_one() to monitor.c.

v10: Commit text in imperative tone.

v9: Minor code change due to merge. Actual code did not change.

v8: Renamed resctrl_arch_mbm_cntr_assign_configure to
        resctrl_arch_mbm_cntr_assign_set_one.
    Adde r->mon_capable check.
    Commit message update.

v7: Introduced resctrl_arch_mbm_cntr_assign_configure() to configure.
    Moved the default settings to rdt_get_mon_l3_config(). It should be
    done before the hotplug handler is called. It cannot be done at
    rdtgroup_init().

v6: Keeping the default enablement in arch init code for now.
     This may need some discussion.
     Renamed resctrl_arch_configure_abmc to resctrl_arch_mbm_cntr_assign_configure.

v5: New patch to enable ABMC by default.
---
 arch/x86/kernel/cpu/resctrl/core.c     | 7 +++++++
 arch/x86/kernel/cpu/resctrl/internal.h | 1 +
 arch/x86/kernel/cpu/resctrl/monitor.c  | 8 ++++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 16f700c2d00d..4f21196bbeb7 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -533,6 +533,9 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 		d = container_of(hdr, struct rdt_mon_domain, hdr);
 
 		cpumask_set_cpu(cpu, &d->hdr.cpu_mask);
+		/* Update the mbm_cntr_assign state for the CPU if supported */
+		if (r->mon.mbm_cntr_assignable)
+			resctrl_arch_mbm_cntr_assign_set_one(r);
 		return;
 	}
 
@@ -551,6 +554,10 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
 	}
 	cpumask_set_cpu(cpu, &d->hdr.cpu_mask);
 
+	/* Update the mbm_cntr_assign state for the CPU if supported */
+	if (r->mon.mbm_cntr_assignable)
+		resctrl_arch_mbm_cntr_assign_set_one(r);
+
 	arch_mon_domain_online(r, d);
 
 	if (arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2f3a5d78d153..72b4a9334c2b 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -601,6 +601,7 @@ int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
 		 struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
 void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d);
 void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d);
+void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r);
 
 #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 7f2e1fdfa936..137c76dda875 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1329,6 +1329,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
 		r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
 		r->mon.mbm_assign_on_mkdir = true;
+		hw_res->mbm_cntr_assign_enabled = true;
 	}
 
 	r->mon_capable = true;
@@ -1445,6 +1446,13 @@ static void resctrl_abmc_set_one_amd(void *arg)
 		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
 }
 
+void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r)
+{
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	resctrl_abmc_set_one_amd(&hw_res->mbm_cntr_assign_enabled);
+}
+
 /*
  * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs
  * associated with all monitor domains.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain
  2025-04-04  0:18 ` [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
@ 2025-04-11 20:49   ` Reinette Chatre
  2025-04-14 15:56     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 20:49 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
> supported, the bandwidth events can be configured to track specific
> events. The event configuration is domain specific. Event configurations
> are not stored in resctrl but instead always read from or written to
> hardware directly when prompted by user space.

Why is this a problem?

> 
> Read the event configuration from the hardware during domain
> initialization and store the configuration value in the rdt_hw_mon_domain
> structure for later use when the user requests to display it.

Why is this required?

This series is about adding support for ABMC while this appears to be
an optimization for BMEC. Even more, as I see it, this optimization makes
resctrl support of ABMC and BMEC confusing (more below).

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: Fixed the conflicts due to recent merge.
>      This patch is for BMEC and there is no dependancy on ABMC feature.

Why still do it?

>      Moved it earlier.
> 
> v11: Resolved minor conflicts due to code displacement. Actual code didnt
>      change.
> 
> v10: Conflicts due to code displacement. Actual code didnt change.
> 
> v9: Added Reviewed-by tag. No other changes.
> 
> v8: Renamed resctrl_mbm_evt_config_init() to arch_mbm_evt_config_init()
>     Minor commit message update.
> 
> v7: Fixed initializing INVALID_CONFIG_VALUE to mbm_local_cfg in case of error.
> 
> v6: Renamed resctrl_arch_mbm_evt_config -> resctrl_mbm_evt_config_init
>     Initialized value to INVALID_CONFIG_VALUE if it is not configurable.
>     Minor commit message update.
> 
> v5: Exported mon_event_config_index_get.
>     Renamed arch_domain_mbm_evt_config to resctrl_arch_mbm_evt_config.
> 
> v4: Read the configuration information from the hardware to initialize.
>     Added few commit messages.
>     Fixed the tab spaces.
> 
> v3: Minor changes related to rebase in mbm_config_write_domain.
> 
> v2: No changes.
> ---
>  arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>  arch/x86/kernel/cpu/resctrl/internal.h |  9 +++++++++
>  arch/x86/kernel/cpu/resctrl/monitor.c  | 26 ++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
>  4 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index cf29681d01e0..a28de257168f 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -558,6 +558,8 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
>  		return;
>  	}
>  
> +	arch_mbm_evt_config_init(hw_dom);
> +
>  	list_add_tail_rcu(&d->hdr.list, add_pos);
>  
>  	err = resctrl_online_mon_domain(r, d);
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index c44c5b496355..9846153aa48f 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -32,6 +32,9 @@
>   */
>  #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
>  
> +#define INVALID_CONFIG_VALUE		U32_MAX
> +#define INVALID_CONFIG_INDEX		UINT_MAX
> +
>  /**
>   * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
>   *			        aren't marked nohz_full
> @@ -335,6 +338,8 @@ struct rdt_hw_ctrl_domain {
>   * @d_resctrl:	Properties exposed to the resctrl file system
>   * @arch_mbm_total:	arch private state for MBM total bandwidth
>   * @arch_mbm_local:	arch private state for MBM local bandwidth
> + * @mbm_total_cfg:	MBM total bandwidth configuration
> + * @mbm_local_cfg:	MBM local bandwidth configuration
>   *
>   * Members of this structure are accessed via helpers that provide abstraction.
>   */
> @@ -342,6 +347,8 @@ struct rdt_hw_mon_domain {
>  	struct rdt_mon_domain		d_resctrl;
>  	struct arch_mbm_state		*arch_mbm_total;
>  	struct arch_mbm_state		*arch_mbm_local;
> +	u32				mbm_total_cfg;
> +	u32				mbm_local_cfg;
>  };

This introduces an architecture managed per-domain event configuration while
the rest of the series introduces a resctrl fs managed global event configuration.
I see this as the start of a source for confusion about how events are configured since
there is no further connection between this per-domain event configuration maintained
by the architecture and the global event configuration maintained by resctrl fs.

>  
>  static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct rdt_ctrl_domain *r)
> @@ -504,6 +511,8 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
>  void rdt_staged_configs_clear(void);
>  bool closid_allocated(unsigned int closid);
>  int resctrl_find_cleanest_closid(void);
> +void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
> +unsigned int mon_event_config_index_get(u32 evtid);
>  
>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index a93ed7d2a160..abd337fbd01d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1284,6 +1284,32 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>  	return 0;
>  }
>  
> +void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom)
> +{
> +	unsigned int index;
> +	u64 msrval;
> +
> +	/*
> +	 * Read the configuration registers QOS_EVT_CFG_n, where <n> is
> +	 * the BMEC event number (EvtID).
> +	 */
> +	if (mbm_total_event.configurable) {

Please keep an eye on where things are going in the arch/fs split.
mbm_total_event is private to resctrl fs and arch code cannot reach into it.
There is the arch helper resctrl_arch_is_evt_configurable() but I also
think that this helper needs to be reconsidered in the light of ABMC.

Overall I think this ABMC support needs to consider what already exists
for BMEC support and ensure that both are supported coherently. For example,
when a monitor domain has a "MBM local bandwidth configuration" then it should
be obvious what that means.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value
  2025-04-04  0:18 ` [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
@ 2025-04-11 20:50   ` Reinette Chatre
  2025-04-14 15:57     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 20:50 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The event configuration is domain specific and initialized during domain
> initialization. The values are stored in struct rdt_hw_mon_domain.
> 
> It is not required to read the configuration register every time user asks
> for it. Use the value stored in struct rdt_hw_mon_domain instead.

Storing and maintaining the event configuration creates confusion with
the new event configurations introduced in the rest of this series. I
think that it will be simpler to keep BMEC support as-is.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-04  0:18 ` [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2025-04-11 20:52   ` Reinette Chatre
  2025-04-14 17:48     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 20:52 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> Users can create as many monitor groups as RMIDs supported by the hardware.
> However, bandwidth monitoring feature on AMD system only guarantees that
> RMIDs currently assigned to a processor will be tracked by hardware. The
> counters of any other RMIDs which are no longer being tracked will be reset
> to zero. The MBM event counters return "Unavailable" for the RMIDs that are
> not tracked by hardware. So, there can be only limited number of groups
> that can give guaranteed monitoring numbers. With ever changing
> configurations there is no way to definitely know which of these groups are
> being tracked for certain point of time. Users do not have the option to
> monitor a group or set of groups for certain period of time without
> worrying about RMID being reset in between.
> 
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as it is
> assigned. The assigned RMID will be tracked by the hardware until the user
> unassigns it manually. There is no need to worry about counters being reset
> during this period. Additionally, the user can specify a bitmask
> identifying the specific bandwidth types from the given source to track
> with the counter.
> 
> Without ABMC enabled, monitoring will work in current mode without
> assignment option.
> 
> Linux resctrl subsystem provides the interface to count maximum of two
> memory bandwidth events per group, from a combination of available total
> and local events. Keeping the current interface, users can enable a maximum
> of 2 ABMC counters per group. User will also have the option to enable only
> one counter to the group. If the system runs out of assignable ABMC
> counters, kernel will display an error. Users need to disable an already
> enabled counter to make space for new assignments.

The above paragraph sounds like it is still talking about the original
global assignment of counters.

> 
> The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5.
> Bits Description
> 5    ABMC (Assignable Bandwidth Monitoring Counters)
> 
> The feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> 
> Note: Checkpatch checks/warnings are ignored to maintain coding style.
> 
> v12: Removed the dependancy on X86_FEATURE_BMEC.

Considering this removal it is not clear to me how the BMEC and ABMC features
are managed on a platform. Since this dependency existed I assume platforms
that support both ABMC and BMEC exist and after previous discussion [1]
I expected to see that BMEC support will be disabled when ABMC is detected
but I do not see this done in this series. From what I can tell, looking at
patch "x86/resctrl: Detect Assignable Bandwidth Monitoring feature details"
BMEC and ABMC are both detected and enabled while I do not see any
interactions handled. For example, a user modifying the BMEC appears
to have no impact on existing ABMC assigned counters. Could you please clarify
how event configuration works on platforms that support both ABMC and BMEC?

Reinette

[1] https://lore.kernel.org/all/4b66ea1c-2f76-4218-a67b-2232b2be6990@amd.com/

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode
  2025-04-04  0:18 ` [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
@ 2025-04-11 20:56   ` Reinette Chatre
  2025-04-14 19:52     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 20:56 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> Introduce the interface file "mbm_assign_mode" to list monitor modes

Using "resctrl file" instead of "interface file" should help to make it
clear what this patch does.

> supported.
> 
> The "mbm_cntr_assign" mode provides the option to assign a counter to
> an RMID, event pair and monitor the bandwidth as long as it is assigned.
> 
> On AMD systems "mbm_cntr_assign" mode is backed by the ABMC (Assignable
> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
> 
> The "default" mode is the existing monitoring mode that works without the
> explicit counter assignment, instead relying on dynamic counter assignment
> by hardware that may result in hardware not dedicating a counter resulting
> in monitoring data reads returning "Unavailable".
> 
> Provide an interface to display the monitor mode on the system.
> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> [mbm_cntr_assign]
> default
> 
> Added an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to handle Arm64

(needs imperative)

> platforms. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined, whereas on
> Arm64, it is. As a result, for MPAM, the file would be either:

CONFIG_RESCTRL_ASSIGN_FIXED does not yet exist anywhere so this motivation needs
to provide stronger support for why it is used before it exists. There is a precedent
here with RESCTRL_RMID_DEPENDS_ON_CLOSID already used while it does not yet
appear in a Kconfig file. I would propose that this is motivated by noting
how it is already understood how Arm supports assignable counters this was recommended
by James to prepare for that work. Since this is user interface this
work is done early to ensure user interface is compatible with that upcoming
support. Also set folks at ease that IS_ENABLED() works as expected with a
non-existing config.


> [default]
> or
> [mbm_cntr_assign]
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: Minor text update in change log and user documentation.
>      Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
>      This will be defined only in arm and not in x86.
> 
> v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
>      Removed few texts in resctrl.rst about AMD specific information.
>      Updated few texts.
> 
> v10: Added few more text to user documentation clarify on the default mode.
> 
> v9: Updated user documentation based on comments.
> 
> v8: Commit message update.
> 
> v7: Updated the descriptions/commit log in resctrl.rst to generic text.
>     Thanks to James and Reinette.
>     Rename mbm_mode to mbm_assign_mode.
>     Introduced mutex lock in rdtgroup_mbm_mode_show().
> 
> v6: Added documentation for mbm_cntr_assign and legacy mode.
>     Moved mbm_mode fflags initialization to static initialization.
> 
> v5: Changed interface name to mbm_mode.
>     It will be always available even if ABMC feature is not supported.
>     Added description in resctrl.rst about ABMC mode.
>     Fixed display abmc and legacy consistantly.
> 
> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
> 
> v3: New patch to display ABMC capability.
> 
> ???END
> ---
>  Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 ++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index fb90f08e564e..bb96b44019fe 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -257,6 +257,33 @@ with the following files:
>  	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>  	    0=0x30;1=0x30;3=0x15;4=0x15
>  
> +"mbm_assign_mode":
> +	Reports the list of monitoring modes supported. The enclosed brackets
> +	indicate which mode is enabled.
> +	::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +	  [mbm_cntr_assign]
> +	  default
> +
> +	"mbm_cntr_assign":
> +
> +	In mbm_cntr_assign mode, a monitoring event can only accumulate data
> +	while it is backed by a hardware counter. The user-space is able to
> +	specify which of the events in CTRL_MON or MON groups should have a
> +	counter assigned using the "mbm_assign_control" file. The number of

"mbm_assign_control" no longer exist.

> +	counters available is described in the "num_mbm_cntrs" file. Changing
> +	the mode may cause all counters on the resource to reset.
> +
> +	"default":
> +
> +	In default mode, resctrl assumes there is a hardware counter for each
> +	event within every CTRL_MON and MON group. On AMD platforms, it is
> +	recommended to use the mbm_cntr_assign mode, if supported, to prevent
> +	the hardware from resetting counters between reads. This can result in

"from resetting counters" -> "from re-allocating counters"?

> +	misleading values or display "Unavailable" if no counter is assigned
> +	to the event.
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 17de38e26f94..626be6becca7 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -882,6 +882,36 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
>  	return ret;
>  }
>  
> +static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
> +					struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = of->kn->parent->priv;
> +	bool enabled;
> +
> +	mutex_lock(&rdtgroup_mutex);
> +	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
> +
> +	if (r->mon.mbm_cntr_assignable) {
> +		if (enabled)
> +			seq_puts(s, "[mbm_cntr_assign]\n");
> +		else
> +			seq_puts(s, "[default]\n");
> +
> +		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
> +			if (enabled)
> +				seq_puts(s, "default\n");
> +			else
> +				seq_puts(s, "mbm_cntr_assign\n");
> +		}
> +	} else {
> +		seq_puts(s, "[default]\n");
> +	}
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +
> +	return 0;
> +}
> +
>  #ifdef CONFIG_PROC_CPU_RESCTRL
>  
>  /*
> @@ -1908,6 +1938,13 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= mbm_local_bytes_config_show,
>  		.write		= mbm_local_bytes_config_write,
>  	},
> +	{
> +		.name		= "mbm_assign_mode",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= resctrl_mbm_assign_mode_show,
> +		.fflags		= RFTYPE_MON_INFO,

Needs a RFTYPE_RES_CACHE?

> +	},
>  	{
>  		.name		= "cpus",
>  		.mode		= 0644,

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters
  2025-04-04  0:18 ` [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
@ 2025-04-11 21:01   ` Reinette Chatre
  2025-04-14 20:12     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:01 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The mbm_cntr_assign mode provides an option to the user to assign a
> counter to an RMID, event pair and monitor the bandwidth as long as
> the counter is assigned. Number of assignments depend on number of
> monitoring counters available.
> 
> Provide the interface to display the number of monitoring counters

An interface can also be a function. To help make this work obvious
it can be specific:

	Create 'num_mbm_cntrs' resctrl file that displays the number of
	monitoring counters supported in each domain. 'num_mbm_cntrs'
	is only visible to user space when the system supports
	mbm_cntr_assign mode.

> supported in each domain. The resctrl file 'num_mbm_cntrs' is visible
> to user space when the system supports mbm_cntr_assign mode.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...

> ---
>  Documentation/arch/x86/resctrl.rst     | 11 ++++++++++
>  arch/x86/kernel/cpu/resctrl/monitor.c  |  3 +++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 ++++++++++++++++++++++++++
>  3 files changed, 44 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index bb96b44019fe..35d908befdfb 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -284,6 +284,17 @@ with the following files:
>  	misleading values or display "Unavailable" if no counter is assigned
>  	to the event.
>  
> +"num_mbm_cntrs":
> +	The maximum number of monitoring counters (total of available and assigned
> +	counters) in each domain when the system supports mbm_cntr_assign mode.
> +
> +	For example, on a system with maximum of 32 memory bandwidth monitoring
> +	counters in each of its L3 domains:
> +	::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
> +	  0=32;1=32
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 6ed7e51d3fdb..028b49878ad0 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1234,6 +1234,9 @@ int __init resctrl_mon_resource_init(void)
>  	else if (resctrl_arch_is_mbm_total_enabled())
>  		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
>  
> +	if (r->mon.mbm_cntr_assignable)
> +		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);

Missing RFTYPE_RES_CACHE?

> +
>  	return 0;
>  }
>  
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 626be6becca7..0c9d7a702b93 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -912,6 +912,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>  	return 0;
>  }
>  
> +static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
> +				      struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = of->kn->parent->priv;
> +	struct rdt_mon_domain *dom;
> +	bool sep = false;
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
> +		if (sep)
> +			seq_puts(s, ";");

seq_putc() can be used.

> +
> +		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
> +		sep = true;
> +	}
> +	seq_puts(s, "\n");

seq_putc() can be used.

> +
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +	return 0;
> +}
> +
>  #ifdef CONFIG_PROC_CPU_RESCTRL
>  
>  /*
> @@ -1945,6 +1969,12 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= resctrl_mbm_assign_mode_show,
>  		.fflags		= RFTYPE_MON_INFO,
>  	},
> +	{
> +		.name		= "num_mbm_cntrs",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= resctrl_num_mbm_cntrs_show,
> +	},
>  	{
>  		.name		= "cpus",
>  		.mode		= 0644,

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-04-04  0:18 ` [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
@ 2025-04-11 21:01   ` Reinette Chatre
  2025-04-14 20:30     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:01 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as the
> counter is assigned. The bandwidth events will be tracked by the hardware
> until the user changes the configuration. Each resctrl group can configure
> maximum two counters, one for total event and one for local event.
> 
> The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
> Configuration is done by setting the counter id, bandwidth source (RMID)
> and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
> Configuration).

Apart from the BMEC optimization in patch #1 and patch #2 this is the
first and only mention of BMEC dependency I see in this series while I do
not see implementation support for this. What am I missing?

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-04-04  0:18 ` [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
@ 2025-04-11 21:02   ` Reinette Chatre
  2025-04-14 20:51     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:02 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as it
> is assigned. The assigned RMID will be tracked by the hardware until the
> user unassigns it manually.
> 
> Implement an architecture-specific handler to assign and unassign the
> counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
> specifying the counter ID, bandwidth source (RMID), and event
> configuration.
> 
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>     Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>     Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...

> ---
>  arch/x86/kernel/cpu/resctrl/monitor.c | 39 +++++++++++++++++++++++++++
>  include/linux/resctrl.h               | 15 +++++++++++
>  2 files changed, 54 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 8a88ac29d57d..77f8662dc50b 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1430,3 +1430,42 @@ int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>  
>  	return 0;
>  }
> +
> +static void resctrl_abmc_config_one_amd(void *info)
> +{
> +	union l3_qos_abmc_cfg *abmc_cfg = info;
> +
> +	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
> +}
> +
> +/*
> + * Send an IPI to the domain to assign the counter to RMID, event pair.
> + */
> +int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			     enum resctrl_event_id evtid, u32 rmid, u32 closid,
> +			     u32 cntr_id, u32 evt_cfg, bool assign)
> +{
> +	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
> +	union l3_qos_abmc_cfg abmc_cfg = { 0 };
> +	struct arch_mbm_state *am;
> +
> +	abmc_cfg.split.cfg_en = 1;
> +	abmc_cfg.split.cntr_en = assign ? 1 : 0;
> +	abmc_cfg.split.cntr_id = cntr_id;
> +	abmc_cfg.split.bw_src = rmid;
> +	abmc_cfg.split.bw_type = evt_cfg;
> +
> +	smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
> +
> +	/*
> +	 * Reset the architectural state so that reading of hardware
> +	 * counter is not considered as an overflow in next update.

Please add something like: "The hardware counter is reset (because cfg_en == 1)
so there is no need to record initial non-zero counts."

> +	 */
> +	if (assign) {
> +		am = get_arch_mbm_state(hw_dom, rmid, evtid);
> +		if (am)
> +			memset(am, 0, sizeof(*am));
> +	}
> +
> +	return 0;
> +}
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 294b15de664e..60270606f1b8 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h

Please keep the declaration internal to the arch code. It can be moved when
other architecture needs it.

> @@ -394,6 +394,21 @@ void resctrl_arch_mon_event_config_set(void *config_info);
>  u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
>  				      enum resctrl_event_id eventid);
>  
> +/**
> + * resctrl_arch_config_cntr() - Configure the counter on the domain
> + * @r:			resource that the counter should be read from.
> + * @d:			domain that the counter should be read from.
> + * @evtid:		event type to assign
> + * @rmid:		rmid of the counter to read.
> + * @closid:		closid that matches the rmid.
> + * @cntr_id:		Counter ID to configure
> + * @evt_cfg:		event configuration

"event configuration" is simply an expansion of member name and does not help to
understand what the value represents.

> + * @assign:		assign or unassign

Please rework the kernel doc: consistent sentence structure (starts with upper case,
ends with period), use proper capitalization for acronyms (rmid -> RMID, etc.),
make descriptions informative.

> + */
> +int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +                             enum resctrl_event_id evtid, u32 rmid, u32 closid,
> +                             u32 cntr_id, u32 evt_cfg, bool assign);
> +
>  /* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>  static inline u32 resctrl_get_config_index(u32 closid,
>  					   enum resctrl_conf_type type)

This patch does not pass checkpatch.pl.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-04  0:18 ` [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events Babu Moger
@ 2025-04-11 21:04   ` Reinette Chatre
  2025-04-15 14:20     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:04 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
> can be assigned to an RMID, event pair and monitor the bandwidth as long
> as it is assigned.

Above makes it sound as though multiple counters can be assigned to
an RMID, event pair.

> 
> Add the functionality to allocate and assign the counters to RMID, event
> pair in the domain.

"assign *a* counter to an RMID, event pair"?

> 
> If all the counters are in use, the kernel will log the error message
> "Unable to allocate counter in domain" in /sys/fs/resctrl/info/
> last_cmd_status when a new assignment is requested. Exit on the first
> failure when assigning counters across all the domains.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> ---
>  arch/x86/kernel/cpu/resctrl/internal.h |   2 +
>  arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
>  2 files changed, 126 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 0b73ec451d2c..1a8ac511241a 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
>  int resctrl_find_cleanest_closid(void);
>  void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
>  unsigned int mon_event_config_index_get(u32 evtid);
> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);

This is internal to resctrl fs. Why is it needed to provide both the event id
and the event configuration? Event configuration can be determined from event ID?

>  
>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 77f8662dc50b..ff55a4fe044f 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1469,3 +1469,127 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>  
>  	return 0;
>  }
> +
> +/*
> + * Configure the counter for the event, RMID pair for the domain. Reset the
> + * non-architectural state to clear all the event counters.
> + */
> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			       enum resctrl_event_id evtid, u32 rmid, u32 closid,
> +			       u32 cntr_id, u32 evt_cfg, bool assign)
> +{
> +	struct mbm_state *m;
> +	int ret;
> +
> +	ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
> +	if (ret)
> +		return ret;

I understood previous discussion to conclude that resctrl_arch_config_cntr() cannot fail
and thus I expect it to return void and not need any error checking from caller.
By extension this will result in resctrl_config_cntr() returning void and should simplify
a few flows. For example, it will make it clear that re-configuring an existing counter
cannot result in that counter being freed.

> +
> +	m = get_mbm_state(d, closid, rmid, evtid);
> +	if (m)
> +		memset(m, 0, sizeof(struct mbm_state));
> +
> +	return ret;
> +}
> +

Could you please add comments to these mbm_cntr* functions to provide information
on how the cntr_cfg data structure is used? Please also include details on
callers since it seems to me as though these functions are called
from paths where assignable counters are not supported (mon_event_read()).

> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
> +	int cntr_id;
> +
> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
> +		    d->cntr_cfg[cntr_id].evtid == evtid)
> +			return cntr_id;
> +	}
> +
> +	return -ENOENT;
> +}
> +
> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
> +	int cntr_id;
> +
> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
> +			d->cntr_cfg[cntr_id].evtid = evtid;
> +			return cntr_id;
> +		}
> +	}
> +
> +	return -ENOSPC;
> +}
> +
> +static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
> +{
> +	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
> +}
> +
> +/*
> + * Allocate a fresh counter and configure the event if not assigned already.
> + */
> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
> +				     u32 evt_cfg)

Same here, why are both evtid and evt_cfg provided as arguments? 

> +{
> +	int cntr_id, ret = 0;
> +
> +	/*
> +	 * No need to allocate or configure if the counter is already assigned
> +	 * and the event configuration is up to date.
> +	 */
> +	cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
> +	if (cntr_id >= 0) {
> +		if (d->cntr_cfg[cntr_id].evt_cfg == evt_cfg)
> +			return 0;
> +
> +		goto cntr_configure;
> +	}
> +
> +	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
> +	if (cntr_id <  0) {
> +		rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
> +				    d->hdr.id);

Please print resource name also.

> +		return cntr_id;
> +	}
> +
> +cntr_configure:
> +	/* Update and configure the domain with the new event configuration value */
> +	d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
> +
> +	ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
> +				  cntr_id, evt_cfg, true);
> +	if (ret) {
> +		rdt_last_cmd_printf("Assignment of event %d failed on domain %d\n",
> +				    evtid, d->hdr.id);

How is user expected to interpret the event ID (especially when looking forward
where events can be dynamic)? This should rather be the event name.

> +		mbm_cntr_free(d, cntr_id);
> +	}
> +
> +	return ret;
> +}
> +
> +/*
> + * Assign a hardware counter to event @evtid of group @rdtgrp. Counter will be
> + * assigned to all the domains if @d is NULL else the counter will be assigned
> + * to @d.
> + */
> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
> +			      u32 evt_cfg)
> +{
> +	int ret = 0;
> +
> +	if (!d) {
> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +			ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
> +			if (ret)
> +				return ret;
> +		}
> +	} else {
> +		ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
> +	}
> +
> +	return ret;
> +}

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode
  2025-04-04  0:18 ` [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode Babu Moger
@ 2025-04-11 21:08   ` Reinette Chatre
  2025-04-15 15:00     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:08 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> In mbm_cntr_assign mode, the hardware counter should be assigned to read
> the MBM events.
> 
> Report 'Unassigned' in case the user attempts to read the events without

"the events" -> "the event"?

> assigning the counter.

"the counter" -> "a hardware counter"?

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...

> ---
>  Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 ++++++++++++++
>  arch/x86/kernel/cpu/resctrl/internal.h    |  3 +++
>  arch/x86/kernel/cpu/resctrl/monitor.c     |  4 ++--
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  2 +-
>  5 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 44128fbda4fe..71ed1cfed33a 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -430,6 +430,16 @@ When monitoring is enabled all MON groups will also contain:
>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>  	where "YY" is the node number.
>  
> +	The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters
> +	and allows users to assign a counter to mon_hw_id, event pair enabling
> +	bandwidth monitoring for as long as the counter remains assigned.
> +	The hardware will continue tracking the assigned mon_hw_id until
> +	the user manually unassigns it, ensuring that counters are not reset
> +	during this period. System may run out of assignable counters when
> +	all the counters are already assigned. In that case, MBM event counters

Counters could be unassigned even if there are assignable counters available.

I think the "System may run ..." sentence should be dropped.
The "In that case ..." sentence could be simplified with something like:
"An MBM event returns 'Unassigned' when the event does not have a hardware
counter assigned."

> +	will return 'Unassigned' when the event is read. Users must manually
> +	assign a counter to read the events.
> +
>  "mon_hw_id":
>  	Available only with debug option. The identifier used by hardware
>  	for the monitor group. On x86 this is the RMID.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters
  2025-04-04  0:18 ` [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters Babu Moger
@ 2025-04-11 21:21   ` Reinette Chatre
  2025-04-15 16:41     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:21 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> Software can read the assignable counters using the QM_EVTSEL and QM_CTR
> register pair.
> 
> QM_EVTSEL Register definition:
> =======================================================
> Bits	Mnemonic	Description
> =======================================================
> 63:44	--		Reserved
> 43:32   RMID		Resource Monitoring Identifier
> 31	ExtEvtID	Extended Event Identifier
> 30:8	--		Reserved
> 7:0	EvtID		Event Identifier
> =======================================================
> 
> The contents of a specific counter can be read by setting the following
> fields in QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID = L3CacheABMC (=1)
> and setting [RMID] to the desired counter ID. Reading QM_CTR will then
> return the contents of the specified counter. The E bit will be set if the
> counter configuration was invalid, or if an invalid counter ID was set

Would an invalid counter configuration be possible at this point? I expect
that an invalid counter configuration would not allow the counter to be
configured in the first place.

> in the QM_EVTSEL[RMID] field.
> 
> Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: New patch to support extended event mode when ABMC is enabled.
> ---
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  4 +-
>  arch/x86/kernel/cpu/resctrl/internal.h    |  7 +++
>  arch/x86/kernel/cpu/resctrl/monitor.c     | 69 ++++++++++++++++-------
>  include/linux/resctrl.h                   |  9 +--
>  4 files changed, 63 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 2225c40b8888..da78389c6ac7 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -636,6 +636,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  	rr->r = r;
>  	rr->d = d;
>  	rr->first = first;
> +	rr->cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>  	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
>  	if (IS_ERR(rr->arch_mon_ctx)) {
>  		rr->err = -EINVAL;
> @@ -661,13 +662,14 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  {
>  	struct kernfs_open_file *of = m->private;
> +	enum resctrl_event_id evtid;
>  	struct rdt_domain_hdr *hdr;
>  	struct rmid_read rr = {0};
>  	struct rdt_mon_domain *d;
> -	u32 resid, evtid, domid;
>  	struct rdtgroup *rdtgrp;
>  	struct rdt_resource *r;
>  	union mon_data_bits md;
> +	u32 resid, domid;
>  	int ret = 0;
>  

Why make this change?

>  	rdtgrp = rdtgroup_kn_lock_live(of->kn);
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index fbb045aec7e5..b7d1a59f09f8 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -38,6 +38,12 @@
>  /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
>  #define ABMC_ENABLE_BIT			0
>  
> +/*
> + * ABMC Qos Event Identifiers.
> + */
> +#define ABMC_EXTENDED_EVT_ID		BIT(31)
> +#define ABMC_EVT_ID			1
> +
>  /**
>   * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
>   *			        aren't marked nohz_full
> @@ -156,6 +162,7 @@ struct rmid_read {
>  	struct rdt_mon_domain	*d;
>  	enum resctrl_event_id	evtid;
>  	bool			first;
> +	int			cntr_id;
>  	struct cacheinfo	*ci;
>  	int			err;
>  	u64			val;

This does not look necessary (more below)

> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 5e7970fd0a97..58476c065921 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -269,8 +269,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mon_domain *hw_do
>  }
>  
>  void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
> -			     u32 unused, u32 rmid,
> -			     enum resctrl_event_id eventid)
> +			     u32 unused, u32 rmid, enum resctrl_event_id eventid,
> +			     int cntr_id)
>  {
>  	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>  	int cpu = cpumask_any(&d->hdr.cpu_mask);
> @@ -281,7 +281,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
>  	if (am) {
>  		memset(am, 0, sizeof(*am));
>  
> -		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> +		if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
> +		    resctrl_is_mbm_event(eventid)) {
> +			if (cntr_id < 0)
> +				return;
> +			prmid = cntr_id;
> +			eventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;

hmmm ... this is not a valid enum resctrl_event_id.

(before venturing into alternatives we need to study Tony's new RMID series
because he made some changes to the enum that may support this work)


> +		} else {
> +			prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> +		}
>  		/* Record any initial, non-zero count value. */
>  		__rmid_read_phys(prmid, eventid, &am->prev_msr);
>  	}
> @@ -313,12 +321,13 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
>  }
>  
>  int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
> -			   u32 unused, u32 rmid, enum resctrl_event_id eventid,
> -			   u64 *val, void *ignored)
> +			   u32 unused, u32 rmid, int cntr_id,
> +			   enum resctrl_event_id eventid, u64 *val, void *ignored)
>  {
>  	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>  	int cpu = cpumask_any(&d->hdr.cpu_mask);
> +	enum resctrl_event_id peventid;
>  	struct arch_mbm_state *am;
>  	u64 msr_val, chunks;
>  	u32 prmid;
> @@ -326,8 +335,19 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>  
>  	resctrl_arch_rmid_read_context_check();
>  
> -	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> -	ret = __rmid_read_phys(prmid, eventid, &msr_val);
> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
> +	    resctrl_is_mbm_event(eventid)) {
> +		if (cntr_id < 0)
> +			return cntr_id;
> +
> +		prmid = cntr_id;
> +		peventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;

same

> +	} else {
> +		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> +		peventid = eventid;
> +	}
> +
> +	ret = __rmid_read_phys(prmid, peventid, &msr_val);
>  	if (ret)
>  		return ret;
>  
> @@ -392,7 +412,7 @@ void __check_limbo(struct rdt_mon_domain *d, bool force_free)
>  			break;
>  
>  		entry = __rmid_entry(idx);
> -		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid,
> +		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, -1,
>  					   QOS_L3_OCCUP_EVENT_ID, &val,
>  					   arch_mon_ctx)) {
>  			rmid_dirty = true;
> @@ -599,7 +619,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  	u64 tval = 0;
>  
>  	if (rr->first) {
> -		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
> +		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid, rr->cntr_id);
>  		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
>  		if (m)
>  			memset(m, 0, sizeof(struct mbm_state));
> @@ -610,7 +630,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  		/* Reading a single domain, must be on a CPU in that domain. */
>  		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
>  			return -EINVAL;
> -		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
> +		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->cntr_id,
>  						 rr->evtid, &tval, rr->arch_mon_ctx);
>  		if (rr->err)
>  			return rr->err;
> @@ -635,7 +655,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>  	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
>  		if (d->ci->id != rr->ci->id)
>  			continue;
> -		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
> +		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid, rr->cntr_id,
>  					     rr->evtid, &tval, rr->arch_mon_ctx);
>  		if (!err) {
>  			rr->val += tval;
> @@ -703,8 +723,8 @@ void mon_event_count(void *info)
>  
>  	if (rdtgrp->type == RDTCTRL_GROUP) {
>  		list_for_each_entry(entry, head, mon.crdtgrp_list) {
> -			if (__mon_event_count(entry->closid, entry->mon.rmid,
> -					      rr) == 0)
> +			rr->cntr_id = mbm_cntr_get(rr->r, rr->d, entry, rr->evtid);
> +			if (__mon_event_count(entry->closid, entry->mon.rmid, rr) == 0)
>  				ret = 0;
>  		}
>  	}
> @@ -835,13 +855,15 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>  }
>  
>  static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> -				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
> +				 u32 closid, u32 rmid, int cntr_id,
> +				 enum resctrl_event_id evtid)

Would it not be simpler to provide resource group as argument (remove closid, rmid, and
cntr_id) and determine cntr_id from known data to provide cntr_id as argument to
__mon_event_count(), removing the need for a new member in struct rmid_read?

>  {
>  	struct rmid_read rr = {0};
>  
>  	rr.r = r;
>  	rr.d = d;
>  	rr.evtid = evtid;
> +	rr.cntr_id = cntr_id;
>  	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
>  	if (IS_ERR(rr.arch_mon_ctx)) {
>  		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> @@ -862,17 +884,22 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>  }
>  
>  static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> -		       u32 closid, u32 rmid)
> +		       struct rdtgroup *rdtgrp, u32 closid, u32 rmid)

This looks redundant to provide both the resource group and two of its members as parameters.
Looks like this can just be resource group and then remove closid and rmid?

>  {
> +	int cntr_id;
>  	/*
>  	 * This is protected from concurrent reads from user as both
>  	 * the user and overflow handler hold the global mutex.
>  	 */
> -	if (resctrl_arch_is_mbm_total_enabled())
> -		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
> +	if (resctrl_arch_is_mbm_total_enabled()) {
> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
> +		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_TOTAL_EVENT_ID);

With similar change to mbm_update_one_event() where it takes resource group as parameter
it is not needed to compute counter ID here.

This patch could be split. One patch can replace the closid/rmid in mbm_update()
and mbm_update_one_event() with the resource group. Following patches can build on that.

> +	}
>  
> -	if (resctrl_arch_is_mbm_local_enabled())
> -		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
> +	if (resctrl_arch_is_mbm_local_enabled()) {
> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
> +		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_LOCAL_EVENT_ID);
> +	}
>  }
>  
>  /*
> @@ -945,11 +972,11 @@ void mbm_handle_overflow(struct work_struct *work)
>  	d = container_of(work, struct rdt_mon_domain, mbm_over.work);
>  
>  	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
> -		mbm_update(r, d, prgrp->closid, prgrp->mon.rmid);
> +		mbm_update(r, d, prgrp, prgrp->closid, prgrp->mon.rmid);

providing both the resource group and two of its members really looks
redundant.

>  
>  		head = &prgrp->mon.crdtgrp_list;
>  		list_for_each_entry(crgrp, head, mon.crdtgrp_list)
> -			mbm_update(r, d, crgrp->closid, crgrp->mon.rmid);
> +			mbm_update(r, d, crgrp, crgrp->closid, crgrp->mon.rmid);

same

>  
>  		if (is_mba_sc(NULL))
>  			update_mba_bw(prgrp, d);
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 60270606f1b8..107cb14a0db2 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -466,8 +466,9 @@ void resctrl_offline_cpu(unsigned int cpu);
>   * 0 on success, or -EIO, -EINVAL etc on error.
>   */
>  int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
> -			   u32 closid, u32 rmid, enum resctrl_event_id eventid,
> -			   u64 *val, void *arch_mon_ctx);
> +			   u32 closid, u32 rmid, int cntr_id,
> +			   enum resctrl_event_id eventid, u64 *val,
> +			   void *arch_mon_ctx);
>  
>  /**
>   * resctrl_arch_rmid_read_context_check()  - warn about invalid contexts
> @@ -513,8 +514,8 @@ struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h, int id,
>   * This can be called from any CPU.
>   */
>  void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
> -			     u32 closid, u32 rmid,
> -			     enum resctrl_event_id eventid);
> +			     u32 closid, u32 rmid, enum resctrl_event_id eventid,
> +			     int cntr_id);
>  
>  /**
>   * resctrl_arch_reset_rmid_all() - Reset all private state associated with

When changing the interface the associated kernel doc should also be updated.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-04  0:18 ` [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode Babu Moger
@ 2025-04-11 21:44   ` Reinette Chatre
  2025-04-15 18:48     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 21:44 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> By default, each resctrl group supports two MBM events: mbm_total_bytes
> and mbm_local_bytes. To maintain the same level of support, two default
> MBM configurations are added. These configurations will initially be used
> to set up the counters upon mounting, while users will have the option to
> modify them as needed.

This jumps in quite fast by stating that MBM configurations are added but
there is no definition of what an MBM configuration is.

> 
> Event configuration values:
> ========================================================
>  Bits    Mnemonics       Description
> ====   ========================================================
>  6       VictimBW        Dirty Victims from all types of memory
>  5       RmtSlowFill     Reads to slow memory in the non-local NUMA domain
>  4       LclSlowFill     Reads to slow memory in the local NUMA domain
>  3       RmtNTWr         Non-temporal writes to non-local NUMA domain
>  2       LclNTWr         Non-temporal writes to local NUMA domain
>  1       mtFill          Reads to memory in the non-local NUMA domain
>  0       LclFill         Reads to memory in the local NUMA domain
> ====    ========================================================

What is the purpose of the mnemonics?

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: New patch to support event configurations via new counter_configs
>      method.
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
>  include/linux/resctrl_types.h          | 17 +++++++++++++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index d84f47db4e43..aba23e2096db 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -57,6 +57,21 @@ static struct kernfs_node *kn_mongrp;
>  /* Kernel fs node for "mon_data" directory under root */
>  static struct kernfs_node *kn_mondata;
>  
> +struct mbm_evt_value mbm_evt_values[NUM_MBM_EVT_VALUES] = {
> +	{"local_reads", 0x1},
> +	{"remote_reads", 0x2},
> +	{"local_non_temporal_writes", 0x4},
> +	{"remote_non_temporal_writes", 0x8},
> +	{"local_reads_slow_memory", 0x10},
> +	{"remote_reads_slow_memory", 0x20},
> +	{"dirty_victim_writes_all", 0x40},
> +};
> +
> +struct mbm_assign_config mbm_assign_configs[NUM_MBM_ASSIGN_CONFIGS] = {
> +	{"mbm_total_bytes", QOS_L3_MBM_TOTAL_EVENT_ID, 0x7f},
> +	{"mbm_local_bytes", QOS_L3_MBM_LOCAL_EVENT_ID, 0x15},
> +};
> +
>  /*
>   * Used to store the max resource name width to display the schemata names in
>   * a tabular format.
> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
> index f26450b3326b..3d98c7bdb459 100644
> --- a/include/linux/resctrl_types.h
> +++ b/include/linux/resctrl_types.h

Please read changelog of f16adbaf9272 ("x86/resctrl: Move resctrl types to a separate header")
for a good explanation of what resctrl_types.h is used for.

> @@ -31,6 +31,9 @@
>  /* Max event bits supported */
>  #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
>  
> +#define NUM_MBM_EVT_VALUES		7
> +#define NUM_MBM_ASSIGN_CONFIGS		2

Please keep changes to internal header files unless required.

> +
>  enum resctrl_res_level {
>  	RDT_RESOURCE_L3,
>  	RDT_RESOURCE_L2,
> @@ -51,4 +54,18 @@ enum resctrl_event_id {
>  	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
>  };
>  
> +struct mbm_evt_value {
> +	char	evt_name[32];
> +	u32	evt_val;
> +};

I cannot see how this belongs in resctrl_types.h.

> +
> +/**
> + * struct mbm_assign_config - Configuration values

Please include a run of scripts/kernel-doc in your patch preparation steps.

The description "Configuration values" is incredibly vague.

> + */
> +struct mbm_assign_config {
> +	char			name[32];
> +	enum resctrl_event_id	evtid;
> +	u32			val;
> +};

Why is this new struct needed? It looks to me like a duplicate of struct
mon_evt with one member added. There is also already the evt_list as part
of a monitor resource that the array introduced here seems to duplicate.

Could the event configuration be made a member of struct mon_evt instead?
This exposes the need to integrate this better with BMEC support to make
clear how existing "configurable" member should used and/or expanded.

There seems more and more overlap with Tony's RMID work. Did you get a
chance to look at that?

> +
>  #endif /* __LINUX_RESCTRL_TYPES_H */

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/
  2025-04-04  0:18 ` [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
@ 2025-04-11 22:04   ` Reinette Chatre
  2025-04-15 20:29     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 22:04 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu

On 4/3/25 5:18 PM, Babu Moger wrote:
> Create the configuration directory and files for mbm_cntr_assign mode.
> These configurations will be used to assign MBM events in mbm_cntr_assign
> mode, with two default configurations created upon mounting.
> 
> Example:
> $ cd /sys/fs/resctrl/
> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>   local_reads, remote_reads, local_non_temporal_writes,
>   remote_non_temporal_writes, local_reads_slow_memory,
>   remote_reads_slow_memory, dirty_victim_writes_all
> 
> $ cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>   local_reads, local_non_temporal_writes, local_reads_slow_memory
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
> ---
>  Documentation/arch/x86/resctrl.rst     | 29 ++++++++++
>  arch/x86/kernel/cpu/resctrl/internal.h |  2 +
>  arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 77 ++++++++++++++++++++++++++
>  4 files changed, 109 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 71ed1cfed33a..99f9f4b9b501 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -306,6 +306,35 @@ with the following files:
>  	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>  	  0=30;1=30
>  
> +"counter_configs:

(mismatch quotes)

This organization needs some extra thought ... consider that the section starts with
"If RDT monitoring is available there will be an "L3_MON" directory              
with the following *files*:"


> +	The directory for storing event configuration files, which will be used to
> +	assign counters when the mbm_cntr_assign mode is enabled.

Needs more imperative tone.

> +
> +	Following types of events are supported:
> +
> +	==== ========================= ============================================================
> +	Bits Name   		         Description
> +	==== ========================= ============================================================
> +	6    dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
> +	5    remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
> +	4    local_reads_slow_memory     Reads to slow memory in the local NUMA domain
> +	3    remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
> +	2    local_non_temporal_writes   Non-temporal writes to local NUMA domain
> +	1    remote_reads                Reads to memory in the non-local NUMA domain
> +	0    local_reads                 Reads to memory in the local NUMA domain
> +	==== ========================= ==========================================================
> +
> +	Two default configurations, mbm_local_bytes and mbm_total_bytes, will be created

"will be created" -> "are created" ... or maybe just:
	 There are two default configurations: mbm_local_bytes and mbm_total_bytes.

> +	upon mounting.

"upon mounting" seems unnecessary.

> +	::
> +
> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> +	    local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
> +	    local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all
> +
> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> +	    local_reads, local_non_temporal_writes, local_reads_slow_memory
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index b7d1a59f09f8..a943450bf2c8 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -282,11 +282,13 @@ struct mbm_cntr_cfg {
>  #define RFTYPE_RES_CACHE		BIT(8)
>  #define RFTYPE_RES_MB			BIT(9)
>  #define RFTYPE_DEBUG			BIT(10)
> +#define RFTYPE_CONFIG			BIT(11)

hmmm ... these flags are becoming quite complex. Even so, RFTYPE_CONFIG would be
unique to this new feature so I think a more specific name would be appropriate.
Maybe even "RFTYPE_MBM_EVENT_CONFIG".

>  #define RFTYPE_CTRL_INFO		(RFTYPE_INFO | RFTYPE_CTRL)
>  #define RFTYPE_MON_INFO			(RFTYPE_INFO | RFTYPE_MON)
>  #define RFTYPE_TOP_INFO			(RFTYPE_INFO | RFTYPE_TOP)
>  #define RFTYPE_CTRL_BASE		(RFTYPE_BASE | RFTYPE_CTRL)
>  #define RFTYPE_MON_BASE			(RFTYPE_BASE | RFTYPE_MON)
> +#define RFTYPE_MON_CONFIG		(RFTYPE_CONFIG | RFTYPE_MON)

Why is this flag needed?

>  
>  /* List of all resource groups */
>  extern struct list_head rdt_all_groups;
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 58476c065921..4525295b1725 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1264,6 +1264,7 @@ int __init resctrl_mon_resource_init(void)
>  	if (r->mon.mbm_cntr_assignable) {
>  		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
>  		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
> +		resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
>  	}
>  
>  	return 0;
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index aba23e2096db..b2122a1dd36c 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1907,6 +1907,25 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>  	return ret ?: nbytes;
>  }
>  
> +static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
> +{
> +	struct mbm_assign_config *assign_config = of->kn->parent->priv;
> +	bool sep = false;
> +	int i;
> +
> +	for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
> +		if (assign_config->val & mbm_evt_values[i].evt_val) {
> +			if (sep)
> +				seq_puts(seq, ", ");

seq_putc()

> +			seq_printf(seq, "%s", mbm_evt_values[i].evt_name);
> +			sep = true;
> +		}
> +	}
> +	seq_puts(seq, "\n");
seq_putc()
> +
> +	return 0;
> +}
> +
>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{
> @@ -2019,6 +2038,12 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= mbm_local_bytes_config_show,
>  		.write		= mbm_local_bytes_config_write,
>  	},
> +	{
> +		.name		= "event_filter",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= event_filter_show,
> +	},
>  	{
>  		.name		= "mbm_assign_mode",
>  		.mode		= 0444,
> @@ -2314,6 +2339,52 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>  	return ret;
>  }
>  
> +static int resctrl_mkdir_info_configs(void *priv,  char *name, unsigned long fflags)

Why a void * instead of struct rdt_resource *?

Also please fix spacing.

Also, why do fflags need to be provided as parameter? These are so custom I think the
hardcoding should be contained here instead of the caller. With this the function name
can also be made specific to what it does ... perhaps "resctrl_mkdir_counter_configs()"
(please feel free to improve).


> +{
> +	struct kernfs_node *l3_mon_kn, *kn_subdir, *kn_subdir2;
> +	int ret, i;
> +
> +	l3_mon_kn = kernfs_find_and_get(kn_info, name);
> +	if (!l3_mon_kn)
> +		return -ENOENT;
> +
> +	kn_subdir = kernfs_create_dir(l3_mon_kn, "counter_configs", l3_mon_kn->mode, priv);
> +	if (IS_ERR(kn_subdir)) {
> +		kernfs_put(l3_mon_kn);
> +		return PTR_ERR(kn_subdir);
> +	}
> +
> +	ret = rdtgroup_kn_set_ugid(kn_subdir);
> +	if (ret) {
> +		kernfs_put(l3_mon_kn);
> +		return ret;
> +	}
> +
> +	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {

This can instead work through the resource's evt_list and use a flag (TBD how to
adapt "configurable") to determine if a directory should be created for it.

> +		kn_subdir2 = kernfs_create_dir(kn_subdir, mbm_assign_configs[i].name,
> +					       kn_subdir->mode, &mbm_assign_configs[i]);
> +		if (IS_ERR(kn_subdir)) {

IS_ERR(kn_subdir2)?

> +			ret = PTR_ERR(kn_subdir2);
> +			goto config_out;
> +		}
> +
> +		ret = rdtgroup_kn_set_ugid(kn_subdir2);
> +		if (ret)
> +			goto config_out;
> +
> +		ret = rdtgroup_add_files(kn_subdir2, fflags);
> +		if (!ret)
> +			kernfs_activate(kn_subdir);
> +	}
> +
> +config_out:
> +	kernfs_put(l3_mon_kn);
> +	if (ret)
> +		kernfs_remove(kn_subdir);
> +
> +	return ret;
> +}
> +
>  static unsigned long fflags_from_resource(struct rdt_resource *r)
>  {
>  	switch (r->rid) {
> @@ -2360,6 +2431,12 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>  		ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
>  		if (ret)
>  			goto out_destroy;
> +
> +		if (r->mon.mbm_cntr_assignable) {
> +			ret = resctrl_mkdir_info_configs(r, name, RFTYPE_MON_CONFIG);
> +			if (ret)
> +				goto out_destroy;
> +		}
>  	}
>  
>  	ret = rdtgroup_kn_set_ugid(kn_info);

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-04  0:18 ` [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations Babu Moger
@ 2025-04-11 22:07   ` Reinette Chatre
  2025-04-15 20:37     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 22:07 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> Users can modify the event configuration by writing to the event_filter
> interface file. The event configurations for mbm_cntr_assign mode are
> located in /sys/fs/resctrl/info/event_configs/.
> 
> Update the assignments of all groups when the event configuration is
> modified.
> 
> Example:
> $ cd /sys/fs/resctrl/
> $ echo "local_reads, local_non_temporal_writes" >
>   info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> 
> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>  local_reads, local_non_temporal_writes
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: New patch to modify event configurations.
> ---
>  Documentation/arch/x86/resctrl.rst     |  10 +++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 115 ++++++++++++++++++++++++-
>  2 files changed, 124 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 99f9f4b9b501..4e6feba6fb08 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -335,6 +335,16 @@ with the following files:
>  	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>  	    local_reads, local_non_temporal_writes, local_reads_slow_memory
>  
> +	The event configuration can be modified by writing to the event_filter file within
> +	the configuration directory.

Please use imperative tone.

> +	::
> +
> +	    # echo "local_reads, local_non_temporal_writes" >
> +	      /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> +
> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
> +	    local_reads, local_non_temporal_writes
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index b2122a1dd36c..7792455f0b26 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1926,6 +1926,118 @@ static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq,
>  	return 0;
>  }
>  

Could you please add comments to these new functions to explain what they do?

> +static int resctrl_group_assign(struct rdt_resource *r, struct rdtgroup *rdtgrp,
> +				enum resctrl_event_id evtid, u32 evt_cfg)
> +{
> +	struct rdt_mon_domain *d;
> +	int cntr_id, ret;
> +
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
> +		if (cntr_id >= 0 && d->cntr_cfg[cntr_id].evt_cfg != evt_cfg) {
> +			d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
> +			ret = resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
> +						       rdtgrp->closid, cntr_id, evt_cfg, true);
> +			if (ret) {
> +				rdt_last_cmd_printf("Assign failed event %d domain %d group %s\n",
> +						    evtid, d->hdr.id, rdtgrp->kn->name);

Please provide the actual event name to user space. The event IDs are not visible to
user space.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments
  2025-04-04  0:18 ` [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments Babu Moger
@ 2025-04-11 22:08   ` Reinette Chatre
  2025-04-15 20:39     ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-11 22:08 UTC (permalink / raw)
  To: Babu Moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/3/25 5:18 PM, Babu Moger wrote:
> The mbm_cntr_assign mode provides an option to the user to assign a
> counter to an RMID, event pair and monitor the bandwidth as long as
> the counter is assigned.
> 
> Introduce a configuration option to automatically assign counter IDs
> when a resctrl group is created, provided the counters are available.
> By default, this option is enabled at boot.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v12: New patch. Added after the discussion on the list.
>      https://lore.kernel.org/lkml/CALPaoCh8siZKjL_3yvOYGL4cF_n_38KpUFgHVGbQ86nD+Q2_SA@mail.gmail.com/

Seems like this needs a Suggested-by for Peter.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain
  2025-04-11 20:49   ` Reinette Chatre
@ 2025-04-14 15:56     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 15:56 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

Thanks for the quick response to the series.

On 4/11/25 15:49, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>> supported, the bandwidth events can be configured to track specific
>> events. The event configuration is domain specific. Event configurations
>> are not stored in resctrl but instead always read from or written to
>> hardware directly when prompted by user space.
> 
> Why is this a problem?

I mean it involves an extra MSR read every time use asks for it.

> 
>>
>> Read the event configuration from the hardware during domain
>> initialization and store the configuration value in the rdt_hw_mon_domain
>> structure for later use when the user requests to display it.
> 
> Why is this required?

Minor optimization.

> 
> This series is about adding support for ABMC while this appears to be
> an optimization for BMEC. Even more, as I see it, this optimization makes
> resctrl support of ABMC and BMEC confusing (more below).
> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: Fixed the conflicts due to recent merge.
>>      This patch is for BMEC and there is no dependancy on ABMC feature.
> 
> Why still do it?

Ok. Will drop it for now.

> 
>>      Moved it earlier.
>>
>> v11: Resolved minor conflicts due to code displacement. Actual code didnt
>>      change.
>>
>> v10: Conflicts due to code displacement. Actual code didnt change.
>>
>> v9: Added Reviewed-by tag. No other changes.
>>
>> v8: Renamed resctrl_mbm_evt_config_init() to arch_mbm_evt_config_init()
>>     Minor commit message update.
>>
>> v7: Fixed initializing INVALID_CONFIG_VALUE to mbm_local_cfg in case of error.
>>
>> v6: Renamed resctrl_arch_mbm_evt_config -> resctrl_mbm_evt_config_init
>>     Initialized value to INVALID_CONFIG_VALUE if it is not configurable.
>>     Minor commit message update.
>>
>> v5: Exported mon_event_config_index_get.
>>     Renamed arch_domain_mbm_evt_config to resctrl_arch_mbm_evt_config.
>>
>> v4: Read the configuration information from the hardware to initialize.
>>     Added few commit messages.
>>     Fixed the tab spaces.
>>
>> v3: Minor changes related to rebase in mbm_config_write_domain.
>>
>> v2: No changes.
>> ---
>>  arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>>  arch/x86/kernel/cpu/resctrl/internal.h |  9 +++++++++
>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 26 ++++++++++++++++++++++++++
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |  2 +-
>>  4 files changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index cf29681d01e0..a28de257168f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -558,6 +558,8 @@ static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
>>  		return;
>>  	}
>>  
>> +	arch_mbm_evt_config_init(hw_dom);
>> +
>>  	list_add_tail_rcu(&d->hdr.list, add_pos);
>>  
>>  	err = resctrl_online_mon_domain(r, d);
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index c44c5b496355..9846153aa48f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -32,6 +32,9 @@
>>   */
>>  #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
>>  
>> +#define INVALID_CONFIG_VALUE		U32_MAX
>> +#define INVALID_CONFIG_INDEX		UINT_MAX
>> +
>>  /**
>>   * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
>>   *			        aren't marked nohz_full
>> @@ -335,6 +338,8 @@ struct rdt_hw_ctrl_domain {
>>   * @d_resctrl:	Properties exposed to the resctrl file system
>>   * @arch_mbm_total:	arch private state for MBM total bandwidth
>>   * @arch_mbm_local:	arch private state for MBM local bandwidth
>> + * @mbm_total_cfg:	MBM total bandwidth configuration
>> + * @mbm_local_cfg:	MBM local bandwidth configuration
>>   *
>>   * Members of this structure are accessed via helpers that provide abstraction.
>>   */
>> @@ -342,6 +347,8 @@ struct rdt_hw_mon_domain {
>>  	struct rdt_mon_domain		d_resctrl;
>>  	struct arch_mbm_state		*arch_mbm_total;
>>  	struct arch_mbm_state		*arch_mbm_local;
>> +	u32				mbm_total_cfg;
>> +	u32				mbm_local_cfg;
>>  };
> 
> This introduces an architecture managed per-domain event configuration while
> the rest of the series introduces a resctrl fs managed global event configuration.
> I see this as the start of a source for confusion about how events are configured since
> there is no further connection between this per-domain event configuration maintained
> by the architecture and the global event configuration maintained by resctrl fs.
> 
>>  
>>  static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct rdt_ctrl_domain *r)
>> @@ -504,6 +511,8 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
>>  void rdt_staged_configs_clear(void);
>>  bool closid_allocated(unsigned int closid);
>>  int resctrl_find_cleanest_closid(void);
>> +void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
>> +unsigned int mon_event_config_index_get(u32 evtid);
>>  
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index a93ed7d2a160..abd337fbd01d 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1284,6 +1284,32 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>>  	return 0;
>>  }
>>  
>> +void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom)
>> +{
>> +	unsigned int index;
>> +	u64 msrval;
>> +
>> +	/*
>> +	 * Read the configuration registers QOS_EVT_CFG_n, where <n> is
>> +	 * the BMEC event number (EvtID).
>> +	 */
>> +	if (mbm_total_event.configurable) {
> 
> Please keep an eye on where things are going in the arch/fs split.
> mbm_total_event is private to resctrl fs and arch code cannot reach into it.
> There is the arch helper resctrl_arch_is_evt_configurable() but I also
> think that this helper needs to be reconsidered in the light of ABMC.

ok

> 
> Overall I think this ABMC support needs to consider what already exists
> for BMEC support and ensure that both are supported coherently. For example,
> when a monitor domain has a "MBM local bandwidth configuration" then it should
> be obvious what that means.

ok. Agreed. Lets drop these two patches. Lets address ABMC in this series.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value
  2025-04-11 20:50   ` Reinette Chatre
@ 2025-04-14 15:57     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 15:57 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 15:50, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The event configuration is domain specific and initialized during domain
>> initialization. The values are stored in struct rdt_hw_mon_domain.
>>
>> It is not required to read the configuration register every time user asks
>> for it. Use the value stored in struct rdt_hw_mon_domain instead.
> 
> Storing and maintaining the event configuration creates confusion with
> the new event configurations introduced in the rest of this series. I
> think that it will be simpler to keep BMEC support as-is.

That is fine with me. I will remove first two patches from this series.
There are multiple things going on with resctrl. We can do these
optimizations later.
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-11 20:52   ` Reinette Chatre
@ 2025-04-14 17:48     ` Moger, Babu
  2025-04-15 16:09       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 17:48 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 15:52, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Users can create as many monitor groups as RMIDs supported by the hardware.
>> However, bandwidth monitoring feature on AMD system only guarantees that
>> RMIDs currently assigned to a processor will be tracked by hardware. The
>> counters of any other RMIDs which are no longer being tracked will be reset
>> to zero. The MBM event counters return "Unavailable" for the RMIDs that are
>> not tracked by hardware. So, there can be only limited number of groups
>> that can give guaranteed monitoring numbers. With ever changing
>> configurations there is no way to definitely know which of these groups are
>> being tracked for certain point of time. Users do not have the option to
>> monitor a group or set of groups for certain period of time without
>> worrying about RMID being reset in between.
>>
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as it is
>> assigned. The assigned RMID will be tracked by the hardware until the user
>> unassigns it manually. There is no need to worry about counters being reset
>> during this period. Additionally, the user can specify a bitmask
>> identifying the specific bandwidth types from the given source to track
>> with the counter.
>>
>> Without ABMC enabled, monitoring will work in current mode without
>> assignment option.
>>
>> Linux resctrl subsystem provides the interface to count maximum of two
>> memory bandwidth events per group, from a combination of available total
>> and local events. Keeping the current interface, users can enable a maximum
>> of 2 ABMC counters per group. User will also have the option to enable only
>> one counter to the group. If the system runs out of assignable ABMC
>> counters, kernel will display an error. Users need to disable an already
>> enabled counter to make space for new assignments.
> 
> The above paragraph sounds like it is still talking about the original
> global assignment of counters.

Ok. Sure. Will update it.

> 
>>
>> The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5.
>> Bits Description
>> 5    ABMC (Assignable Bandwidth Monitoring Counters)
>>
>> The feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>
>> Note: Checkpatch checks/warnings are ignored to maintain coding style.
>>
>> v12: Removed the dependancy on X86_FEATURE_BMEC.
> 
> Considering this removal it is not clear to me how the BMEC and ABMC features
> are managed on a platform. Since this dependency existed I assume platforms
> that support both ABMC and BMEC exist and after previous discussion [1]
> I expected to see that BMEC support will be disabled when ABMC is detected
> but I do not see this done in this series. From what I can tell, looking at
> patch "x86/resctrl: Detect Assignable Bandwidth Monitoring feature details"
> BMEC and ABMC are both detected and enabled while I do not see any
> interactions handled. For example, a user modifying the BMEC appears
> to have no impact on existing ABMC assigned counters. Could you please clarify
> how event configuration works on platforms that support both ABMC and BMEC?

They are mutually exclusive. If ABMC is enabled then BMEC should not work.

I missed to handle it. Also, I was not very clear at that  on how to
handle that.

Here is my proposal to handle this case. This can be separate patch.


diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d10cf1e5b914..772f2f77faee 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
kernfs_open_file *of,

        list_for_each_entry(mevt, &r->mon.evt_list, list) {
                seq_printf(seq, "%s\n", mevt->name);
-               if (mevt->configurable)
+               if (mevt->configurable &&
!resctrl_arch_mbm_cntr_assign_enabled(r))
                        seq_printf(seq, "%s_config\n", mevt->name);
        }

@@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
struct rdt_resource *r, u32 evtid
        cpus_read_lock();
        mutex_lock(&rdtgroup_mutex);

+       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+               rdt_last_cmd_puts("Event configuration(BMEC) not supported
with mbm_cntr_assign mode\n");
+               return -EINVAL;
+       }
+
        list_for_each_entry(dom, &r->mon_domains, hdr.list) {
                if (sep)
                        seq_puts(s, ";");
@@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
struct rdt_resource *r, u32 evtid
 static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
                                       struct seq_file *seq, void *v)
 {
+       int ret;
        struct rdt_resource *r = of->kn->parent->priv;

-       mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
+       ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);

-       return 0;
+       return ret;
 }

 static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
                                       struct seq_file *seq, void *v)
 {
+       int ret;
+
        struct rdt_resource *r = of->kn->parent->priv;

-       mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
+       ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);

-       return 0;
+       return ret;
 }

 static void mbm_config_write_domain(struct rdt_resource *r,
@@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
char *tok, u32 evtid)
        /* Walking r->domains, ensure it can't race with cpuhp */
        lockdep_assert_cpus_held();

+       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+               rdt_last_cmd_puts("Event configuration(BMEC) not supported
with mbm_cntr_assign mode\n");
+               return -EINVAL;
+       }
+
 next:
        if (!tok || tok[0] == '\0')
                return 0;





> 
> Reinette
> 
> [1] https://lore.kernel.org/all/4b66ea1c-2f76-4218-a67b-2232b2be6990@amd.com/
> 

-- 
Thanks
Babu Moger

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode
  2025-04-11 20:56   ` Reinette Chatre
@ 2025-04-14 19:52     ` Moger, Babu
  2025-04-15 16:22       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 19:52 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 15:56, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Introduce the interface file "mbm_assign_mode" to list monitor modes
> 
> Using "resctrl file" instead of "interface file" should help to make it
> clear what this patch does.

ok. Sure.

> 
>> supported.
>>
>> The "mbm_cntr_assign" mode provides the option to assign a counter to
>> an RMID, event pair and monitor the bandwidth as long as it is assigned.
>>
>> On AMD systems "mbm_cntr_assign" mode is backed by the ABMC (Assignable
>> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
>>
>> The "default" mode is the existing monitoring mode that works without the
>> explicit counter assignment, instead relying on dynamic counter assignment
>> by hardware that may result in hardware not dedicating a counter resulting
>> in monitoring data reads returning "Unavailable".
>>
>> Provide an interface to display the monitor mode on the system.
>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> [mbm_cntr_assign]
>> default
>>
>> Added an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to handle Arm64
> 
> (needs imperative)

Sure.

> 
>> platforms. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined, whereas on
>> Arm64, it is. As a result, for MPAM, the file would be either:
> 
> CONFIG_RESCTRL_ASSIGN_FIXED does not yet exist anywhere so this motivation needs
> to provide stronger support for why it is used before it exists. There is a precedent
> here with RESCTRL_RMID_DEPENDS_ON_CLOSID already used while it does not yet
> appear in a Kconfig file. I would propose that this is motivated by noting
> how it is already understood how Arm supports assignable counters this was recommended
> by James to prepare for that work. Since this is user interface this
> work is done early to ensure user interface is compatible with that upcoming
> support. Also set folks at ease that IS_ENABLED() works as expected with a
> non-existing config.

How about this?

Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64.

On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be
defined when the "mbm_cntr_assign" mode is supported.

Add an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the
user interface remains compatible with upcoming Arm64 support.
IS_ENABLED() safely evaluates to 0 when the configuration is not defined.

As a result, for MPAM, the file would be either:
[default]
or
[mbm_cntr_assign]


> 
> 
>> [default]
>> or
>> [mbm_cntr_assign]
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: Minor text update in change log and user documentation.
>>      Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
>>      This will be defined only in arm and not in x86.
>>
>> v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
>>      Removed few texts in resctrl.rst about AMD specific information.
>>      Updated few texts.
>>
>> v10: Added few more text to user documentation clarify on the default mode.
>>
>> v9: Updated user documentation based on comments.
>>
>> v8: Commit message update.
>>
>> v7: Updated the descriptions/commit log in resctrl.rst to generic text.
>>     Thanks to James and Reinette.
>>     Rename mbm_mode to mbm_assign_mode.
>>     Introduced mutex lock in rdtgroup_mbm_mode_show().
>>
>> v6: Added documentation for mbm_cntr_assign and legacy mode.
>>     Moved mbm_mode fflags initialization to static initialization.
>>
>> v5: Changed interface name to mbm_mode.
>>     It will be always available even if ABMC feature is not supported.
>>     Added description in resctrl.rst about ABMC mode.
>>     Fixed display abmc and legacy consistantly.
>>
>> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
>>
>> v3: New patch to display ABMC capability.
>>
>> ???END
>> ---
>>  Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++++++
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 ++++++++++++++++++++++++++
>>  2 files changed, 64 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index fb90f08e564e..bb96b44019fe 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -257,6 +257,33 @@ with the following files:
>>  	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>  	    0=0x30;1=0x30;3=0x15;4=0x15
>>  
>> +"mbm_assign_mode":
>> +	Reports the list of monitoring modes supported. The enclosed brackets
>> +	indicate which mode is enabled.
>> +	::
>> +
>> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> +	  [mbm_cntr_assign]
>> +	  default
>> +
>> +	"mbm_cntr_assign":
>> +
>> +	In mbm_cntr_assign mode, a monitoring event can only accumulate data
>> +	while it is backed by a hardware counter. The user-space is able to
>> +	specify which of the events in CTRL_MON or MON groups should have a
>> +	counter assigned using the "mbm_assign_control" file. The number of
> 
> "mbm_assign_control" no longer exist.

The user-space is able to specify which of the events in CTRL_MON or MON
groups should have a counter assigned using the "mbm_L3_assignments"
interface file in each resctrl group.

> 
>> +	counters available is described in the "num_mbm_cntrs" file. Changing
>> +	the mode may cause all counters on the resource to reset.
>> +
>> +	"default":
>> +
>> +	In default mode, resctrl assumes there is a hardware counter for each
>> +	event within every CTRL_MON and MON group. On AMD platforms, it is
>> +	recommended to use the mbm_cntr_assign mode, if supported, to prevent
>> +	the hardware from resetting counters between reads. This can result in
> 
> "from resetting counters" -> "from re-allocating counters"?

How about?

"from resetting MBM events between reads"

> 
>> +	misleading values or display "Unavailable" if no counter is assigned
>> +	to the event.
>> +
>>  "max_threshold_occupancy":
>>  		Read/write file provides the largest value (in
>>  		bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 17de38e26f94..626be6becca7 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -882,6 +882,36 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
>>  	return ret;
>>  }
>>  
>> +static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>> +					struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = of->kn->parent->priv;
>> +	bool enabled;
>> +
>> +	mutex_lock(&rdtgroup_mutex);
>> +	enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
>> +
>> +	if (r->mon.mbm_cntr_assignable) {
>> +		if (enabled)
>> +			seq_puts(s, "[mbm_cntr_assign]\n");
>> +		else
>> +			seq_puts(s, "[default]\n");
>> +
>> +		if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
>> +			if (enabled)
>> +				seq_puts(s, "default\n");
>> +			else
>> +				seq_puts(s, "mbm_cntr_assign\n");
>> +		}
>> +	} else {
>> +		seq_puts(s, "[default]\n");
>> +	}
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +
>> +	return 0;
>> +}
>> +
>>  #ifdef CONFIG_PROC_CPU_RESCTRL
>>  
>>  /*
>> @@ -1908,6 +1938,13 @@ static struct rftype res_common_files[] = {
>>  		.seq_show	= mbm_local_bytes_config_show,
>>  		.write		= mbm_local_bytes_config_write,
>>  	},
>> +	{
>> +		.name		= "mbm_assign_mode",
>> +		.mode		= 0444,
>> +		.kf_ops		= &rdtgroup_kf_single_ops,
>> +		.seq_show	= resctrl_mbm_assign_mode_show,
>> +		.fflags		= RFTYPE_MON_INFO,
> 
> Needs a RFTYPE_RES_CACHE?

I am not very sure about this.  This flag is added to the files in info/L3.

"mbm_assign_mode" goes in info/L3_MON/

The files in L3_MON does not have these flags set (for example
mon_features, num_rmids).

> 
>> +	},
>>  	{
>>  		.name		= "cpus",
>>  		.mode		= 0644,
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters
  2025-04-11 21:01   ` Reinette Chatre
@ 2025-04-14 20:12     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 20:12 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:01, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The mbm_cntr_assign mode provides an option to the user to assign a
>> counter to an RMID, event pair and monitor the bandwidth as long as
>> the counter is assigned. Number of assignments depend on number of
>> monitoring counters available.
>>
>> Provide the interface to display the number of monitoring counters
> 
> An interface can also be a function. To help make this work obvious
> it can be specific:
> 
> 	Create 'num_mbm_cntrs' resctrl file that displays the number of
> 	monitoring counters supported in each domain. 'num_mbm_cntrs'
> 	is only visible to user space when the system supports
> 	mbm_cntr_assign mode.

ok. Sure.

> 
>> supported in each domain. The resctrl file 'num_mbm_cntrs' is visible
>> to user space when the system supports mbm_cntr_assign mode.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
> 
>> ---
>>  Documentation/arch/x86/resctrl.rst     | 11 ++++++++++
>>  arch/x86/kernel/cpu/resctrl/monitor.c  |  3 +++
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 ++++++++++++++++++++++++++
>>  3 files changed, 44 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index bb96b44019fe..35d908befdfb 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -284,6 +284,17 @@ with the following files:
>>  	misleading values or display "Unavailable" if no counter is assigned
>>  	to the event.
>>  
>> +"num_mbm_cntrs":
>> +	The maximum number of monitoring counters (total of available and assigned
>> +	counters) in each domain when the system supports mbm_cntr_assign mode.
>> +
>> +	For example, on a system with maximum of 32 memory bandwidth monitoring
>> +	counters in each of its L3 domains:
>> +	::
>> +
>> +	  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>> +	  0=32;1=32
>> +
>>  "max_threshold_occupancy":
>>  		Read/write file provides the largest value (in
>>  		bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 6ed7e51d3fdb..028b49878ad0 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1234,6 +1234,9 @@ int __init resctrl_mon_resource_init(void)
>>  	else if (resctrl_arch_is_mbm_total_enabled())
>>  		mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
>>  
>> +	if (r->mon.mbm_cntr_assignable)
>> +		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
> 
> Missing RFTYPE_RES_CACHE?


Please see my response here.

https://lore.kernel.org/lkml/4fc02936-237d-4060-86af-79efc28a72e5@amd.com/

> 
>> +
>>  	return 0;
>>  }
>>  
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 626be6becca7..0c9d7a702b93 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -912,6 +912,30 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>>  	return 0;
>>  }
>>  
>> +static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
>> +				      struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = of->kn->parent->priv;
>> +	struct rdt_mon_domain *dom;
>> +	bool sep = false;
>> +
>> +	cpus_read_lock();
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>> +		if (sep)
>> +			seq_puts(s, ";");
> 
> seq_putc() can be used.
> 
Sure.

>> +
>> +		seq_printf(s, "%d=%d", dom->hdr.id, r->mon.num_mbm_cntrs);
>> +		sep = true;
>> +	}
>> +	seq_puts(s, "\n");
> 
> seq_putc() can be used.

Sure.

> 
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +	cpus_read_unlock();
>> +	return 0;
>> +}
>> +
>>  #ifdef CONFIG_PROC_CPU_RESCTRL
>>  
>>  /*
>> @@ -1945,6 +1969,12 @@ static struct rftype res_common_files[] = {
>>  		.seq_show	= resctrl_mbm_assign_mode_show,
>>  		.fflags		= RFTYPE_MON_INFO,
>>  	},
>> +	{
>> +		.name		= "num_mbm_cntrs",
>> +		.mode		= 0444,
>> +		.kf_ops		= &rdtgroup_kf_single_ops,
>> +		.seq_show	= resctrl_num_mbm_cntrs_show,
>> +	},
>>  	{
>>  		.name		= "cpus",
>>  		.mode		= 0644,
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-04-11 21:01   ` Reinette Chatre
@ 2025-04-14 20:30     ` Moger, Babu
  2025-04-15 16:30       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 20:30 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:01, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as the
>> counter is assigned. The bandwidth events will be tracked by the hardware
>> until the user changes the configuration. Each resctrl group can configure
>> maximum two counters, one for total event and one for local event.
>>
>> The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
>> Configuration is done by setting the counter id, bandwidth source (RMID)
>> and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
>> Configuration).
> 
> Apart from the BMEC optimization in patch #1 and patch #2 this is the
> first and only mention of BMEC dependency I see in this series while I do
> not see implementation support for this. What am I missing?
> 

My mistake. I should have corrected it.  How about this?

"The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
ABMC counter assignment is done by setting the counter id, bandwidth
source (RMID) and bandwidth configuration. Users will have the option to
change the bandwidth configuration using resctrl interface which will be
introduced later in the series."

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-04-11 21:02   ` Reinette Chatre
@ 2025-04-14 20:51     ` Moger, Babu
  2025-04-15 16:38       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-14 20:51 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:02, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as it
>> is assigned. The assigned RMID will be tracked by the hardware until the
>> user unassigns it manually.
>>
>> Implement an architecture-specific handler to assign and unassign the
>> counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
>> specifying the counter ID, bandwidth source (RMID), and event
>> configuration.
>>
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>     Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>>     Monitoring (ABMC).
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
> 
>> ---
>>  arch/x86/kernel/cpu/resctrl/monitor.c | 39 +++++++++++++++++++++++++++
>>  include/linux/resctrl.h               | 15 +++++++++++
>>  2 files changed, 54 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 8a88ac29d57d..77f8662dc50b 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1430,3 +1430,42 @@ int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>>  
>>  	return 0;
>>  }
>> +
>> +static void resctrl_abmc_config_one_amd(void *info)
>> +{
>> +	union l3_qos_abmc_cfg *abmc_cfg = info;
>> +
>> +	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
>> +}
>> +
>> +/*
>> + * Send an IPI to the domain to assign the counter to RMID, event pair.
>> + */
>> +int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			     enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> +			     u32 cntr_id, u32 evt_cfg, bool assign)
>> +{
>> +	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>> +	union l3_qos_abmc_cfg abmc_cfg = { 0 };
>> +	struct arch_mbm_state *am;
>> +
>> +	abmc_cfg.split.cfg_en = 1;
>> +	abmc_cfg.split.cntr_en = assign ? 1 : 0;
>> +	abmc_cfg.split.cntr_id = cntr_id;
>> +	abmc_cfg.split.bw_src = rmid;
>> +	abmc_cfg.split.bw_type = evt_cfg;
>> +
>> +	smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
>> +
>> +	/*
>> +	 * Reset the architectural state so that reading of hardware
>> +	 * counter is not considered as an overflow in next update.
> 
> Please add something like: "The hardware counter is reset (because cfg_en == 1)
> so there is no need to record initial non-zero counts."

Sure.

> 
>> +	 */
>> +	if (assign) {
>> +		am = get_arch_mbm_state(hw_dom, rmid, evtid);
>> +		if (am)
>> +			memset(am, 0, sizeof(*am));
>> +	}
>> +
>> +	return 0;
>> +}
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 294b15de664e..60270606f1b8 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
> 
> Please keep the declaration internal to the arch code. It can be moved when
> other architecture needs it.

Sure.

> 
>> @@ -394,6 +394,21 @@ void resctrl_arch_mon_event_config_set(void *config_info);
>>  u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
>>  				      enum resctrl_event_id eventid);
>>  
>> +/**
>> + * resctrl_arch_config_cntr() - Configure the counter on the domain
>> + * @r:			resource that the counter should be read from.
>> + * @d:			domain that the counter should be read from.
>> + * @evtid:		event type to assign
>> + * @rmid:		rmid of the counter to read.
>> + * @closid:		closid that matches the rmid.
>> + * @cntr_id:		Counter ID to configure
>> + * @evt_cfg:		event configuration
> 
> "event configuration" is simply an expansion of member name and does not help to
> understand what the value represents.

How about?

"MBM Event configuration value representing reads, writes etc.."

> 
>> + * @assign:		assign or unassign
> 
> Please rework the kernel doc: consistent sentence structure (starts with upper case,
> ends with period), use proper capitalization for acronyms (rmid -> RMID, etc.),
> make descriptions informative.

Sure.

> 
>> + */
>> +int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +                             enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> +                             u32 cntr_id, u32 evt_cfg, bool assign);
>> +
>>  /* For use by arch code to remap resctrl's smaller CDP CLOSID range */
>>  static inline u32 resctrl_get_config_index(u32 closid,
>>  					   enum resctrl_conf_type type)
> 
> This patch does not pass checkpatch.pl.
> 

Sure. Will check again.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-11 21:04   ` Reinette Chatre
@ 2025-04-15 14:20     ` Moger, Babu
  2025-04-15 16:53       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 14:20 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:04, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>> as it is assigned.
> 
> Above makes it sound as though multiple counters can be assigned to
> an RMID, event pair.
> 

Yes. Multiple counter-ids can be assigned to RMID, event pair.

>>
>> Add the functionality to allocate and assign the counters to RMID, event
>> pair in the domain.
> 
> "assign *a* counter to an RMID, event pair"?

Sure.

> 
>>
>> If all the counters are in use, the kernel will log the error message
>> "Unable to allocate counter in domain" in /sys/fs/resctrl/info/
>> last_cmd_status when a new assignment is requested. Exit on the first
>> failure when assigning counters across all the domains.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> ---
>>  arch/x86/kernel/cpu/resctrl/internal.h |   2 +
>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
>>  2 files changed, 126 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 0b73ec451d2c..1a8ac511241a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
>>  int resctrl_find_cleanest_closid(void);
>>  void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
>>  unsigned int mon_event_config_index_get(u32 evtid);
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
> 
> This is internal to resctrl fs. Why is it needed to provide both the event id
> and the event configuration? Event configuration can be determined from event ID?

Yes. It can be done. Then I have to export the functions like
mbm_get_assign_config() into monitor.c. To avoid that I passed it from
here which I felt much more cleaner.

> 
>>  
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 77f8662dc50b..ff55a4fe044f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1469,3 +1469,127 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>>  
>>  	return 0;
>>  }
>> +
>> +/*
>> + * Configure the counter for the event, RMID pair for the domain. Reset the
>> + * non-architectural state to clear all the event counters.
>> + */
>> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			       enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> +			       u32 cntr_id, u32 evt_cfg, bool assign)
>> +{
>> +	struct mbm_state *m;
>> +	int ret;
>> +
>> +	ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
>> +	if (ret)
>> +		return ret;
> 
> I understood previous discussion to conclude that resctrl_arch_config_cntr() cannot fail
> and thus I expect it to return void and not need any error checking from caller.
> By extension this will result in resctrl_config_cntr() returning void and should simplify
> a few flows. For example, it will make it clear that re-configuring an existing counter
> cannot result in that counter being freed.

Yea. I missed it. Will take care of it next version.

> 
>> +
>> +	m = get_mbm_state(d, closid, rmid, evtid);
>> +	if (m)
>> +		memset(m, 0, sizeof(struct mbm_state));
>> +
>> +	return ret;
>> +}
>> +
> 
> Could you please add comments to these mbm_cntr* functions to provide information
> on how the cntr_cfg data structure is used? Please also include details on
> callers since it seems to me as though these functions are called
> from paths where assignable counters are not supported (mon_event_read()).

Sure. Will add details about these functions.

> 
>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> +	int cntr_id;
>> +
>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
>> +		    d->cntr_cfg[cntr_id].evtid == evtid)
>> +			return cntr_id;
>> +	}
>> +
>> +	return -ENOENT;
>> +}
>> +
>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> +	int cntr_id;
>> +
>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
>> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
>> +			d->cntr_cfg[cntr_id].evtid = evtid;
>> +			return cntr_id;
>> +		}
>> +	}
>> +
>> +	return -ENOSPC;
>> +}
>> +
>> +static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
>> +{
>> +	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
>> +}
>> +
>> +/*
>> + * Allocate a fresh counter and configure the event if not assigned already.
>> + */
>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
>> +				     u32 evt_cfg)
> 
> Same here, why are both evtid and evt_cfg provided as arguments? 

Yes. It can be done. Then I have to export the functions like
mbm_get_assign_config() into monitor.c. To avoid that I passed it from
here which I felt much more cleaner.


> 
>> +{
>> +	int cntr_id, ret = 0;
>> +
>> +	/*
>> +	 * No need to allocate or configure if the counter is already assigned
>> +	 * and the event configuration is up to date.
>> +	 */
>> +	cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>> +	if (cntr_id >= 0) {
>> +		if (d->cntr_cfg[cntr_id].evt_cfg == evt_cfg)
>> +			return 0;
>> +
>> +		goto cntr_configure;
>> +	}
>> +
>> +	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
>> +	if (cntr_id <  0) {
>> +		rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
>> +				    d->hdr.id);
> 
> Please print resource name also.

Sure. We can print r->name.

> 
>> +		return cntr_id;
>> +	}
>> +
>> +cntr_configure:
>> +	/* Update and configure the domain with the new event configuration value */
>> +	d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
>> +
>> +	ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
>> +				  cntr_id, evt_cfg, true);
>> +	if (ret) {
>> +		rdt_last_cmd_printf("Assignment of event %d failed on domain %d\n",
>> +				    evtid, d->hdr.id);
> 
> How is user expected to interpret the event ID (especially when looking forward
> where events can be dynamic)? This should rather be the event name.

Sure. We can do that.

> 
>> +		mbm_cntr_free(d, cntr_id);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +/*
>> + * Assign a hardware counter to event @evtid of group @rdtgrp. Counter will be
>> + * assigned to all the domains if @d is NULL else the counter will be assigned
>> + * to @d.
>> + */
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
>> +			      u32 evt_cfg)
>> +{
>> +	int ret = 0;
>> +
>> +	if (!d) {
>> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +			ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
>> +			if (ret)
>> +				return ret;
>> +		}
>> +	} else {
>> +		ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid, evt_cfg);
>> +	}
>> +
>> +	return ret;
>> +}
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode
  2025-04-11 21:08   ` Reinette Chatre
@ 2025-04-15 15:00     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 15:00 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:08, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> In mbm_cntr_assign mode, the hardware counter should be assigned to read
>> the MBM events.
>>
>> Report 'Unassigned' in case the user attempts to read the events without
> 
> "the events" -> "the event"?

Sure.

> 
>> assigning the counter.
> 
> "the counter" -> "a hardware counter"?
> 

Sure.

>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
> 
>> ---
>>  Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 ++++++++++++++
>>  arch/x86/kernel/cpu/resctrl/internal.h    |  3 +++
>>  arch/x86/kernel/cpu/resctrl/monitor.c     |  4 ++--
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  2 +-
>>  5 files changed, 30 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 44128fbda4fe..71ed1cfed33a 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -430,6 +430,16 @@ When monitoring is enabled all MON groups will also contain:
>>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>>  	where "YY" is the node number.
>>  
>> +	The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters
>> +	and allows users to assign a counter to mon_hw_id, event pair enabling
>> +	bandwidth monitoring for as long as the counter remains assigned.
>> +	The hardware will continue tracking the assigned mon_hw_id until
>> +	the user manually unassigns it, ensuring that counters are not reset
>> +	during this period. System may run out of assignable counters when
>> +	all the counters are already assigned. In that case, MBM event counters
> 
> Counters could be unassigned even if there are assignable counters available.
> 
> I think the "System may run ..." sentence should be dropped.
> The "In that case ..." sentence could be simplified with something like:
> "An MBM event returns 'Unassigned' when the event does not have a hardware
> counter assigned."
> 

Sure.

>> +	will return 'Unassigned' when the event is read. Users must manually
>> +	assign a counter to read the events.
>> +
>>  "mon_hw_id":
>>  	Available only with debug option. The identifier used by hardware
>>  	for the monitor group. On x86 this is the RMID.
> 
> Reinette
> 
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-14 17:48     ` Moger, Babu
@ 2025-04-15 16:09       ` Reinette Chatre
  2025-04-15 19:43         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-15 16:09 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/14/25 10:48 AM, Moger, Babu wrote:

> Here is my proposal to handle this case. This can be separate patch.
> 
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index d10cf1e5b914..772f2f77faee 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
> kernfs_open_file *of,
> 
>         list_for_each_entry(mevt, &r->mon.evt_list, list) {
>                 seq_printf(seq, "%s\n", mevt->name);
> -               if (mevt->configurable)
> +               if (mevt->configurable &&
> !resctrl_arch_mbm_cntr_assign_enabled(r))
>                         seq_printf(seq, "%s_config\n", mevt->name);
>         }
> 
> @@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
> struct rdt_resource *r, u32 evtid
>         cpus_read_lock();
>         mutex_lock(&rdtgroup_mutex);
> 
> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
> with mbm_cntr_assign mode\n");
> +               return -EINVAL;
> +       }
> +
>         list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>                 if (sep)
>                         seq_puts(s, ";");
> @@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
> struct rdt_resource *r, u32 evtid
>  static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
>                                        struct seq_file *seq, void *v)
>  {
> +       int ret;
>         struct rdt_resource *r = of->kn->parent->priv;
> 
> -       mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
> 
> -       return 0;
> +       return ret;
>  }
> 
>  static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
>                                        struct seq_file *seq, void *v)
>  {
> +       int ret;
> +
>         struct rdt_resource *r = of->kn->parent->priv;
> 
> -       mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
> 
> -       return 0;
> +       return ret;
>  }
> 
>  static void mbm_config_write_domain(struct rdt_resource *r,
> @@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
> char *tok, u32 evtid)
>         /* Walking r->domains, ensure it can't race with cpuhp */
>         lockdep_assert_cpus_held();
> 
> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
> with mbm_cntr_assign mode\n");
> +               return -EINVAL;
> +       }
> +
>  next:
>         if (!tok || tok[0] == '\0')
>                 return 0;
> 

Instead of chasing every call that may involve BMEC I think it will be simpler to
disable BMEC support during initialization when ABMC is detected. Specifically,
on systems that support both BMEC and ABMC rdt_cpu_has(X86_FEATURE_BMEC) returns
false. 

I would also like to consider enhancing mevt->configurable to handle all different
ways in which events can be configured. For example, making mevt->configurable an
enum that captures how event can be configured instead of keeping mevt->configurable
a boolean for BMEC support and handling ABMC completely separately. I hope this
may become clearer when using struct mon_evt for ABMC also.

Reinette




^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode
  2025-04-14 19:52     ` Moger, Babu
@ 2025-04-15 16:22       ` Reinette Chatre
  2025-04-16 14:05         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-15 16:22 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/14/25 12:52 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 15:56, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:

>>> platforms. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined, whereas on
>>> Arm64, it is. As a result, for MPAM, the file would be either:
>>
>> CONFIG_RESCTRL_ASSIGN_FIXED does not yet exist anywhere so this motivation needs
>> to provide stronger support for why it is used before it exists. There is a precedent
>> here with RESCTRL_RMID_DEPENDS_ON_CLOSID already used while it does not yet
>> appear in a Kconfig file. I would propose that this is motivated by noting
>> how it is already understood how Arm supports assignable counters this was recommended
>> by James to prepare for that work. Since this is user interface this
>> work is done early to ensure user interface is compatible with that upcoming
>> support. Also set folks at ease that IS_ENABLED() works as expected with a
>> non-existing config.
> 
> How about this?
> 
> Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64.
> 
> On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be
> defined when the "mbm_cntr_assign" mode is supported.
> 
> Add an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the
> user interface remains compatible with upcoming Arm64 support.
> IS_ENABLED() safely evaluates to 0 when the configuration is not defined.
> 
> As a result, for MPAM, the file would be either:
> [default]
> or
> [mbm_cntr_assign]
> 

Sounds good to me.

> 
>>
>>
>>> [default]
>>> or
>>> [mbm_cntr_assign]
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> v12: Minor text update in change log and user documentation.
>>>      Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
>>>      This will be defined only in arm and not in x86.
>>>
>>> v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
>>>      Removed few texts in resctrl.rst about AMD specific information.
>>>      Updated few texts.
>>>
>>> v10: Added few more text to user documentation clarify on the default mode.
>>>
>>> v9: Updated user documentation based on comments.
>>>
>>> v8: Commit message update.
>>>
>>> v7: Updated the descriptions/commit log in resctrl.rst to generic text.
>>>     Thanks to James and Reinette.
>>>     Rename mbm_mode to mbm_assign_mode.
>>>     Introduced mutex lock in rdtgroup_mbm_mode_show().
>>>
>>> v6: Added documentation for mbm_cntr_assign and legacy mode.
>>>     Moved mbm_mode fflags initialization to static initialization.
>>>
>>> v5: Changed interface name to mbm_mode.
>>>     It will be always available even if ABMC feature is not supported.
>>>     Added description in resctrl.rst about ABMC mode.
>>>     Fixed display abmc and legacy consistantly.
>>>
>>> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
>>>
>>> v3: New patch to display ABMC capability.
>>>
>>> ???END
>>> ---
>>>  Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++++++
>>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 ++++++++++++++++++++++++++
>>>  2 files changed, 64 insertions(+)
>>>
>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>> index fb90f08e564e..bb96b44019fe 100644
>>> --- a/Documentation/arch/x86/resctrl.rst
>>> +++ b/Documentation/arch/x86/resctrl.rst
>>> @@ -257,6 +257,33 @@ with the following files:
>>>  	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>>  	    0=0x30;1=0x30;3=0x15;4=0x15
>>>  
>>> +"mbm_assign_mode":
>>> +	Reports the list of monitoring modes supported. The enclosed brackets
>>> +	indicate which mode is enabled.
>>> +	::
>>> +
>>> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>>> +	  [mbm_cntr_assign]
>>> +	  default
>>> +
>>> +	"mbm_cntr_assign":
>>> +
>>> +	In mbm_cntr_assign mode, a monitoring event can only accumulate data
>>> +	while it is backed by a hardware counter. The user-space is able to
>>> +	specify which of the events in CTRL_MON or MON groups should have a
>>> +	counter assigned using the "mbm_assign_control" file. The number of
>>
>> "mbm_assign_control" no longer exist.
> 
> The user-space is able to specify which of the events in CTRL_MON or MON
> groups should have a counter assigned using the "mbm_L3_assignments"
> interface file in each resctrl group.

I think it can be assumed the reader represents the user space. If doing so
this can be simplified like:

	Use "mbm_L3_assignments" found in each CTRL_MON and MON group to
	specify which of the events should have a counter assigned.

> 
>>
>>> +	counters available is described in the "num_mbm_cntrs" file. Changing
>>> +	the mode may cause all counters on the resource to reset.
>>> +
>>> +	"default":
>>> +
>>> +	In default mode, resctrl assumes there is a hardware counter for each
>>> +	event within every CTRL_MON and MON group. On AMD platforms, it is
>>> +	recommended to use the mbm_cntr_assign mode, if supported, to prevent
>>> +	the hardware from resetting counters between reads. This can result in
>>
>> "from resetting counters" -> "from re-allocating counters"?
> 
> How about?
> 
> "from resetting MBM events between reads"

With more detail, how about:

 ", to prevent reset of MBM events between reads resulting from hardware re-allocating counters"?

>>>  /*
>>> @@ -1908,6 +1938,13 @@ static struct rftype res_common_files[] = {
>>>  		.seq_show	= mbm_local_bytes_config_show,
>>>  		.write		= mbm_local_bytes_config_write,
>>>  	},
>>> +	{
>>> +		.name		= "mbm_assign_mode",
>>> +		.mode		= 0444,
>>> +		.kf_ops		= &rdtgroup_kf_single_ops,
>>> +		.seq_show	= resctrl_mbm_assign_mode_show,
>>> +		.fflags		= RFTYPE_MON_INFO,
>>
>> Needs a RFTYPE_RES_CACHE?
> 
> I am not very sure about this.  This flag is added to the files in info/L3.
> 
> "mbm_assign_mode" goes in info/L3_MON/
> 
> The files in L3_MON does not have these flags set (for example
> mon_features, num_rmids).
> 

My assumption is that mon_features and num_rmids are generic monitoring
files that should be supported by all resources that support monitoring. When
resctrl starts to handle resource specific information then it should be
clear what type or resource it applies to. I understand that this may not
seem obvious since resctrl only supports monitoring on L3 resource.

Another view, consider existing code in resctrl_mon_resource_init() where
the MBM configuration files are made specific to RFTYPE_RES_CACHE. I see
mbm_assign_mode to be very similar to these.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-04-14 20:30     ` Moger, Babu
@ 2025-04-15 16:30       ` Reinette Chatre
  2025-04-16 15:43         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-15 16:30 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/14/25 1:30 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 16:01, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>> The ABMC feature provides an option to the user to assign a hardware
>>> counter to an RMID, event pair and monitor the bandwidth as long as the
>>> counter is assigned. The bandwidth events will be tracked by the hardware
>>> until the user changes the configuration. Each resctrl group can configure
>>> maximum two counters, one for total event and one for local event.
>>>
>>> The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
>>> Configuration is done by setting the counter id, bandwidth source (RMID)
>>> and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
>>> Configuration).
>>
>> Apart from the BMEC optimization in patch #1 and patch #2 this is the
>> first and only mention of BMEC dependency I see in this series while I do
>> not see implementation support for this. What am I missing?
>>
> 
> My mistake. I should have corrected it.  How about this?
> 
> "The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
> ABMC counter assignment is done by setting the counter id, bandwidth
> source (RMID) and bandwidth configuration. Users will have the option to
> change the bandwidth configuration using resctrl interface which will be
> introduced later in the series."
> 

Please just stick to what this patch does. The part starting with "Users will ..."
can cause confusion. To support what bandwidth configuration means the description
can point to existing definitions in include/linux/resctrl_types.h without needing
to mention BMEC.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-04-14 20:51     ` Moger, Babu
@ 2025-04-15 16:38       ` Reinette Chatre
  2025-04-16 15:51         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-15 16:38 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/14/25 1:51 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 16:02, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:

>>> @@ -394,6 +394,21 @@ void resctrl_arch_mon_event_config_set(void *config_info);
>>>  u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
>>>  				      enum resctrl_event_id eventid);
>>>  
>>> +/**
>>> + * resctrl_arch_config_cntr() - Configure the counter on the domain
>>> + * @r:			resource that the counter should be read from.
>>> + * @d:			domain that the counter should be read from.
>>> + * @evtid:		event type to assign
>>> + * @rmid:		rmid of the counter to read.
>>> + * @closid:		closid that matches the rmid.
>>> + * @cntr_id:		Counter ID to configure
>>> + * @evt_cfg:		event configuration
>>
>> "event configuration" is simply an expansion of member name and does not help to
>> understand what the value represents.
> 
> How about?
> 
> "MBM Event configuration value representing reads, writes etc.."

This is more helpful (note Event -> event). When data structures are decided it
will also be helpful to include reference of where this data is maintained and
how it is formatted.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters
  2025-04-11 21:21   ` Reinette Chatre
@ 2025-04-15 16:41     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 16:41 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:21, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Software can read the assignable counters using the QM_EVTSEL and QM_CTR
>> register pair.
>>
>> QM_EVTSEL Register definition:
>> =======================================================
>> Bits	Mnemonic	Description
>> =======================================================
>> 63:44	--		Reserved
>> 43:32   RMID		Resource Monitoring Identifier
>> 31	ExtEvtID	Extended Event Identifier
>> 30:8	--		Reserved
>> 7:0	EvtID		Event Identifier
>> =======================================================
>>
>> The contents of a specific counter can be read by setting the following
>> fields in QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID = L3CacheABMC (=1)
>> and setting [RMID] to the desired counter ID. Reading QM_CTR will then
>> return the contents of the specified counter. The E bit will be set if the
>> counter configuration was invalid, or if an invalid counter ID was set
> 
> Would an invalid counter configuration be possible at this point? I expect
> that an invalid counter configuration would not allow the counter to be
> configured in the first place.

Ideally that is true.  We should not hit this case. Added the text for
completeness.

> 
>> in the QM_EVTSEL[RMID] field.
>>
>> Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: New patch to support extended event mode when ABMC is enabled.
>> ---
>>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  4 +-
>>  arch/x86/kernel/cpu/resctrl/internal.h    |  7 +++
>>  arch/x86/kernel/cpu/resctrl/monitor.c     | 69 ++++++++++++++++-------
>>  include/linux/resctrl.h                   |  9 +--
>>  4 files changed, 63 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> index 2225c40b8888..da78389c6ac7 100644
>> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> @@ -636,6 +636,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>>  	rr->r = r;
>>  	rr->d = d;
>>  	rr->first = first;
>> +	rr->cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>>  	rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid);
>>  	if (IS_ERR(rr->arch_mon_ctx)) {
>>  		rr->err = -EINVAL;
>> @@ -661,13 +662,14 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>>  int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>>  {
>>  	struct kernfs_open_file *of = m->private;
>> +	enum resctrl_event_id evtid;
>>  	struct rdt_domain_hdr *hdr;
>>  	struct rmid_read rr = {0};
>>  	struct rdt_mon_domain *d;
>> -	u32 resid, evtid, domid;
>>  	struct rdtgroup *rdtgrp;
>>  	struct rdt_resource *r;
>>  	union mon_data_bits md;
>> +	u32 resid, domid;
>>  	int ret = 0;
>>  
> 
> Why make this change?

Yes. Not required.

> 
>>  	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index fbb045aec7e5..b7d1a59f09f8 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -38,6 +38,12 @@
>>  /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
>>  #define ABMC_ENABLE_BIT			0
>>  
>> +/*
>> + * ABMC Qos Event Identifiers.
>> + */
>> +#define ABMC_EXTENDED_EVT_ID		BIT(31)
>> +#define ABMC_EVT_ID			1
>> +
>>  /**
>>   * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
>>   *			        aren't marked nohz_full
>> @@ -156,6 +162,7 @@ struct rmid_read {
>>  	struct rdt_mon_domain	*d;
>>  	enum resctrl_event_id	evtid;
>>  	bool			first;
>> +	int			cntr_id;
>>  	struct cacheinfo	*ci;
>>  	int			err;
>>  	u64			val;
> 
> This does not look necessary (more below)

ok.

> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 5e7970fd0a97..58476c065921 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -269,8 +269,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mon_domain *hw_do
>>  }
>>  
>>  void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -			     u32 unused, u32 rmid,
>> -			     enum resctrl_event_id eventid)
>> +			     u32 unused, u32 rmid, enum resctrl_event_id eventid,
>> +			     int cntr_id)
>>  {
>>  	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>>  	int cpu = cpumask_any(&d->hdr.cpu_mask);
>> @@ -281,7 +281,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
>>  	if (am) {
>>  		memset(am, 0, sizeof(*am));
>>  
>> -		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>> +		if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
>> +		    resctrl_is_mbm_event(eventid)) {
>> +			if (cntr_id < 0)
>> +				return;
>> +			prmid = cntr_id;
>> +			eventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;
> 
> hmmm ... this is not a valid enum resctrl_event_id.

Yes. I may have to introduce the new function __cntr_id_read_phys() to
address this.

> 
> (before venturing into alternatives we need to study Tony's new RMID series
> because he made some changes to the enum that may support this work)

I looked into his series little bit.
https://lore.kernel.org/lkml/20250407234032.241215-1-tony.luck@intel.com/

I see he is refactoring the the events to support the new event types that
he is adding. It feels like his changes may not drastically affect the
changes I am doing here except some code conflicts between both the series.


> 
> 
>> +		} else {
>> +			prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>> +		}
>>  		/* Record any initial, non-zero count value. */
>>  		__rmid_read_phys(prmid, eventid, &am->prev_msr);
>>  	}
>> @@ -313,12 +321,13 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
>>  }
>>  
>>  int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -			   u32 unused, u32 rmid, enum resctrl_event_id eventid,
>> -			   u64 *val, void *ignored)
>> +			   u32 unused, u32 rmid, int cntr_id,
>> +			   enum resctrl_event_id eventid, u64 *val, void *ignored)
>>  {
>>  	struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>  	int cpu = cpumask_any(&d->hdr.cpu_mask);
>> +	enum resctrl_event_id peventid;
>>  	struct arch_mbm_state *am;
>>  	u64 msr_val, chunks;
>>  	u32 prmid;
>> @@ -326,8 +335,19 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>>  
>>  	resctrl_arch_rmid_read_context_check();
>>  
>> -	prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>> -	ret = __rmid_read_phys(prmid, eventid, &msr_val);
>> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) &&
>> +	    resctrl_is_mbm_event(eventid)) {
>> +		if (cntr_id < 0)
>> +			return cntr_id;
>> +
>> +		prmid = cntr_id;
>> +		peventid = ABMC_EXTENDED_EVT_ID | ABMC_EVT_ID;
> 
> same
> 

Sure. I may have to introduce the new function __cntr_id_read_phys() to
address this.

>> +	} else {
>> +		prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>> +		peventid = eventid;
>> +	}
>> +
>> +	ret = __rmid_read_phys(prmid, peventid, &msr_val);
>>  	if (ret)
>>  		return ret;
>>  
>> @@ -392,7 +412,7 @@ void __check_limbo(struct rdt_mon_domain *d, bool force_free)
>>  			break;
>>  
>>  		entry = __rmid_entry(idx);
>> -		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid,
>> +		if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, -1,
>>  					   QOS_L3_OCCUP_EVENT_ID, &val,
>>  					   arch_mon_ctx)) {
>>  			rmid_dirty = true;
>> @@ -599,7 +619,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>>  	u64 tval = 0;
>>  
>>  	if (rr->first) {
>> -		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
>> +		resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid, rr->cntr_id);
>>  		m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
>>  		if (m)
>>  			memset(m, 0, sizeof(struct mbm_state));
>> @@ -610,7 +630,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>>  		/* Reading a single domain, must be on a CPU in that domain. */
>>  		if (!cpumask_test_cpu(cpu, &rr->d->hdr.cpu_mask))
>>  			return -EINVAL;
>> -		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid,
>> +		rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->cntr_id,
>>  						 rr->evtid, &tval, rr->arch_mon_ctx);
>>  		if (rr->err)
>>  			return rr->err;
>> @@ -635,7 +655,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr)
>>  	list_for_each_entry(d, &rr->r->mon_domains, hdr.list) {
>>  		if (d->ci->id != rr->ci->id)
>>  			continue;
>> -		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid,
>> +		err = resctrl_arch_rmid_read(rr->r, d, closid, rmid, rr->cntr_id,
>>  					     rr->evtid, &tval, rr->arch_mon_ctx);
>>  		if (!err) {
>>  			rr->val += tval;
>> @@ -703,8 +723,8 @@ void mon_event_count(void *info)
>>  
>>  	if (rdtgrp->type == RDTCTRL_GROUP) {
>>  		list_for_each_entry(entry, head, mon.crdtgrp_list) {
>> -			if (__mon_event_count(entry->closid, entry->mon.rmid,
>> -					      rr) == 0)
>> +			rr->cntr_id = mbm_cntr_get(rr->r, rr->d, entry, rr->evtid);
>> +			if (__mon_event_count(entry->closid, entry->mon.rmid, rr) == 0)
>>  				ret = 0;
>>  		}
>>  	}
>> @@ -835,13 +855,15 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>>  }
>>  
>>  static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
>> +				 u32 closid, u32 rmid, int cntr_id,
>> +				 enum resctrl_event_id evtid)
> 
> Would it not be simpler to provide resource group as argument (remove closid, rmid, and
> cntr_id) and determine cntr_id from known data to provide cntr_id as argument to
> __mon_event_count(), removing the need for a new member in struct rmid_read?

Yes. We can do that.

> 
>>  {
>>  	struct rmid_read rr = {0};
>>  
>>  	rr.r = r;
>>  	rr.d = d;
>>  	rr.evtid = evtid;
>> +	rr.cntr_id = cntr_id;
>>  	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
>>  	if (IS_ERR(rr.arch_mon_ctx)) {
>>  		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
>> @@ -862,17 +884,22 @@ static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *
>>  }
>>  
>>  static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -		       u32 closid, u32 rmid)
>> +		       struct rdtgroup *rdtgrp, u32 closid, u32 rmid)
> 
> This looks redundant to provide both the resource group and two of its members as parameters.
> Looks like this can just be resource group and then remove closid and rmid?

Yes. We can do that.

> 
>>  {
>> +	int cntr_id;
>>  	/*
>>  	 * This is protected from concurrent reads from user as both
>>  	 * the user and overflow handler hold the global mutex.
>>  	 */
>> -	if (resctrl_arch_is_mbm_total_enabled())
>> -		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +	if (resctrl_arch_is_mbm_total_enabled()) {
>> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_TOTAL_EVENT_ID);
> 
> With similar change to mbm_update_one_event() where it takes resource group as parameter
> it is not needed to compute counter ID here.
> 
> This patch could be split. One patch can replace the closid/rmid in mbm_update()
> and mbm_update_one_event() with the resource group. Following patches can build on that.

Sure. We can do that.

> 
>> +	}
>>  
>> -	if (resctrl_arch_is_mbm_local_enabled())
>> -		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +	if (resctrl_arch_is_mbm_local_enabled()) {
>> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +		mbm_update_one_event(r, d, closid, rmid, cntr_id, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +	}
>>  }
>>  
>>  /*
>> @@ -945,11 +972,11 @@ void mbm_handle_overflow(struct work_struct *work)
>>  	d = container_of(work, struct rdt_mon_domain, mbm_over.work);
>>  
>>  	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
>> -		mbm_update(r, d, prgrp->closid, prgrp->mon.rmid);
>> +		mbm_update(r, d, prgrp, prgrp->closid, prgrp->mon.rmid);
> 
> providing both the resource group and two of its members really looks
> redundant.

Will take care of thato.
> 
>>  
>>  		head = &prgrp->mon.crdtgrp_list;
>>  		list_for_each_entry(crgrp, head, mon.crdtgrp_list)
>> -			mbm_update(r, d, crgrp->closid, crgrp->mon.rmid);
>> +			mbm_update(r, d, crgrp, crgrp->closid, crgrp->mon.rmid);
> 
> same
> 
Sure.

>>  
>>  		if (is_mba_sc(NULL))
>>  			update_mba_bw(prgrp, d);
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 60270606f1b8..107cb14a0db2 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -466,8 +466,9 @@ void resctrl_offline_cpu(unsigned int cpu);
>>   * 0 on success, or -EIO, -EINVAL etc on error.
>>   */
>>  int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -			   u32 closid, u32 rmid, enum resctrl_event_id eventid,
>> -			   u64 *val, void *arch_mon_ctx);
>> +			   u32 closid, u32 rmid, int cntr_id,
>> +			   enum resctrl_event_id eventid, u64 *val,
>> +			   void *arch_mon_ctx);
>>  
>>  /**
>>   * resctrl_arch_rmid_read_context_check()  - warn about invalid contexts
>> @@ -513,8 +514,8 @@ struct rdt_domain_hdr *resctrl_find_domain(struct list_head *h, int id,
>>   * This can be called from any CPU.
>>   */
>>  void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain *d,
>> -			     u32 closid, u32 rmid,
>> -			     enum resctrl_event_id eventid);
>> +			     u32 closid, u32 rmid, enum resctrl_event_id eventid,
>> +			     int cntr_id);
>>  
>>  /**
>>   * resctrl_arch_reset_rmid_all() - Reset all private state associated with
> 
> When changing the interface the associated kernel doc should also be updated.
> 

Sure.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-15 14:20     ` Moger, Babu
@ 2025-04-15 16:53       ` Reinette Chatre
  2025-04-16 17:09         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-15 16:53 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/15/25 7:20 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 16:04, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>>> as it is assigned.
>>
>> Above makes it sound as though multiple counters can be assigned to
>> an RMID, event pair.
>>
> 
> Yes. Multiple counter-ids can be assigned to RMID, event pair.

oh, are you referring to the assignments of different counters across multiple
domains?

> 
>>>
>>> Add the functionality to allocate and assign the counters to RMID, event
>>> pair in the domain.
>>
>> "assign *a* counter to an RMID, event pair"?
> 
> Sure.
> 
>>
>>>
>>> If all the counters are in use, the kernel will log the error message
>>> "Unable to allocate counter in domain" in /sys/fs/resctrl/info/
>>> last_cmd_status when a new assignment is requested. Exit on the first
>>> failure when assigning counters across all the domains.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>
>> ...
>>
>>> ---
>>>  arch/x86/kernel/cpu/resctrl/internal.h |   2 +
>>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
>>>  2 files changed, 126 insertions(+)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>>> index 0b73ec451d2c..1a8ac511241a 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>>> @@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
>>>  int resctrl_find_cleanest_closid(void);
>>>  void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
>>>  unsigned int mon_event_config_index_get(u32 evtid);
>>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
>>
>> This is internal to resctrl fs. Why is it needed to provide both the event id
>> and the event configuration? Event configuration can be determined from event ID?
> 
> Yes. It can be done. Then I have to export the functions like
> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
> here which I felt much more cleaner.

From what I can tell, for example by looking at patch #22, callers of
resctrl_assign_cntr_event() now need to call mbm_get_assign_config()
every time before calling resctrl_assign_cntr_event(). Calling
mbm_get_assign_config() from within resctrl_assign_cntr_event() seems
simpler to me and that may result in mbm_get_assign_config() moving to 
monitor.c as an extra benefit.

...

>>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>>> +{
>>> +	int cntr_id;
>>> +
>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>>> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
>>> +		    d->cntr_cfg[cntr_id].evtid == evtid)
>>> +			return cntr_id;
>>> +	}
>>> +
>>> +	return -ENOENT;
>>> +}
>>> +
>>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>>> +{
>>> +	int cntr_id;
>>> +
>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>>> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
>>> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
>>> +			d->cntr_cfg[cntr_id].evtid = evtid;
>>> +			return cntr_id;
>>> +		}
>>> +	}
>>> +
>>> +	return -ENOSPC;
>>> +}
>>> +
>>> +static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
>>> +{
>>> +	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
>>> +}
>>> +
>>> +/*
>>> + * Allocate a fresh counter and configure the event if not assigned already.
>>> + */
>>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
>>> +				     u32 evt_cfg)
>>
>> Same here, why are both evtid and evt_cfg provided as arguments? 
> 
> Yes. It can be done. Then I have to export the functions like
> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
> here which I felt much more cleaner.

Maybe even resctrl_assign_cntr_event() does not need to call mbm_get_assign_config()
but only resctrl_alloc_config_cntr() needs to call mbm_get_assign_config(). Doing so
may avoid more burden on callers while reducing parameters needed throughout.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-11 21:44   ` Reinette Chatre
@ 2025-04-15 18:48     ` Moger, Babu
  2025-04-15 19:25       ` Luck, Tony
  2025-04-16 16:18       ` Reinette Chatre
  0 siblings, 2 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 18:48 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 16:44, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>> and mbm_local_bytes. To maintain the same level of support, two default
>> MBM configurations are added. These configurations will initially be used
>> to set up the counters upon mounting, while users will have the option to
>> modify them as needed.
> 
> This jumps in quite fast by stating that MBM configurations are added but
> there is no definition of what an MBM configuration is.

How about this?


By default, each resctrl group supports two MBM events: mbm_total_bytes
and mbm_local_bytes. These represent total and local memory bandwidth
monitoring, respectively. Each event corresponds to a specific MBM
configuration. Use these default configurations to set up the counters
during mount. Allow users to modify the configurations as needed after
initialization.

Initialize resctrl MBM events with default configurations.

>> to set up the counters upon mounting, while users will have the option to
>> modify them as needed.

> 
>>
>> Event configuration values:
>> ========================================================
>>  Bits    Mnemonics       Description
>> ====   ========================================================
>>  6       VictimBW        Dirty Victims from all types of memory
>>  5       RmtSlowFill     Reads to slow memory in the non-local NUMA domain
>>  4       LclSlowFill     Reads to slow memory in the local NUMA domain
>>  3       RmtNTWr         Non-temporal writes to non-local NUMA domain
>>  2       LclNTWr         Non-temporal writes to local NUMA domain
>>  1       mtFill          Reads to memory in the non-local NUMA domain
>>  0       LclFill         Reads to memory in the local NUMA domain
>> ====    ========================================================
> 
> What is the purpose of the mnemonics?

I replace with full text on each of these.

> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: New patch to support event configurations via new counter_configs
>>      method.
>> ---
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
>>  include/linux/resctrl_types.h          | 17 +++++++++++++++++
>>  2 files changed, 32 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index d84f47db4e43..aba23e2096db 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -57,6 +57,21 @@ static struct kernfs_node *kn_mongrp;
>>  /* Kernel fs node for "mon_data" directory under root */
>>  static struct kernfs_node *kn_mondata;
>>  
>> +struct mbm_evt_value mbm_evt_values[NUM_MBM_EVT_VALUES] = {
>> +	{"local_reads", 0x1},
>> +	{"remote_reads", 0x2},
>> +	{"local_non_temporal_writes", 0x4},
>> +	{"remote_non_temporal_writes", 0x8},
>> +	{"local_reads_slow_memory", 0x10},
>> +	{"remote_reads_slow_memory", 0x20},
>> +	{"dirty_victim_writes_all", 0x40},
>> +};
>> +
>> +struct mbm_assign_config mbm_assign_configs[NUM_MBM_ASSIGN_CONFIGS] = {
>> +	{"mbm_total_bytes", QOS_L3_MBM_TOTAL_EVENT_ID, 0x7f},
>> +	{"mbm_local_bytes", QOS_L3_MBM_LOCAL_EVENT_ID, 0x15},
>> +};
>> +
>>  /*
>>   * Used to store the max resource name width to display the schemata names in
>>   * a tabular format.
>> diff --git a/include/linux/resctrl_types.h b/include/linux/resctrl_types.h
>> index f26450b3326b..3d98c7bdb459 100644
>> --- a/include/linux/resctrl_types.h
>> +++ b/include/linux/resctrl_types.h
> 
> Please read changelog of f16adbaf9272 ("x86/resctrl: Move resctrl types to a separate header")
> for a good explanation of what resctrl_types.h is used for.

Sure.

> 
>> @@ -31,6 +31,9 @@
>>  /* Max event bits supported */
>>  #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
>>  
>> +#define NUM_MBM_EVT_VALUES		7
>> +#define NUM_MBM_ASSIGN_CONFIGS		2
> 
> Please keep changes to internal header files unless required.

Will move these to internal header.

> 
>> +
>>  enum resctrl_res_level {
>>  	RDT_RESOURCE_L3,
>>  	RDT_RESOURCE_L2,
>> @@ -51,4 +54,18 @@ enum resctrl_event_id {
>>  	QOS_L3_MBM_LOCAL_EVENT_ID	= 0x03,
>>  };
>>  
>> +struct mbm_evt_value {
>> +	char	evt_name[32];
>> +	u32	evt_val;
>> +};
> 
> I cannot see how this belongs in resctrl_types.h.

Will move these to internal header.

> 
>> +
>> +/**
>> + * struct mbm_assign_config - Configuration values
> 
> Please include a run of scripts/kernel-doc in your patch preparation steps.

ok. Sure.

> 
> The description "Configuration values" is incredibly vague.

ok. Will add details.

> 
>> + */
>> +struct mbm_assign_config {
>> +	char			name[32];
>> +	enum resctrl_event_id	evtid;
>> +	u32			val;
>> +};
> 
> Why is this new struct needed? It looks to me like a duplicate of struct
> mon_evt with one member added. There is also already the evt_list as part
> of a monitor resource that the array introduced here seems to duplicate.

Yes. We can probably do that.

> 
> Could the event configuration be made a member of struct mon_evt instead?
> This exposes the need to integrate this better with BMEC support to make
> clear how existing "configurable" member should used and/or expanded.

Sure.

> 
> There seems more and more overlap with Tony's RMID work. Did you get a
> chance to look at that?

Looked little bit. Will have look bit closer again.

> 
>> +
>>  #endif /* __LINUX_RESCTRL_TYPES_H */
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-15 18:48     ` Moger, Babu
@ 2025-04-15 19:25       ` Luck, Tony
  2025-04-16 16:21         ` Reinette Chatre
  2025-04-16 16:18       ` Reinette Chatre
  1 sibling, 1 reply; 80+ messages in thread
From: Luck, Tony @ 2025-04-15 19:25 UTC (permalink / raw)
  To: babu.moger@amd.com, Chatre, Reinette, peternewman@google.com
  Cc: corbet@lwn.net, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, paulmck@kernel.org, akpm@linux-foundation.org,
	thuth@redhat.com, rostedt@goodmis.org, ardb@kernel.org,
	gregkh@linuxfoundation.org, daniel.sneddon@linux.intel.com,
	jpoimboe@kernel.org, alexandre.chartre@oracle.com,
	pawan.kumar.gupta@linux.intel.com, thomas.lendacky@amd.com,
	perry.yuan@amd.com, seanjc@google.com, Huang, Kai, Li, Xiaoyao,
	kan.liang@linux.intel.com, Li, Xin3, ebiggers@google.com,
	xin@zytor.com, Mehta, Sohil, andrew.cooper3@citrix.com,
	mario.limonciello@amd.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Wieczor-Retman, Maciej,
	Eranian, Stephane

> By default, each resctrl group supports two MBM events: mbm_total_bytes
> and mbm_local_bytes. These represent total and local memory bandwidth
> monitoring, respectively. Each event corresponds to a specific MBM
> configuration. Use these default configurations to set up the counters
> during mount. Allow users to modify the configurations as needed after
> initialization.

I think an update to this part of the resctrl.rst documentation is somewhat
overdue:

        In a MON group these files provide a read out of the current
        value of the event for all tasks in the group. In CTRL_MON groups
        these files provide the sum for all tasks in the CTRL_MON group
        and all tasks in MON groups. Please see example section for more
        details on usage.

The sentence about CTRL_MON groups providing the sum for all tasks
in the child MON groups is only true if counters are assigned to all of
those MON groups. What mon_event_count() actually does is to
return success if any of the CTRL_MON or child MON groups succeeded
with the count being the sum of all the successes.

-Tony

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-15 16:09       ` Reinette Chatre
@ 2025-04-15 19:43         ` Moger, Babu
  2025-04-16 16:08           ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 19:43 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/15/25 11:09, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/14/25 10:48 AM, Moger, Babu wrote:
> 
>> Here is my proposal to handle this case. This can be separate patch.
>>
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index d10cf1e5b914..772f2f77faee 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
>> kernfs_open_file *of,
>>
>>         list_for_each_entry(mevt, &r->mon.evt_list, list) {
>>                 seq_printf(seq, "%s\n", mevt->name);
>> -               if (mevt->configurable)
>> +               if (mevt->configurable &&
>> !resctrl_arch_mbm_cntr_assign_enabled(r))
>>                         seq_printf(seq, "%s_config\n", mevt->name);
>>         }
>>
>> @@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
>> struct rdt_resource *r, u32 evtid
>>         cpus_read_lock();
>>         mutex_lock(&rdtgroup_mutex);
>>
>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>> with mbm_cntr_assign mode\n");
>> +               return -EINVAL;
>> +       }
>> +
>>         list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>                 if (sep)
>>                         seq_puts(s, ";");
>> @@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
>> struct rdt_resource *r, u32 evtid
>>  static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
>>                                        struct seq_file *seq, void *v)
>>  {
>> +       int ret;
>>         struct rdt_resource *r = of->kn->parent->priv;
>>
>> -       mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>>
>> -       return 0;
>> +       return ret;
>>  }
>>
>>  static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
>>                                        struct seq_file *seq, void *v)
>>  {
>> +       int ret;
>> +
>>         struct rdt_resource *r = of->kn->parent->priv;
>>
>> -       mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>>
>> -       return 0;
>> +       return ret;
>>  }
>>
>>  static void mbm_config_write_domain(struct rdt_resource *r,
>> @@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
>> char *tok, u32 evtid)
>>         /* Walking r->domains, ensure it can't race with cpuhp */
>>         lockdep_assert_cpus_held();
>>
>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>> with mbm_cntr_assign mode\n");
>> +               return -EINVAL;
>> +       }
>> +
>>  next:
>>         if (!tok || tok[0] == '\0')
>>                 return 0;
>>
> 
> Instead of chasing every call that may involve BMEC I think it will be simpler to
> disable BMEC support during initialization when ABMC is detected. Specifically,
> on systems that support both BMEC and ABMC rdt_cpu_has(X86_FEATURE_BMEC) returns
> false. 

There is one problem with this approach. Users have the option to switch
between the assignment modes. System will boot with ABMC by default if
supported. But, users can switch to 'default' mode after the boot. By
disabling the BMEC completely, it will not be possible to do that.

> 
> I would also like to consider enhancing mevt->configurable to handle all different
> ways in which events can be configured. For example, making mevt->configurable an
> enum that captures how event can be configured instead of keeping mevt->configurable
> a boolean for BMEC support and handling ABMC completely separately. I hope this
> may become clearer when using struct mon_evt for ABMC also.

Sure. I can try that.


-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/
  2025-04-11 22:04   ` Reinette Chatre
@ 2025-04-15 20:29     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 20:29 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 17:04, Reinette Chatre wrote:
> Hi Babu
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Create the configuration directory and files for mbm_cntr_assign mode.
>> These configurations will be used to assign MBM events in mbm_cntr_assign
>> mode, with two default configurations created upon mounting.
>>
>> Example:
>> $ cd /sys/fs/resctrl/
>> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>   local_reads, remote_reads, local_non_temporal_writes,
>>   remote_non_temporal_writes, local_reads_slow_memory,
>>   remote_reads_slow_memory, dirty_victim_writes_all
>>
>> $ cat info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>   local_reads, local_non_temporal_writes, local_reads_slow_memory
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
>> ---
>>  Documentation/arch/x86/resctrl.rst     | 29 ++++++++++
>>  arch/x86/kernel/cpu/resctrl/internal.h |  2 +
>>  arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 77 ++++++++++++++++++++++++++
>>  4 files changed, 109 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 71ed1cfed33a..99f9f4b9b501 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -306,6 +306,35 @@ with the following files:
>>  	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>>  	  0=30;1=30
>>  
>> +"counter_configs:
> 
> (mismatch quotes)
> 

Sure.

> This organization needs some extra thought ... consider that the section starts with
> "If RDT monitoring is available there will be an "L3_MON" directory              
> with the following *files*:"
> 

Sure.

> 
>> +	The directory for storing event configuration files, which will be used to
>> +	assign counters when the mbm_cntr_assign mode is enabled.
> 
> Needs more imperative tone.

Sure.
 >> +
>> +	Following types of events are supported:
>> +
>> +	==== ========================= ============================================================
>> +	Bits Name   		         Description
>> +	==== ========================= ============================================================
>> +	6    dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
>> +	5    remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
>> +	4    local_reads_slow_memory     Reads to slow memory in the local NUMA domain
>> +	3    remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
>> +	2    local_non_temporal_writes   Non-temporal writes to local NUMA domain
>> +	1    remote_reads                Reads to memory in the non-local NUMA domain
>> +	0    local_reads                 Reads to memory in the local NUMA domain
>> +	==== ========================= ==========================================================
>> +
>> +	Two default configurations, mbm_local_bytes and mbm_total_bytes, will be created
> 
> "will be created" -> "are created" ... or maybe just:
> 	 There are two default configurations: mbm_local_bytes and mbm_total_bytes.

Looks good.

> 
>> +	upon mounting.
> 
> "upon mounting" seems unnecessary.
> 

ok.

>> +	::
>> +
>> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>> +	    local_reads, remote_reads, local_non_temporal_writes, remote_non_temporal_writes,
>> +	    local_reads_slow_memory, remote_reads_slow_memory, dirty_victim_writes_all
>> +
>> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>> +	    local_reads, local_non_temporal_writes, local_reads_slow_memory
>> +
>>  "max_threshold_occupancy":
>>  		Read/write file provides the largest value (in
>>  		bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index b7d1a59f09f8..a943450bf2c8 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -282,11 +282,13 @@ struct mbm_cntr_cfg {
>>  #define RFTYPE_RES_CACHE		BIT(8)
>>  #define RFTYPE_RES_MB			BIT(9)
>>  #define RFTYPE_DEBUG			BIT(10)
>> +#define RFTYPE_CONFIG			BIT(11)
> 
> hmmm ... these flags are becoming quite complex. Even so, RFTYPE_CONFIG would be
> unique to this new feature so I think a more specific name would be appropriate.
> Maybe even "RFTYPE_MBM_EVENT_CONFIG".

Sure.

> 
>>  #define RFTYPE_CTRL_INFO		(RFTYPE_INFO | RFTYPE_CTRL)
>>  #define RFTYPE_MON_INFO			(RFTYPE_INFO | RFTYPE_MON)
>>  #define RFTYPE_TOP_INFO			(RFTYPE_INFO | RFTYPE_TOP)
>>  #define RFTYPE_CTRL_BASE		(RFTYPE_BASE | RFTYPE_CTRL)
>>  #define RFTYPE_MON_BASE			(RFTYPE_BASE | RFTYPE_MON)
>> +#define RFTYPE_MON_CONFIG		(RFTYPE_CONFIG | RFTYPE_MON)
> 
> Why is this flag needed?
> 

Not required. Will remove it.

>>  
>>  /* List of all resource groups */
>>  extern struct list_head rdt_all_groups;
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 58476c065921..4525295b1725 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1264,6 +1264,7 @@ int __init resctrl_mon_resource_init(void)
>>  	if (r->mon.mbm_cntr_assignable) {
>>  		resctrl_file_fflags_init("num_mbm_cntrs", RFTYPE_MON_INFO);
>>  		resctrl_file_fflags_init("available_mbm_cntrs", RFTYPE_MON_INFO);
>> +		resctrl_file_fflags_init("event_filter", RFTYPE_MON_CONFIG);
>>  	}
>>  
>>  	return 0;
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index aba23e2096db..b2122a1dd36c 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1907,6 +1907,25 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>>  	return ret ?: nbytes;
>>  }
>>  
>> +static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
>> +{
>> +	struct mbm_assign_config *assign_config = of->kn->parent->priv;
>> +	bool sep = false;
>> +	int i;
>> +
>> +	for (i = 0; i < NUM_MBM_EVT_VALUES; i++) {
>> +		if (assign_config->val & mbm_evt_values[i].evt_val) {
>> +			if (sep)
>> +				seq_puts(seq, ", ");
> 
> seq_putc()

Sure.

> 
>> +			seq_printf(seq, "%s", mbm_evt_values[i].evt_name);
>> +			sep = true;
>> +		}
>> +	}
>> +	seq_puts(seq, "\n");
> seq_putc()

Sure.

>> +
>> +	return 0;
>> +}
>> +
>>  /* rdtgroup information files for one cache resource. */
>>  static struct rftype res_common_files[] = {
>>  	{
>> @@ -2019,6 +2038,12 @@ static struct rftype res_common_files[] = {
>>  		.seq_show	= mbm_local_bytes_config_show,
>>  		.write		= mbm_local_bytes_config_write,
>>  	},
>> +	{
>> +		.name		= "event_filter",
>> +		.mode		= 0444,
>> +		.kf_ops		= &rdtgroup_kf_single_ops,
>> +		.seq_show	= event_filter_show,
>> +	},
>>  	{
>>  		.name		= "mbm_assign_mode",
>>  		.mode		= 0444,
>> @@ -2314,6 +2339,52 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>>  	return ret;
>>  }
>>  
>> +static int resctrl_mkdir_info_configs(void *priv,  char *name, unsigned long fflags)
> 
> Why a void * instead of struct rdt_resource *?

Yes. Will change it.

> 
> Also please fix spacing.

Sure.

> 
> Also, why do fflags need to be provided as parameter? These are so custom I think the
> hardcoding should be contained here instead of the caller. With this the function name

Will remove fflags as parameter.

> can also be made specific to what it does ... perhaps "resctrl_mkdir_counter_configs()"
> (please feel free to improve).

Sounds good.

> 
> 
>> +{
>> +	struct kernfs_node *l3_mon_kn, *kn_subdir, *kn_subdir2;
>> +	int ret, i;
>> +
>> +	l3_mon_kn = kernfs_find_and_get(kn_info, name);
>> +	if (!l3_mon_kn)
>> +		return -ENOENT;
>> +
>> +	kn_subdir = kernfs_create_dir(l3_mon_kn, "counter_configs", l3_mon_kn->mode, priv);
>> +	if (IS_ERR(kn_subdir)) {
>> +		kernfs_put(l3_mon_kn);
>> +		return PTR_ERR(kn_subdir);
>> +	}
>> +
>> +	ret = rdtgroup_kn_set_ugid(kn_subdir);
>> +	if (ret) {
>> +		kernfs_put(l3_mon_kn);
>> +		return ret;
>> +	}
>> +
>> +	for (i = 0; i < NUM_MBM_ASSIGN_CONFIGS; i++) {
> 
> This can instead work through the resource's evt_list and use a flag (TBD how to
> adapt "configurable") to determine if a directory should be created for it.

Yes. Will look into this.

> 
>> +		kn_subdir2 = kernfs_create_dir(kn_subdir, mbm_assign_configs[i].name,
>> +					       kn_subdir->mode, &mbm_assign_configs[i]);
>> +		if (IS_ERR(kn_subdir)) {
> 
> IS_ERR(kn_subdir2)?

Yes.

> 
>> +			ret = PTR_ERR(kn_subdir2);
>> +			goto config_out;
>> +		}
>> +
>> +		ret = rdtgroup_kn_set_ugid(kn_subdir2);
>> +		if (ret)
>> +			goto config_out;
>> +
>> +		ret = rdtgroup_add_files(kn_subdir2, fflags);
>> +		if (!ret)
>> +			kernfs_activate(kn_subdir);
>> +	}
>> +
>> +config_out:
>> +	kernfs_put(l3_mon_kn);
>> +	if (ret)
>> +		kernfs_remove(kn_subdir);
>> +
>> +	return ret;
>> +}
>> +
>>  static unsigned long fflags_from_resource(struct rdt_resource *r)
>>  {
>>  	switch (r->rid) {
>> @@ -2360,6 +2431,12 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
>>  		ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
>>  		if (ret)
>>  			goto out_destroy;
>> +
>> +		if (r->mon.mbm_cntr_assignable) {
>> +			ret = resctrl_mkdir_info_configs(r, name, RFTYPE_MON_CONFIG);
>> +			if (ret)
>> +				goto out_destroy;
>> +		}
>>  	}
>>  
>>  	ret = rdtgroup_kn_set_ugid(kn_info);
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-11 22:07   ` Reinette Chatre
@ 2025-04-15 20:37     ` Moger, Babu
  2025-04-16 18:52       ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 20:37 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 17:07, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Users can modify the event configuration by writing to the event_filter
>> interface file. The event configurations for mbm_cntr_assign mode are
>> located in /sys/fs/resctrl/info/event_configs/.
>>
>> Update the assignments of all groups when the event configuration is
>> modified.
>>
>> Example:
>> $ cd /sys/fs/resctrl/
>> $ echo "local_reads, local_non_temporal_writes" >
>>   info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>
>> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>  local_reads, local_non_temporal_writes
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: New patch to modify event configurations.
>> ---
>>  Documentation/arch/x86/resctrl.rst     |  10 +++
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 115 ++++++++++++++++++++++++-
>>  2 files changed, 124 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 99f9f4b9b501..4e6feba6fb08 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -335,6 +335,16 @@ with the following files:
>>  	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>  	    local_reads, local_non_temporal_writes, local_reads_slow_memory
>>  
>> +	The event configuration can be modified by writing to the event_filter file within
>> +	the configuration directory.
> 
> Please use imperative tone.

Sure.

Basic question - Should the user doc also be in imperative mode? I thought
it only applies to commit log.

> 
>> +	::
>> +
>> +	    # echo "local_reads, local_non_temporal_writes" >
>> +	      /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>> +
>> +	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>> +	    local_reads, local_non_temporal_writes
>> +
>>  "max_threshold_occupancy":
>>  		Read/write file provides the largest value (in
>>  		bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index b2122a1dd36c..7792455f0b26 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1926,6 +1926,118 @@ static int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq,
>>  	return 0;
>>  }
>>  
> 
> Could you please add comments to these new functions to explain what they do?

Sure.

> 
>> +static int resctrl_group_assign(struct rdt_resource *r, struct rdtgroup *rdtgrp,
>> +				enum resctrl_event_id evtid, u32 evt_cfg)
>> +{
>> +	struct rdt_mon_domain *d;
>> +	int cntr_id, ret;
>> +
>> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +		cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>> +		if (cntr_id >= 0 && d->cntr_cfg[cntr_id].evt_cfg != evt_cfg) {
>> +			d->cntr_cfg[cntr_id].evt_cfg = evt_cfg;
>> +			ret = resctrl_arch_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
>> +						       rdtgrp->closid, cntr_id, evt_cfg, true);
>> +			if (ret) {
>> +				rdt_last_cmd_printf("Assign failed event %d domain %d group %s\n",
>> +						    evtid, d->hdr.id, rdtgrp->kn->name);
> 
> Please provide the actual event name to user space. The event IDs are not visible to
> user space.

Sure.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments
  2025-04-11 22:08   ` Reinette Chatre
@ 2025-04-15 20:39     ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-15 20:39 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/11/25 17:08, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> The mbm_cntr_assign mode provides an option to the user to assign a
>> counter to an RMID, event pair and monitor the bandwidth as long as
>> the counter is assigned.
>>
>> Introduce a configuration option to automatically assign counter IDs
>> when a resctrl group is created, provided the counters are available.
>> By default, this option is enabled at boot.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v12: New patch. Added after the discussion on the list.
>>      https://lore.kernel.org/lkml/CALPaoCh8siZKjL_3yvOYGL4cF_n_38KpUFgHVGbQ86nD+Q2_SA@mail.gmail.com/
> 
> Seems like this needs a Suggested-by for Peter.
> 

Sure. Will add  "Suggested-by: Peter Newman <peternewman@google.com>"
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode
  2025-04-15 16:22       ` Reinette Chatre
@ 2025-04-16 14:05         ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 14:05 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/15/2025 11:22 AM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/14/25 12:52 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 15:56, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
> 
>>>> platforms. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined, whereas on
>>>> Arm64, it is. As a result, for MPAM, the file would be either:
>>>
>>> CONFIG_RESCTRL_ASSIGN_FIXED does not yet exist anywhere so this motivation needs
>>> to provide stronger support for why it is used before it exists. There is a precedent
>>> here with RESCTRL_RMID_DEPENDS_ON_CLOSID already used while it does not yet
>>> appear in a Kconfig file. I would propose that this is motivated by noting
>>> how it is already understood how Arm supports assignable counters this was recommended
>>> by James to prepare for that work. Since this is user interface this
>>> work is done early to ensure user interface is compatible with that upcoming
>>> support. Also set folks at ease that IS_ENABLED() works as expected with a
>>> non-existing config.
>>
>> How about this?
>>
>> Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64.
>>
>> On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be
>> defined when the "mbm_cntr_assign" mode is supported.
>>
>> Add an IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the
>> user interface remains compatible with upcoming Arm64 support.
>> IS_ENABLED() safely evaluates to 0 when the configuration is not defined.
>>
>> As a result, for MPAM, the file would be either:
>> [default]
>> or
>> [mbm_cntr_assign]
>>
> 
> Sounds good to me.
> 

Thanks

>>
>>>
>>>
>>>> [default]
>>>> or
>>>> [mbm_cntr_assign]
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> v12: Minor text update in change log and user documentation.
>>>>       Added the check CONFIG_RESCTRL_ASSIGN_FIXED to take care of arm platforms.
>>>>       This will be defined only in arm and not in x86.
>>>>
>>>> v11: Renamed rdtgroup_mbm_assign_mode_show() to resctrl_mbm_assign_mode_show().
>>>>       Removed few texts in resctrl.rst about AMD specific information.
>>>>       Updated few texts.
>>>>
>>>> v10: Added few more text to user documentation clarify on the default mode.
>>>>
>>>> v9: Updated user documentation based on comments.
>>>>
>>>> v8: Commit message update.
>>>>
>>>> v7: Updated the descriptions/commit log in resctrl.rst to generic text.
>>>>      Thanks to James and Reinette.
>>>>      Rename mbm_mode to mbm_assign_mode.
>>>>      Introduced mutex lock in rdtgroup_mbm_mode_show().
>>>>
>>>> v6: Added documentation for mbm_cntr_assign and legacy mode.
>>>>      Moved mbm_mode fflags initialization to static initialization.
>>>>
>>>> v5: Changed interface name to mbm_mode.
>>>>      It will be always available even if ABMC feature is not supported.
>>>>      Added description in resctrl.rst about ABMC mode.
>>>>      Fixed display abmc and legacy consistantly.
>>>>
>>>> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
>>>>
>>>> v3: New patch to display ABMC capability.
>>>>
>>>> ???END
>>>> ---
>>>>   Documentation/arch/x86/resctrl.rst     | 27 +++++++++++++++++++
>>>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 37 ++++++++++++++++++++++++++
>>>>   2 files changed, 64 insertions(+)
>>>>
>>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>>> index fb90f08e564e..bb96b44019fe 100644
>>>> --- a/Documentation/arch/x86/resctrl.rst
>>>> +++ b/Documentation/arch/x86/resctrl.rst
>>>> @@ -257,6 +257,33 @@ with the following files:
>>>>   	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>>>   	    0=0x30;1=0x30;3=0x15;4=0x15
>>>>   
>>>> +"mbm_assign_mode":
>>>> +	Reports the list of monitoring modes supported. The enclosed brackets
>>>> +	indicate which mode is enabled.
>>>> +	::
>>>> +
>>>> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>>>> +	  [mbm_cntr_assign]
>>>> +	  default
>>>> +
>>>> +	"mbm_cntr_assign":
>>>> +
>>>> +	In mbm_cntr_assign mode, a monitoring event can only accumulate data
>>>> +	while it is backed by a hardware counter. The user-space is able to
>>>> +	specify which of the events in CTRL_MON or MON groups should have a
>>>> +	counter assigned using the "mbm_assign_control" file. The number of
>>>
>>> "mbm_assign_control" no longer exist.
>>
>> The user-space is able to specify which of the events in CTRL_MON or MON
>> groups should have a counter assigned using the "mbm_L3_assignments"
>> interface file in each resctrl group.
> 
> I think it can be assumed the reader represents the user space. If doing so
> this can be simplified like:
> 
> 	Use "mbm_L3_assignments" found in each CTRL_MON and MON group to
> 	specify which of the events should have a counter assigned.
> 

Sure.

>>
>>>
>>>> +	counters available is described in the "num_mbm_cntrs" file. Changing
>>>> +	the mode may cause all counters on the resource to reset.
>>>> +
>>>> +	"default":
>>>> +
>>>> +	In default mode, resctrl assumes there is a hardware counter for each
>>>> +	event within every CTRL_MON and MON group. On AMD platforms, it is
>>>> +	recommended to use the mbm_cntr_assign mode, if supported, to prevent
>>>> +	the hardware from resetting counters between reads. This can result in
>>>
>>> "from resetting counters" -> "from re-allocating counters"?
>>
>> How about?
>>
>> "from resetting MBM events between reads"
> 
> With more detail, how about:
> 
>   ", to prevent reset of MBM events between reads resulting from hardware re-allocating counters"?

Yes.

> 
>>>>   /*
>>>> @@ -1908,6 +1938,13 @@ static struct rftype res_common_files[] = {
>>>>   		.seq_show	= mbm_local_bytes_config_show,
>>>>   		.write		= mbm_local_bytes_config_write,
>>>>   	},
>>>> +	{
>>>> +		.name		= "mbm_assign_mode",
>>>> +		.mode		= 0444,
>>>> +		.kf_ops		= &rdtgroup_kf_single_ops,
>>>> +		.seq_show	= resctrl_mbm_assign_mode_show,
>>>> +		.fflags		= RFTYPE_MON_INFO,
>>>
>>> Needs a RFTYPE_RES_CACHE?
>>
>> I am not very sure about this.  This flag is added to the files in info/L3.
>>
>> "mbm_assign_mode" goes in info/L3_MON/
>>
>> The files in L3_MON does not have these flags set (for example
>> mon_features, num_rmids).
>>
> 
> My assumption is that mon_features and num_rmids are generic monitoring
> files that should be supported by all resources that support monitoring. When
> resctrl starts to handle resource specific information then it should be
> clear what type or resource it applies to. I understand that this may not
> seem obvious since resctrl only supports monitoring on L3 resource.
> 
> Another view, consider existing code in resctrl_mon_resource_init() where
> the MBM configuration files are made specific to RFTYPE_RES_CACHE. I see
> mbm_assign_mode to be very similar to these.
>

Ok. Sure. Will do.

thanks
Babu


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment
  2025-04-15 16:30       ` Reinette Chatre
@ 2025-04-16 15:43         ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 15:43 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/15/25 11:30, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/14/25 1:30 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 16:01, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>> The ABMC feature provides an option to the user to assign a hardware
>>>> counter to an RMID, event pair and monitor the bandwidth as long as the
>>>> counter is assigned. The bandwidth events will be tracked by the hardware
>>>> until the user changes the configuration. Each resctrl group can configure
>>>> maximum two counters, one for total event and one for local event.
>>>>
>>>> The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
>>>> Configuration is done by setting the counter id, bandwidth source (RMID)
>>>> and bandwidth configuration supported by BMEC (Bandwidth Monitoring Event
>>>> Configuration).
>>>
>>> Apart from the BMEC optimization in patch #1 and patch #2 this is the
>>> first and only mention of BMEC dependency I see in this series while I do
>>> not see implementation support for this. What am I missing?
>>>
>>
>> My mistake. I should have corrected it.  How about this?
>>
>> "The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh).
>> ABMC counter assignment is done by setting the counter id, bandwidth
>> source (RMID) and bandwidth configuration. Users will have the option to
>> change the bandwidth configuration using resctrl interface which will be
>> introduced later in the series."
>>
> 
> Please just stick to what this patch does. The part starting with "Users will ..."
> can cause confusion. To support what bandwidth configuration means the description
> can point to existing definitions in include/linux/resctrl_types.h without needing
> to mention BMEC.
> 
Sure. Sounds good.
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC
  2025-04-15 16:38       ` Reinette Chatre
@ 2025-04-16 15:51         ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 15:51 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/15/25 11:38, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/14/25 1:51 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 16:02, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
> 
>>>> @@ -394,6 +394,21 @@ void resctrl_arch_mon_event_config_set(void *config_info);
>>>>  u32 resctrl_arch_mon_event_config_get(struct rdt_mon_domain *d,
>>>>  				      enum resctrl_event_id eventid);
>>>>  
>>>> +/**
>>>> + * resctrl_arch_config_cntr() - Configure the counter on the domain
>>>> + * @r:			resource that the counter should be read from.
>>>> + * @d:			domain that the counter should be read from.
>>>> + * @evtid:		event type to assign
>>>> + * @rmid:		rmid of the counter to read.
>>>> + * @closid:		closid that matches the rmid.
>>>> + * @cntr_id:		Counter ID to configure
>>>> + * @evt_cfg:		event configuration
>>>
>>> "event configuration" is simply an expansion of member name and does not help to
>>> understand what the value represents.
>>
>> How about?
>>
>> "MBM Event configuration value representing reads, writes etc.."
> 
> This is more helpful (note Event -> event). When data structures are decided it
> will also be helpful to include reference of where this data is maintained and
> how it is formatted.

Sure.
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-15 19:43         ` Moger, Babu
@ 2025-04-16 16:08           ` Reinette Chatre
  2025-04-17 14:27             ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-16 16:08 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/15/25 12:43 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/15/25 11:09, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/14/25 10:48 AM, Moger, Babu wrote:
>>
>>> Here is my proposal to handle this case. This can be separate patch.
>>>
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index d10cf1e5b914..772f2f77faee 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
>>> kernfs_open_file *of,
>>>
>>>         list_for_each_entry(mevt, &r->mon.evt_list, list) {
>>>                 seq_printf(seq, "%s\n", mevt->name);
>>> -               if (mevt->configurable)
>>> +               if (mevt->configurable &&
>>> !resctrl_arch_mbm_cntr_assign_enabled(r))
>>>                         seq_printf(seq, "%s_config\n", mevt->name);
>>>         }
>>>
>>> @@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
>>> struct rdt_resource *r, u32 evtid
>>>         cpus_read_lock();
>>>         mutex_lock(&rdtgroup_mutex);
>>>
>>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>>> with mbm_cntr_assign mode\n");
>>> +               return -EINVAL;
>>> +       }
>>> +
>>>         list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>                 if (sep)
>>>                         seq_puts(s, ";");
>>> @@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
>>> struct rdt_resource *r, u32 evtid
>>>  static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
>>>                                        struct seq_file *seq, void *v)
>>>  {
>>> +       int ret;
>>>         struct rdt_resource *r = of->kn->parent->priv;
>>>
>>> -       mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>>>
>>> -       return 0;
>>> +       return ret;
>>>  }
>>>
>>>  static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
>>>                                        struct seq_file *seq, void *v)
>>>  {
>>> +       int ret;
>>> +
>>>         struct rdt_resource *r = of->kn->parent->priv;
>>>
>>> -       mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>>>
>>> -       return 0;
>>> +       return ret;
>>>  }
>>>
>>>  static void mbm_config_write_domain(struct rdt_resource *r,
>>> @@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
>>> char *tok, u32 evtid)
>>>         /* Walking r->domains, ensure it can't race with cpuhp */
>>>         lockdep_assert_cpus_held();
>>>
>>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>>> with mbm_cntr_assign mode\n");
>>> +               return -EINVAL;
>>> +       }
>>> +
>>>  next:
>>>         if (!tok || tok[0] == '\0')
>>>                 return 0;
>>>
>>
>> Instead of chasing every call that may involve BMEC I think it will be simpler to
>> disable BMEC support during initialization when ABMC is detected. Specifically,
>> on systems that support both BMEC and ABMC rdt_cpu_has(X86_FEATURE_BMEC) returns
>> false. 
> 
> There is one problem with this approach. Users have the option to switch
> between the assignment modes. System will boot with ABMC by default if
> supported. But, users can switch to 'default' mode after the boot. By
> disabling the BMEC completely, it will not be possible to do that.

Good point. Thank you. Another option is to hide (see kernfs_show()) mbm_total_bytes_config
and mbm_local_bytes_config when ABMC is enabled. To me this seems like a clear
interface to user space, when user interface changes the mode the interface changes
to reflect new mode.

> 
>>
>> I would also like to consider enhancing mevt->configurable to handle all different
>> ways in which events can be configured. For example, making mevt->configurable an
>> enum that captures how event can be configured instead of keeping mevt->configurable
>> a boolean for BMEC support and handling ABMC completely separately. I hope this
>> may become clearer when using struct mon_evt for ABMC also.
> 
> Sure. I can try that.

Thank you.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-15 18:48     ` Moger, Babu
  2025-04-15 19:25       ` Luck, Tony
@ 2025-04-16 16:18       ` Reinette Chatre
  2025-04-16 17:27         ` Moger, Babu
  1 sibling, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-16 16:18 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/15/25 11:48 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 16:44, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>>> and mbm_local_bytes. To maintain the same level of support, two default
>>> MBM configurations are added. These configurations will initially be used
>>> to set up the counters upon mounting, while users will have the option to
>>> modify them as needed.
>>
>> This jumps in quite fast by stating that MBM configurations are added but
>> there is no definition of what an MBM configuration is.
> 
> How about this?
> 
> 
> By default, each resctrl group supports two MBM events: mbm_total_bytes
> and mbm_local_bytes. These represent total and local memory bandwidth
> monitoring, respectively. Each event corresponds to a specific MBM
> configuration. Use these default configurations to set up the counters
> during mount. Allow users to modify the configurations as needed after
> initialization.

I am still missing a definition of "MBM configuration". How about:

"Each event corresponds to a specific MBM configuration." -> "Each event
corresponds to an MBM configuration that specifies the bandwidth sources
tracked by the event."

...

>>
>> There seems more and more overlap with Tony's RMID work. Did you get a
>> chance to look at that?
> 
> Looked little bit. Will have look bit closer again.

I'll study that series next to catch up with Tony's plans.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-15 19:25       ` Luck, Tony
@ 2025-04-16 16:21         ` Reinette Chatre
  2025-04-16 17:26           ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-16 16:21 UTC (permalink / raw)
  To: Luck, Tony, babu.moger@amd.com, peternewman@google.com
  Cc: corbet@lwn.net, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, paulmck@kernel.org, akpm@linux-foundation.org,
	thuth@redhat.com, rostedt@goodmis.org, ardb@kernel.org,
	gregkh@linuxfoundation.org, daniel.sneddon@linux.intel.com,
	jpoimboe@kernel.org, alexandre.chartre@oracle.com,
	pawan.kumar.gupta@linux.intel.com, thomas.lendacky@amd.com,
	perry.yuan@amd.com, seanjc@google.com, Huang, Kai, Li, Xiaoyao,
	kan.liang@linux.intel.com, Li, Xin3, ebiggers@google.com,
	xin@zytor.com, Mehta, Sohil, andrew.cooper3@citrix.com,
	mario.limonciello@amd.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Wieczor-Retman, Maciej,
	Eranian, Stephane

Hi Tony,

On 4/15/25 12:25 PM, Luck, Tony wrote:
>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>> and mbm_local_bytes. These represent total and local memory bandwidth
>> monitoring, respectively. Each event corresponds to a specific MBM
>> configuration. Use these default configurations to set up the counters
>> during mount. Allow users to modify the configurations as needed after
>> initialization.
> 
> I think an update to this part of the resctrl.rst documentation is somewhat
> overdue:
> 
>         In a MON group these files provide a read out of the current
>         value of the event for all tasks in the group. In CTRL_MON groups
>         these files provide the sum for all tasks in the CTRL_MON group
>         and all tasks in MON groups. Please see example section for more
>         details on usage.
> 
> The sentence about CTRL_MON groups providing the sum for all tasks
> in the child MON groups is only true if counters are assigned to all of
> those MON groups. What mon_event_count() actually does is to
> return success if any of the CTRL_MON or child MON groups succeeded
> with the count being the sum of all the successes.

Thanks for catching this. This would be important to highlight so that
user space does not have impression that events of CTRL_MON can be
used as estimate for MON groups that do not have counters assigned.

Reinette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-15 16:53       ` Reinette Chatre
@ 2025-04-16 17:09         ` Moger, Babu
  2025-04-16 17:55           ` Luck, Tony
  2025-04-16 19:02           ` Reinette Chatre
  0 siblings, 2 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 17:09 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/15/25 11:53, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/15/25 7:20 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 16:04, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>>>> as it is assigned.
>>>
>>> Above makes it sound as though multiple counters can be assigned to
>>> an RMID, event pair.
>>>
>>
>> Yes. Multiple counter-ids can be assigned to RMID, event pair.
> 
> oh, are you referring to the assignments of different counters across multiple
> domains?

May be I am confusing you here. This is what I meant.

Here is one example.

In a same group,
  Configure cntr_id 0, to count reads only (This maps to total event).
  Configure cntr_id 1, to count write only (This maps to local event).
  Configure cntr_id 2, to count dirty victims.
  so on..
  so on..
  Configure cntr_id 31, to count remote read only.

We have 32 counter ids in a domain. Basically, we can configure all the
counters in a domain to just one group if you want to.

We cannot do that right now because our data structures cannot do that.
We can only configure 2 events(local and total) right now.

My understanding it is same with MPAM also.

> 
>>
>>>>
>>>> Add the functionality to allocate and assign the counters to RMID, event
>>>> pair in the domain.
>>>
>>> "assign *a* counter to an RMID, event pair"?
>>
>> Sure.
>>
>>>
>>>>
>>>> If all the counters are in use, the kernel will log the error message
>>>> "Unable to allocate counter in domain" in /sys/fs/resctrl/info/
>>>> last_cmd_status when a new assignment is requested. Exit on the first
>>>> failure when assigning counters across all the domains.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>
>>> ...
>>>
>>>> ---
>>>>  arch/x86/kernel/cpu/resctrl/internal.h |   2 +
>>>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
>>>>  2 files changed, 126 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>>>> index 0b73ec451d2c..1a8ac511241a 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>>>> @@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
>>>>  int resctrl_find_cleanest_closid(void);
>>>>  void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
>>>>  unsigned int mon_event_config_index_get(u32 evtid);
>>>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>>>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
>>>
>>> This is internal to resctrl fs. Why is it needed to provide both the event id
>>> and the event configuration? Event configuration can be determined from event ID?
>>
>> Yes. It can be done. Then I have to export the functions like
>> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
>> here which I felt much more cleaner.
> 
>>From what I can tell, for example by looking at patch #22, callers of
> resctrl_assign_cntr_event() now need to call mbm_get_assign_config()
> every time before calling resctrl_assign_cntr_event(). Calling
> mbm_get_assign_config() from within resctrl_assign_cntr_event() seems
> simpler to me and that may result in mbm_get_assign_config() moving to 
> monitor.c as an extra benefit.

Sure.

> 
> ...
> 
>>>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>>>> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>>>> +{
>>>> +	int cntr_id;
>>>> +
>>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>>>> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
>>>> +		    d->cntr_cfg[cntr_id].evtid == evtid)
>>>> +			return cntr_id;
>>>> +	}
>>>> +
>>>> +	return -ENOENT;
>>>> +}
>>>> +
>>>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
>>>> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>>>> +{
>>>> +	int cntr_id;
>>>> +
>>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>>>> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
>>>> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
>>>> +			d->cntr_cfg[cntr_id].evtid = evtid;
>>>> +			return cntr_id;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return -ENOSPC;
>>>> +}
>>>> +
>>>> +static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
>>>> +{
>>>> +	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
>>>> +}
>>>> +
>>>> +/*
>>>> + * Allocate a fresh counter and configure the event if not assigned already.
>>>> + */
>>>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>>>> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
>>>> +				     u32 evt_cfg)
>>>
>>> Same here, why are both evtid and evt_cfg provided as arguments? 
>>
>> Yes. It can be done. Then I have to export the functions like
>> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
>> here which I felt much more cleaner.
> 
> Maybe even resctrl_assign_cntr_event() does not need to call mbm_get_assign_config()
> but only resctrl_alloc_config_cntr() needs to call mbm_get_assign_config(). Doing so
> may avoid more burden on callers while reducing parameters needed throughout.
> 

ok. Sure. Will do.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-16 16:21         ` Reinette Chatre
@ 2025-04-16 17:26           ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 17:26 UTC (permalink / raw)
  To: Reinette Chatre, Luck, Tony, peternewman@google.com
  Cc: corbet@lwn.net, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, paulmck@kernel.org, akpm@linux-foundation.org,
	thuth@redhat.com, rostedt@goodmis.org, ardb@kernel.org,
	gregkh@linuxfoundation.org, daniel.sneddon@linux.intel.com,
	jpoimboe@kernel.org, alexandre.chartre@oracle.com,
	pawan.kumar.gupta@linux.intel.com, thomas.lendacky@amd.com,
	perry.yuan@amd.com, seanjc@google.com, Huang, Kai, Li, Xiaoyao,
	kan.liang@linux.intel.com, Li, Xin3, ebiggers@google.com,
	xin@zytor.com, Mehta, Sohil, andrew.cooper3@citrix.com,
	mario.limonciello@amd.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Wieczor-Retman, Maciej,
	Eranian, Stephane

Hi Tony/Reinette,

On 4/16/25 11:21, Reinette Chatre wrote:
> Hi Tony,
> 
> On 4/15/25 12:25 PM, Luck, Tony wrote:
>>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>>> and mbm_local_bytes. These represent total and local memory bandwidth
>>> monitoring, respectively. Each event corresponds to a specific MBM
>>> configuration. Use these default configurations to set up the counters
>>> during mount. Allow users to modify the configurations as needed after
>>> initialization.
>>
>> I think an update to this part of the resctrl.rst documentation is somewhat
>> overdue:
>>
>>         In a MON group these files provide a read out of the current
>>         value of the event for all tasks in the group. In CTRL_MON groups
>>         these files provide the sum for all tasks in the CTRL_MON group
>>         and all tasks in MON groups. Please see example section for more
>>         details on usage.
>>
>> The sentence about CTRL_MON groups providing the sum for all tasks
>> in the child MON groups is only true if counters are assigned to all of
>> those MON groups. What mon_event_count() actually does is to
>> return success if any of the CTRL_MON or child MON groups succeeded
>> with the count being the sum of all the successes.
> 
> Thanks for catching this. This would be important to highlight so that
> user space does not have impression that events of CTRL_MON can be
> used as estimate for MON groups that do not have counters assigned.
> 

Sure. Will add text about it.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode
  2025-04-16 16:18       ` Reinette Chatre
@ 2025-04-16 17:27         ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 17:27 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/16/25 11:18, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/15/25 11:48 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 16:44, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>>>> and mbm_local_bytes. To maintain the same level of support, two default
>>>> MBM configurations are added. These configurations will initially be used
>>>> to set up the counters upon mounting, while users will have the option to
>>>> modify them as needed.
>>>
>>> This jumps in quite fast by stating that MBM configurations are added but
>>> there is no definition of what an MBM configuration is.
>>
>> How about this?
>>
>>
>> By default, each resctrl group supports two MBM events: mbm_total_bytes
>> and mbm_local_bytes. These represent total and local memory bandwidth
>> monitoring, respectively. Each event corresponds to a specific MBM
>> configuration. Use these default configurations to set up the counters
>> during mount. Allow users to modify the configurations as needed after
>> initialization.
> 
> I am still missing a definition of "MBM configuration". How about:
> 
> "Each event corresponds to a specific MBM configuration." -> "Each event
> corresponds to an MBM configuration that specifies the bandwidth sources
> tracked by the event."

Sure.

> 
> ...
> 
>>>
>>> There seems more and more overlap with Tony's RMID work. Did you get a
>>> chance to look at that?
>>
>> Looked little bit. Will have look bit closer again.
> 
> I'll study that series next to catch up with Tony's plans.
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-16 17:09         ` Moger, Babu
@ 2025-04-16 17:55           ` Luck, Tony
  2025-04-16 18:17             ` Moger, Babu
  2025-04-16 19:02           ` Reinette Chatre
  1 sibling, 1 reply; 80+ messages in thread
From: Luck, Tony @ 2025-04-16 17:55 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Reinette Chatre, peternewman, corbet, tglx, mingo, bp,
	dave.hansen, x86, hpa, paulmck, akpm, thuth, rostedt, ardb,
	gregkh, daniel.sneddon, jpoimboe, alexandre.chartre,
	pawan.kumar.gupta, thomas.lendacky, perry.yuan, seanjc, kai.huang,
	xiaoyao.li, kan.liang, xin3.li, ebiggers, xin, sohil.mehta,
	andrew.cooper3, mario.limonciello, linux-doc, linux-kernel,
	maciej.wieczor-retman, eranian

On Wed, Apr 16, 2025 at 12:09:52PM -0500, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/15/25 11:53, Reinette Chatre wrote:
> > Hi Babu,
> > 
> > On 4/15/25 7:20 AM, Moger, Babu wrote:
> >> Hi Reinette,
> >>
> >> On 4/11/25 16:04, Reinette Chatre wrote:
> >>> Hi Babu,
> >>>
> >>> On 4/3/25 5:18 PM, Babu Moger wrote:
> >>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
> >>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
> >>>> as it is assigned.
> >>>
> >>> Above makes it sound as though multiple counters can be assigned to
> >>> an RMID, event pair.
> >>>
> >>
> >> Yes. Multiple counter-ids can be assigned to RMID, event pair.
> > 
> > oh, are you referring to the assignments of different counters across multiple
> > domains?
> 
> May be I am confusing you here. This is what I meant.
> 
> Here is one example.
> 
> In a same group,
>   Configure cntr_id 0, to count reads only (This maps to total event).
>   Configure cntr_id 1, to count write only (This maps to local event).
>   Configure cntr_id 2, to count dirty victims.
>   so on..
>   so on..
>   Configure cntr_id 31, to count remote read only.
> 
> We have 32 counter ids in a domain. Basically, we can configure all the
> counters in a domain to just one group if you want to.
> 
> We cannot do that right now because our data structures cannot do that.
> We can only configure 2 events(local and total) right now.

Not just data structures, but also user visible files in
mon_data/mon_L3*/*

You'd need to create a new file for each counter.

My patch for making it easier to add more counters:

https://lore.kernel.org/all/20250407234032.241215-3-tony.luck@intel.com/

may help ... though you have to pick the number of simultaneous counters
at compile time to size the arrays in the domain structures:

	struct mbm_state	*mbm_states[QOS_NUM_MBM_EVENTS];

and if you are dynamically adding/removing events using the
configuration files, need to alloc/free the memory that those
arrays of pointers reference ... as well as adding/removing files
from the appropriate mon_data/mon_L3* directory.

> My understanding it is same with MPAM also.
> 
> > 
> >>
> >>>>
> >>>> Add the functionality to allocate and assign the counters to RMID, event
> >>>> pair in the domain.
> >>>
> >>> "assign *a* counter to an RMID, event pair"?
> >>
> >> Sure.
> >>
> >>>
> >>>>
> >>>> If all the counters are in use, the kernel will log the error message
> >>>> "Unable to allocate counter in domain" in /sys/fs/resctrl/info/
> >>>> last_cmd_status when a new assignment is requested. Exit on the first
> >>>> failure when assigning counters across all the domains.
> >>>>
> >>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >>>> ---
> >>>
> >>> ...
> >>>
> >>>> ---
> >>>>  arch/x86/kernel/cpu/resctrl/internal.h |   2 +
> >>>>  arch/x86/kernel/cpu/resctrl/monitor.c  | 124 +++++++++++++++++++++++++
> >>>>  2 files changed, 126 insertions(+)
> >>>>
> >>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> >>>> index 0b73ec451d2c..1a8ac511241a 100644
> >>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> >>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> >>>> @@ -574,6 +574,8 @@ bool closid_allocated(unsigned int closid);
> >>>>  int resctrl_find_cleanest_closid(void);
> >>>>  void arch_mbm_evt_config_init(struct rdt_hw_mon_domain *hw_dom);
> >>>>  unsigned int mon_event_config_index_get(u32 evtid);
> >>>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> >>>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg);
> >>>
> >>> This is internal to resctrl fs. Why is it needed to provide both the event id
> >>> and the event configuration? Event configuration can be determined from event ID?
> >>
> >> Yes. It can be done. Then I have to export the functions like
> >> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
> >> here which I felt much more cleaner.
> > 
> >>From what I can tell, for example by looking at patch #22, callers of
> > resctrl_assign_cntr_event() now need to call mbm_get_assign_config()
> > every time before calling resctrl_assign_cntr_event(). Calling
> > mbm_get_assign_config() from within resctrl_assign_cntr_event() seems
> > simpler to me and that may result in mbm_get_assign_config() moving to 
> > monitor.c as an extra benefit.
> 
> Sure.
> 
> > 
> > ...
> > 
> >>>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> >>>> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> >>>> +{
> >>>> +	int cntr_id;
> >>>> +
> >>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> >>>> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
> >>>> +		    d->cntr_cfg[cntr_id].evtid == evtid)
> >>>> +			return cntr_id;
> >>>> +	}
> >>>> +
> >>>> +	return -ENOENT;
> >>>> +}
> >>>> +
> >>>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
> >>>> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> >>>> +{
> >>>> +	int cntr_id;
> >>>> +
> >>>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> >>>> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
> >>>> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
> >>>> +			d->cntr_cfg[cntr_id].evtid = evtid;
> >>>> +			return cntr_id;
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>> +	return -ENOSPC;
> >>>> +}
> >>>> +
> >>>> +static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
> >>>> +{
> >>>> +	memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
> >>>> +}
> >>>> +
> >>>> +/*
> >>>> + * Allocate a fresh counter and configure the event if not assigned already.
> >>>> + */
> >>>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> >>>> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid,
> >>>> +				     u32 evt_cfg)
> >>>
> >>> Same here, why are both evtid and evt_cfg provided as arguments? 
> >>
> >> Yes. It can be done. Then I have to export the functions like
> >> mbm_get_assign_config() into monitor.c. To avoid that I passed it from
> >> here which I felt much more cleaner.
> > 
> > Maybe even resctrl_assign_cntr_event() does not need to call mbm_get_assign_config()
> > but only resctrl_alloc_config_cntr() needs to call mbm_get_assign_config(). Doing so
> > may avoid more burden on callers while reducing parameters needed throughout.
> > 
> 
> ok. Sure. Will do.
> 
> -- 
> Thanks
> Babu Moger

-Tony

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-16 17:55           ` Luck, Tony
@ 2025-04-16 18:17             ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 18:17 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Reinette Chatre, peternewman, corbet, tglx, mingo, bp,
	dave.hansen, x86, hpa, paulmck, akpm, thuth, rostedt, ardb,
	gregkh, daniel.sneddon, jpoimboe, alexandre.chartre,
	pawan.kumar.gupta, thomas.lendacky, perry.yuan, seanjc, kai.huang,
	xiaoyao.li, kan.liang, xin3.li, ebiggers, xin, sohil.mehta,
	andrew.cooper3, mario.limonciello, linux-doc, linux-kernel,
	maciej.wieczor-retman, eranian

Hi Tony,

On 4/16/25 12:55, Luck, Tony wrote:
> On Wed, Apr 16, 2025 at 12:09:52PM -0500, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/15/25 11:53, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/15/25 7:20 AM, Moger, Babu wrote:
>>>> Hi Reinette,
>>>>
>>>> On 4/11/25 16:04, Reinette Chatre wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>>>>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>>>>>> as it is assigned.
>>>>>
>>>>> Above makes it sound as though multiple counters can be assigned to
>>>>> an RMID, event pair.
>>>>>
>>>>
>>>> Yes. Multiple counter-ids can be assigned to RMID, event pair.
>>>
>>> oh, are you referring to the assignments of different counters across multiple
>>> domains?
>>
>> May be I am confusing you here. This is what I meant.
>>
>> Here is one example.
>>
>> In a same group,
>>   Configure cntr_id 0, to count reads only (This maps to total event).
>>   Configure cntr_id 1, to count write only (This maps to local event).
>>   Configure cntr_id 2, to count dirty victims.
>>   so on..
>>   so on..
>>   Configure cntr_id 31, to count remote read only.
>>
>> We have 32 counter ids in a domain. Basically, we can configure all the
>> counters in a domain to just one group if you want to.
>>
>> We cannot do that right now because our data structures cannot do that.
>> We can only configure 2 events(local and total) right now.
> 
> Not just data structures, but also user visible files in
> mon_data/mon_L3*/*
> 
> You'd need to create a new file for each counter.

Yes. That is correct.

> 
> My patch for making it easier to add more counters:
> 
> https://lore.kernel.org/all/20250407234032.241215-3-tony.luck@intel.com/
> 
> may help ... though you have to pick the number of simultaneous counters
> at compile time to size the arrays in the domain structures:
> 
> 	struct mbm_state	*mbm_states[QOS_NUM_MBM_EVENTS];
> 
> and if you are dynamically adding/removing events using the
> configuration files, need to alloc/free the memory that those
> arrays of pointers reference ... as well as adding/removing files
> from the appropriate mon_data/mon_L3* directory.

Not just that. Also there is that overflow handler to keep all these
counters in sane state. So, pretty quickly it gets complicated. It is
probably best to handle as a separate series.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-15 20:37     ` Moger, Babu
@ 2025-04-16 18:52       ` Reinette Chatre
  2025-04-17 14:34         ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-16 18:52 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/15/25 1:37 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 4/11/25 17:07, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>> Users can modify the event configuration by writing to the event_filter
>>> interface file. The event configurations for mbm_cntr_assign mode are
>>> located in /sys/fs/resctrl/info/event_configs/.
>>>
>>> Update the assignments of all groups when the event configuration is
>>> modified.
>>>
>>> Example:
>>> $ cd /sys/fs/resctrl/
>>> $ echo "local_reads, local_non_temporal_writes" >
>>>   info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>>
>>> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>>  local_reads, local_non_temporal_writes
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> v12: New patch to modify event configurations.
>>> ---
>>>  Documentation/arch/x86/resctrl.rst     |  10 +++
>>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 115 ++++++++++++++++++++++++-
>>>  2 files changed, 124 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>> index 99f9f4b9b501..4e6feba6fb08 100644
>>> --- a/Documentation/arch/x86/resctrl.rst
>>> +++ b/Documentation/arch/x86/resctrl.rst
>>> @@ -335,6 +335,16 @@ with the following files:
>>>  	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>  	    local_reads, local_non_temporal_writes, local_reads_slow_memory
>>>  
>>> +	The event configuration can be modified by writing to the event_filter file within
>>> +	the configuration directory.
>>
>> Please use imperative tone.
> 
> Sure.
> 
> Basic question - Should the user doc also be in imperative mode? I thought
> it only applies to commit log.

I am not aware of a documented rule that user doc should be in imperative mode. I
requested imperative tone here because writing in this way helps to remove ambiguity
and fits with how the rest of the resctrl files are described.

Looking at this specific addition I realized that there is no initial description of
what "event_filter" contains and to make things more confusing the term "event" is
used for both the individual "events" being counted (remote_reads, local_reads, etc.) as
well as the (what will eventually be dynamic) name for collection of "events" being counted,
mbm_total_bytes and mbm_local_bytes. 

Since "event" have been used for mbm_total_bytes and mbm_local_bytes since beginning we
should try to come up with term that can describe what they are configured with.

Below is a start of trying to address this but I think more refinement is needed (other
possible terms for "transactions" could perhaps be "data sources"? ... what do you think?):

	"The read/write event_filter file contains the configuration of the event
	 that reflects which transactions(?) are being counted by it."

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-16 17:09         ` Moger, Babu
  2025-04-16 17:55           ` Luck, Tony
@ 2025-04-16 19:02           ` Reinette Chatre
  2025-04-16 19:29             ` Moger, Babu
  1 sibling, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-16 19:02 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/16/25 10:09 AM, Moger, Babu wrote:
> On 4/15/25 11:53, Reinette Chatre wrote:
>> On 4/15/25 7:20 AM, Moger, Babu wrote:
>>> On 4/11/25 16:04, Reinette Chatre wrote:
>>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>>>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>>>>> as it is assigned.
>>>>
>>>> Above makes it sound as though multiple counters can be assigned to
>>>> an RMID, event pair.
>>>>
>>>
>>> Yes. Multiple counter-ids can be assigned to RMID, event pair.
>>
>> oh, are you referring to the assignments of different counters across multiple
>> domains?
> 
> May be I am confusing you here. This is what I meant.
> 
> Here is one example.
> 
> In a same group,

"same group" means single RMID, eg. RMID_A

>   Configure cntr_id 0, to count reads only (This maps to total event).

This will be one event, event0, so one counter assigned to RMID_A, event0 pair.

>   Configure cntr_id 1, to count write only (This maps to local event).

... event1, one counter assigned to RMID_A, event1 pair, ...

>   Configure cntr_id 2, to count dirty victims.

... event2, one counter assigned to RMID_A, event2 pair, ...

>   so on..
>   so on..
>   Configure cntr_id 31, to count remote read only.

... and event31, one counter assigned to RMID_A, event31 pair.

The example reflects that a *single* counter can be assigned to an RMID, event pair.

Considering above, perhaps changelog can start with something like:
	mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
        can be assigned to RMID, event pairs ....

> 
> We have 32 counter ids in a domain. Basically, we can configure all the
> counters in a domain to just one group if you want to.

Understood.


> 
> We cannot do that right now because our data structures cannot do that.
> We can only configure 2 events(local and total) right now.
> 
> My understanding it is same with MPAM also.

Reinette


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events
  2025-04-16 19:02           ` Reinette Chatre
@ 2025-04-16 19:29             ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-16 19:29 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/16/25 14:02, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/16/25 10:09 AM, Moger, Babu wrote:
>> On 4/15/25 11:53, Reinette Chatre wrote:
>>> On 4/15/25 7:20 AM, Moger, Babu wrote:
>>>> On 4/11/25 16:04, Reinette Chatre wrote:
>>>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>>>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>>>>>> can be assigned to an RMID, event pair and monitor the bandwidth as long
>>>>>> as it is assigned.
>>>>>
>>>>> Above makes it sound as though multiple counters can be assigned to
>>>>> an RMID, event pair.
>>>>>
>>>>
>>>> Yes. Multiple counter-ids can be assigned to RMID, event pair.
>>>
>>> oh, are you referring to the assignments of different counters across multiple
>>> domains?
>>
>> May be I am confusing you here. This is what I meant.
>>
>> Here is one example.
>>
>> In a same group,
> 
> "same group" means single RMID, eg. RMID_A

Yes.

> 
>>   Configure cntr_id 0, to count reads only (This maps to total event).
> 
> This will be one event, event0, so one counter assigned to RMID_A, event0 pair.
> 
>>   Configure cntr_id 1, to count write only (This maps to local event).
> 
> ... event1, one counter assigned to RMID_A, event1 pair, ...
> 
>>   Configure cntr_id 2, to count dirty victims.
> 
> ... event2, one counter assigned to RMID_A, event2 pair, ...
> 
>>   so on..
>>   so on..
>>   Configure cntr_id 31, to count remote read only.
> 
> ... and event31, one counter assigned to RMID_A, event31 pair.

Yes. That is correct.

> 
> The example reflects that a *single* counter can be assigned to an RMID, event pair.
> 
> Considering above, perhaps changelog can start with something like:
> 	mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>         can be assigned to RMID, event pairs ....
> 

Sounds good.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2025-04-16 16:08           ` Reinette Chatre
@ 2025-04-17 14:27             ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-17 14:27 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/16/25 11:08, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/15/25 12:43 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/15/25 11:09, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/14/25 10:48 AM, Moger, Babu wrote:
>>>
>>>> Here is my proposal to handle this case. This can be separate patch.
>>>>
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index d10cf1e5b914..772f2f77faee 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
>>>> kernfs_open_file *of,
>>>>
>>>>         list_for_each_entry(mevt, &r->mon.evt_list, list) {
>>>>                 seq_printf(seq, "%s\n", mevt->name);
>>>> -               if (mevt->configurable)
>>>> +               if (mevt->configurable &&
>>>> !resctrl_arch_mbm_cntr_assign_enabled(r))
>>>>                         seq_printf(seq, "%s_config\n", mevt->name);
>>>>         }
>>>>
>>>> @@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
>>>> struct rdt_resource *r, u32 evtid
>>>>         cpus_read_lock();
>>>>         mutex_lock(&rdtgroup_mutex);
>>>>
>>>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>>>> with mbm_cntr_assign mode\n");
>>>> +               return -EINVAL;
>>>> +       }
>>>> +
>>>>         list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>>                 if (sep)
>>>>                         seq_puts(s, ";");
>>>> @@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
>>>> struct rdt_resource *r, u32 evtid
>>>>  static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
>>>>                                        struct seq_file *seq, void *v)
>>>>  {
>>>> +       int ret;
>>>>         struct rdt_resource *r = of->kn->parent->priv;
>>>>
>>>> -       mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>>>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>>>>
>>>> -       return 0;
>>>> +       return ret;
>>>>  }
>>>>
>>>>  static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
>>>>                                        struct seq_file *seq, void *v)
>>>>  {
>>>> +       int ret;
>>>> +
>>>>         struct rdt_resource *r = of->kn->parent->priv;
>>>>
>>>> -       mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>>>> +       ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
>>>>
>>>> -       return 0;
>>>> +       return ret;
>>>>  }
>>>>
>>>>  static void mbm_config_write_domain(struct rdt_resource *r,
>>>> @@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
>>>> char *tok, u32 evtid)
>>>>         /* Walking r->domains, ensure it can't race with cpuhp */
>>>>         lockdep_assert_cpus_held();
>>>>
>>>> +       if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>>>> +               rdt_last_cmd_puts("Event configuration(BMEC) not supported
>>>> with mbm_cntr_assign mode\n");
>>>> +               return -EINVAL;
>>>> +       }
>>>> +
>>>>  next:
>>>>         if (!tok || tok[0] == '\0')
>>>>                 return 0;
>>>>
>>>
>>> Instead of chasing every call that may involve BMEC I think it will be simpler to
>>> disable BMEC support during initialization when ABMC is detected. Specifically,
>>> on systems that support both BMEC and ABMC rdt_cpu_has(X86_FEATURE_BMEC) returns
>>> false. 
>>
>> There is one problem with this approach. Users have the option to switch
>> between the assignment modes. System will boot with ABMC by default if
>> supported. But, users can switch to 'default' mode after the boot. By
>> disabling the BMEC completely, it will not be possible to do that.
> 
> Good point. Thank you. Another option is to hide (see kernfs_show()) mbm_total_bytes_config
> and mbm_local_bytes_config when ABMC is enabled. To me this seems like a clear
> interface to user space, when user interface changes the mode the interface changes
> to reflect new mode.

Sure. Will try this. Thanks for the pointer.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-16 18:52       ` Reinette Chatre
@ 2025-04-17 14:34         ` Moger, Babu
  2025-04-17 15:09           ` Reinette Chatre
  0 siblings, 1 reply; 80+ messages in thread
From: Moger, Babu @ 2025-04-17 14:34 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/16/25 13:52, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/15/25 1:37 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 4/11/25 17:07, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/3/25 5:18 PM, Babu Moger wrote:
>>>> Users can modify the event configuration by writing to the event_filter
>>>> interface file. The event configurations for mbm_cntr_assign mode are
>>>> located in /sys/fs/resctrl/info/event_configs/.
>>>>
>>>> Update the assignments of all groups when the event configuration is
>>>> modified.
>>>>
>>>> Example:
>>>> $ cd /sys/fs/resctrl/
>>>> $ echo "local_reads, local_non_temporal_writes" >
>>>>   info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>>>
>>>> $ cat info/L3_MON/counter_configs/mbm_total_bytes/event_filter
>>>>  local_reads, local_non_temporal_writes
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> v12: New patch to modify event configurations.
>>>> ---
>>>>  Documentation/arch/x86/resctrl.rst     |  10 +++
>>>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 115 ++++++++++++++++++++++++-
>>>>  2 files changed, 124 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>>> index 99f9f4b9b501..4e6feba6fb08 100644
>>>> --- a/Documentation/arch/x86/resctrl.rst
>>>> +++ b/Documentation/arch/x86/resctrl.rst
>>>> @@ -335,6 +335,16 @@ with the following files:
>>>>  	    # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
>>>>  	    local_reads, local_non_temporal_writes, local_reads_slow_memory
>>>>  
>>>> +	The event configuration can be modified by writing to the event_filter file within
>>>> +	the configuration directory.
>>>
>>> Please use imperative tone.
>>
>> Sure.
>>
>> Basic question - Should the user doc also be in imperative mode? I thought
>> it only applies to commit log.
> 
> I am not aware of a documented rule that user doc should be in imperative mode. I
> requested imperative tone here because writing in this way helps to remove ambiguity
> and fits with how the rest of the resctrl files are described.
> 
> Looking at this specific addition I realized that there is no initial description of
> what "event_filter" contains and to make things more confusing the term "event" is
> used for both the individual "events" being counted (remote_reads, local_reads, etc.) as
> well as the (what will eventually be dynamic) name for collection of "events" being counted,
> mbm_total_bytes and mbm_local_bytes. 
> 
> Since "event" have been used for mbm_total_bytes and mbm_local_bytes since beginning we
> should try to come up with term that can describe what they are configured with.
> 
> Below is a start of trying to address this but I think more refinement is needed (other
> possible terms for "transactions" could perhaps be "data sources"? ... what do you think?):
> 
> 	"The read/write event_filter file contains the configuration of the event
> 	 that reflects which transactions(?) are being counted by it."
> 

How about?

"The read/write event_filter file contains the configuration of the event
that reflects which memory transactions are being counted by it."


-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-17 14:34         ` Moger, Babu
@ 2025-04-17 15:09           ` Reinette Chatre
  2025-04-17 20:19             ` Moger, Babu
  0 siblings, 1 reply; 80+ messages in thread
From: Reinette Chatre @ 2025-04-17 15:09 UTC (permalink / raw)
  To: babu.moger, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Babu,

On 4/17/25 7:34 AM, Moger, Babu wrote:
> On 4/16/25 13:52, Reinette Chatre wrote:
>>
>> Below is a start of trying to address this but I think more refinement is needed (other
>> possible terms for "transactions" could perhaps be "data sources"? ... what do you think?):
>>
>> 	"The read/write event_filter file contains the configuration of the event
>> 	 that reflects which transactions(?) are being counted by it."
>>
> 
> How about?
> 
> "The read/write event_filter file contains the configuration of the event
> that reflects which memory transactions are being counted by it."
> 

Looks good to me. Perhaps "being" can be dropped? Thank you.

Reinette
 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations
  2025-04-17 15:09           ` Reinette Chatre
@ 2025-04-17 20:19             ` Moger, Babu
  0 siblings, 0 replies; 80+ messages in thread
From: Moger, Babu @ 2025-04-17 20:19 UTC (permalink / raw)
  To: Reinette Chatre, tony.luck, peternewman
  Cc: corbet, tglx, mingo, bp, dave.hansen, x86, hpa, paulmck, akpm,
	thuth, rostedt, ardb, gregkh, daniel.sneddon, jpoimboe,
	alexandre.chartre, pawan.kumar.gupta, thomas.lendacky, perry.yuan,
	seanjc, kai.huang, xiaoyao.li, kan.liang, xin3.li, ebiggers, xin,
	sohil.mehta, andrew.cooper3, mario.limonciello, linux-doc,
	linux-kernel, maciej.wieczor-retman, eranian

Hi Reinette,

On 4/17/25 10:09, Reinette Chatre wrote:
> Hi Babu,
> 
> On 4/17/25 7:34 AM, Moger, Babu wrote:
>> On 4/16/25 13:52, Reinette Chatre wrote:
>>>
>>> Below is a start of trying to address this but I think more refinement is needed (other
>>> possible terms for "transactions" could perhaps be "data sources"? ... what do you think?):
>>>
>>> 	"The read/write event_filter file contains the configuration of the event
>>> 	 that reflects which transactions(?) are being counted by it."
>>>
>>
>> How about?
>>
>> "The read/write event_filter file contains the configuration of the event
>> that reflects which memory transactions are being counted by it."
>>
> 
> Looks good to me. Perhaps "being" can be dropped? Thank you.
> 

Sure.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2025-04-17 20:19 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-04  0:18 [PATCH v12 00/26] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-04-04  0:18 ` [PATCH v12 01/26] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg in struct rdt_hw_mon_domain Babu Moger
2025-04-11 20:49   ` Reinette Chatre
2025-04-14 15:56     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 02/26] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2025-04-11 20:50   ` Reinette Chatre
2025-04-14 15:57     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2025-04-11 20:52   ` Reinette Chatre
2025-04-14 17:48     ` Moger, Babu
2025-04-15 16:09       ` Reinette Chatre
2025-04-15 19:43         ` Moger, Babu
2025-04-16 16:08           ` Reinette Chatre
2025-04-17 14:27             ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 04/26] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2025-04-04  0:18 ` [PATCH v12 05/26] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2025-04-04  0:18 ` [PATCH v12 06/26] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2025-04-04  0:18 ` [PATCH v12 07/26] x86/resctrl: Add support to enable/disable AMD ABMC feature Babu Moger
2025-04-04  0:18 ` [PATCH v12 08/26] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2025-04-11 20:56   ` Reinette Chatre
2025-04-14 19:52     ` Moger, Babu
2025-04-15 16:22       ` Reinette Chatre
2025-04-16 14:05         ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 09/26] x86/resctrl: Introduce interface to display number of monitoring counters Babu Moger
2025-04-11 21:01   ` Reinette Chatre
2025-04-14 20:12     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 10/26] x86/resctrl: Introduce mbm_cntr_cfg to track assignable counters at domain Babu Moger
2025-04-04  0:18 ` [PATCH v12 11/26] x86/resctrl: Introduce interface to display number of free MBM counters Babu Moger
2025-04-04  0:18 ` [PATCH v12 12/26] x86/resctrl: Add data structures and definitions for ABMC assignment Babu Moger
2025-04-11 21:01   ` Reinette Chatre
2025-04-14 20:30     ` Moger, Babu
2025-04-15 16:30       ` Reinette Chatre
2025-04-16 15:43         ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 13/26] x86/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC Babu Moger
2025-04-11 21:02   ` Reinette Chatre
2025-04-14 20:51     ` Moger, Babu
2025-04-15 16:38       ` Reinette Chatre
2025-04-16 15:51         ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 14/26] x86/resctrl: Add the functionality to assign MBM events Babu Moger
2025-04-11 21:04   ` Reinette Chatre
2025-04-15 14:20     ` Moger, Babu
2025-04-15 16:53       ` Reinette Chatre
2025-04-16 17:09         ` Moger, Babu
2025-04-16 17:55           ` Luck, Tony
2025-04-16 18:17             ` Moger, Babu
2025-04-16 19:02           ` Reinette Chatre
2025-04-16 19:29             ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 15/26] x86/resctrl: Add the functionality to unassign " Babu Moger
2025-04-04  0:18 ` [PATCH v12 16/26] x86/resctrl: Report 'Unassigned' for MBM events in mbm_cntr_assign mode Babu Moger
2025-04-11 21:08   ` Reinette Chatre
2025-04-15 15:00     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 17/26] x86/resctrl: Add the support for reading ABMC counters Babu Moger
2025-04-11 21:21   ` Reinette Chatre
2025-04-15 16:41     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 18/26] x86/resctrl: Add default MBM event configurations for mbm_cntr_assign mode Babu Moger
2025-04-11 21:44   ` Reinette Chatre
2025-04-15 18:48     ` Moger, Babu
2025-04-15 19:25       ` Luck, Tony
2025-04-16 16:21         ` Reinette Chatre
2025-04-16 17:26           ` Moger, Babu
2025-04-16 16:18       ` Reinette Chatre
2025-04-16 17:27         ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 19/26] x86/resctrl: Add event configuration directory under info/L3_MON/ Babu Moger
2025-04-11 22:04   ` Reinette Chatre
2025-04-15 20:29     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 20/26] x86/resctrl: Provide interface to update the event configurations Babu Moger
2025-04-11 22:07   ` Reinette Chatre
2025-04-15 20:37     ` Moger, Babu
2025-04-16 18:52       ` Reinette Chatre
2025-04-17 14:34         ` Moger, Babu
2025-04-17 15:09           ` Reinette Chatre
2025-04-17 20:19             ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 21/26] x86/resctrl: Introduce mbm_assign_on_mkdir to configure assignments Babu Moger
2025-04-11 22:08   ` Reinette Chatre
2025-04-15 20:39     ` Moger, Babu
2025-04-04  0:18 ` [PATCH v12 22/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled Babu Moger
2025-04-04  0:18 ` [PATCH v12 23/26] x86/resctrl: Introduce mbm_L3_assignments to list assignments in a group Babu Moger
2025-04-04  0:18 ` [PATCH v12 24/26] x86/resctrl: Introduce the interface to modify " Babu Moger
2025-04-04  0:18 ` [PATCH v12 25/26] x86/resctrl: Introduce the interface to switch between monitor modes Babu Moger
2025-04-04  0:18 ` [PATCH v12 26/26] x86/resctrl: Configure mbm_cntr_assign mode if supported Babu Moger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).