linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
@ 2024-05-24 12:23 Babu Moger
  2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
                   ` (19 more replies)
  0 siblings, 20 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse


This series adds the support for Assignable Bandwidth Monitoring Counters
(ABMC). It is also called QoS RMID Pinning feature

The feature details are documented in the  APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC). The documentation is available at
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

The patches are based on top of commit
commit 573c35550b2b ("Merge branch into tip/master: 'x86/percpu'")

# Introduction

Users can create as many monitor groups as RMIDs supported by the hardware.
However, bandwidth monitoring feature on AMD system only guarantees that
RMIDs currently assigned to a processor will be tracked by hardware.
The counters of any other RMIDs which are no longer being tracked will be
reset to zero. The MBM event counters return "Unavailable" for the RMIDs
that are not tracked by hardware. So, there can be only limited number of
groups that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups
are being tracked for certain point of time. Users do not have the option
to monitor a group or set of groups for certain period of time without
worrying about RMID being reset in between.
    
The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as it is assigned.
The assigned RMID will be tracked by the hardware until the user unassigns
it manually. There is no need to worry about counters being reset during
this period. Additionally, the user can specify a bitmask identifying the
specific bandwidth types from the given source to track with the counter.

Without ABMC enabled, monitoring will work in current mode without
assignment option.

# Linux Implementation

Linux resctrl subsystem provides the interface to count maximum of two
memory bandwidth events per group, from a combination of available total
and local events. Keeping the current interface, users can enable a maximum
of 2 ABMC counters per group. User will also have the option to enable only
one counter to the group. If the system runs out of assignable ABMC
counters, kernel will display an error. Users need to disable an already
enabled counter to make space for new assignments.


# Examples

a. Check if ABMC support is available
	#mount -t resctrl resctrl /sys/fs/resctrl/

	#cat /sys/fs/resctrl/info/L3_MON/mbm_assign
	[abmc] 
	mbm_legacy

	Linux kernel detected ABMC feature and it is enabled.

b. Check how many ABMC counters are available. 

	#cat /sys/fs/resctrl/info/L3_MON/num_cntrs 
	32

c. Create few resctrl groups.

	# mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
	# mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
	# mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp


d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
   to list and modify the group's assignment states.

	The list follows the following format:

	"<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"


	Format for specific type of groups:

	* Default CTRL_MON group:
	 "//<domain_id>=<assignment_flags>"

       * Non-default CTRL_MON group:
               "<CTRL_MON group>//<domain_id>=<assignment_flags>"

       * Child MON group of default CTRL_MON group:
               "/<MON group>/<domain_id>=<assignment_flags>"

       * Child MON group of non-default CTRL_MON group:
               "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"

       Assignment flags can be one of the following:

        t  MBM total event is enabled
        l  MBM local event is enabled
        tl Both total and local MBM events are enabled
        _  None of the MBM events are enabled

	Examples:

	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control 
	non_default_ctrl_mon_grp//0=tl;1=tl;
	non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
	//0=tl;1=tl;
	/child_default_mon_grp/0=tl;1=tl;

	There are four groups and all the groups have local and total
	event enabled on domain 0 and 1.

	=tl means both total and local events are enabled.

	"//" - This is a default CONTROL MON group

	"non_default_ctrl_mon_grp//" - This is non default CONTROL MON group

	"/child_default_mon_grp/"  - This is Child MON group of the defult group

	"non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child
	MON group of the non default group

e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control.

	The write format is similar to the above list format with addition of
	op-code for the assignment operation.
	
	* Default CTRL_MON group:
	        "//<domain_id><op-code><assignment_flags>"
	
	* Non-default CTRL_MON group:
	        "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
	
	* Child MON group of default CTRL_MON group:
	        "/<MON group>/<domain_id><op-code><assignment_flags>"
	
	* Child MON group of non-default CTRL_MON group:
	        "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
	
	Op-code can be one of the following:
	
	= Update the assignment to match the flags
	+ Assign a new state
	- Unassign a new state
	
	
	Initial group status:
	
	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	non_default_ctrl_mon_grp//0=tl;1=tl;
	non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
	//0=tl;1=tl;
	/child_default_mon_grp/0=tl;1=tl;
	
	 To update the default group to enable only total event on domain 0:
	 # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	
	 Assignment status after the update:
	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	 non_default_ctrl_mon_grp//0=tl;1=tl;
	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
	 //0=t;1=tl;
	 /child_default_mon_grp/0=tl;1=tl;
	
	 To update the MON group child_default_mon_grp to remove total event on domain 1:
	 # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	
	 Assignment status after the update:
	 $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	 non_default_ctrl_mon_grp//0=tl;1=tl;
	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
	 //0=t;1=l;
	 /child_default_mon_grp/0=t;1=tl;
	
	 To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
	 remove both local and total events on domain 1:
	 # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
	               /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	
	 Assignment status after the update:
	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	 non_default_ctrl_mon_grp//0=tl;1=tl;
	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
	 //0=t;1=l;
	 /child_default_mon_grp/0=t;1=tl;
	
	 To update the default group to add a total event domain 1.
	 # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	
	 Assignment status after the update:
	
	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
	 non_default_ctrl_mon_grp//0=tl;1=tl;
	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
	 //0=t;1=tl;
	 /child_default_mon_grp/0=t;1=tl;
	
f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
   There is no change in reading the evetns with ABMC. If the event is unassigned
   when reading, then the read will come back as Unavailable.
	
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	779247936
	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 
	765207488
	
g. Users will have the option to go back to legacy_mbm mode if required.
   This can be done using the following command.

	# echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign
        abmc
        [mbm_legacy]

h. Check the bandwidth configuration for the group. Note that bandwidth
   configuration has a domain scope. Total event defaults to 0x7F (to
   count all the events) and local event defaults to 0x15 (to count all
   the local numa events). The event bitmap decoding is available at
   https://www.kernel.org/doc/Documentation/x86/resctrl.rst
   in section "mbm_total_bytes_config", "mbm_local_bytes_config":
	
	#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 
	0=0x7f;1=0x7f
	
	#cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config 
	0=0x15;1=0x15
	
j. Change the bandwidth source for domain 0 for the total event to count only reads.
   Note that this change effects total events on the domain 0.
	
	#echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 
	#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config 
	0=0x33;1=0x7F
	
k. Now read the total event again. The mbm_total_bytes should display
   only the read events.
	
	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
	314101
	
l. Unmount the resctrl
	 
	#umount /sys/fs/resctrl/

---
v4: 
  Main change is domain specific event assignment.
  Kept the ABMC feature as a default.
  Dynamcic switching between ABMC and mbm_legacy is still allowed.
  We are still not clear about mount option.
  Moved the monitoring related data in resctrl_mon structure from rdt_resource.
  Fixed the display of legacy and ABMC mode.
  Used bimap APIs when possible.
  Removed event configuration read from MSRs. We can use the
  internal saved data.(patch 12)
  Added more comments about L3_QOS_ABMC_CFG MSR.
  Added IPIs to read the assignment status for each domain (patch 18 and 19)
  More details in each patch.

v3:
   This series adds the support for global assignment mode discussed in
   the thread. https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
   Removed the individual assignment mode and included the global assignment interface.
   Added following interface files.
   a. /sys/fs/resctrl/info/L3_MON/mbm_assign
      Used for displaying the current assignment mode and switch between
      ABMC and legacy mode.
   b. /sys/fs/resctrl/info/L3_MON/mbm_assign_control
      Used for lising the groups assignment mode and modify the assignment states.
   c. Most of the changes are related to the new interface.
   d. Addressed the comments from Reinette, James and Peter.
   e. Hope I have addressed most of the major feedbacks discussed. If I missed
      something then it is not intentional. Please feel free to comment.
   f. Sending this as an RFC as per Reinette's comment. So, this is still open
      for discussion.

v2:
   a. Major change is the way ABMC is enabled. Earlier, user needed to remount
      with -o abmc to enable ABMC feature. Removed that option now.
      Now users can enable ABMC by "$echo 1 to /sys/fs/resctrl/info/L3_MON/mbm_assign_enable".
     
   b. Added new word 21 to x86/cpufeatures.h.

   c. Display unsupported if user attempts to read the events when ABMC is enabled
      and event is not assigned.

   d. Display monitor_state as "Unsupported" when ABMC is disabled.
  
   e. Text updates and rebase to latest tip tree (as of Jan 18).
 
   f. This series is still work in progress. I am yet to hear from ARM developers. 

v4:
  https://lore.kernel.org/lkml/cover.1711674410.git.babu.moger@amd.com/

v2:
  https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/

v1 :
   https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/


Babu Moger (17):
  x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters
    (ABMC)
  x86/resctrl: Add ABMC feature in the command line options
  x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  x86/resctrl: Introduce resctrl_file_fflags_init
  x86/resctrl: Introduce the interface to display the assignment state
  x86/resctrl: Introduce interface to display number of ABMC counters
  x86/resctrl: Add support to enable/disable ABMC feature
  x86/resctrl: Initialize assignable counters bitmap
  x86/resctrl: Introduce assign state for the mon group
  x86/resctrl: Add data structures for ABMC assignment
  x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  x86/resctrl: Add the functionality to assign the RMID
  x86/resctrl: Add the functionality to unassign the RMID
  x86/resctrl: Enable ABMC by default on resctrl mount
  x86/resctrl: Introduce the interface switch between ABMC and
    legacy_mbm
  x86/resctrl: Introduce interface to list assignment states of all the
    groups
  x86/resctrl: Introduce interface to modify assignment states of the
    groups

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/arch/x86/resctrl.rst            | 144 ++++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   3 +
 arch/x86/kernel/cpu/resctrl/core.c            |  25 +-
 arch/x86/kernel/cpu/resctrl/internal.h        |  56 +-
 arch/x86/kernel/cpu/resctrl/monitor.c         |  24 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 714 +++++++++++++++++-
 arch/x86/kernel/cpu/scattered.c               |   1 +
 include/linux/resctrl.h                       |  12 +
 11 files changed, 964 insertions(+), 20 deletions(-)

-- 
2.34.1


Babu Moger (17):
  x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters
    (ABMC)
  x86/resctrl: Add ABMC feature in the command line options
  x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  x86/resctrl: Introduce resctrl_file_fflags_init
  x86/resctrl: Introduce the interface to display the assignment state
  x86/resctrl: Introduce interface to display number of ABMC counters
  x86/resctrl: Add support to enable/disable ABMC feature
  x86/resctrl: Initialize assignable counters bitmap
  x86/resctrl: Introduce assign state for the mon group
  x86/resctrl: Add data structures for ABMC assignment
  x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  x86/resctrl: Add the functionality to assign the RMID
  x86/resctrl: Add the functionality to unassign the RMID
  x86/resctrl: Enable ABMC by default on resctrl mount
  x86/resctrl: Introduce the interface switch between ABMC and
    legacy_mbm
  x86/resctrl: Introduce interface to list assignment states of all the
    groups
  x86/resctrl: Introduce interface to modify assignment states of the
    groups

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/arch/x86/resctrl.rst            | 144 ++++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   3 +
 arch/x86/kernel/cpu/resctrl/core.c            |  25 +-
 arch/x86/kernel/cpu/resctrl/internal.h        |  56 +-
 arch/x86/kernel/cpu/resctrl/monitor.c         |  24 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 714 +++++++++++++++++-
 arch/x86/kernel/cpu/scattered.c               |   1 +
 include/linux/resctrl.h                       |  12 +
 11 files changed, 964 insertions(+), 20 deletions(-)

-- 
2.34.1



Babu Moger (19):
  x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters
    (ABMC)
  x86/resctrl: Add ABMC feature in the command line options
  x86/resctrl: Consolidate monitoring related data from rdt_resource
  x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags
  x86/resctrl: Introduce interface to display number of ABMC counters
  x86/resctrl: Add support to enable/disable ABMC feature
  x86/resctrl: Introduce the interface to display monitor mode
  x86/resctrl: Initialize ABMC counters bitmap
  x86/resctrl: Introduce ABMC state for the monitor group
  x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  x86/resctrl: Remove MSR reading of event configuration value
  x86/resctrl: Add data structures for ABMC assignment
  x86/resctrl: Add the interface to assign ABMC counter
  x86/resctrl: Add the interface to unassign ABMC counter
  x86/resctrl: Enable ABMC by default on resctrl mount
  x86/resctrl: Introduce the interface switch between ABMC and
    mbm_legacy
  x86/resctrl: Introduce interface to list monitor states of all the
    groups
  x86/resctrl: Introduce interface to modify assignment states of the
    groups

 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/arch/x86/resctrl.rst            | 161 ++++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   3 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   3 +
 arch/x86/kernel/cpu/resctrl/core.c            |  10 +-
 arch/x86/kernel/cpu/resctrl/internal.h        |  61 +-
 arch/x86/kernel/cpu/resctrl/monitor.c         |  60 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 830 +++++++++++++++++-
 arch/x86/kernel/cpu/scattered.c               |   1 +
 include/linux/resctrl.h                       |  20 +-
 11 files changed, 1086 insertions(+), 66 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v4 01/19] x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:54   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 02/19] x86/resctrl: Add ABMC feature in the command line options Babu Moger
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Users can create as many monitor groups as RMIDs supported by the hardware.
However, bandwidth monitoring feature on AMD system only guarantees that
RMIDs currently assigned to a processor will be tracked by hardware.
The counters of any other RMIDs which are no longer being tracked will be
reset to zero. The MBM event counters return "Unavailable" for the RMIDs
that are not tracked by hardware. So, there can be only limited number of
groups that can give guaranteed monitoring numbers. With ever changing
configurations there is no way to definitely know which of these groups
are being tracked for certain point of time. Users do not have the option
to monitor a group or set of groups for certain period of time without
worrying about RMID being reset in between.

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as it is assigned.
The assigned RMID will be tracked by the hardware until the user unassigns
it manually. There is no need to worry about counters being reset during
this period. Additionally, the user can specify a bitmask identifying the
specific bandwidth types from the given source to track with the counter.

Without ABMC enabled, monitoring will work in current mode without
assignment option.

Linux resctrl subsystem provides the interface to count maximum of two
memory bandwidth events per group, from a combination of available total
and local events. Keeping the current interface, users can enable a maximum
of 2 ABMC counters per group. User will also have the option to enable only
one counter to the group. If the system runs out of assignable ABMC
counters, kernel will display an error. Users need to disable an already
enabled counter to make space for new assignments.

The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5.
Bits Description
5    ABMC (Assignable Bandwidth Monitoring Counters)

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Note: Checkpatch checks/warnings are ignored to maintain coding style.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Changes because of rebase. Feature word 21 has few more additions now.
    Changed the text to "tracked by hardware" instead of active.

v3: Change because of rebase. Actual patch did not change.

v2: Added dependency on X86_FEATURE_BMEC.
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 3 +++
 arch/x86/kernel/cpu/scattered.c    | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 3c7434329661..fbe118fa7f88 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -470,6 +470,7 @@
 #define X86_FEATURE_BHI_CTRL		(21*32+ 2) /* "" BHI_DIS_S HW control available */
 #define X86_FEATURE_CLEAR_BHB_HW	(21*32+ 3) /* "" BHI_DIS_S HW control enabled */
 #define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */
+#define X86_FEATURE_ABMC		(21*32+ 5) /* "" Assignable Bandwidth Monitoring Counters */
 
 /*
  * BUG word(s)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index b7d9f530ae16..5227a6232e9e 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -70,6 +70,9 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_TOTAL   },
 	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_LOCAL   },
+	{ X86_FEATURE_ABMC,			X86_FEATURE_CQM_MBM_TOTAL   },
+	{ X86_FEATURE_ABMC,			X86_FEATURE_CQM_MBM_LOCAL   },
+	{ X86_FEATURE_ABMC,			X86_FEATURE_BMEC      },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
 	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index af5aa2c754c2..411b18c962bb 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -48,6 +48,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
 	{ X86_FEATURE_SMBA,		CPUID_EBX,  2, 0x80000020, 0 },
 	{ X86_FEATURE_BMEC,		CPUID_EBX,  3, 0x80000020, 0 },
+	{ X86_FEATURE_ABMC,		CPUID_EBX,  5, 0x80000020, 0 },
 	{ X86_FEATURE_PERFMON_V2,	CPUID_EAX,  0, 0x80000022, 0 },
 	{ X86_FEATURE_AMD_LBR_V2,	CPUID_EAX,  1, 0x80000022, 0 },
 	{ X86_FEATURE_AMD_LBR_PMC_FREEZE,	CPUID_EAX,  2, 0x80000022, 0 },
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 02/19] x86/resctrl: Add ABMC feature in the command line options
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-05-24 12:23 ` [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Add the command line option to enable or disable the new resctrl feature
ABMC (Assignable Bandwidth Monitoring Counters).

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: No changes

v3: No changes

v2: No changes
---
 Documentation/admin-guide/kernel-parameters.txt | 2 +-
 Documentation/arch/x86/resctrl.rst              | 1 +
 arch/x86/kernel/cpu/resctrl/core.c              | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 500cfa776225..66a1be9d0ea6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5602,7 +5602,7 @@
 	rdt=		[HW,X86,RDT]
 			Turn on/off individual RDT features. List is:
 			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
-			mba, smba, bmec.
+			mba, smba, bmec, abmc.
 			E.g. to turn on cmt and turn off mba use:
 				rdt=cmt,!mba
 
diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 627e23869bca..02790efaabcc 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -26,6 +26,7 @@ MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total", "cqm_mbm_local"
 MBA (Memory Bandwidth Allocation)		"mba"
 SMBA (Slow Memory Bandwidth Allocation)         ""
 BMEC (Bandwidth Monitoring Event Configuration) ""
+ABMC (Assignable Bandwidth Monitoring Counters) ""
 ===============================================	================================
 
 Historically, new features were made visible by default in /proc/cpuinfo. This
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a113d9aba553..df1b43fd9f0e 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -651,6 +651,7 @@ enum {
 	RDT_FLAG_MBA,
 	RDT_FLAG_SMBA,
 	RDT_FLAG_BMEC,
+	RDT_FLAG_ABMC,
 };
 
 #define RDT_OPT(idx, n, f)	\
@@ -676,6 +677,7 @@ static struct rdt_options rdt_options[]  __initdata = {
 	RDT_OPT(RDT_FLAG_MBA,	    "mba",	X86_FEATURE_MBA),
 	RDT_OPT(RDT_FLAG_SMBA,	    "smba",	X86_FEATURE_SMBA),
 	RDT_OPT(RDT_FLAG_BMEC,	    "bmec",	X86_FEATURE_BMEC),
+	RDT_OPT(RDT_FLAG_ABMC,	    "abmc",	X86_FEATURE_ABMC),
 };
 #define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
  2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
  2024-05-24 12:23 ` [PATCH v4 02/19] x86/resctrl: Add ABMC feature in the command line options Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:55   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Consolidate all the data related to monitoring into separate structure
for better readability.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: New patch.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  2 +-
 arch/x86/kernel/cpu/resctrl/monitor.c  | 16 ++++++++--------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  8 ++++----
 include/linux/resctrl.h                | 16 ++++++++++++----
 4 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index df1b43fd9f0e..6cf3c887c693 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -535,7 +535,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
 		return;
 	}
 
-	if (r->mon_capable && arch_domain_mbm_alloc(r->num_rmid, hw_dom)) {
+	if (r->mon_capable && arch_domain_mbm_alloc(r->mon.num_rmid, hw_dom)) {
 		domain_free(hw_dom);
 		return;
 	}
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 2345e6836593..b35d04fc761b 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -254,11 +254,11 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d)
 
 	if (is_mbm_total_enabled())
 		memset(hw_dom->arch_mbm_total, 0,
-		       sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
+		       sizeof(*hw_dom->arch_mbm_total) * r->mon.num_rmid);
 
 	if (is_mbm_local_enabled())
 		memset(hw_dom->arch_mbm_local, 0,
-		       sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
+		       sizeof(*hw_dom->arch_mbm_local) * r->mon.num_rmid);
 }
 
 static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
@@ -1004,14 +1004,14 @@ static struct mon_evt mbm_local_event = {
  */
 static void l3_mon_evt_init(struct rdt_resource *r)
 {
-	INIT_LIST_HEAD(&r->evt_list);
+	INIT_LIST_HEAD(&r->mon.evt_list);
 
 	if (is_llc_occupancy_enabled())
-		list_add_tail(&llc_occupancy_event.list, &r->evt_list);
+		list_add_tail(&llc_occupancy_event.list, &r->mon.evt_list);
 	if (is_mbm_total_enabled())
-		list_add_tail(&mbm_total_event.list, &r->evt_list);
+		list_add_tail(&mbm_total_event.list, &r->mon.evt_list);
 	if (is_mbm_local_enabled())
-		list_add_tail(&mbm_local_event.list, &r->evt_list);
+		list_add_tail(&mbm_local_event.list, &r->mon.evt_list);
 }
 
 int __init rdt_get_mon_l3_config(struct rdt_resource *r)
@@ -1023,7 +1023,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 	resctrl_rmid_realloc_limit = boot_cpu_data.x86_cache_size * 1024;
 	hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
-	r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;
+	r->mon.num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;
 	hw_res->mbm_width = MBM_CNTR_WIDTH_BASE;
 
 	if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
@@ -1038,7 +1038,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	 *
 	 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
 	 */
-	threshold = resctrl_rmid_realloc_limit / r->num_rmid;
+	threshold = resctrl_rmid_realloc_limit / r->mon.num_rmid;
 
 	/*
 	 * Because num_rmid may not be a power of two, round the value
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 02f213f1c51c..7114d58ef1e3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1098,7 +1098,7 @@ static int rdt_num_rmids_show(struct kernfs_open_file *of,
 {
 	struct rdt_resource *r = of->kn->parent->priv;
 
-	seq_printf(seq, "%d\n", r->num_rmid);
+	seq_printf(seq, "%d\n", r->mon.num_rmid);
 
 	return 0;
 }
@@ -1109,7 +1109,7 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
 	struct rdt_resource *r = of->kn->parent->priv;
 	struct mon_evt *mevt;
 
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	list_for_each_entry(mevt, &r->mon.evt_list, list) {
 		seq_printf(seq, "%s\n", mevt->name);
 		if (mevt->configurable)
 			seq_printf(seq, "%s_config\n", mevt->name);
@@ -3042,14 +3042,14 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
 	if (ret)
 		goto out_destroy;
 
-	if (WARN_ON(list_empty(&r->evt_list))) {
+	if (WARN_ON(list_empty(&r->mon.evt_list))) {
 		ret = -EPERM;
 		goto out_destroy;
 	}
 
 	priv.u.rid = r->rid;
 	priv.u.domid = d->id;
-	list_for_each_entry(mevt, &r->evt_list, list) {
+	list_for_each_entry(mevt, &r->mon.evt_list, list) {
 		priv.u.evtid = mevt->evtid;
 		ret = mon_addfile(kn, mevt->name, priv.priv);
 		if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index a365f67131ec..bf99eb9c6ce4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -150,22 +150,31 @@ struct resctrl_membw {
 struct rdt_parse_data;
 struct resctrl_schema;
 
+/**
+ * struct resctrl_mon - Monitoring related data
+ * @num_rmid:		Number of RMIDs available
+ * @evt_list:		List of monitoring events
+ */
+struct resctrl_mon {
+	int			num_rmid;
+	struct list_head	evt_list;
+};
+
 /**
  * struct rdt_resource - attributes of a resctrl resource
  * @rid:		The index of the resource
  * @alloc_capable:	Is allocation available on this machine
  * @mon_capable:	Is monitor feature available on this machine
- * @num_rmid:		Number of RMIDs available
  * @cache_level:	Which cache level defines scope of this resource
  * @cache:		Cache allocation related data
  * @membw:		If the component has bandwidth controls, their properties.
+ * @mon:		Monitoring related data.
  * @domains:		RCU list of all domains for this resource
  * @name:		Name to use in "schemata" file.
  * @data_width:		Character width of data when displaying
  * @default_ctrl:	Specifies default cache cbm or memory B/W percent.
  * @format_str:		Per resource format string to show domain value
  * @parse_ctrlval:	Per resource function pointer to parse control values
- * @evt_list:		List of monitoring events
  * @fflags:		flags to choose base and info files
  * @cdp_capable:	Is the CDP feature available on this resource
  */
@@ -173,10 +182,10 @@ struct rdt_resource {
 	int			rid;
 	bool			alloc_capable;
 	bool			mon_capable;
-	int			num_rmid;
 	int			cache_level;
 	struct resctrl_cache	cache;
 	struct resctrl_membw	membw;
+	struct resctrl_mon	mon;
 	struct list_head	domains;
 	char			*name;
 	int			data_width;
@@ -185,7 +194,6 @@ struct rdt_resource {
 	int			(*parse_ctrlval)(struct rdt_parse_data *data,
 						 struct resctrl_schema *s,
 						 struct rdt_domain *d);
-	struct list_head	evt_list;
 	unsigned long		fflags;
 	bool			cdp_capable;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (2 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:56   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags Babu Moger
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
Bits Description
15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
     Monitoring Counter ID + 1

The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Removed resctrl_arch_has_abmc(). Added all the code inline. We dont
    need to separate this as arch code.

v3: Removed changes related to mon_features.
    Moved rdt_cpu_has to core.c and added new function resctrl_arch_has_abmc.
    Also moved the fields mbm_assign_capable and mbm_assign_cntrs to
    rdt_resource. (James)

v2: Changed the field name to mbm_assign_capable from abmc_capable.
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 14 ++++++++++++++
 include/linux/resctrl.h               |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b35d04fc761b..1602b58ba23d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1066,6 +1066,20 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 			mbm_local_event.configurable = true;
 			mbm_config_rftype_init("mbm_local_bytes_config");
 		}
+
+		if (rdt_cpu_has(X86_FEATURE_ABMC)) {
+			r->abmc_capable = true;
+			/*
+			 * Query CPUID_Fn80000020_EBX_x05 for number of
+			 * ABMC counters
+			 */
+			cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
+			r->mon.num_cntrs = (ebx & 0xFFFF) + 1;
+			if (r->mon.num_cntrs > 64) {
+				WARN(1, "Cannot support more than 64 ABMC counters\n");
+				r->mon.num_cntrs = 64;
+			}
+		}
 	}
 
 	l3_mon_evt_init(r);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index bf99eb9c6ce4..24087e6efbb6 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -153,10 +153,12 @@ struct resctrl_schema;
 /**
  * struct resctrl_mon - Monitoring related data
  * @num_rmid:		Number of RMIDs available
+ * @num_cntrs:		Maximum number of abmc counters
  * @evt_list:		List of monitoring events
  */
 struct resctrl_mon {
 	int			num_rmid;
+	int			num_cntrs;
 	struct list_head	evt_list;
 };
 
@@ -177,6 +179,7 @@ struct resctrl_mon {
  * @parse_ctrlval:	Per resource function pointer to parse control values
  * @fflags:		flags to choose base and info files
  * @cdp_capable:	Is the CDP feature available on this resource
+ * @abmc_capable:	Is system capable of supporting monitor assignment?
  */
 struct rdt_resource {
 	int			rid;
@@ -196,6 +199,7 @@ struct rdt_resource {
 						 struct rdt_domain *d);
 	unsigned long		fflags;
 	bool			cdp_capable;
+	bool			abmc_capable;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (3 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:57   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The functions thread_throttle_mode_init() and mbm_config_rftype_init()
both initialize fflags for resctrl files. Adding the new files will
involve adding another function to initialize the fflags. All of this
can be simplified by adding a new function resctrl_file_fflags_init()
and passing the file name and flags to be initialized.

Consolidate multiple fflags initialization into resctrl_file_fflags_init()
and remove thread_throttle_mode_init() and mbm_config_rftype_init().

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Commit message update.

v3: New patch to display ABMC capability.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  4 +++-
 arch/x86/kernel/cpu/resctrl/internal.h |  4 ++--
 arch/x86/kernel/cpu/resctrl/monitor.c  |  6 ++++--
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 +++-------------
 4 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6cf3c887c693..ec93f6a50308 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -224,7 +224,9 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
 		r->membw.throttle_mode = THREAD_THROTTLE_PER_THREAD;
 	else
 		r->membw.throttle_mode = THREAD_THROTTLE_MAX;
-	thread_throttle_mode_init();
+
+	resctrl_file_fflags_init("thread_throttle_mode",
+				 RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
 
 	r->alloc_capable = true;
 
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f1d926832ec8..d566251094b2 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -603,8 +603,8 @@ void cqm_handle_limbo(struct work_struct *work);
 bool has_busy_rmid(struct rdt_domain *d);
 void __check_limbo(struct rdt_domain *d, bool force_free);
 void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
-void __init thread_throttle_mode_init(void);
-void __init mbm_config_rftype_init(const char *config);
+void __init resctrl_file_fflags_init(const char *config,
+				     unsigned long fflags);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 1602b58ba23d..0db9f12debb9 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1060,11 +1060,13 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
 			mbm_total_event.configurable = true;
-			mbm_config_rftype_init("mbm_total_bytes_config");
+			resctrl_file_fflags_init("mbm_total_bytes_config",
+						 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		}
 		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
 			mbm_local_event.configurable = true;
-			mbm_config_rftype_init("mbm_local_bytes_config");
+			resctrl_file_fflags_init("mbm_local_bytes_config",
+						 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
 		}
 
 		if (rdt_cpu_has(X86_FEATURE_ABMC)) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 7114d58ef1e3..aa3eb6ea059a 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2022,24 +2022,14 @@ static struct rftype *rdtgroup_get_rftype_by_name(const char *name)
 	return NULL;
 }
 
-void __init thread_throttle_mode_init(void)
-{
-	struct rftype *rft;
-
-	rft = rdtgroup_get_rftype_by_name("thread_throttle_mode");
-	if (!rft)
-		return;
-
-	rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB;
-}
-
-void __init mbm_config_rftype_init(const char *config)
+void __init resctrl_file_fflags_init(const char *config,
+				     unsigned long fflags)
 {
 	struct rftype *rft;
 
 	rft = rdtgroup_get_rftype_by_name(config);
 	if (rft)
-		rft->fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE;
+		rft->fflags = fflags;
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (4 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:57   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as the counter
is assigned. Number of assignments depend on number of ABMC counters
available.

Provide the interface to display the number of ABMC counters supported.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Changed the counter name to num_cntrs. And few text changes.

v3: Changed the field name to mbm_assign_cntrs.

v2: Changed the field name to mbm_assignable_counters from abmc_counters.
---
 Documentation/arch/x86/resctrl.rst     |  4 ++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 ++++++++++++++++
 3 files changed, 21 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 02790efaabcc..7ab8172ef208 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -257,6 +257,10 @@ with the following files:
 	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
 	    0=0x30;1=0x30;3=0x15;4=0x15
 
+"num_cntrs":
+	Available when ABMC feature is supported. The number of ABMC counters
+	available for configuration.
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 0db9f12debb9..e75a6146068b 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1081,6 +1081,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 				WARN(1, "Cannot support more than 64 ABMC counters\n");
 				r->mon.num_cntrs = 64;
 			}
+			resctrl_file_fflags_init("num_cntrs", RFTYPE_MON_INFO);
 		}
 	}
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index aa3eb6ea059a..ca692712b393 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -846,6 +846,16 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int rdtgroup_num_cntrs_show(struct kernfs_open_file *of,
+				   struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+
+	seq_printf(s, "%d\n", r->mon.num_cntrs);
+
+	return 0;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1911,6 +1921,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= rdtgroup_cpus_show,
 		.fflags		= RFTYPE_BASE,
 	},
+	{
+		.name		= "num_cntrs",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= rdtgroup_num_cntrs_show,
+	},
 	{
 		.name		= "cpus_list",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (5 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:59   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Add the functionality to enable/disable ABMC feature.

ABMC feature is enabled by setting enabled bit(0) in MSR L3_QOS_EXT_CFG.
When the state of ABMC is changed, the MSR needs to be updated on all
the logical processors in the QOS Domain.

Hardware counters will reset when ABMC state is changed. Kernel internal
counters need to be reset to avoid overflow condition in the next update.

The ABMC feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v4: Removed resctrl_arch_set_abmc_enabled and resctrl_arch_get_abmc_enabled.
    Directly calling resctrl_abmc_enable and resctrl_abmc_disable.
    Renamed couple of functions.
    resctrl_abmc_msrwrite() -> resctrl_abmc_set_one()
    resctrl_abmc_setup() -> resctrl_abmc_set_all()
    Added rdtgroup_mutex lockdep asserts.
    Updated commit log and code comments.

v3: No changes.

v2: Few text changes in commit message.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/internal.h |  8 ++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++++++++++++
 3 files changed, 71 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e022e6eb766c..5f9a0139e98c 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1171,6 +1171,7 @@
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
+#define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 
 /* MSR_IA32_VMX_MISC bits */
 #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index d566251094b2..fabe40304798 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -97,6 +97,9 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
 	return cpu;
 }
 
+/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
+#define ABMC_ENABLE			BIT(0)
+
 struct rdt_fs_context {
 	struct kernfs_fs_context	kfc;
 	bool				enable_cdpl2;
@@ -436,6 +439,7 @@ struct rdt_parse_data {
  * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
  *			Monitoring Event Configuration (BMEC) is supported.
  * @cdp_enabled:	CDP state of this resource
+ * @abmc_enabled:	ABMC feature is enabled
  *
  * Members of this structure are either private to the architecture
  * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -450,6 +454,7 @@ struct rdt_hw_resource {
 	unsigned int		mbm_width;
 	unsigned int		mbm_cfg_mask;
 	bool			cdp_enabled;
+	bool			abmc_enabled;
 };
 
 static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
@@ -493,6 +498,9 @@ static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
 
 int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
 
+int resctrl_abmc_enable(enum resctrl_res_level l);
+void resctrl_abmc_disable(enum resctrl_res_level l);
+
 /*
  * To return the common struct rdt_resource, which is contained in struct
  * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index ca692712b393..9148d1234ede 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2416,6 +2416,68 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable)
 	return 0;
 }
 
+static void resctrl_abmc_set_one(void *arg)
+{
+	bool *enable = arg;
+	u64 msrval;
+
+	rdmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
+
+	if (*enable)
+		msrval |= ABMC_ENABLE;
+	else
+		msrval &= ~ABMC_ENABLE;
+
+	wrmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
+}
+
+static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)
+{
+	struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
+	struct rdt_domain *d;
+
+	/*
+	 * Update QOS_CFG MSR on all the CPUs associated with the resource
+	 * Hardware counters will reset after switching the monotor mode.
+	 * Reset the internal counters so that it is not considered as
+	 * an overflow in next update.
+	 */
+	list_for_each_entry(d, &r->domains, list) {
+		on_each_cpu_mask(&d->cpu_mask, resctrl_abmc_set_one, &enable, 1);
+		resctrl_arch_reset_rmid_all(r, d);
+	}
+
+	return 0;
+}
+
+int resctrl_abmc_enable(enum resctrl_res_level l)
+{
+	struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
+	int ret = 0;
+
+	lockdep_assert_held(&rdtgroup_mutex);
+
+	if (!hw_res->abmc_enabled) {
+		ret = resctrl_abmc_set_all(l, true);
+		if (!ret)
+			hw_res->abmc_enabled = true;
+	}
+
+	return ret;
+}
+
+void resctrl_abmc_disable(enum resctrl_res_level l)
+{
+	struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
+
+	lockdep_assert_held(&rdtgroup_mutex);
+
+	if (hw_res->abmc_enabled) {
+		resctrl_abmc_set_all(l, false);
+		hw_res->abmc_enabled = false;
+	}
+}
+
 /*
  * We don't allow rdtgroup directories to be created anywhere
  * except the root directory. Thus when looking for the rdtgroup
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (6 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:40   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as it is assigned.
ABMC mode is enabled by default when supported. System can be one mode
at a time (Legacy monitor mode or ABMC mode).

Provide an interface to display the monitor mode on the system.
    $cat /sys/fs/resctrl/info/L3_MON/mbm_assign
    [abmc]
    legacy

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Fixed the checks for legacy and abmc mode. Default it ABMC.

v3: New patch to display ABMC capability.
---
 Documentation/arch/x86/resctrl.rst     | 10 ++++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 7ab8172ef208..ab3cde61a124 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -261,6 +261,16 @@ with the following files:
 	Available when ABMC feature is supported. The number of ABMC counters
 	available for configuration.
 
+"mbm_assign":
+	Available when ABMC feature is supported. Reports the list of assignable
+	monitoring features supported. The enclosed brackets indicate which
+	feature is enabled.
+	::
+
+	  cat /sys/fs/resctrl/info/L3_MON/mbm_assign
+	  [abmc]
+	  mbm_legacy
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index e75a6146068b..b1d002e5e93d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1071,6 +1071,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 
 		if (rdt_cpu_has(X86_FEATURE_ABMC)) {
 			r->abmc_capable = true;
+			resctrl_file_fflags_init("mbm_assign", RFTYPE_MON_INFO);
 			/*
 			 * Query CPUID_Fn80000020_EBX_x05 for number of
 			 * ABMC counters
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9148d1234ede..3071bbb7a15e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -856,6 +856,23 @@ static int rdtgroup_num_cntrs_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int rdtgroup_mbm_assign_show(struct kernfs_open_file *of,
+				    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	if (hw_res->abmc_enabled) {
+		seq_puts(s, "[abmc]\n");
+		seq_puts(s, "mbm_legacy\n");
+	} else {
+		seq_puts(s, "abmc\n");
+		seq_puts(s, "[mbm_legacy]\n");
+	}
+
+	return 0;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1913,6 +1930,12 @@ static struct rftype res_common_files[] = {
 		.seq_show	= mbm_local_bytes_config_show,
 		.write		= mbm_local_bytes_config_write,
 	},
+	{
+		.name		= "mbm_assign",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= rdtgroup_mbm_assign_show,
+	},
 	{
 		.name		= "cpus",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (7 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:42   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 10/19] x86/resctrl: Introduce ABMC state for the monitor group Babu Moger
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hardware provides a set of counters when the ABMC feature is supported.
These counters are used for enabling the events in resctrl group when
the feature is supported.

Introduce num_cntrs_free_map bitmap to track available and free counters.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Changed the name to num_cntrs where applicable.
    Used bitmap apis.
    Added more comments for the globals.

v3: Changed the bitmap name to assign_cntrs_free_map. Removed abmc
    from the name.

v2: Changed the bitmap name to assignable_counter_free_map from
    abmc_counter_free_map.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3071bbb7a15e..400ae405e10e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -186,6 +186,23 @@ bool closid_allocated(unsigned int closid)
 	return !test_bit(closid, &closid_free_map);
 }
 
+/*
+ * ABMC Counter bitmap and length for tracking available counters.
+ * ABMC feature provides set of hardware counters for enabling events.
+ * Each event takes one hardware counter. Kernel needs to keep track
+ * of number of available counters.
+ */
+static unsigned long num_cntrs_free_map;
+static u32 num_cntrs_free_map_len;
+
+static void num_cntrs_init(void)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
+	bitmap_fill(&num_cntrs_free_map, r->mon.num_cntrs);
+	num_cntrs_free_map_len = r->mon.num_cntrs;
+}
+
 /**
  * rdtgroup_mode_by_closid - Return mode of resource group with closid
  * @closid: closid if the resource group
@@ -2459,6 +2476,12 @@ static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)
 	struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
 	struct rdt_domain *d;
 
+	/*
+	 * The bitmap num_cntrs_free_map needs to be reset when switching
+	 * the monitor mode.
+	 */
+	num_cntrs_init();
+
 	/*
 	 * Update QOS_CFG MSR on all the CPUs associated with the resource
 	 * Hardware counters will reset after switching the monotor mode.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 10/19] x86/resctrl: Introduce ABMC state for the monitor group
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (8 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-05-24 12:23 ` [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as it is assigned.
There are two events per resctrl group, mbm total and mbm local.
User have the option to assign or unassign each event individually.

Add a new field mon_state in mongroup data structure to represent the
assignment state of the group. Reset the states when switching the
monitor mode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Changed ASSIGN_TOTAL and ASSIGN_LOCAL to use QOS_L3_MBM_TOTAL_EVENT_ID
    and QOS_L3_MBM_LOCAL_EVENT_ID.
    Few more commit text changes.

v3: Changed the field name to mon_state. Also thie state is not visible to
    users directly as part of out global assign approach.

v2: Added check to display "Unsupported" when user tries to access
    monitor state when ABMC is not enabled.
---
 arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index fabe40304798..5e7e76cd512f 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -100,6 +100,13 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
 /* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
 #define ABMC_ENABLE			BIT(0)
 
+/*
+ * monitor group's state when ABMC is supported
+ */
+#define ASSIGN_NONE			0
+#define ASSIGN_TOTAL			BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
+#define ASSIGN_LOCAL			BIT(QOS_L3_MBM_LOCAL_EVENT_ID)
+
 struct rdt_fs_context {
 	struct kernfs_fs_context	kfc;
 	bool				enable_cdpl2;
@@ -203,12 +210,14 @@ enum rdtgrp_mode {
  * @parent:			parent rdtgrp
  * @crdtgrp_list:		child rdtgroup node list
  * @rmid:			rmid for this rdtgroup
+ * @mon_state:			Assignment state of the group
  */
 struct mongroup {
 	struct kernfs_node	*mon_data_kn;
 	struct rdtgroup		*parent;
 	struct list_head	crdtgrp_list;
 	u32			rmid;
+	u32			mon_state;
 };
 
 /**
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 400ae405e10e..c176bacf7ba1 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2474,6 +2474,7 @@ static void resctrl_abmc_set_one(void *arg)
 static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)
 {
 	struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
+	struct rdtgroup *prgrp, *crgrp;
 	struct rdt_domain *d;
 
 	/*
@@ -2493,6 +2494,13 @@ static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)
 		resctrl_arch_reset_rmid_all(r, d);
 	}
 
+	/* Reset assign state for all the monitor groups */
+	list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
+		prgrp->mon.mon_state = ASSIGN_NONE;
+		list_for_each_entry(crgrp, &prgrp->mon.crdtgrp_list, mon.crdtgrp_list)
+			crgrp->mon.mon_state = ASSIGN_NONE;
+	}
+
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (9 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 10/19] x86/resctrl: Introduce ABMC state for the monitor group Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:43   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 12/19] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

If the BMEC (Bandwidth Monitoring Event Configuration) feature is
supported, the bandwidth events can be configured to track specific
events. The event configuration is domain specific. ABMC (Assignable
Bandwidth Monitoring Counters) feature needs event configuration
information to assign hardware counter to an RMID. Event configurations
are not stored in resctrl but instead always read from or written to
hardware directly when prompted by user space.

Read the event configuration from the hardware during the domain
initialization. Save the configuration information in the rdt_hw_domain,
so it can be used for counter assignment.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Read the configuration information from the hardware to initialize.
    Added few commit messages.
    Fixed the tab spaces.

v3: Minor changes related to rebase in mbm_config_write_domain.

v2: No changes.
---
 arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
 arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index ec93f6a50308..856c46d12177 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
 		return;
 	}
 
+	arch_domain_mbm_evt_config(hw_dom);
+
 	list_add_tail_rcu(&d->list, add_pos);
 
 	err = resctrl_online_domain(r, d);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5e7e76cd512f..60a1ca0a11a7 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -373,6 +373,8 @@ struct arch_mbm_state {
  * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
  * @arch_mbm_total:	arch private state for MBM total bandwidth
  * @arch_mbm_local:	arch private state for MBM local bandwidth
+ * @mbm_total_cfg:	MBM total bandwidth configuration
+ * @mbm_local_cfg:	MBM local bandwidth configuration
  *
  * Members of this structure are accessed via helpers that provide abstraction.
  */
@@ -381,6 +383,8 @@ struct rdt_hw_domain {
 	u32				*ctrl_val;
 	struct arch_mbm_state		*arch_mbm_total;
 	struct arch_mbm_state		*arch_mbm_local;
+	u32				mbm_total_cfg;
+	u32				mbm_local_cfg;
 };
 
 static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
@@ -622,6 +626,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free);
 void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
 void __init resctrl_file_fflags_init(const char *config,
 				     unsigned long fflags);
+void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b1d002e5e93d..ab0f4bb49bd9 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1093,6 +1093,27 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 	return 0;
 }
 
+void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom)
+{
+	u64 msrval;
+
+	/*
+	 * Read the configuration registers QOS_EVT_CFG_n, where <n> is
+	 * the BMEC event number (EvtID).
+	 * 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
+	 * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
+	 */
+	if (mbm_total_event.configurable) {
+		rdmsrl(MSR_IA32_EVT_CFG_BASE, msrval);
+		hw_dom->mbm_total_cfg = msrval & MAX_EVT_CONFIG_BITS;
+	}
+
+	if (mbm_local_event.configurable) {
+		rdmsrl(MSR_IA32_EVT_CFG_BASE + 1, msrval);
+		hw_dom->mbm_local_cfg = msrval & MAX_EVT_CONFIG_BITS;
+	}
+}
+
 void __exit rdt_put_mon_l3_config(void)
 {
 	dom_data_exit();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 12/19] x86/resctrl: Remove MSR reading of event configuration value
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (10 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-05-24 12:23 ` [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment Babu Moger
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The event configuration is domain specific and initialized during domain
initialization. It is not required to read the configuration register
every time user asks for it. Use the value stored in rdt_hw_domain instead.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: New patch.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 55 ++++++++++++--------------
 1 file changed, 25 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index c176bacf7ba1..0e425c91fa46 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1618,46 +1618,30 @@ static inline unsigned int mon_event_config_index_get(u32 evtid)
 	}
 }
 
-static void mon_event_config_read(void *info)
-{
-	struct mon_config_info *mon_info = info;
-	unsigned int index;
-	u64 msrval;
-
-	index = mon_event_config_index_get(mon_info->evtid);
-	if (index == INVALID_CONFIG_INDEX) {
-		pr_warn_once("Invalid event id %d\n", mon_info->evtid);
-		return;
-	}
-	rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
-
-	/* Report only the valid event configuration bits */
-	mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
-}
-
-static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
-{
-	smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
-}
-
 static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
 {
-	struct mon_config_info mon_info = {0};
 	struct rdt_domain *dom;
 	bool sep = false;
+	u32 val;
 
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
 
 	list_for_each_entry(dom, &r->domains, list) {
+		struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(dom);
 		if (sep)
 			seq_puts(s, ";");
 
-		memset(&mon_info, 0, sizeof(struct mon_config_info));
-		mon_info.evtid = evtid;
-		mondata_config_read(dom, &mon_info);
+		if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
+			val = hw_dom->mbm_total_cfg;
+		} else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
+			val = hw_dom->mbm_local_cfg;
+		} else {
+			pr_warn_once("Invalid event id %d\n", evtid);
+			break;
+		}
 
-		seq_printf(s, "%d=0x%02x", dom->id, mon_info.mon_config);
+		seq_printf(s, "%d=0x%02x", dom->id, val);
 		sep = true;
 	}
 	seq_puts(s, "\n");
@@ -1704,17 +1688,28 @@ static void mon_event_config_write(void *info)
 static void mbm_config_write_domain(struct rdt_resource *r,
 				    struct rdt_domain *d, u32 evtid, u32 val)
 {
+	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
 	struct mon_config_info mon_info = {0};
 
 	/*
 	 * Read the current config value first. If both are the same then
 	 * no need to write it again.
 	 */
-	mon_info.evtid = evtid;
-	mondata_config_read(d, &mon_info);
-	if (mon_info.mon_config == val)
+	switch (evtid) {
+	case QOS_L3_MBM_TOTAL_EVENT_ID:
+		if (val == hw_dom->mbm_total_cfg)
+			return;
+		break;
+	case QOS_L3_MBM_LOCAL_EVENT_ID:
+		if (val == hw_dom->mbm_local_cfg)
+			return;
+		break;
+	default:
+		pr_warn_once("Invalid event id %d\n", evtid);
 		return;
+	}
 
+	mon_info.evtid = evtid;
 	mon_info.mon_config = val;
 
 	/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (11 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 12/19] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:44   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter Babu Moger
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

The ABMC feature provides an option to the user to assign a hardware
counter to an RMID and monitor the bandwidth as long as the counter
is assigned. The bandwidth events will be tracked by the hardware until
the user changes the configuration. Each resctrl group can configure
maximum two counters, one for total event and one for local event.

The counters are configured by writing to MSR L3_QOS_ABMC_CFG.
Configuration is done by setting the counter id, bandwidth source (RMID),
bandwidth types. Reading L3_QOS_ABMC_DSC returns the configuration of
the counter id specified in L3_QOS_ABMC_CFG.

Attempts to read or write these MSRs when ABMC is not enabled will result
in a #GP(0) exception.

MSR L3_QOS_ABMC_CFG (0xC000_03FDh) and L3_QOS_ABMC_DSC (0xC000_03FEh)
details.
==========================================================================
Bits 	Mnemonic	Description		Access Type   Reset Value
==========================================================================
63 	CfgEn 		Configuration Enable 	R/W 		0

62 	CtrEn 		Counter Enable 		R/W 		0

61:53 	– 		Reserved 		MBZ 		0

52:48 	CtrID 		Counter Identifier	R/W		0

47 	IsCOS		BwSrc field is a COS	R/W		0
			(not an RMID)

46:44 	–		Reserved		MBZ		0

43:32	BwSrc		Bandwidth Source	R/W		0
			(RMID or COS)

31:0	BwType		Bandwidth types to	R/W		0
			track for this counter
==========================================================================

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v4: Added more descriptions.
    Changed the name abmc_ctr_id to ctr_id.
    Added L3_QOS_ABMC_DSC. Used for reading the configuration.

v3: No changes.

v2: No changes.
---
 arch/x86/include/asm/msr-index.h       |  2 ++
 arch/x86/kernel/cpu/resctrl/internal.h | 30 ++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 5f9a0139e98c..6d2fe39ac68f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1172,6 +1172,8 @@
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 #define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
+#define MSR_IA32_L3_QOS_ABMC_CFG	0xc00003fd
+#define MSR_IA32_L3_QOS_ABMC_DSC	0xc00003fe
 
 /* MSR_IA32_VMX_MISC bits */
 #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 60a1ca0a11a7..45ed33f4f0ff 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -107,6 +107,9 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
 #define ASSIGN_TOTAL			BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
 #define ASSIGN_LOCAL			BIT(QOS_L3_MBM_LOCAL_EVENT_ID)
 
+/* Maximum assignable counters per resctrl group */
+#define MAX_CNTRS		2
+
 struct rdt_fs_context {
 	struct kernfs_fs_context	kfc;
 	bool				enable_cdpl2;
@@ -211,6 +214,7 @@ enum rdtgrp_mode {
  * @crdtgrp_list:		child rdtgroup node list
  * @rmid:			rmid for this rdtgroup
  * @mon_state:			Assignment state of the group
+ * @ctr_id:			ABMC counter ids assigned to this group
  */
 struct mongroup {
 	struct kernfs_node	*mon_data_kn;
@@ -218,6 +222,7 @@ struct mongroup {
 	struct list_head	crdtgrp_list;
 	u32			rmid;
 	u32			mon_state;
+	u32			ctr_id[MAX_CNTRS];
 };
 
 /**
@@ -568,6 +573,31 @@ union cpuid_0x10_x_edx {
 	unsigned int full;
 };
 
+/*
+ * ABMC counters can be configured  by writing to L3_QOS_ABMC_CFG.
+ * @bw_type		: Bandwidth types to track for this counter
+ * @bw_src		: Bandwidth Source (RMID or CLOSID)
+ * @reserved1		: Reserved
+ * @is_cos		: BwSrc field is a COS (not an RMID)
+ * @ctr_id		: Counter Identifier
+ * @reserved		: Reserved
+ * @ctr_en		: Counter Enable bit
+ * @cfg_en		: Configuration Enable bit
+ */
+union l3_qos_abmc_cfg {
+	struct {
+		unsigned long  bw_type	:32,
+			       bw_src	:12,
+			       reserved1: 3,
+			       is_cos	: 1,
+			       ctr_id	: 5,
+			       reserved	: 9,
+			       ctr_en	: 1,
+			       cfg_en	: 1;
+	} split;
+	unsigned long full;
+};
+
 void rdt_last_cmd_clear(void);
 void rdt_last_cmd_puts(const char *s);
 __printf(1, 2)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (12 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:48   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 15/19] x86/resctrl: Add the interface to unassign " Babu Moger
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

ABMC feature requires to users to assign a hardware counter to an RMID
to monitor the events. Provide the interfaces to assign a counter.

Individual counters are configured by writing to L3_QOS_ABMC_CFG MSR
and specifying the counter id, bandwidth source, and bandwidth types.

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
    Monitoring (ABMC).

Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v4: Commit message update.
    User bitmap APIs where applicable.
    Changed the interfaces considering MPAM(arm).
    Added domain specific assignment.

v3: Removed the static from the prototype of rdtgroup_assign_abmc.
    The function is not called directly from user anymore. These
    changes are related to global assignment interface.

v2: Minor text changes in commit message.
---
 arch/x86/kernel/cpu/resctrl/internal.h |   3 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 101 +++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 45ed33f4f0ff..a88c8fc5e4df 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -657,6 +657,9 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
 void __init resctrl_file_fflags_init(const char *config,
 				     unsigned long fflags);
 void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
+int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
+			u32 ctr_id, u32 closid, bool enable);
+int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0e425c91fa46..48df76499a04 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -203,6 +203,19 @@ static void num_cntrs_init(void)
 	num_cntrs_free_map_len = r->mon.num_cntrs;
 }
 
+static int assign_cntrs_alloc(void)
+{
+	u32 ctr_id = find_first_bit(&num_cntrs_free_map,
+				    num_cntrs_free_map_len);
+
+	if (ctr_id >= num_cntrs_free_map_len)
+		return -ENOSPC;
+
+	__clear_bit(ctr_id, &num_cntrs_free_map);
+
+	return ctr_id;
+}
+
 /**
  * rdtgroup_mode_by_closid - Return mode of resource group with closid
  * @closid: closid if the resource group
@@ -1830,6 +1843,94 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static void rdtgroup_abmc_cfg(void *info)
+{
+	u64 *msrval = info;
+
+	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *msrval);
+}
+
+int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
+			u32 ctr_id, u32 closid, bool enable)
+{
+	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+	union l3_qos_abmc_cfg abmc_cfg = { 0 };
+	struct arch_mbm_state *arch_mbm;
+
+	abmc_cfg.split.cfg_en = 1;
+	abmc_cfg.split.ctr_en = enable ? 1 : 0;
+	abmc_cfg.split.ctr_id = ctr_id;
+	abmc_cfg.split.bw_src = rmid;
+
+	/*
+	 * Read the event configuration from the domain and pass it as
+	 * bw_type.
+	 */
+	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
+		abmc_cfg.split.bw_type = hw_dom->mbm_total_cfg;
+		arch_mbm = &hw_dom->arch_mbm_total[rmid];
+	} else {
+		abmc_cfg.split.bw_type = hw_dom->mbm_local_cfg;
+		arch_mbm = &hw_dom->arch_mbm_local[rmid];
+	}
+
+	smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_cfg, &abmc_cfg, 1);
+
+	/* Reset the internal counters */
+	if (arch_mbm)
+		memset(arch_mbm, 0, sizeof(struct arch_mbm_state));
+
+	return 0;
+}
+
+int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	int ctr_id = 0, index;
+	struct rdt_domain *d;
+	u32 mon_state;
+
+	index = mon_event_config_index_get(evtid);
+	if (index == INVALID_CONFIG_INDEX) {
+		rdt_last_cmd_puts("Invalid event id\n");
+		return -EINVAL;
+	}
+
+	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
+		mon_state = ASSIGN_TOTAL;
+	} else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
+		mon_state = ASSIGN_LOCAL;
+	} else {
+		rdt_last_cmd_puts("Invalid event id\n");
+		return -EINVAL;
+	}
+
+	/* Nothing to do if event has been assigned already */
+	if (rdtgrp->mon.mon_state & mon_state) {
+		rdt_last_cmd_puts("ABMC counter is assigned already\n");
+		return 0;
+	}
+
+	/*
+	 * Allocate a new counter and update domains
+	 */
+	ctr_id = assign_cntrs_alloc();
+	if (ctr_id < 0) {
+		rdt_last_cmd_puts("Out of ABMC counters\n");
+		return -ENOSPC;
+	}
+
+	rdtgrp->mon.ctr_id[index] = ctr_id;
+
+	list_for_each_entry(d, &r->domains, list)
+		resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid, ctr_id,
+				    rdtgrp->closid, 1);
+
+	rdtgrp->mon.mon_state |= mon_state;
+
+	return 0;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 15/19] x86/resctrl: Add the interface to unassign ABMC counter
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (13 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:49   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount Babu Moger
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hardware provides a limited number of ABMC counters. Once all the
counters are exhausted, counters need to be freed for new assignments.

Provide the interface to unassign the counter.

The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
    Monitoring (ABMC).

Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v4: Added domain specific unassign feature.
    Few name changes.

v3: Removed the static from the prototype of rdtgroup_unassign_abmc.
    The function is not called directly from user anymore. These
    changes are related to global assignment interface.

v2: No changes.
---
 arch/x86/kernel/cpu/resctrl/internal.h |  2 ++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 ++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index a88c8fc5e4df..e16244895350 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -660,6 +660,8 @@ void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
 int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
 			u32 ctr_id, u32 closid, bool enable);
 int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
+int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid);
+void num_cntrs_free(u32 ctr_id);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 48df76499a04..5ea1e58c7201 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -216,6 +216,11 @@ static int assign_cntrs_alloc(void)
 	return ctr_id;
 }
 
+void num_cntrs_free(u32 ctr_id)
+{
+	__set_bit(ctr_id, &num_cntrs_free_map);
+}
+
 /**
  * rdtgroup_mode_by_closid - Return mode of resource group with closid
  * @closid: closid if the resource group
@@ -1931,6 +1936,43 @@ int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid)
 	return 0;
 }
 
+int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_domain *d;
+	u32 mon_state;
+	int index;
+
+	index = mon_event_config_index_get(evtid);
+	if (index == INVALID_CONFIG_INDEX) {
+		pr_warn_once("Invalid event id %d\n", evtid);
+		return -EINVAL;
+	}
+
+	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
+		mon_state = ASSIGN_TOTAL;
+	} else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
+		mon_state = ASSIGN_LOCAL;
+	} else {
+		rdt_last_cmd_puts("Invalid event id\n");
+		return -EINVAL;
+	}
+
+	if (rdtgrp->mon.mon_state & mon_state) {
+		list_for_each_entry(d, &r->domains, list)
+			resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid,
+					    rdtgrp->mon.ctr_id[index],
+					    rdtgrp->closid, 0);
+
+		/* Update the counter bitmap */
+		num_cntrs_free(rdtgrp->mon.ctr_id[index]);
+	}
+
+	rdtgrp->mon.mon_state &= ~mon_state;
+
+	return 0;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (14 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 15/19] x86/resctrl: Add the interface to unassign " Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:50   ` Reinette Chatre
  2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Enable ABMC by default when feature is supported.

Counters are assigned automatically on group creation. If the counters
are exhausted, report the warnings and continue. It is not required to
fail group creation for assignment failures. Users will have the option
to modify the assignments.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Few name changes based on the upstream discussion.
    Commit message update.

v3: This is a new patch. Patch addresses the upstream comment to enable
    ABMC feature by default if the feature is available.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 69 ++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5ea1e58c7201..f452b6d9bb99 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2756,6 +2756,7 @@ static void rdt_disable_ctx(void)
 {
 	resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
 	resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
+	resctrl_abmc_disable(RDT_RESOURCE_L3);
 	set_mba_sc(false);
 
 	resctrl_debug = false;
@@ -2786,6 +2787,8 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
 	if (ctx->enable_debug)
 		resctrl_debug = true;
 
+	resctrl_abmc_enable(RDT_RESOURCE_L3);
+
 	return 0;
 
 out_cdpl3:
@@ -2882,6 +2885,41 @@ static void schemata_list_destroy(void)
 	}
 }
 
+static int resctrl_assign_events(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+	int ret = 0;
+
+	if (!hw_res->abmc_enabled)
+		return 0;
+
+	if (is_mbm_total_enabled())
+		ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+	if (!ret && is_mbm_local_enabled())
+		ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
+
+	return ret;
+}
+
+static int resctrl_unassign_events(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+	int ret = 0;
+
+	if (!hw_res->abmc_enabled)
+		return 0;
+
+	if (is_mbm_total_enabled())
+		ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+	if (!ret && is_mbm_local_enabled())
+		ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
+	return ret;
+}
+
 static int rdt_get_tree(struct fs_context *fc)
 {
 	struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -2941,6 +2979,14 @@ static int rdt_get_tree(struct fs_context *fc)
 		if (ret < 0)
 			goto out_mongrp;
 		rdtgroup_default.mon.mon_data_kn = kn_mondata;
+
+		/*
+		 * Assign the monitor counters if it is available. If it fails,
+		 * report the warnings and continue. It is not nessaccery to
+		 * fail here.
+		 */
+		if (resctrl_assign_events(&rdtgroup_default) < 0)
+			rdt_last_cmd_puts("Monitor assignment failed\n");
 	}
 
 	ret = rdt_pseudo_lock_init();
@@ -3214,6 +3260,8 @@ static void rdt_kill_sb(struct super_block *sb)
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
 
+	resctrl_unassign_events(&rdtgroup_default);
+
 	rdt_disable_ctx();
 
 	/*Put everything back to default values. */
@@ -3752,6 +3800,14 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
 		goto out_unlock;
 	}
 
+	/*
+	 * Assign the monitor counters if it is available. If it fails,
+	 * report the warnings and continue. It is not nessaccery to
+	 * fail here.
+	 */
+	if (resctrl_assign_events(rdtgrp) < 0)
+		rdt_last_cmd_puts("Monitor assignment failed\n");
+
 	kernfs_activate(rdtgrp->kn);
 
 	/*
@@ -3796,6 +3852,14 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
 	if (ret)
 		goto out_closid_free;
 
+	/*
+	 * Assign the monitor counters if it is available. If it fails,
+	 * report the warnings and continue. It is not nessaccery to
+	 * fail here.
+	 */
+	if (resctrl_assign_events(rdtgrp) < 0)
+		rdt_last_cmd_puts("Monitor assignment failed\n");
+
 	kernfs_activate(rdtgrp->kn);
 
 	ret = rdtgroup_init_alloc(rdtgrp);
@@ -3891,6 +3955,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	update_closid_rmid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
+
+	resctrl_unassign_events(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 	/*
@@ -3937,6 +4004,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
 	update_closid_rmid(tmpmask, NULL);
 
+	resctrl_unassign_events(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 	closid_free(rdtgrp->closid);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (15 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  1:51   ` Reinette Chatre
  2024-06-21 14:43   ` Markus Elfring
  2024-05-24 12:23 ` [PATCH v4 18/19] x86/resctrl: Introduce interface to list monitor states of all the groups Babu Moger
                   ` (2 subsequent siblings)
  19 siblings, 2 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Introduce interface to switch between ABMC and legacy_mbm modes.

By default ABMC is enabled on resctrl mount if the feature is available.
However, user will have the option to go back to legacy_mbm if required.

$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign
[abmc]
mbm_legacy

To enable the legacy monitoring feature:
$ echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Minor commit text changes. Keep the default to ABMC when supported.

v3: New patch to address the review comments from upstream.
---
 Documentation/arch/x86/resctrl.rst     | 10 +++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++++++++++++-
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index ab3cde61a124..fd050d4d22cd 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -271,6 +271,16 @@ with the following files:
 	  [abmc]
 	  mbm_legacy
 
+	* To enable ABMC feature:
+	  ::
+
+	    # echo  "abmc" > /sys/fs/resctrl/info/L3_MON/mbm_assign
+
+	* To enable the legacy monitoring feature:
+	  ::
+
+	    # echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f452b6d9bb99..d77ff059269a 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -908,6 +908,43 @@ static int rdtgroup_mbm_assign_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+/*
+ * rdtgroup_mode_write - Modify the resource group's mode
+ */
+static ssize_t rdtgroup_mbm_assign_write(struct kernfs_open_file *of,
+					 char *buf, size_t nbytes,
+					 loff_t off)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	int ret = 0;
+
+	if (!r->abmc_capable)
+		return -EINVAL;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!strcmp(buf, "mbm_legacy"))
+		resctrl_abmc_disable(RDT_RESOURCE_L3);
+	else if (!strcmp(buf, "abmc"))
+		ret = resctrl_abmc_enable(RDT_RESOURCE_L3);
+	else
+		ret = -EINVAL;
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2087,9 +2124,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_assign",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= rdtgroup_mbm_assign_show,
+		.write		= rdtgroup_mbm_assign_write,
 	},
 	{
 		.name		= "cpus",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 18/19] x86/resctrl: Introduce interface to list monitor states of all the groups
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (16 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-05-24 12:23 ` [PATCH v4 19/19] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
  2024-06-14  0:54 ` [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
  19 siblings, 0 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Introduce the interface to list the monitor states of all the resctrl groups.

Example:
$cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control

List follows the following format:

"<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"

Format for specific type of groups:

- Default CTRL_MON group:
  "//<domain_id>=<assignment_flags>"

- Non-default CTRL_MON group:
  "<CTRL_MON group>//<domain_id>=<assignment_flags>"

- Child MON group of default CTRL_MON group:
  "/<MON group>/<domain_id>=<assignment_flags>"

- Child MON group of non-default CTRL_MON group:
  "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"


Assignment flags can be one of the following:
t  MBM total event is enabled
l  MBM local event is enabled
tl Both total and local MBM events are enabled
_  None of the MBM events are enabled

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Added functionality to query domain specific assigment in.
    rdtgroup_abmc_dom_state().

v3: New patch.
    Addresses the feedback to provide the global assignment interface.
    https://lore.kernel.org/lkml/c73f444b-83a1-4e9a-95d3-54c5165ee782@intel.com/
---
 Documentation/arch/x86/resctrl.rst     |  55 +++++++++++
 arch/x86/kernel/cpu/resctrl/monitor.c  |   1 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 129 +++++++++++++++++++++++++
 3 files changed, 185 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index fd050d4d22cd..255087de568d 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -281,6 +281,61 @@ with the following files:
 
 	    # echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign
 
+"mbm_assign_control":
+	Available when ABMC features are supported.
+	Reports the resctrl group and monitor status of each group.
+
+	List follows the following format:
+		"<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
+
+	Format for specific type of grpups:
+
+	* Default CTRL_MON group:
+		"//<domain_id>=<assignment_flags>"
+
+	* Non-default CTRL_MON group:
+		"<CTRL_MON group>//<domain_id>=<assignment_flags>"
+
+	* Child MON group of default CTRL_MON group:
+		"/<MON group>/<domain_id>=<assignment_flags>"
+
+	* Child MON group of non-default CTRL_MON group:
+		"<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
+
+	Assignment flags can be one of the following:
+	::
+
+	 t  MBM total event is enabled
+	 l  MBM local event is enabled
+	 tl Both total and local MBM events are enabled
+	 _  None of the MBM events are enabled
+
+	Examples:
+	::
+
+	 # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
+	 # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
+	 # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
+
+	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	 non_default_ctrl_mon_grp//0=tl;1=tl;
+	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	 //0=tl;1=tl;
+	 /child_default_mon_grp/0=tl;1=tl;
+
+	 There are four resctrl groups. All the groups have total and local events are
+	 enabled on domain 0 and 1.
+
+	 non_default_ctrl_mon_grp// - This is a non default control mon group.
+
+	 non_default_ctrl_mon_grp/child_non_default_mon_grp/ - This is a child monitor
+	 group of the non default control mon group.
+
+	 // - This is a default control mon group.
+
+	 /child_default_mon_grp/ - This is a child monitor group of the default control
+	 mon group.
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ab0f4bb49bd9..f4dba02db9c0 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1083,6 +1083,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 				r->mon.num_cntrs = 64;
 			}
 			resctrl_file_fflags_init("num_cntrs", RFTYPE_MON_INFO);
+			resctrl_file_fflags_init("mbm_assign_control", RFTYPE_MON_INFO);
 		}
 	}
 
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d77ff059269a..bd3a54405402 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -945,6 +945,129 @@ static ssize_t rdtgroup_mbm_assign_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static void rdtgroup_abmc_dom_cfg(void *info)
+{
+	u64 *msrval = info;
+
+	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *msrval);
+	rdmsrl(MSR_IA32_L3_QOS_ABMC_DSC, *msrval);
+}
+
+/*
+ * Writing the counter id with CfgEn=0 on L3_QOS_ABMC_CFG and reading
+ * L3_QOS_ABMC_DSC back will return configuration of the counter
+ * specified.
+ */
+static int rdtgroup_abmc_dom_state(struct rdt_domain *d, u32 ctr_id, u32 rmid)
+{
+	union l3_qos_abmc_cfg abmc_cfg = { 0 };
+
+	abmc_cfg.split.cfg_en = 0;
+	abmc_cfg.split.ctr_id = ctr_id;
+
+	smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_dom_cfg,
+			      &abmc_cfg, 1);
+
+	if (abmc_cfg.split.ctr_en && abmc_cfg.split.bw_src == rmid)
+		return 0;
+	else
+		return -1;
+}
+
+static char *mon_state_to_str(struct rdtgroup *rdtgrp,
+			      struct rdt_domain *d, char *str)
+{
+	char *tmp = str;
+	int dom_state = ASSIGN_NONE;
+
+	/*
+	 * Query the monitor state for the domain.
+	 * Index 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
+	 * Index 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
+	 */
+	if (rdtgrp->mon.mon_state & ASSIGN_TOTAL)
+		if (!rdtgroup_abmc_dom_state(d, rdtgrp->mon.ctr_id[0], rdtgrp->mon.rmid))
+			dom_state |= ASSIGN_TOTAL;
+
+	if (rdtgrp->mon.mon_state & ASSIGN_LOCAL)
+		if (!rdtgroup_abmc_dom_state(d, rdtgrp->mon.ctr_id[1], rdtgrp->mon.rmid))
+			dom_state |= ASSIGN_LOCAL;
+
+	switch (dom_state) {
+	case ASSIGN_NONE:
+		*tmp++ = '_';
+		break;
+	case (ASSIGN_TOTAL | ASSIGN_LOCAL):
+		*tmp++ = 't';
+		*tmp++ = 'l';
+		break;
+	case ASSIGN_TOTAL:
+		*tmp++ = 't';
+		break;
+	case ASSIGN_LOCAL:
+		*tmp++ = 'l';
+		break;
+	default:
+		break;
+	}
+
+	*tmp = '\0';
+	return str;
+}
+
+static int rdtgroup_mbm_assign_control_show(struct kernfs_open_file *of,
+					    struct seq_file *s, void *v)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+	struct rdt_domain *dom;
+	struct rdtgroup *rdtg;
+	int grp_default = 0;
+	char str[10];
+
+	if (!hw_res->abmc_enabled) {
+		rdt_last_cmd_puts("ABMC feature is not enabled\n");
+		return -EINVAL;
+	}
+
+	mutex_lock(&rdtgroup_mutex);
+
+	list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
+		struct rdtgroup *crg;
+
+		if (rdtg == &rdtgroup_default) {
+			grp_default = 1;
+			seq_puts(s, "//");
+		} else {
+			grp_default = 0;
+			seq_printf(s, "%s//", rdtg->kn->name);
+		}
+
+		list_for_each_entry(dom, &r->domains, list)
+			seq_printf(s, "%d=%s;", dom->id,
+				   mon_state_to_str(rdtg, dom, str));
+		seq_putc(s, '\n');
+
+		list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
+				    mon.crdtgrp_list) {
+			if (grp_default)
+				seq_printf(s, "/%s/", crg->kn->name);
+			else
+				seq_printf(s, "%s/%s/", rdtg->kn->name,
+					   crg->kn->name);
+
+			list_for_each_entry(dom, &r->domains, list)
+				seq_printf(s, "%d=%s;", dom->id,
+					   mon_state_to_str(crg, dom, str));
+			seq_putc(s, '\n');
+		}
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+
+	return 0;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2143,6 +2266,12 @@ static struct rftype res_common_files[] = {
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= rdtgroup_num_cntrs_show,
 	},
+	{
+		.name		= "mbm_assign_control",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= rdtgroup_mbm_assign_control_show,
+	},
 	{
 		.name		= "cpus_list",
 		.mode		= 0644,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 19/19] x86/resctrl: Introduce interface to modify assignment states of the groups
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (17 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 18/19] x86/resctrl: Introduce interface to list monitor states of all the groups Babu Moger
@ 2024-05-24 12:23 ` Babu Moger
  2024-06-14  0:54 ` [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
  19 siblings, 0 replies; 66+ messages in thread
From: Babu Moger @ 2024-05-24 12:23 UTC (permalink / raw)
  To: corbet, fenghua.yu, reinette.chatre, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, babu.moger,
	kim.phillips, lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Introduce the interface to enable events in ABMC mode.

Events can be enabled or disabled by writing to file
/sys/fs/resctrl/info/L3_MON/mbm_assign_control

Format is similar to the list format with addition of op-code for the
assignment operation.
 "<CTRL_MON group>/<MON group>/<op-code><assignment_flags>"

Format for specific type of groups:

 * Default CTRL_MON group:
         "//<domain_id><op-code><assignment_flags>"

 * Non-default CTRL_MON group:
         "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"

 * Child MON group of default CTRL_MON group:
         "/<MON group>/<domain_id><op-code><assignment_flags>"

 * Child MON group of non-default CTRL_MON group:
         "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"

Op-code can be one of the following:

 = Update the assignment to match the flags
 + enable a new state
 - disable a new state

Assignment flags can be one of the following:
 t  MBM total event is enabled
 l  MBM local event is enabled
 tl Both total and local MBM events are enabled
 _  None of the MBM events are enabled

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Added domain specific assignments. Fixed the opcode parsing.

v3: New patch.
    Addresses the feedback to provide the global assignment interface.
    https://lore.kernel.org/lkml/c73f444b-83a1-4e9a-95d3-54c5165ee782@intel.com/
---
 Documentation/arch/x86/resctrl.rst     |  81 +++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 242 ++++++++++++++++++++++++-
 2 files changed, 322 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 255087de568d..89af91805b38 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -336,6 +336,87 @@ with the following files:
 	 /child_default_mon_grp/ - This is a child monitor group of the default control
 	 mon group.
 
+	Assignment state can be updated by writing to the interface.
+
+	Format is similar to the list format with addition of op-code for the
+	assignment operation.
+
+		"<CTRL_MON group>/<MON group>/<op-code><assignment_flags>"
+
+	Format for each type of groups:
+
+        * Default CTRL_MON group:
+                "//<domain_id><op-code><assignment_flags>"
+
+        * Non-default CTRL_MON group:
+                "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
+
+        * Child MON group of default CTRL_MON group:
+                "/<MON group>/<domain_id><op-code><assignment_flags>"
+
+        * Child MON group of non-default CTRL_MON group:
+                "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
+
+	Op-code can be one of the following:
+	::
+
+	 = Update the assignment to match the flags
+	 + Add a new state
+	 - delete a new state
+
+	Examples:
+	::
+
+	  Initial group status:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=tl;1=tl;
+	  /child_default_mon_grp/0=tl;1=tl;
+
+	  To update the default group to enable only total event on domain 0:
+	  # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+
+	  Assignment status after the update:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=t;1=tl;2=tl;
+	  /child_default_mon_grp/0=tl;1=tl;
+
+	  To update the MON group child_default_mon_grp to remove total event on domain 1:
+	  # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+
+	  Assignment status after the update:
+	  $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=t;1=l;
+	  /child_default_mon_grp/0=t;1=tl;
+
+	  To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
+	  remove both local and total events on domain 1:
+	  # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
+			/sys/fs/resctrl/info/L3_MON/mbm_assign_control
+
+	  Assignment status after the update:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
+	  //0=t;1=l;
+	  /child_default_mon_grp/0=t;1=tl;
+
+	  To update the default group to add a total event domain 1.
+	  # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+
+	  Assignment status after the update:
+
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
+	  //0=t;1=tl;
+	  /child_default_mon_grp/0=t;1=tl;
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index bd3a54405402..b74c18927bd2 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1068,6 +1068,245 @@ static int rdtgroup_mbm_assign_control_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int str_to_mon_state(char *flag)
+{
+	int i, mon_state = 0;
+
+	for (i = 0; i < strlen(flag); i++) {
+		switch (*(flag + i)) {
+		case 't':
+			mon_state |= ASSIGN_TOTAL;
+			break;
+		case 'l':
+			mon_state |= ASSIGN_LOCAL;
+			break;
+		case '_':
+			mon_state = ASSIGN_NONE;
+			break;
+		default:
+			mon_state = ASSIGN_NONE;
+			break;
+		}
+	}
+
+	return mon_state;
+}
+
+static struct rdtgroup *resctrl_get_rdtgroup(enum rdt_group_type rtype, char *p_grp, char *c_grp)
+{
+	struct rdtgroup *rdtg, *crg;
+
+	if (rtype == RDTCTRL_GROUP && *p_grp == '\0') {
+		return &rdtgroup_default;
+	} else if (rtype == RDTCTRL_GROUP) {
+		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list)
+			if (!strcmp(p_grp, rdtg->kn->name))
+				return rdtg;
+	} else if (rtype == RDTMON_GROUP) {
+		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
+			if (!strcmp(p_grp, rdtg->kn->name)) {
+				list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
+						    mon.crdtgrp_list) {
+					if (!strcmp(c_grp, crg->kn->name))
+						return crg;
+				}
+			}
+		}
+	}
+
+	return NULL;
+}
+
+static int resctrl_process_flags(enum rdt_group_type rtype, char *p_grp, char *c_grp, char *tok)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	int op, mon_state, assign_state, unassign_state;
+	char *dom_str, *id_str, *op_str;
+	struct rdtgroup *rdtgrp;
+	struct rdt_domain *d;
+	unsigned long dom_id;
+	int ret, found = 0;
+
+	rdtgrp = resctrl_get_rdtgroup(rtype, p_grp, c_grp);
+
+	if (!rdtgrp) {
+		rdt_last_cmd_puts("Not a valid resctrl group\n");
+		return -EINVAL;
+	}
+
+next:
+	if (!tok || tok[0] == '\0')
+		return 0;
+
+	/* Start processing the strings for each domain */
+	dom_str = strim(strsep(&tok, ";"));
+
+	op_str = strpbrk(dom_str, "=+-");
+
+	if (op_str) {
+		op = *op_str;
+	} else {
+		rdt_last_cmd_puts("Missing operation =, +, -, _ character\n");
+		return -EINVAL;
+	}
+
+	id_str = strsep(&dom_str, "=+-");
+
+	if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
+		rdt_last_cmd_puts("Missing domain id\n");
+		return -EINVAL;
+	}
+
+	/* Verify if the dom_id is valid */
+	list_for_each_entry(d, &r->domains, list) {
+		if (d->id == dom_id) {
+			found = 1;
+			break;
+		}
+	}
+	if (!found) {
+		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
+		return -EINVAL;
+	}
+
+	mon_state = str_to_mon_state(dom_str);
+
+	assign_state = 0;
+	unassign_state = 0;
+
+	switch (op) {
+	case '+':
+		assign_state = mon_state;
+		break;
+	case '-':
+		unassign_state = mon_state;
+		break;
+	case '=':
+		assign_state = mon_state;
+		unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
+		break;
+	default:
+		break;
+	}
+
+	if (assign_state & ASSIGN_TOTAL)
+		ret = resctrl_arch_assign(d, QOS_L3_MBM_TOTAL_EVENT_ID,
+					  rdtgrp->mon.rmid,
+					  rdtgrp->mon.ctr_id[0],
+					  rdtgrp->closid, 1);
+	if (ret)
+		goto out_fail;
+
+	if (assign_state & ASSIGN_LOCAL)
+		ret = resctrl_arch_assign(d, QOS_L3_MBM_LOCAL_EVENT_ID,
+					  rdtgrp->mon.rmid,
+					  rdtgrp->mon.ctr_id[1],
+					  rdtgrp->closid, 1);
+
+	if (ret)
+		goto out_fail;
+
+	if (unassign_state & ASSIGN_TOTAL)
+		ret = resctrl_arch_assign(d, QOS_L3_MBM_TOTAL_EVENT_ID,
+					  rdtgrp->mon.rmid,
+					  rdtgrp->mon.ctr_id[0],
+					  rdtgrp->closid, 0);
+
+	if (ret)
+		goto out_fail;
+
+	if (unassign_state & ASSIGN_LOCAL)
+		ret = resctrl_arch_assign(d, QOS_L3_MBM_LOCAL_EVENT_ID,
+					  rdtgrp->mon.rmid,
+					  rdtgrp->mon.ctr_id[1],
+					  rdtgrp->closid, 0);
+	if (ret)
+		goto out_fail;
+
+	goto next;
+
+out_fail:
+
+	return -EINVAL;
+}
+
+static ssize_t rdtgroup_mbm_assign_control_write(struct kernfs_open_file *of,
+						 char *buf, size_t nbytes,
+						 loff_t off)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	char *token, *cmon_grp, *mon_grp;
+	struct rdt_hw_resource *hw_res;
+	int ret;
+
+	hw_res = resctrl_to_arch_res(r);
+	if (!hw_res->abmc_enabled)
+		return -EINVAL;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+	rdt_last_cmd_clear();
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	while ((token = strsep(&buf, "\n")) != NULL) {
+		if (strstr(token, "//")) {
+			/*
+			 * The control mon group processing:
+			 * default CTRL_MON group: "//<flags>"
+			 * non-default CTRL_MON group: "<CTRL_MON group>//flags"
+			 * The CTRL_MON group will be empty string if it is a
+			 * default group.
+			 */
+			cmon_grp = strsep(&token, "//");
+
+			/*
+			 * strsep returns empty string for contiguous delimiters.
+			 * Make sure check for two consicutive delimiters and
+			 * advance the token.
+			 */
+			mon_grp = strsep(&token, "//");
+			if (*mon_grp != '\0') {
+				rdt_last_cmd_printf("Invalid CTRL_MON group format %s\n", token);
+				ret = -EINVAL;
+				break;
+			}
+
+			ret = resctrl_process_flags(RDTCTRL_GROUP, cmon_grp, mon_grp, token);
+			if (ret)
+				break;
+		} else if (strstr(token, "/")) {
+			/*
+			 * Mon group processing:
+			 * MON_GROUP inside default CTRL_MON group: "/<MON group>/<flags>"
+			 * MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON group>/<flags>"
+			 */
+			cmon_grp = strsep(&token, "/");
+
+			/* Extract the MON_GROUP. It cannot be empty string */
+			mon_grp = strsep(&token, "/");
+			if (*mon_grp == '\0') {
+				rdt_last_cmd_printf("Invalid MON_GROUP format %s\n", token);
+				ret = -EINVAL;
+				break;
+			}
+
+			ret = resctrl_process_flags(RDTMON_GROUP, cmon_grp, mon_grp, token);
+			if (ret)
+				break;
+		}
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2268,9 +2507,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_assign_control",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= rdtgroup_mbm_assign_control_show,
+		.write		= rdtgroup_mbm_assign_control_write,
 	},
 	{
 		.name		= "cpus_list",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
                   ` (18 preceding siblings ...)
  2024-05-24 12:23 ` [PATCH v4 19/19] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
@ 2024-06-14  0:54 ` Reinette Chatre
  2024-06-18 21:02   ` Moger, Babu
  19 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:54 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> 
> 
> d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>     to list and modify the group's assignment states.

There was a lot of discussion resulting in this centralized file. At first glance this
file appears to be very complicated and I believe any reasonable person would wonder if
all of this is necessary. I recommend that you add a motivation for why this file is needed.
Some items I recall are : it makes it easier for user space to learn how counters are used (no
need to traverse resctrl and open()/close() many files), on the resctrl side it makes
it possible to support counter re-assignment with a single IPI. There may be other motivations
that I am forgetting now.

Also, could the name just be "mbm_control"? What is enabled at this time are "assignable
counters" but in the future we may want to add support for other flags that have nothing to
do with "assignable counters".

> 
> 	The list follows the following format:
> 
> 	"<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"

"assignment_flags" -> "flags" ? (throughout)

> 
> 
> 	Format for specific type of groups:
> 
> 	* Default CTRL_MON group:
> 	 "//<domain_id>=<assignment_flags>"
> 
>         * Non-default CTRL_MON group:
>                 "<CTRL_MON group>//<domain_id>=<assignment_flags>"
> 
>         * Child MON group of default CTRL_MON group:
>                 "/<MON group>/<domain_id>=<assignment_flags>"
> 
>         * Child MON group of non-default CTRL_MON group:
>                 "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
> 
>         Assignment flags can be one of the following:
> 
>          t  MBM total event is enabled
>          l  MBM local event is enabled
>          tl Both total and local MBM events are enabled
>          _  None of the MBM events are enabled
> 
> 	Examples:
> 
> 	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	non_default_ctrl_mon_grp//0=tl;1=tl;
> 	non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> 	//0=tl;1=tl;
> 	/child_default_mon_grp/0=tl;1=tl;
> 
> 	There are four groups and all the groups have local and total
> 	event enabled on domain 0 and 1.
> 
> 	=tl means both total and local events are enabled.
> 
> 	"//" - This is a default CONTROL MON group
> 
> 	"non_default_ctrl_mon_grp//" - This is non default CONTROL MON group

Be consistent with "non-default" (vs non default) as well as "CTRL_MON" (vs
CONTROL MON).

> 
> 	"/child_default_mon_grp/"  - This is Child MON group of the defult group

"Child" -> "child"
"defult" -> "default"

> 
> 	"non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child
> 	MON group of the non default group

non-default

> 
> e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control.
> 
> 	The write format is similar to the above list format with addition of
> 	op-code for the assignment operation.
> 	
> 	* Default CTRL_MON group:
> 	        "//<domain_id><op-code><assignment_flags>"
> 	
> 	* Non-default CTRL_MON group:
> 	        "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
> 	
> 	* Child MON group of default CTRL_MON group:
> 	        "/<MON group>/<domain_id><op-code><assignment_flags>"
> 	
> 	* Child MON group of non-default CTRL_MON group:
> 	        "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
> 	
> 	Op-code can be one of the following:
> 	
> 	= Update the assignment to match the flags
> 	+ Assign a new state
> 	- Unassign a new state

Looking here and the implementation it seems that "+_" and "-_" is supported.
I think that should be invalid. Only "=_" seems appropriate to me.
Also please take care to not have a catchall "default" that does an
unassign. Doing something like that will prevent us from ever being
able to add any flags in the future.

> 	
> 	
> 	Initial group status:
> 	
> 	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	non_default_ctrl_mon_grp//0=tl;1=tl;
> 	non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> 	//0=tl;1=tl;
> 	/child_default_mon_grp/0=tl;1=tl;
> 	
> 	 To update the default group to enable only total event on domain 0:
> 	 # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	
> 	 Assignment status after the update:
> 	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	 non_default_ctrl_mon_grp//0=tl;1=tl;
> 	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> 	 //0=t;1=tl;
> 	 /child_default_mon_grp/0=tl;1=tl;
> 	
> 	 To update the MON group child_default_mon_grp to remove total event on domain 1:
> 	 # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	
> 	 Assignment status after the update:
> 	 $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	 non_default_ctrl_mon_grp//0=tl;1=tl;
> 	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> 	 //0=t;1=l;
> 	 /child_default_mon_grp/0=t;1=tl;

This does not look right. Why did domain #1 of the default CTRL_MON group change also?

> 	
> 	 To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
> 	 remove both local and total events on domain 1:
> 	 # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
> 	               /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	
> 	 Assignment status after the update:
> 	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	 non_default_ctrl_mon_grp//0=tl;1=tl;
> 	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> 	 //0=t;1=l;
> 	 /child_default_mon_grp/0=t;1=tl;
> 	
> 	 To update the default group to add a total event domain 1.
> 	 # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	

Unclear where "t" flag was removed.

> 	 Assignment status after the update:
> 	
> 	 # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> 	 non_default_ctrl_mon_grp//0=tl;1=tl;
> 	 non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> 	 //0=t;1=tl;
> 	 /child_default_mon_grp/0=t;1=tl;
> 	
> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
>     There is no change in reading the evetns with ABMC. If the event is unassigned

"evetns" -> "events"

>     when reading, then the read will come back as Unavailable.

Should this not rather be "Unassigned"? According to the docs the counters
will return "Unavailable" right after reconfigure so it seems that there
are scenarios where an "assigned" counter returns "Unavailable". It seems more
useful to return "Unassigned" that will have a new specific meaning that
overloading existing "Unavailable" that has original meaning of "try again" ....
but in this case trying again will be futile.

> 	
> 	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 	779247936
> 	# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> 	765207488
> 	
> g. Users will have the option to go back to legacy_mbm mode if required.
>     This can be done using the following command.
> 
> 	# echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> 	# cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>          abmc
>          [mbm_legacy]

It is confusing for the value written by user space to be different from
the value displayed: "legacy_mbm" vs "mbm_legacy.

This is still missing information about what happens to the counters/events on
such a switch. Will events just keep counting? Will they be reset? ...?

I also think we should try to find a more generic name for this file.
"mbm_cntr_mode" or "mbm_mode" maybe?

> 
> h. Check the bandwidth configuration for the group. Note that bandwidth
>     configuration has a domain scope. Total event defaults to 0x7F (to
>     count all the events) and local event defaults to 0x15 (to count all
>     the local numa events). The event bitmap decoding is available at
>     https://www.kernel.org/doc/Documentation/x86/resctrl.rst
>     in section "mbm_total_bytes_config", "mbm_local_bytes_config":
> 	
> 	#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 	0=0x7f;1=0x7f
> 	
> 	#cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> 	0=0x15;1=0x15
> 	
> j. Change the bandwidth source for domain 0 for the total event to count only reads.
>     Note that this change effects total events on the domain 0.
> 	
> 	#echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 	#cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 	0=0x33;1=0x7F
> 	
> k. Now read the total event again. The mbm_total_bytes should display
>     only the read events.
> 	
> 	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 	314101

According to doc, right after a BMEC change the counter will read "Unavailable"
is this not the case here?

> 	
> l. Unmount the resctrl
> 	
> 	#umount /sys/fs/resctrl/

Reinette


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 01/19] x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
@ 2024-06-14  0:54   ` Reinette Chatre
  2024-06-18 21:02     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:54 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

Could you please change shortlog prefix to be "x86/cpufeatures:" to
make it obvious where this patch is headed?

Reinette


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource
  2024-05-24 12:23 ` [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
@ 2024-06-14  0:55   ` Reinette Chatre
  2024-06-18 21:02     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:55 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Consolidate all the data related to monitoring into separate structure
> for better readability.

This needs a better motivation because "better readability" is subjective.
You can motivate this work by noting that this is re-organization in preparation
for more monitoring specific properties that will clobber the existing resource
struct more. To support this organization the pattern of the cache allocation
and memory bandwidth allocation features that consolidate the feature specific
properties into a struct is followed.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2024-05-24 12:23 ` [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
@ 2024-06-14  0:56   ` Reinette Chatre
  2024-06-18 21:03     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:56 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
> Bits Description
> 15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
>       Monitoring Counter ID + 1
> 
> The feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Removed resctrl_arch_has_abmc(). Added all the code inline. We dont
>      need to separate this as arch code.
> 
> v3: Removed changes related to mon_features.
>      Moved rdt_cpu_has to core.c and added new function resctrl_arch_has_abmc.
>      Also moved the fields mbm_assign_capable and mbm_assign_cntrs to
>      rdt_resource. (James)
> 
> v2: Changed the field name to mbm_assign_capable from abmc_capable.
> ---
>   arch/x86/kernel/cpu/resctrl/monitor.c | 14 ++++++++++++++
>   include/linux/resctrl.h               |  4 ++++
>   2 files changed, 18 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index b35d04fc761b..1602b58ba23d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1066,6 +1066,20 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>   			mbm_local_event.configurable = true;
>   			mbm_config_rftype_init("mbm_local_bytes_config");
>   		}
> +
> +		if (rdt_cpu_has(X86_FEATURE_ABMC)) {
> +			r->abmc_capable = true;
> +			/*
> +			 * Query CPUID_Fn80000020_EBX_x05 for number of
> +			 * ABMC counters
> +			 */
> +			cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
> +			r->mon.num_cntrs = (ebx & 0xFFFF) + 1;
> +			if (r->mon.num_cntrs > 64) {
> +				WARN(1, "Cannot support more than 64 ABMC counters\n");

if (WARN_ON(...))

> +				r->mon.num_cntrs = 64;
> +			}
> +		}
>   	}
>   
>   	l3_mon_evt_init(r);
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index bf99eb9c6ce4..24087e6efbb6 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -153,10 +153,12 @@ struct resctrl_schema;
>   /**
>    * struct resctrl_mon - Monitoring related data
>    * @num_rmid:		Number of RMIDs available
> + * @num_cntrs:		Maximum number of abmc counters

Maximum implies some limit and it is not clear what limit this applies to.
Can this just be "Number of monitoring counters".

Thinking ahead ... when switching between different "MBM modes" it
may be that one mode has a different number of counters than the other.
Since "ABMC" is just one mode it does not seem appropriate to
connect the "num_cntrs" property to ABMC.


>    * @evt_list:		List of monitoring events
>    */
>   struct resctrl_mon {
>   	int			num_rmid;
> +	int			num_cntrs;
>   	struct list_head	evt_list;
>   };
>   
> @@ -177,6 +179,7 @@ struct resctrl_mon {
>    * @parse_ctrlval:	Per resource function pointer to parse control values
>    * @fflags:		flags to choose base and info files
>    * @cdp_capable:	Is the CDP feature available on this resource
> + * @abmc_capable:	Is system capable of supporting monitor assignment?
>    */
>   struct rdt_resource {
>   	int			rid;
> @@ -196,6 +199,7 @@ struct rdt_resource {
>   						 struct rdt_domain *d);
>   	unsigned long		fflags;
>   	bool			cdp_capable;
> +	bool			abmc_capable;

Shouldn't abmc_capable be a property of the new struct resctrl_mon?

>   };
>   
>   /**

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags
  2024-05-24 12:23 ` [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags Babu Moger
@ 2024-06-14  0:57   ` Reinette Chatre
  2024-06-18 21:03     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:57 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

Shortlog: "Introduce resctrl_file_fflags_init() ..."


On 5/24/24 5:23 AM, Babu Moger wrote:
> The functions thread_throttle_mode_init() and mbm_config_rftype_init()

You can drop "The functions".

> both initialize fflags for resctrl files. Adding the new files will

"Adding the new files" -> "Adding new files"? (this should be in new
paragraph)

> involve adding another function to initialize the fflags. All of this

Solution should be in new paragraph and written in imperative mood.

> can be simplified by adding a new function resctrl_file_fflags_init()
> and passing the file name and flags to be initialized.
> 
> Consolidate multiple fflags initialization into resctrl_file_fflags_init()
> and remove thread_throttle_mode_init() and mbm_config_rftype_init().
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

Patch looks good.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters
  2024-05-24 12:23 ` [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
@ 2024-06-14  0:57   ` Reinette Chatre
  2024-06-18 21:04     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:57 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID and monitor the bandwidth as long as the counter
> is assigned. Number of assignments depend on number of ABMC counters
> available.

Take care that this interface will not just be used by ABMC. I assumed that
when the user switches to "soft-RMID" then the value of "num_cntrs" will change?

> 
> Provide the interface to display the number of ABMC counters supported.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Changed the counter name to num_cntrs. And few text changes.
> 
> v3: Changed the field name to mbm_assign_cntrs.
> 
> v2: Changed the field name to mbm_assignable_counters from abmc_counters.
> ---
>   Documentation/arch/x86/resctrl.rst     |  4 ++++
>   arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 ++++++++++++++++
>   3 files changed, 21 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 02790efaabcc..7ab8172ef208 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -257,6 +257,10 @@ with the following files:
>   	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>   	    0=0x30;1=0x30;3=0x15;4=0x15
>   
> +"num_cntrs":
> +	Available when ABMC feature is supported. The number of ABMC counters
> +	available for configuration.

This can only be understood by folks already familiar with AMD's ABMC feature. There is
no information about what "ABMC feature" is, what an "ABMC counter" is and what
"configuration" can be done with it.

Do you think this num_cntrs will only be used by ABMC? What will happen when user
enables "soft-RMID" or some other mode?

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature
  2024-05-24 12:23 ` [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
@ 2024-06-14  0:59   ` Reinette Chatre
  2024-06-19 15:37     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  0:59 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Add the functionality to enable/disable ABMC feature.
> 
> ABMC feature is enabled by setting enabled bit(0) in MSR L3_QOS_EXT_CFG.
> When the state of ABMC is changed, the MSR needs to be updated on all
> the logical processors in the QOS Domain.
> 
> Hardware counters will reset when ABMC state is changed. Kernel internal
> counters need to be reset to avoid overflow condition in the next update.

Please note that there are two "ABMC" features introduced in this series.
First, there is the "abmc" resctrl fs feature that just happens to have
the same name as AMD's "ABMC" (which may be a good motivation to change
this name). Second, there is the architecture (AMD) specific "ABMC" feature
that is enabled in response to user's request to enable the
resctrl "abmc" feature.

Other architectures need to support resctrl fs "abmc" feature with something
entirely different.

Please consider this distinction during this series because it is often
blurred.

> 
> The ABMC feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v4: Removed resctrl_arch_set_abmc_enabled and resctrl_arch_get_abmc_enabled.
>      Directly calling resctrl_abmc_enable and resctrl_abmc_disable.
>      Renamed couple of functions.
>      resctrl_abmc_msrwrite() -> resctrl_abmc_set_one()
>      resctrl_abmc_setup() -> resctrl_abmc_set_all()
>      Added rdtgroup_mutex lockdep asserts.
>      Updated commit log and code comments.
> 
> v3: No changes.
> 
> v2: Few text changes in commit message.
> ---
>   arch/x86/include/asm/msr-index.h       |  1 +
>   arch/x86/kernel/cpu/resctrl/internal.h |  8 ++++
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++++++++++++
>   3 files changed, 71 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index e022e6eb766c..5f9a0139e98c 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1171,6 +1171,7 @@
>   #define MSR_IA32_MBA_BW_BASE		0xc0000200
>   #define MSR_IA32_SMBA_BW_BASE		0xc0000280
>   #define MSR_IA32_EVT_CFG_BASE		0xc0000400
> +#define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
>   
>   /* MSR_IA32_VMX_MISC bits */
>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index d566251094b2..fabe40304798 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -97,6 +97,9 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
>   	return cpu;
>   }
>   
> +/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
> +#define ABMC_ENABLE			BIT(0)
> +
>   struct rdt_fs_context {
>   	struct kernfs_fs_context	kfc;
>   	bool				enable_cdpl2;
> @@ -436,6 +439,7 @@ struct rdt_parse_data {
>    * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
>    *			Monitoring Event Configuration (BMEC) is supported.
>    * @cdp_enabled:	CDP state of this resource
> + * @abmc_enabled:	ABMC feature is enabled
>    *
>    * Members of this structure are either private to the architecture
>    * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
> @@ -450,6 +454,7 @@ struct rdt_hw_resource {
>   	unsigned int		mbm_width;
>   	unsigned int		mbm_cfg_mask;
>   	bool			cdp_enabled;
> +	bool			abmc_enabled;
>   };
>   

ok, so here by making "abmc_enabled" a member of struct rdt_hw_resource this is
an architecture specific property. This is reasonable since every architecture
will look different. What is _not_ ok is that this causes the rest of the
series to change resctrl fs to reach into the architecture code. For example,
this work causes mbm_config_show() to now need to peek into struct rdt_hw_resource
to see this value. That is not ok. All of the interactions between this
field and resctrl fs needs to be via arch helpers: resctrl_arch_abmc_enable()/
resctrl_arch_abmc_disable() and resctrl_arch_get_abmc_enabled()/resctrl_arch_set_abmc_enabled().

>   static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
> @@ -493,6 +498,9 @@ static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
>   
>   int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
>   
> +int resctrl_abmc_enable(enum resctrl_res_level l);
> +void resctrl_abmc_disable(enum resctrl_res_level l);

Why do these need enum resctrl_res_level parameter?

> +
>   /*
>    * To return the common struct rdt_resource, which is contained in struct
>    * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index ca692712b393..9148d1234ede 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2416,6 +2416,68 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable)
>   	return 0;
>   }
>   
> +static void resctrl_abmc_set_one(void *arg)
> +{
> +	bool *enable = arg;
> +	u64 msrval;
> +
> +	rdmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
> +
> +	if (*enable)
> +		msrval |= ABMC_ENABLE;
> +	else
> +		msrval &= ~ABMC_ENABLE;
> +
> +	wrmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
> +}
> +
> +static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)

Should this function and resctrl_abmc_set_one() perhaps have "amd" in the
name just to enforce that this is not filesystem code at all and specific
and unique to AMD.

> +{
> +	struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
> +	struct rdt_domain *d;
> +
> +	/*
> +	 * Update QOS_CFG MSR on all the CPUs associated with the resource

end of sentence needs "."

This comment about the specific register seems more appropriate to
resctrl_abmc_set_one() though. This function is a higher level
enable/disable of the hardware feature.

> +	 * Hardware counters will reset after switching the monotor mode.

monotor -> monitor

> +	 * Reset the internal counters so that it is not considered as
> +	 * an overflow in next update.

For the first time the term "internal counters" is introduced. What does it
mean?

> +	 */
> +	list_for_each_entry(d, &r->domains, list) {
> +		on_each_cpu_mask(&d->cpu_mask, resctrl_abmc_set_one, &enable, 1);
> +		resctrl_arch_reset_rmid_all(r, d);
> +	}
> +
> +	return 0;
> +}
> +
> +int resctrl_abmc_enable(enum resctrl_res_level l)
> +{
> +	struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
> +	int ret = 0;
> +
> +	lockdep_assert_held(&rdtgroup_mutex);
> +
> +	if (!hw_res->abmc_enabled) {
> +		ret = resctrl_abmc_set_all(l, true);
> +		if (!ret)
> +			hw_res->abmc_enabled = true;

This error handling seems useless since resctrl_abmc_set_all() always returns
0 ... perhaps it should return void instead and this error handling dropped?
With that this function can never fail either and it can just return void,
but this is probably not what we want as the architecture call since other
architectures may fail.

> +	}
> +
> +	return ret;
> +}
> +
> +void resctrl_abmc_disable(enum resctrl_res_level l)
> +{
> +	struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
> +
> +	lockdep_assert_held(&rdtgroup_mutex);
> +
> +	if (hw_res->abmc_enabled) {
> +		resctrl_abmc_set_all(l, false);
> +		hw_res->abmc_enabled = false;
> +	}
> +}
> +
>   /*
>    * We don't allow rdtgroup directories to be created anywhere
>    * except the root directory. Thus when looking for the rdtgroup

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode
  2024-05-24 12:23 ` [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
@ 2024-06-14  1:40   ` Reinette Chatre
  2024-06-19 16:25     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:40 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID and monitor the bandwidth as long as it is assigned.
> ABMC mode is enabled by default when supported. System can be one mode
> at a time (Legacy monitor mode or ABMC mode).
> 
> Provide an interface to display the monitor mode on the system.
>      $cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>      [abmc]
>      legacy

Output is different from cover letter and what this patch implements.

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
> 
> v3: New patch to display ABMC capability.
> ---
>   Documentation/arch/x86/resctrl.rst     | 10 ++++++++++
>   arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
>   3 files changed, 34 insertions(+)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 7ab8172ef208..ab3cde61a124 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -261,6 +261,16 @@ with the following files:
>   	Available when ABMC feature is supported. The number of ABMC counters
>   	available for configuration.
>   
> +"mbm_assign":

This name is not ideal but I am having trouble finding a better one ... I have
seen you use "monitor mode" a couple of times (even in shortlog), so maybe that
could be the start of a more generic name? "mbm_mode"?

> +	Available when ABMC feature is supported. Reports the list of assignable

Why not always make this file available? At least it will display that
legacy mode is supported and it gives user space an always present file to query to
determine support.

> +	monitoring features supported. The enclosed brackets indicate which
> +	feature is enabled.
> +	::
> +
> +	  cat /sys/fs/resctrl/info/L3_MON/mbm_assign
> +	  [abmc]
> +	  mbm_legacy
> +
>   "max_threshold_occupancy":
>   		Read/write file provides the largest value (in
>   		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index e75a6146068b..b1d002e5e93d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1071,6 +1071,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>   
>   		if (rdt_cpu_has(X86_FEATURE_ABMC)) {
>   			r->abmc_capable = true;
> +			resctrl_file_fflags_init("mbm_assign", RFTYPE_MON_INFO);
>   			/*
>   			 * Query CPUID_Fn80000020_EBX_x05 for number of
>   			 * ABMC counters
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 9148d1234ede..3071bbb7a15e 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -856,6 +856,23 @@ static int rdtgroup_num_cntrs_show(struct kernfs_open_file *of,
>   	return 0;
>   }
>   
> +static int rdtgroup_mbm_assign_show(struct kernfs_open_file *of,
> +				    struct seq_file *s, void *v)
> +{
> +	struct rdt_resource *r = of->kn->parent->priv;
> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

Please use arch helper to just get abmc state instead of fs code
digging into arch structures.

> +
> +	if (hw_res->abmc_enabled) {
> +		seq_puts(s, "[abmc]\n");
> +		seq_puts(s, "mbm_legacy\n");
> +	} else {
> +		seq_puts(s, "abmc\n");
> +		seq_puts(s, "[mbm_legacy]\n");
> +	}
> +
> +	return 0;
> +}
> +
>   #ifdef CONFIG_PROC_CPU_RESCTRL
>   
>   /*
> @@ -1913,6 +1930,12 @@ static struct rftype res_common_files[] = {
>   		.seq_show	= mbm_local_bytes_config_show,
>   		.write		= mbm_local_bytes_config_write,
>   	},
> +	{
> +		.name		= "mbm_assign",
> +		.mode		= 0444,
> +		.kf_ops		= &rdtgroup_kf_single_ops,
> +		.seq_show	= rdtgroup_mbm_assign_show,
> +	},
>   	{
>   		.name		= "cpus",
>   		.mode		= 0644,

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap
  2024-05-24 12:23 ` [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
@ 2024-06-14  1:42   ` Reinette Chatre
  2024-06-19 17:03     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:42 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Hardware provides a set of counters when the ABMC feature is supported.
> These counters are used for enabling the events in resctrl group when
> the feature is supported.
> 
> Introduce num_cntrs_free_map bitmap to track available and free counters.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Changed the name to num_cntrs where applicable.
>      Used bitmap apis.
>      Added more comments for the globals.
> 
> v3: Changed the bitmap name to assign_cntrs_free_map. Removed abmc
>      from the name.
> 
> v2: Changed the bitmap name to assignable_counter_free_map from
>      abmc_counter_free_map.
> ---
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 3071bbb7a15e..400ae405e10e 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -186,6 +186,23 @@ bool closid_allocated(unsigned int closid)
>   	return !test_bit(closid, &closid_free_map);
>   }
>   
> +/*
> + * ABMC Counter bitmap and length for tracking available counters.
> + * ABMC feature provides set of hardware counters for enabling events.
> + * Each event takes one hardware counter. Kernel needs to keep track
> + * of number of available counters.
> + */
> +static unsigned long num_cntrs_free_map;

Why does variable have "num" in its name? That seems strange. How
about just "mon_cntrs_free_map

> +static u32 num_cntrs_free_map_len;

Same comment about "num" ... also, any special reason why u32 is needed?

> +
> +static void num_cntrs_init(void)

mon_cntrs_init() ?

> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> +	bitmap_fill(&num_cntrs_free_map, r->mon.num_cntrs);
> +	num_cntrs_free_map_len = r->mon.num_cntrs;
> +}
> +
>   /**
>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>    * @closid: closid if the resource group
> @@ -2459,6 +2476,12 @@ static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)

resctrl_abmc_set_all() was initially created as a complement of
resctrl_abmc_set_one() ... but with more initialization added to
resctrl_abmc_set_all() this relationship becomes vague.

>   	struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
>   	struct rdt_domain *d;
>   
> +	/*
> +	 * The bitmap num_cntrs_free_map needs to be reset when switching
> +	 * the monitor mode.

This comment puts in words what can be understood by just looking at the code.
What would be useful would be to explain _why_ this needs to be done ...
what happens to the previously assigned counters?


> +	 */
> +	num_cntrs_init();
> +
>   	/*
>   	 * Update QOS_CFG MSR on all the CPUs associated with the resource
>   	 * Hardware counters will reset after switching the monotor mode.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-05-24 12:23 ` [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
@ 2024-06-14  1:43   ` Reinette Chatre
  2024-06-19 18:46     ` Moger, Babu
  2024-06-27 18:51     ` Moger, Babu
  0 siblings, 2 replies; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:43 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
> supported, the bandwidth events can be configured to track specific
> events. The event configuration is domain specific. ABMC (Assignable
> Bandwidth Monitoring Counters) feature needs event configuration
> information to assign hardware counter to an RMID. Event configurations
> are not stored in resctrl but instead always read from or written to
> hardware directly when prompted by user space.
> 
> Read the event configuration from the hardware during the domain
> initialization. Save the configuration information in the rdt_hw_domain,
> so it can be used for counter assignment.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Read the configuration information from the hardware to initialize.
>      Added few commit messages.
>      Fixed the tab spaces.
> 
> v3: Minor changes related to rebase in mbm_config_write_domain.
> 
> v2: No changes.
> ---
>   arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>   arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
>   arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
>   3 files changed, 28 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index ec93f6a50308..856c46d12177 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>   		return;
>   	}
>   
> +	arch_domain_mbm_evt_config(hw_dom);
> +
>   	list_add_tail_rcu(&d->list, add_pos);
>   
>   	err = resctrl_online_domain(r, d);
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 5e7e76cd512f..60a1ca0a11a7 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -373,6 +373,8 @@ struct arch_mbm_state {
>    * @ctrl_val:	array of cache or mem ctrl values (indexed by CLOSID)
>    * @arch_mbm_total:	arch private state for MBM total bandwidth
>    * @arch_mbm_local:	arch private state for MBM local bandwidth
> + * @mbm_total_cfg:	MBM total bandwidth configuration
> + * @mbm_local_cfg:	MBM local bandwidth configuration
>    *
>    * Members of this structure are accessed via helpers that provide abstraction.
>    */
> @@ -381,6 +383,8 @@ struct rdt_hw_domain {
>   	u32				*ctrl_val;
>   	struct arch_mbm_state		*arch_mbm_total;
>   	struct arch_mbm_state		*arch_mbm_local;
> +	u32				mbm_total_cfg;
> +	u32				mbm_local_cfg;
>   };

Similar to the abmc_enabled member of rdt_hw_resource, these new
members of rdt_hw_domain are architecture specific and should never be
touched directly by resctrl fs code, for example, from mbm_config_show().

>   
>   static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
> @@ -622,6 +626,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free);
>   void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>   void __init resctrl_file_fflags_init(const char *config,
>   				     unsigned long fflags);
> +void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
>   void rdt_staged_configs_clear(void);
>   bool closid_allocated(unsigned int closid);
>   int resctrl_find_cleanest_closid(void);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index b1d002e5e93d..ab0f4bb49bd9 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1093,6 +1093,27 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>   	return 0;
>   }
>   
> +void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom)
> +{
> +	u64 msrval;
> +
> +	/*
> +	 * Read the configuration registers QOS_EVT_CFG_n, where <n> is
> +	 * the BMEC event number (EvtID).
> +	 * 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
> +	 * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID

This is what mon_event_config_index_get() is for, no?

> +	 */
> +	if (mbm_total_event.configurable) {
> +		rdmsrl(MSR_IA32_EVT_CFG_BASE, msrval);
> +		hw_dom->mbm_total_cfg = msrval & MAX_EVT_CONFIG_BITS;
> +	}
> +
> +	if (mbm_local_event.configurable) {
> +		rdmsrl(MSR_IA32_EVT_CFG_BASE + 1, msrval);
> +		hw_dom->mbm_local_cfg = msrval & MAX_EVT_CONFIG_BITS;
> +	}
> +}
> +
>   void __exit rdt_put_mon_l3_config(void)
>   {
>   	dom_data_exit();

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment
  2024-05-24 12:23 ` [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment Babu Moger
@ 2024-06-14  1:44   ` Reinette Chatre
  2024-06-19 20:10     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:44 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID and monitor the bandwidth as long as the counter
> is assigned. The bandwidth events will be tracked by the hardware until
> the user changes the configuration. Each resctrl group can configure
> maximum two counters, one for total event and one for local event.
> 
> The counters are configured by writing to MSR L3_QOS_ABMC_CFG.
> Configuration is done by setting the counter id, bandwidth source (RMID),
> bandwidth types. Reading L3_QOS_ABMC_DSC returns the configuration of

What is a "bandwidth type" here?

> the counter id specified in L3_QOS_ABMC_CFG.
> 
> Attempts to read or write these MSRs when ABMC is not enabled will result
> in a #GP(0) exception.
> 
> MSR L3_QOS_ABMC_CFG (0xC000_03FDh) and L3_QOS_ABMC_DSC (0xC000_03FEh)
> details.
> ==========================================================================
> Bits 	Mnemonic	Description		Access Type   Reset Value
> ==========================================================================
> 63 	CfgEn 		Configuration Enable 	R/W 		0
> 
> 62 	CtrEn 		Counter Enable 		R/W 		0
> 
> 61:53 	– 		Reserved 		MBZ 		0
> 
> 52:48 	CtrID 		Counter Identifier	R/W		0
> 
> 47 	IsCOS		BwSrc field is a COS	R/W		0
> 			(not an RMID)
> 
> 46:44 	–		Reserved		MBZ		0
> 
> 43:32	BwSrc		Bandwidth Source	R/W		0
> 			(RMID or COS)
> 
> 31:0	BwType		Bandwidth types to	R/W		0
> 			track for this counter
> ==========================================================================
> 

This is copy&paste from AMD spec but needs some translation to map it to
resctrl.

> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v4: Added more descriptions.
>      Changed the name abmc_ctr_id to ctr_id.
>      Added L3_QOS_ABMC_DSC. Used for reading the configuration.
> 
> v3: No changes.
> 
> v2: No changes.
> ---
>   arch/x86/include/asm/msr-index.h       |  2 ++
>   arch/x86/kernel/cpu/resctrl/internal.h | 30 ++++++++++++++++++++++++++
>   2 files changed, 32 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 5f9a0139e98c..6d2fe39ac68f 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1172,6 +1172,8 @@
>   #define MSR_IA32_SMBA_BW_BASE		0xc0000280
>   #define MSR_IA32_EVT_CFG_BASE		0xc0000400
>   #define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
> +#define MSR_IA32_L3_QOS_ABMC_CFG	0xc00003fd
> +#define MSR_IA32_L3_QOS_ABMC_DSC	0xc00003fe
>   
>   /* MSR_IA32_VMX_MISC bits */
>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 60a1ca0a11a7..45ed33f4f0ff 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -107,6 +107,9 @@ cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
>   #define ASSIGN_TOTAL			BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
>   #define ASSIGN_LOCAL			BIT(QOS_L3_MBM_LOCAL_EVENT_ID)
>   
> +/* Maximum assignable counters per resctrl group */
> +#define MAX_CNTRS		2
> +
>   struct rdt_fs_context {
>   	struct kernfs_fs_context	kfc;
>   	bool				enable_cdpl2;
> @@ -211,6 +214,7 @@ enum rdtgrp_mode {
>    * @crdtgrp_list:		child rdtgroup node list
>    * @rmid:			rmid for this rdtgroup
>    * @mon_state:			Assignment state of the group
> + * @ctr_id:			ABMC counter ids assigned to this group
>    */
>   struct mongroup {
>   	struct kernfs_node	*mon_data_kn;
> @@ -218,6 +222,7 @@ struct mongroup {
>   	struct list_head	crdtgrp_list;
>   	u32			rmid;
>   	u32			mon_state;
> +	u32			ctr_id[MAX_CNTRS];

To match "num_cntrs" it seems "cntr_id" may be more appropriate?

>   };
>   
>   /**
> @@ -568,6 +573,31 @@ union cpuid_0x10_x_edx {
>   	unsigned int full;
>   };
>   
> +/*
> + * ABMC counters can be configured  by writing to L3_QOS_ABMC_CFG.

extra space

> + * @bw_type		: Bandwidth types to track for this counter

What is a "Bandwidth type"? What are possible values?

> + * @bw_src		: Bandwidth Source (RMID or CLOSID)
> + * @reserved1		: Reserved
> + * @is_cos		: BwSrc field is a COS (not an RMID)

Please be consistent wrt "CLOS"/"CLOSID" in comment and member name

> + * @ctr_id		: Counter Identifier

Please be consistent with "cntr" vs "ctr". Earlier the series
introduces "num_cntrs" so cntr seems appropriate?

> + * @reserved		: Reserved
> + * @ctr_en		: Counter Enable bit
> + * @cfg_en		: Configuration Enable bit

Why are there two enable bits? How are they used?

(please match other kernel doc wrt : usage)

> + */
> +union l3_qos_abmc_cfg {
> +	struct {
> +		unsigned long  bw_type	:32,
> +			       bw_src	:12,
> +			       reserved1: 3,
> +			       is_cos	: 1,
> +			       ctr_id	: 5,
> +			       reserved	: 9,
> +			       ctr_en	: 1,
> +			       cfg_en	: 1;
> +	} split;
> +	unsigned long full;
> +};
> +
>   void rdt_last_cmd_clear(void);
>   void rdt_last_cmd_puts(const char *s);
>   __printf(1, 2)

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter
  2024-05-24 12:23 ` [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter Babu Moger
@ 2024-06-14  1:48   ` Reinette Chatre
  2024-06-19 22:38     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:48 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> ABMC feature requires to users to assign a hardware counter to an RMID

"requires to users" -> "requires users"?

> to monitor the events. Provide the interfaces to assign a counter.
> 
> Individual counters are configured by writing to L3_QOS_ABMC_CFG MSR
> and specifying the counter id, bandwidth source, and bandwidth types.

Where is discription of what this patch does and why?

This and following patches seem to introduce building blocks that
are only used later in series. These building blocks are introduced
without any information about what they do and why and that makes it
difficult to understand how this work will fall into place.

> 
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>      Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>      Monitoring (ABMC).
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v4: Commit message update.
>      User bitmap APIs where applicable.
>      Changed the interfaces considering MPAM(arm).
>      Added domain specific assignment.
> 
> v3: Removed the static from the prototype of rdtgroup_assign_abmc.
>      The function is not called directly from user anymore. These
>      changes are related to global assignment interface.
> 
> v2: Minor text changes in commit message.
> ---
>   arch/x86/kernel/cpu/resctrl/internal.h |   3 +
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 101 +++++++++++++++++++++++++
>   2 files changed, 104 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 45ed33f4f0ff..a88c8fc5e4df 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -657,6 +657,9 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>   void __init resctrl_file_fflags_init(const char *config,
>   				     unsigned long fflags);
>   void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
> +			u32 ctr_id, u32 closid, bool enable);
> +int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
>   void rdt_staged_configs_clear(void);
>   bool closid_allocated(unsigned int closid);
>   int resctrl_find_cleanest_closid(void);
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 0e425c91fa46..48df76499a04 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -203,6 +203,19 @@ static void num_cntrs_init(void)
>   	num_cntrs_free_map_len = r->mon.num_cntrs;
>   }
>   
> +static int assign_cntrs_alloc(void)

Only one counter is allocated and "assign" is unexpected since it
seems to imply how this counter will be used.
It can just be "mon_cntr_alloc()"?

> +{
> +	u32 ctr_id = find_first_bit(&num_cntrs_free_map,
> +				    num_cntrs_free_map_len);
> +
> +	if (ctr_id >= num_cntrs_free_map_len)
> +		return -ENOSPC;
> +
> +	__clear_bit(ctr_id, &num_cntrs_free_map);
> +
> +	return ctr_id;
> +}
> +
>   /**
>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>    * @closid: closid if the resource group
> @@ -1830,6 +1843,94 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>   	return ret ?: nbytes;
>   }
>   
> +static void rdtgroup_abmc_cfg(void *info)
> +{
> +	u64 *msrval = info;
> +
> +	wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *msrval);
> +}
> +

Please add comments to these functions to summarize what
they do.

> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
> +			u32 ctr_id, u32 closid, bool enable)
> +{
> +	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> +	union l3_qos_abmc_cfg abmc_cfg = { 0 };
> +	struct arch_mbm_state *arch_mbm;
> +
> +	abmc_cfg.split.cfg_en = 1;
> +	abmc_cfg.split.ctr_en = enable ? 1 : 0;
> +	abmc_cfg.split.ctr_id = ctr_id;
> +	abmc_cfg.split.bw_src = rmid;
> +
> +	/*
> +	 * Read the event configuration from the domain and pass it as
> +	 * bw_type.
> +	 */
> +	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
> +		abmc_cfg.split.bw_type = hw_dom->mbm_total_cfg;
> +		arch_mbm = &hw_dom->arch_mbm_total[rmid];
> +	} else {
> +		abmc_cfg.split.bw_type = hw_dom->mbm_local_cfg;
> +		arch_mbm = &hw_dom->arch_mbm_local[rmid];
> +	}
> +
> +	smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_cfg, &abmc_cfg, 1);
> +
> +	/* Reset the internal counters */

"internal counters"? This needs a definition ... but since this is not
a new data structure the comment can be more specific about what is done
and why.

> +	if (arch_mbm)
> +		memset(arch_mbm, 0, sizeof(struct arch_mbm_state));
> +
> +	return 0;

If this function always succeeds then it can just be void?

> +}
> +
> +int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid)

Please use consistent naming. Note how the filename is "rdtgroup.c" and this
function operates on "struct rdtgroup" and if you take a closer look at
the functions in this file and the header where you add this function you
will notice a distinct pattern of "rdtgroup_" used as prefix. There really
is no reason to start a new namespace for this.

> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	int ctr_id = 0, index;
> +	struct rdt_domain *d;
> +	u32 mon_state;
> +
> +	index = mon_event_config_index_get(evtid);
> +	if (index == INVALID_CONFIG_INDEX) {
> +		rdt_last_cmd_puts("Invalid event id\n");
> +		return -EINVAL;
> +	}
> +
> +	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
> +		mon_state = ASSIGN_TOTAL;
> +	} else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
> +		mon_state = ASSIGN_LOCAL;
> +	} else {
> +		rdt_last_cmd_puts("Invalid event id\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Nothing to do if event has been assigned already */
> +	if (rdtgrp->mon.mon_state & mon_state) {

Why is this mon_state needed? Is it not redundant when considering that
struct mongroup has an array of counter IDs that can already be used
to see if a counter is assigned or not?
What if there is a new:
#define MON_CNTR_UNSET U32_MAX

When cntr_id[index] == MON_CNTR_UNSET then it is not assigned.

> +		rdt_last_cmd_puts("ABMC counter is assigned already\n");
> +		return 0;
> +	}
> +
> +	/*
> +	 * Allocate a new counter and update domains
> +	 */
> +	ctr_id = assign_cntrs_alloc();
> +	if (ctr_id < 0) {
> +		rdt_last_cmd_puts("Out of ABMC counters\n");
> +		return -ENOSPC;
> +	}
> +
> +	rdtgrp->mon.ctr_id[index] = ctr_id;
> +
> +	list_for_each_entry(d, &r->domains, list)
> +		resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid, ctr_id,
> +				    rdtgrp->closid, 1);
> +
> +	rdtgrp->mon.mon_state |= mon_state;
> +
> +	return 0;
> +}
> +
>   /* rdtgroup information files for one cache resource. */
>   static struct rftype res_common_files[] = {
>   	{

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 15/19] x86/resctrl: Add the interface to unassign ABMC counter
  2024-05-24 12:23 ` [PATCH v4 15/19] x86/resctrl: Add the interface to unassign " Babu Moger
@ 2024-06-14  1:49   ` Reinette Chatre
  2024-06-20 13:48     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:49 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Hardware provides a limited number of ABMC counters. Once all the
> counters are exhausted, counters need to be freed for new assignments.
> 
> Provide the interface to unassign the counter.

Please write a proper changelog. This needs information that explains
what this patch does and why.

> 
> The feature details are documented in the APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>      Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>      Monitoring (ABMC).
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v4: Added domain specific unassign feature.
>      Few name changes.
> 
> v3: Removed the static from the prototype of rdtgroup_unassign_abmc.
>      The function is not called directly from user anymore. These
>      changes are related to global assignment interface.
> 
> v2: No changes.
> ---
>   arch/x86/kernel/cpu/resctrl/internal.h |  2 ++
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 ++++++++++++++++++++++++++
>   2 files changed, 44 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index a88c8fc5e4df..e16244895350 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -660,6 +660,8 @@ void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
>   int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>   			u32 ctr_id, u32 closid, bool enable);
>   int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
> +int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid);
> +void num_cntrs_free(u32 ctr_id);
>   void rdt_staged_configs_clear(void);
>   bool closid_allocated(unsigned int closid);
>   int resctrl_find_cleanest_closid(void);
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 48df76499a04..5ea1e58c7201 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -216,6 +216,11 @@ static int assign_cntrs_alloc(void)
>   	return ctr_id;
>   }
>   
> +void num_cntrs_free(u32 ctr_id)

The name does not reflect what it does. It neither frees the "num_cntrs" information
nor does it free "num_cntrs" of counters. How about "mon_cntr_free()"?


> +{
> +	__set_bit(ctr_id, &num_cntrs_free_map);
> +}
> +
>   /**
>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>    * @closid: closid if the resource group
> @@ -1931,6 +1936,43 @@ int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid)
>   	return 0;
>   }
>   
> +int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid)

Same comment wrt namespace. Also this function needs a description.

> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	struct rdt_domain *d;
> +	u32 mon_state;
> +	int index;
> +
> +	index = mon_event_config_index_get(evtid);
> +	if (index == INVALID_CONFIG_INDEX) {
> +		pr_warn_once("Invalid event id %d\n", evtid);
> +		return -EINVAL;
> +	}
> +
> +	if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
> +		mon_state = ASSIGN_TOTAL;
> +	} else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
> +		mon_state = ASSIGN_LOCAL;
> +	} else {
> +		rdt_last_cmd_puts("Invalid event id\n");
> +		return -EINVAL;
> +	}
> +
> +	if (rdtgrp->mon.mon_state & mon_state) {
> +		list_for_each_entry(d, &r->domains, list)
> +			resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid,
> +					    rdtgrp->mon.ctr_id[index],
> +					    rdtgrp->closid, 0);
> +
> +		/* Update the counter bitmap */
> +		num_cntrs_free(rdtgrp->mon.ctr_id[index]);
> +	}
> +
> +	rdtgrp->mon.mon_state &= ~mon_state;
> +
> +	return 0;
> +}
> +
>   /* rdtgroup information files for one cache resource. */
>   static struct rftype res_common_files[] = {
>   	{


Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount
  2024-05-24 12:23 ` [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount Babu Moger
@ 2024-06-14  1:50   ` Reinette Chatre
  2024-06-20 14:46     ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:50 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Enable ABMC by default when feature is supported.

Why enable/disable it on every mount/unmount cycle though? Why not just
enable it at the beginning and only change if requested by user?

> 
> Counters are assigned automatically on group creation. If the counters
> are exhausted, report the warnings and continue. It is not required to
> fail group creation for assignment failures. Users will have the option
> to modify the assignments.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Few name changes based on the upstream discussion.
>      Commit message update.
> 
> v3: This is a new patch. Patch addresses the upstream comment to enable
>      ABMC feature by default if the feature is available.
> ---
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 69 ++++++++++++++++++++++++++
>   1 file changed, 69 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 5ea1e58c7201..f452b6d9bb99 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2756,6 +2756,7 @@ static void rdt_disable_ctx(void)
>   {
>   	resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
>   	resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
> +	resctrl_abmc_disable(RDT_RESOURCE_L3);
>   	set_mba_sc(false);
>   
>   	resctrl_debug = false;
> @@ -2786,6 +2787,8 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
>   	if (ctx->enable_debug)
>   		resctrl_debug = true;
>   
> +	resctrl_abmc_enable(RDT_RESOURCE_L3);
> +
>   	return 0;
>   
>   out_cdpl3:
> @@ -2882,6 +2885,41 @@ static void schemata_list_destroy(void)
>   	}
>   }
>   
> +static int resctrl_assign_events(struct rdtgroup *rdtgrp)
> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> +	int ret = 0;
> +
> +	if (!hw_res->abmc_enabled)
> +		return 0;

resctrl fs should not be peeking in arch structure

> +
> +	if (is_mbm_total_enabled())
> +		ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> +	if (!ret && is_mbm_local_enabled())
> +		ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
> +

This function needs a comment about why it is ok to return with error and
partial changes.

> +	return ret;
> +}
> +
> +static int resctrl_unassign_events(struct rdtgroup *rdtgrp)
> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> +	int ret = 0;
> +
> +	if (!hw_res->abmc_enabled)
> +		return 0;
> +
> +	if (is_mbm_total_enabled())
> +		ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> +	if (!ret && is_mbm_local_enabled())
> +		ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
> +	return ret;
> +}
> +
>   static int rdt_get_tree(struct fs_context *fc)
>   {
>   	struct rdt_fs_context *ctx = rdt_fc2context(fc);
> @@ -2941,6 +2979,14 @@ static int rdt_get_tree(struct fs_context *fc)
>   		if (ret < 0)
>   			goto out_mongrp;
>   		rdtgroup_default.mon.mon_data_kn = kn_mondata;
> +
> +		/*
> +		 * Assign the monitor counters if it is available. If it fails,
> +		 * report the warnings and continue. It is not nessaccery to
> +		 * fail here.
> +		 */

nessaccery -> necessary

Please elaborate why it is not necessary to fail.


> +		if (resctrl_assign_events(&rdtgroup_default) < 0)
> +			rdt_last_cmd_puts("Monitor assignment failed\n");

How will user know that there is a warning? This does not return an error
so nothing will prompt user to check the status file. Perhaps add a comment
in the docs to help users at least know that this exists. Another helpful
thing will be to have the counter return "Unassigned" if it has not been assigned
since users familiar with old interface may see "Unavailable" without
knowing that user action is required to get a working event.

>   	}
>   
>   	ret = rdt_pseudo_lock_init();
> @@ -3214,6 +3260,8 @@ static void rdt_kill_sb(struct super_block *sb)
>   	cpus_read_lock();
>   	mutex_lock(&rdtgroup_mutex);
>   
> +	resctrl_unassign_events(&rdtgroup_default);
> +
>   	rdt_disable_ctx();
>   
>   	/*Put everything back to default values. */
> @@ -3752,6 +3800,14 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
>   		goto out_unlock;
>   	}
>   
> +	/*
> +	 * Assign the monitor counters if it is available. If it fails,
> +	 * report the warnings and continue. It is not nessaccery to
> +	 * fail here.
> +	 */

Fix copy&paste here and below.

> +	if (resctrl_assign_events(rdtgrp) < 0)
> +		rdt_last_cmd_puts("Monitor assignment failed\n");
> +
>   	kernfs_activate(rdtgrp->kn);
>   
>   	/*
> @@ -3796,6 +3852,14 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>   	if (ret)
>   		goto out_closid_free;
>   
> +	/*
> +	 * Assign the monitor counters if it is available. If it fails,
> +	 * report the warnings and continue. It is not nessaccery to
> +	 * fail here.
> +	 */
> +	if (resctrl_assign_events(rdtgrp) < 0)
> +		rdt_last_cmd_puts("Monitor assignment failed\n");
> +
>   	kernfs_activate(rdtgrp->kn);
>   
>   	ret = rdtgroup_init_alloc(rdtgrp);
> @@ -3891,6 +3955,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>   	update_closid_rmid(tmpmask, NULL);
>   
>   	rdtgrp->flags = RDT_DELETED;
> +
> +	resctrl_unassign_events(rdtgrp);
> +
>   	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>   
>   	/*
> @@ -3937,6 +4004,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>   	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>   	update_closid_rmid(tmpmask, NULL);
>   
> +	resctrl_unassign_events(rdtgrp);
> +
>   	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>   	closid_free(rdtgrp->closid);
>   

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy
  2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
@ 2024-06-14  1:51   ` Reinette Chatre
  2024-06-20 14:53     ` Moger, Babu
  2024-06-21 14:43   ` Markus Elfring
  1 sibling, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-14  1:51 UTC (permalink / raw)
  To: Babu Moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 5/24/24 5:23 AM, Babu Moger wrote:
> Introduce interface to switch between ABMC and legacy_mbm modes.

shortlog and first sentence of changelog do not match: mbm_legacy vs legacy_mbm?

> 
> By default ABMC is enabled on resctrl mount if the feature is available.
> However, user will have the option to go back to legacy_mbm if required.
> 
> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign
> [abmc]
> mbm_legacy
> 
> To enable the legacy monitoring feature:
> $ echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign

Missing information about user visible impact to counters/events
and any mitigations needed in implementation.

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v4: Minor commit text changes. Keep the default to ABMC when supported.
> 
> v3: New patch to address the review comments from upstream.
> ---
>   Documentation/arch/x86/resctrl.rst     | 10 +++++++
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++++++++++++-
>   2 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index ab3cde61a124..fd050d4d22cd 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -271,6 +271,16 @@ with the following files:
>   	  [abmc]
>   	  mbm_legacy
>   
> +	* To enable ABMC feature:
> +	  ::
> +
> +	    # echo  "abmc" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> +
> +	* To enable the legacy monitoring feature:
> +	  ::
> +
> +	    # echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> +

No information about what the features are or what will happen on such a switch.

>   "max_threshold_occupancy":
>   		Read/write file provides the largest value (in
>   		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index f452b6d9bb99..d77ff059269a 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -908,6 +908,43 @@ static int rdtgroup_mbm_assign_show(struct kernfs_open_file *of,
>   	return 0;
>   }
>   
> +/*
> + * rdtgroup_mode_write - Modify the resource group's mode

Comment does not match function name. Seems like "mode" is the generic term
to use instead of "assign".

> + */
> +static ssize_t rdtgroup_mbm_assign_write(struct kernfs_open_file *of,
> +					 char *buf, size_t nbytes,
> +					 loff_t off)
> +{
> +	struct rdt_resource *r = of->kn->parent->priv;
> +	int ret = 0;
> +
> +	if (!r->abmc_capable)
> +		return -EINVAL;
> +
> +	/* Valid input requires a trailing newline */
> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
> +		return -EINVAL;
> +
> +	buf[nbytes - 1] = '\0';
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +
> +	if (!strcmp(buf, "mbm_legacy"))
> +		resctrl_abmc_disable(RDT_RESOURCE_L3);
> +	else if (!strcmp(buf, "abmc"))
> +		ret = resctrl_abmc_enable(RDT_RESOURCE_L3);
> +	else
> +		ret = -EINVAL;
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret ?: nbytes;
> +}
> +
>   #ifdef CONFIG_PROC_CPU_RESCTRL
>   
>   /*
> @@ -2087,9 +2124,10 @@ static struct rftype res_common_files[] = {
>   	},
>   	{
>   		.name		= "mbm_assign",
> -		.mode		= 0444,
> +		.mode		= 0644,
>   		.kf_ops		= &rdtgroup_kf_single_ops,
>   		.seq_show	= rdtgroup_mbm_assign_show,
> +		.write		= rdtgroup_mbm_assign_write,
>   	},
>   	{
>   		.name		= "cpus",


Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2024-06-14  0:54 ` [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
@ 2024-06-18 21:02   ` Moger, Babu
  2024-06-20 22:49     ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:02 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

Thanks for the feedback for the series.

On 6/13/24 19:54, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>>
>>
>> d. This series adds a new interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     to list and modify the group's assignment states.
> 
> There was a lot of discussion resulting in this centralized file. At first
> glance this
> file appears to be very complicated and I believe any reasonable person
> would wonder if
> all of this is necessary. I recommend that you add a motivation for why
> this file is needed.
> Some items I recall are : it makes it easier for user space to learn how
> counters are used (no
> need to traverse resctrl and open()/close() many files), on the resctrl
> side it makes
> it possible to support counter re-assignment with a single IPI. There may
> be other motivations
> that I am forgetting now.

Sure. Will add those details.
> 
> Also, could the name just be "mbm_control"? What is enabled at this time
> are "assignable
> counters" but in the future we may want to add support for other flags
> that have nothing to
> do with "assignable counters".

Yes. Sure.

> 
>>
>>     The list follows the following format:
>>
>>     "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
> 
> "assignment_flags" -> "flags" ? (throughout)

Yes.

> 
>>
>>
>>     Format for specific type of groups:
>>
>>     * Default CTRL_MON group:
>>      "//<domain_id>=<assignment_flags>"
>>
>>         * Non-default CTRL_MON group:
>>                 "<CTRL_MON group>//<domain_id>=<assignment_flags>"
>>
>>         * Child MON group of default CTRL_MON group:
>>                 "/<MON group>/<domain_id>=<assignment_flags>"
>>
>>         * Child MON group of non-default CTRL_MON group:
>>                 "<CTRL_MON group>/<MON
>> group>/<domain_id>=<assignment_flags>"
>>
>>         Assignment flags can be one of the following:
>>
>>          t  MBM total event is enabled
>>          l  MBM local event is enabled
>>          tl Both total and local MBM events are enabled
>>          _  None of the MBM events are enabled
>>
>>     Examples:
>>
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=tl;
>>
>>     There are four groups and all the groups have local and total
>>     event enabled on domain 0 and 1.
>>
>>     =tl means both total and local events are enabled.
>>
>>     "//" - This is a default CONTROL MON group
>>
>>     "non_default_ctrl_mon_grp//" - This is non default CONTROL MON group
> 
> Be consistent with "non-default" (vs non default) as well as "CTRL_MON" (vs
> CONTROL MON).

Sure.

> 
>>
>>     "/child_default_mon_grp/"  - This is Child MON group of the defult
>> group
> 
> "Child" -> "child"
> "defult" -> "default"

Yes.
> 
>>
>>     "non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child
>>     MON group of the non default group
> 
> non-default

Sure.

> 
>>
>> e. Update the group assignment states using the interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control.
>>
>>     The write format is similar to the above list format with addition of
>>     op-code for the assignment operation.
>>     
>>     * Default CTRL_MON group:
>>             "//<domain_id><op-code><assignment_flags>"
>>     
>>     * Non-default CTRL_MON group:
>>             "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
>>     
>>     * Child MON group of default CTRL_MON group:
>>             "/<MON group>/<domain_id><op-code><assignment_flags>"
>>     
>>     * Child MON group of non-default CTRL_MON group:
>>             "<CTRL_MON group>/<MON
>> group>/<domain_id><op-code><assignment_flags>"
>>     
>>     Op-code can be one of the following:
>>     
>>     = Update the assignment to match the flags
>>     + Assign a new state
>>     - Unassign a new state
> 
> Looking here and the implementation it seems that "+_" and "-_" is supported.
> I think that should be invalid. Only "=_" seems appropriate to me.
> Also please take care to not have a catchall "default" that does an
> unassign. Doing something like that will prevent us from ever being
> able to add any flags in the future.

Yes. Good catch..  Will fix it.

> 
>>     
>>     
>>     Initial group status:
>>     
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=tl;
>>     
>>      To update the default group to enable only total event on domain 0:
>>      # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     
>>      Assignment status after the update:
>>      # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>      non_default_ctrl_mon_grp//0=tl;1=tl;
>>      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>      //0=t;1=tl;
>>      /child_default_mon_grp/0=tl;1=tl;
>>     
>>      To update the MON group child_default_mon_grp to remove total event
>> on domain 1:
>>      # echo "/child_default_mon_grp/1-t" >
>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     
>>      Assignment status after the update:
>>      $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>      non_default_ctrl_mon_grp//0=tl;1=tl;
>>      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>      //0=t;1=l;
>>      /child_default_mon_grp/0=t;1=tl;
> 
> This does not look right. Why did domain #1 of the default CTRL_MON group
> change also?

Will correct  it.

> 
>>     
>>      To update the MON group
>> non_default_ctrl_mon_grp/child_non_default_mon_grp to
>>      remove both local and total events on domain 1:
>>      # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
>>                    /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     
>>      Assignment status after the update:
>>      # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>      non_default_ctrl_mon_grp//0=tl;1=tl;
>>      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>>      //0=t;1=l;
>>      /child_default_mon_grp/0=t;1=tl;
>>     
>>      To update the default group to add a total event domain 1.
>>      # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>     
> 
> Unclear where "t" flag was removed.

Yes. Will correct.

> 
>>      Assignment status after the update:
>>     
>>      # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>      non_default_ctrl_mon_grp//0=tl;1=tl;
>>      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>>      //0=t;1=tl;
>>      /child_default_mon_grp/0=t;1=tl;
>>     
>> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
>>     There is no change in reading the evetns with ABMC. If the event is
>> unassigned
> 
> "evetns" -> "events"

Sure.

> 
>>     when reading, then the read will come back as Unavailable.
> 
> Should this not rather be "Unassigned"? According to the docs the counters
> will return "Unavailable" right after reconfigure so it seems that there
> are scenarios where an "assigned" counter returns "Unavailable". It seems
> more
> useful to return "Unassigned" that will have a new specific meaning that
> overloading existing "Unavailable" that has original meaning of "try
> again" ....
> but in this case trying again will be futile.

Hardware returns "Unavailable" in both the cases. So, thought of 
reporting the same without any interpretation.

> 
>>     
>>     # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>     779247936
>>     # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>     765207488
>>     
>> g. Users will have the option to go back to legacy_mbm mode if required.
>>     This can be done using the following command.
>>
>>     # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>>          abmc
>>          [mbm_legacy]
> 
> It is confusing for the value written by user space to be different from
> the value displayed: "legacy_mbm" vs "mbm_legacy.

My bad. Both should have been "legacy_mbm"

> 
> This is still missing information about what happens to the
> counters/events on
> such a switch. Will events just keep counting? Will they be reset? ...?

It will all reset.

> 
> I also think we should try to find a more generic name for this file.
> "mbm_cntr_mode" or "mbm_mode" maybe?

"mbm_mode" looks better.  Then I will change "legacy_mbm" to "mbm_legacy".

> 
>>
>> h. Check the bandwidth configuration for the group. Note that bandwidth
>>     configuration has a domain scope. Total event defaults to 0x7F (to
>>     count all the events) and local event defaults to 0x15 (to count all
>>     the local numa events). The event bitmap decoding is available at
>>     https://www.kernel.org/doc/Documentation/x86/resctrl.rst
>>     in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>>     
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     0=0x7f;1=0x7f
>>     
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>     0=0x15;1=0x15
>>     
>> j. Change the bandwidth source for domain 0 for the total event to count
>> only reads.
>>     Note that this change effects total events on the domain 0.
>>     
>>     #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     0=0x33;1=0x7F
>>     
>> k. Now read the total event again. The mbm_total_bytes should display
>>     only the read events.
>>     
>>     #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>     314101
> 
> According to doc, right after a BMEC change the counter will read
> "Unavailable"
> is this not the case here?

Yes. First read will come back with "Unavailable". Will have add one 
line about that here.

> 
>>     
>> l. Unmount the resctrl
>>     
>>     #umount /sys/fs/resctrl/
> 
> Reinette
> 
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 01/19] x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters (ABMC)
  2024-06-14  0:54   ` Reinette Chatre
@ 2024-06-18 21:02     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:02 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse



On 6/13/24 19:54, Reinette Chatre wrote:
> Hi Babu,
> 
> Could you please change shortlog prefix to be "x86/cpufeatures:" to
> make it obvious where this patch is headed?

Sure.
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource
  2024-06-14  0:55   ` Reinette Chatre
@ 2024-06-18 21:02     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:02 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 19:55, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Consolidate all the data related to monitoring into separate structure
>> for better readability.
> 
> This needs a better motivation because "better readability" is subjective.
> You can motivate this work by noting that this is re-organization in
> preparation
> for more monitoring specific properties that will clobber the existing
> resource
> struct more. To support this organization the pattern of the cache allocation
> and memory bandwidth allocation features that consolidate the feature
> specific
> properties into a struct is followed.
> 

Sure. Sounds good.
-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details
  2024-06-14  0:56   ` Reinette Chatre
@ 2024-06-18 21:03     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:03 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 19:56, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5.
>> Bits Description
>> 15:0 MAX_ABMC Maximum Supported Assignable Bandwidth
>>       Monitoring Counter ID + 1
>>
>> The feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Removed resctrl_arch_has_abmc(). Added all the code inline. We dont
>>      need to separate this as arch code.
>>
>> v3: Removed changes related to mon_features.
>>      Moved rdt_cpu_has to core.c and added new function
>> resctrl_arch_has_abmc.
>>      Also moved the fields mbm_assign_capable and mbm_assign_cntrs to
>>      rdt_resource. (James)
>>
>> v2: Changed the field name to mbm_assign_capable from abmc_capable.
>> ---
>>   arch/x86/kernel/cpu/resctrl/monitor.c | 14 ++++++++++++++
>>   include/linux/resctrl.h               |  4 ++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
>> b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index b35d04fc761b..1602b58ba23d 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1066,6 +1066,20 @@ int __init rdt_get_mon_l3_config(struct
>> rdt_resource *r)
>>               mbm_local_event.configurable = true;
>>               mbm_config_rftype_init("mbm_local_bytes_config");
>>           }
>> +
>> +        if (rdt_cpu_has(X86_FEATURE_ABMC)) {
>> +            r->abmc_capable = true;
>> +            /*
>> +             * Query CPUID_Fn80000020_EBX_x05 for number of
>> +             * ABMC counters
>> +             */
>> +            cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
>> +            r->mon.num_cntrs = (ebx & 0xFFFF) + 1;
>> +            if (r->mon.num_cntrs > 64) {
>> +                WARN(1, "Cannot support more than 64 ABMC counters\n");
> 
> if (WARN_ON(...))

Sure..

> 
>> +                r->mon.num_cntrs = 64;
>> +            }
>> +        }
>>       }
>>         l3_mon_evt_init(r);
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index bf99eb9c6ce4..24087e6efbb6 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -153,10 +153,12 @@ struct resctrl_schema;
>>   /**
>>    * struct resctrl_mon - Monitoring related data
>>    * @num_rmid:        Number of RMIDs available
>> + * @num_cntrs:        Maximum number of abmc counters
> 
> Maximum implies some limit and it is not clear what limit this applies to.
> Can this just be "Number of monitoring counters".

Sure.
> 
> Thinking ahead ... when switching between different "MBM modes" it
> may be that one mode has a different number of counters than the other.
> Since "ABMC" is just one mode it does not seem appropriate to
> connect the "num_cntrs" property to ABMC.

Ok

> 
> 
>>    * @evt_list:        List of monitoring events
>>    */
>>   struct resctrl_mon {
>>       int            num_rmid;
>> +    int            num_cntrs;
>>       struct list_head    evt_list;
>>   };
>>   @@ -177,6 +179,7 @@ struct resctrl_mon {
>>    * @parse_ctrlval:    Per resource function pointer to parse control
>> values
>>    * @fflags:        flags to choose base and info files
>>    * @cdp_capable:    Is the CDP feature available on this resource
>> + * @abmc_capable:    Is system capable of supporting monitor assignment?
>>    */
>>   struct rdt_resource {
>>       int            rid;
>> @@ -196,6 +199,7 @@ struct rdt_resource {
>>                            struct rdt_domain *d);
>>       unsigned long        fflags;
>>       bool            cdp_capable;
>> +    bool            abmc_capable;
> 
> Shouldn't abmc_capable be a property of the new struct resctrl_mon?

Sure. We can move it to resctrl_mon.

> 
>>   };
>>     /**
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags
  2024-06-14  0:57   ` Reinette Chatre
@ 2024-06-18 21:03     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:03 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 19:57, Reinette Chatre wrote:
> Hi Babu,
> 
> Shortlog: "Introduce resctrl_file_fflags_init() ..."

Sure.

> 
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> The functions thread_throttle_mode_init() and mbm_config_rftype_init()
> 
> You can drop "The functions".

Sure.

> 
>> both initialize fflags for resctrl files. Adding the new files will
> 
> "Adding the new files" -> "Adding new files"? (this should be in new
> paragraph)

Sure.

> 
>> involve adding another function to initialize the fflags. All of this
> 
> Solution should be in new paragraph and written in imperative mood.

ok

> 
>> can be simplified by adding a new function resctrl_file_fflags_init()
>> and passing the file name and flags to be initialized.
>>
>> Consolidate multiple fflags initialization into resctrl_file_fflags_init()
>> and remove thread_throttle_mode_init() and mbm_config_rftype_init().
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> Patch looks good.

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters
  2024-06-14  0:57   ` Reinette Chatre
@ 2024-06-18 21:04     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-18 21:04 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 19:57, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID and monitor the bandwidth as long as the counter
>> is assigned. Number of assignments depend on number of ABMC counters
>> available.
> 
> Take care that this interface will not just be used by ABMC. I assumed that
> when the user switches to "soft-RMID" then the value of "num_cntrs" will
> change?

It can be used by both ABMC and "soft-RMID". Will update the information
to be bit generic.

> 
>>
>> Provide the interface to display the number of ABMC counters supported.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Changed the counter name to num_cntrs. And few text changes.
>>
>> v3: Changed the field name to mbm_assign_cntrs.
>>
>> v2: Changed the field name to mbm_assignable_counters from abmc_counters.
>> ---
>>   Documentation/arch/x86/resctrl.rst     |  4 ++++
>>   arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 16 ++++++++++++++++
>>   3 files changed, 21 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst
>> b/Documentation/arch/x86/resctrl.rst
>> index 02790efaabcc..7ab8172ef208 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -257,6 +257,10 @@ with the following files:
>>           # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>           0=0x30;1=0x30;3=0x15;4=0x15
>>   +"num_cntrs":
>> +    Available when ABMC feature is supported. The number of ABMC counters
>> +    available for configuration.
> 
> This can only be understood by folks already familiar with AMD's ABMC
> feature. There is
> no information about what "ABMC feature" is, what an "ABMC counter" is and
> what
> "configuration" can be done with it.

I can move this patch after 8/19. I will add details on ABMC feature in
Patch 8/19. Also, keep the details bit generic.

> 
> Do you think this num_cntrs will only be used by ABMC? What will happen
> when user
> enables "soft-RMID" or some other mode?

This can be used by both the features. But we don't know how soft-RMID
will be implemented. But, I can make this explanation bit more generic.
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature
  2024-06-14  0:59   ` Reinette Chatre
@ 2024-06-19 15:37     ` Moger, Babu
  2024-06-20 22:02       ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 15:37 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 19:59, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Add the functionality to enable/disable ABMC feature.
>>
>> ABMC feature is enabled by setting enabled bit(0) in MSR L3_QOS_EXT_CFG.
>> When the state of ABMC is changed, the MSR needs to be updated on all
>> the logical processors in the QOS Domain.
>>
>> Hardware counters will reset when ABMC state is changed. Kernel internal
>> counters need to be reset to avoid overflow condition in the next update.
> 
> Please note that there are two "ABMC" features introduced in this series.
> First, there is the "abmc" resctrl fs feature that just happens to have
> the same name as AMD's "ABMC" (which may be a good motivation to change
> this name). Second, there is the architecture (AMD) specific "ABMC" feature
> that is enabled in response to user's request to enable the
> resctrl "abmc" feature.
> 
> Other architectures need to support resctrl fs "abmc" feature with something
> entirely different.
> 
> Please consider this distinction during this series because it is often
> blurred.

I was explainaing the enablement of  AMD's ABMC feature here. I can add
text about it.

> 
>>
>> The ABMC feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> ---
>> v4: Removed resctrl_arch_set_abmc_enabled and
>> resctrl_arch_get_abmc_enabled.
>>      Directly calling resctrl_abmc_enable and resctrl_abmc_disable.
>>      Renamed couple of functions.
>>      resctrl_abmc_msrwrite() -> resctrl_abmc_set_one()
>>      resctrl_abmc_setup() -> resctrl_abmc_set_all()
>>      Added rdtgroup_mutex lockdep asserts.
>>      Updated commit log and code comments.
>>
>> v3: No changes.
>>
>> v2: Few text changes in commit message.
>> ---
>>   arch/x86/include/asm/msr-index.h       |  1 +
>>   arch/x86/kernel/cpu/resctrl/internal.h |  8 ++++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++++++++++++
>>   3 files changed, 71 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h
>> b/arch/x86/include/asm/msr-index.h
>> index e022e6eb766c..5f9a0139e98c 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -1171,6 +1171,7 @@
>>   #define MSR_IA32_MBA_BW_BASE        0xc0000200
>>   #define MSR_IA32_SMBA_BW_BASE        0xc0000280
>>   #define MSR_IA32_EVT_CFG_BASE        0xc0000400
>> +#define MSR_IA32_L3_QOS_EXT_CFG        0xc00003ff
>>     /* MSR_IA32_VMX_MISC bits */
>>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index d566251094b2..fabe40304798 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -97,6 +97,9 @@ cpumask_any_housekeeping(const struct cpumask *mask,
>> int exclude_cpu)
>>       return cpu;
>>   }
>>   +/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature */
>> +#define ABMC_ENABLE            BIT(0)
>> +
>>   struct rdt_fs_context {
>>       struct kernfs_fs_context    kfc;
>>       bool                enable_cdpl2;
>> @@ -436,6 +439,7 @@ struct rdt_parse_data {
>>    * @mbm_cfg_mask:    Bandwidth sources that can be tracked when Bandwidth
>>    *            Monitoring Event Configuration (BMEC) is supported.
>>    * @cdp_enabled:    CDP state of this resource
>> + * @abmc_enabled:    ABMC feature is enabled
>>    *
>>    * Members of this structure are either private to the architecture
>>    * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
>> @@ -450,6 +454,7 @@ struct rdt_hw_resource {
>>       unsigned int        mbm_width;
>>       unsigned int        mbm_cfg_mask;
>>       bool            cdp_enabled;
>> +    bool            abmc_enabled;
>>   };
>>   
> 
> ok, so here by making "abmc_enabled" a member of struct rdt_hw_resource
> this is
> an architecture specific property. This is reasonable since every
> architecture
> will look different. What is _not_ ok is that this causes the rest of the
> series to change resctrl fs to reach into the architecture code. For example,
> this work causes mbm_config_show() to now need to peek into struct
> rdt_hw_resource
> to see this value. That is not ok. All of the interactions between this
> field and resctrl fs needs to be via arch helpers:
> resctrl_arch_abmc_enable()/
> resctrl_arch_abmc_disable() and
> resctrl_arch_get_abmc_enabled()/resctrl_arch_set_abmc_enabled().

Sure. Will change it.

> 
>>   static inline struct rdt_hw_resource *resctrl_to_arch_res(struct
>> rdt_resource *r)
>> @@ -493,6 +498,9 @@ static inline bool resctrl_arch_get_cdp_enabled(enum
>> resctrl_res_level l)
>>     int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool
>> enable);
>>   +int resctrl_abmc_enable(enum resctrl_res_level l);
>> +void resctrl_abmc_disable(enum resctrl_res_level l);
> 
> Why do these need enum resctrl_res_level parameter?

It is really not required. It is always "RDT_RESOURCE_L3".
I can remove it.

> 
>> +
>>   /*
>>    * To return the common struct rdt_resource, which is contained in struct
>>    * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index ca692712b393..9148d1234ede 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2416,6 +2416,68 @@ int resctrl_arch_set_cdp_enabled(enum
>> resctrl_res_level l, bool enable)
>>       return 0;
>>   }
>>   +static void resctrl_abmc_set_one(void *arg)
>> +{
>> +    bool *enable = arg;
>> +    u64 msrval;
>> +
>> +    rdmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
>> +
>> +    if (*enable)
>> +        msrval |= ABMC_ENABLE;
>> +    else
>> +        msrval &= ~ABMC_ENABLE;
>> +
>> +    wrmsrl(MSR_IA32_L3_QOS_EXT_CFG, msrval);
>> +}
>> +
>> +static int resctrl_abmc_set_all(enum resctrl_res_level l, bool enable)
> 
> Should this function and resctrl_abmc_set_one() perhaps have "amd" in the
> name just to enforce that this is not filesystem code at all and specific
> and unique to AMD.

Sure.

> 
>> +{
>> +    struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
>> +    struct rdt_domain *d;
>> +
>> +    /*
>> +     * Update QOS_CFG MSR on all the CPUs associated with the resource
> 
> end of sentence needs "."

Sure.

> 
> This comment about the specific register seems more appropriate to
> resctrl_abmc_set_one() though. This function is a higher level
> enable/disable of the hardware feature.

Ok.

> 
>> +     * Hardware counters will reset after switching the monotor mode.
> 
> monotor -> monitor

Sure.

> 
>> +     * Reset the internal counters so that it is not considered as
>> +     * an overflow in next update.
> 
> For the first time the term "internal counters" is introduced. What does it
> mean?

How about "mbm counters"?

> 
>> +     */
>> +    list_for_each_entry(d, &r->domains, list) {
>> +        on_each_cpu_mask(&d->cpu_mask, resctrl_abmc_set_one, &enable, 1);
>> +        resctrl_arch_reset_rmid_all(r, d);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +int resctrl_abmc_enable(enum resctrl_res_level l)
>> +{
>> +    struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
>> +    int ret = 0;
>> +
>> +    lockdep_assert_held(&rdtgroup_mutex);
>> +
>> +    if (!hw_res->abmc_enabled) {
>> +        ret = resctrl_abmc_set_all(l, true);
>> +        if (!ret)
>> +            hw_res->abmc_enabled = true;
> 
> This error handling seems useless since resctrl_abmc_set_all() always returns
> 0 ... perhaps it should return void instead and this error handling dropped?
> With that this function can never fail either and it can just return void,
> but this is probably not what we want as the architecture call since other
> architectures may fail.

Understood.

> 
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +void resctrl_abmc_disable(enum resctrl_res_level l)
>> +{
>> +    struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
>> +
>> +    lockdep_assert_held(&rdtgroup_mutex);
>> +
>> +    if (hw_res->abmc_enabled) {
>> +        resctrl_abmc_set_all(l, false);
>> +        hw_res->abmc_enabled = false;
>> +    }
>> +}
>> +
>>   /*
>>    * We don't allow rdtgroup directories to be created anywhere
>>    * except the root directory. Thus when looking for the rdtgroup
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode
  2024-06-14  1:40   ` Reinette Chatre
@ 2024-06-19 16:25     ` Moger, Babu
  2024-06-20 22:05       ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 16:25 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:40, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID and monitor the bandwidth as long as it is assigned.
>> ABMC mode is enabled by default when supported. System can be one mode
>> at a time (Legacy monitor mode or ABMC mode).
>>
>> Provide an interface to display the monitor mode on the system.
>>      $cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>>      [abmc]
>>      legacy
> 
> Output is different from cover letter and what this patch implements.

My bad. Will fix it,

> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Fixed the checks for legacy and abmc mode. Default it ABMC.
>>
>> v3: New patch to display ABMC capability.
>> ---
>>   Documentation/arch/x86/resctrl.rst     | 10 ++++++++++
>>   arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
>>   3 files changed, 34 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst
>> b/Documentation/arch/x86/resctrl.rst
>> index 7ab8172ef208..ab3cde61a124 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -261,6 +261,16 @@ with the following files:
>>       Available when ABMC feature is supported. The number of ABMC counters
>>       available for configuration.
>>   +"mbm_assign":
> 
> This name is not ideal but I am having trouble finding a better one ... I
> have
> seen you use "monitor mode" a couple of times (even in shortlog), so maybe
> that
> could be the start of a more generic name? "mbm_mode"?

mbm_mode sounds good. Like this.

$cat /sys/fs/resctrl/info/L3_MON/mbm_mode
[abmc]
legacy

Keeping just "legacy" vs mbm_legacy.


> 
>> +    Available when ABMC feature is supported. Reports the list of
>> assignable
> 
> Why not always make this file available? At least it will display that
> legacy mode is supported and it gives user space an always present file to
> query to
> determine support.

Ok. Sure.

> 
>> +    monitoring features supported. The enclosed brackets indicate which
>> +    feature is enabled.
>> +    ::
>> +
>> +      cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> +      [abmc]
>> +      mbm_legacy
>> +
>>   "max_threshold_occupancy":
>>           Read/write file provides the largest value (in
>>           bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
>> b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index e75a6146068b..b1d002e5e93d 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1071,6 +1071,7 @@ int __init rdt_get_mon_l3_config(struct
>> rdt_resource *r)
>>             if (rdt_cpu_has(X86_FEATURE_ABMC)) {
>>               r->abmc_capable = true;
>> +            resctrl_file_fflags_init("mbm_assign", RFTYPE_MON_INFO);
>>               /*
>>                * Query CPUID_Fn80000020_EBX_x05 for number of
>>                * ABMC counters
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 9148d1234ede..3071bbb7a15e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -856,6 +856,23 @@ static int rdtgroup_num_cntrs_show(struct
>> kernfs_open_file *of,
>>       return 0;
>>   }
>>   +static int rdtgroup_mbm_assign_show(struct kernfs_open_file *of,
>> +                    struct seq_file *s, void *v)
>> +{
>> +    struct rdt_resource *r = of->kn->parent->priv;
>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> 
> Please use arch helper to just get abmc state instead of fs code
> digging into arch structures.

Ok. I will use "resctrl_arch_get_abmc_enabled()" to get abmc state.

> 
>> +
>> +    if (hw_res->abmc_enabled) {
>> +        seq_puts(s, "[abmc]\n");
>> +        seq_puts(s, "mbm_legacy\n");
>> +    } else {
>> +        seq_puts(s, "abmc\n");
>> +        seq_puts(s, "[mbm_legacy]\n");
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   #ifdef CONFIG_PROC_CPU_RESCTRL
>>     /*
>> @@ -1913,6 +1930,12 @@ static struct rftype res_common_files[] = {
>>           .seq_show    = mbm_local_bytes_config_show,
>>           .write        = mbm_local_bytes_config_write,
>>       },
>> +    {
>> +        .name        = "mbm_assign",
>> +        .mode        = 0444,
>> +        .kf_ops        = &rdtgroup_kf_single_ops,
>> +        .seq_show    = rdtgroup_mbm_assign_show,
>> +    },
>>       {
>>           .name        = "cpus",
>>           .mode        = 0644,
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap
  2024-06-14  1:42   ` Reinette Chatre
@ 2024-06-19 17:03     ` Moger, Babu
  2024-06-20 22:20       ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 17:03 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:42, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Hardware provides a set of counters when the ABMC feature is supported.
>> These counters are used for enabling the events in resctrl group when
>> the feature is supported.
>>
>> Introduce num_cntrs_free_map bitmap to track available and free counters.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Changed the name to num_cntrs where applicable.
>>      Used bitmap apis.
>>      Added more comments for the globals.
>>
>> v3: Changed the bitmap name to assign_cntrs_free_map. Removed abmc
>>      from the name.
>>
>> v2: Changed the bitmap name to assignable_counter_free_map from
>>      abmc_counter_free_map.
>> ---
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++++++++
>>   1 file changed, 23 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 3071bbb7a15e..400ae405e10e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -186,6 +186,23 @@ bool closid_allocated(unsigned int closid)
>>       return !test_bit(closid, &closid_free_map);
>>   }
>>   +/*
>> + * ABMC Counter bitmap and length for tracking available counters.
>> + * ABMC feature provides set of hardware counters for enabling events.
>> + * Each event takes one hardware counter. Kernel needs to keep track
>> + * of number of available counters.
>> + */
>> +static unsigned long num_cntrs_free_map;
> 
> Why does variable have "num" in its name? That seems strange. How
> about just "mon_cntrs_free_map

It came from patch 4/19.

struct resctrl_mon {
        int                     num_rmid;
+       int                     num_cntrs;
        struct list_head        evt_list;
 };

num_cntrs_free_map is a bitmap representing num_cntrs. Kept the matching
name for better understanding. Renaming it will loose that connection.
If I rename then I will have to rename both.

How about mbm_cntrs and mbm_cntrs_free_map?


> 
>> +static u32 num_cntrs_free_map_len;
> 
> Same comment about "num" ... also, any special reason why u32 is needed?

Only reason is, it is supposed to be unsigned.  I can change it "unsigned
int".

> 
>> +
>> +static void num_cntrs_init(void)
> 
> mon_cntrs_init() ?

mbm_cntrs_init?

> 
>> +{
>> +    struct rdt_resource *r =
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> +    bitmap_fill(&num_cntrs_free_map, r->mon.num_cntrs);
>> +    num_cntrs_free_map_len = r->mon.num_cntrs;
>> +}
>> +
>>   /**
>>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>>    * @closid: closid if the resource group
>> @@ -2459,6 +2476,12 @@ static int resctrl_abmc_set_all(enum
>> resctrl_res_level l, bool enable)
> 
> resctrl_abmc_set_all() was initially created as a complement of
> resctrl_abmc_set_one() ... but with more initialization added to
> resctrl_abmc_set_all() this relationship becomes vague.

Yes. Understood. Let me know if want me to change anything here.

> 
>>       struct rdt_resource *r = &rdt_resources_all[l].r_resctrl;
>>       struct rdt_domain *d;
>>   +    /*
>> +     * The bitmap num_cntrs_free_map needs to be reset when switching
>> +     * the monitor mode.
> 
> This comment puts in words what can be understood by just looking at the
> code.
> What would be useful would be to explain _why_ this needs to be done ...
> what happens to the previously assigned counters?

Sure. Will add more details.

> 
> 
>> +     */
>> +    num_cntrs_init();
>> +
>>       /*
>>        * Update QOS_CFG MSR on all the CPUs associated with the resource
>>        * Hardware counters will reset after switching the monotor mode.
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-06-14  1:43   ` Reinette Chatre
@ 2024-06-19 18:46     ` Moger, Babu
  2024-06-27 18:51     ` Moger, Babu
  1 sibling, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 18:46 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:43, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>> supported, the bandwidth events can be configured to track specific
>> events. The event configuration is domain specific. ABMC (Assignable
>> Bandwidth Monitoring Counters) feature needs event configuration
>> information to assign hardware counter to an RMID. Event configurations
>> are not stored in resctrl but instead always read from or written to
>> hardware directly when prompted by user space.
>>
>> Read the event configuration from the hardware during the domain
>> initialization. Save the configuration information in the rdt_hw_domain,
>> so it can be used for counter assignment.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Read the configuration information from the hardware to initialize.
>>      Added few commit messages.
>>      Fixed the tab spaces.
>>
>> v3: Minor changes related to rebase in mbm_config_write_domain.
>>
>> v2: No changes.
>> ---
>>   arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>>   arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
>>   arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
>>   3 files changed, 28 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
>> b/arch/x86/kernel/cpu/resctrl/core.c
>> index ec93f6a50308..856c46d12177 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct
>> rdt_resource *r)
>>           return;
>>       }
>>   +    arch_domain_mbm_evt_config(hw_dom);
>> +
>>       list_add_tail_rcu(&d->list, add_pos);
>>         err = resctrl_online_domain(r, d);
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 5e7e76cd512f..60a1ca0a11a7 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -373,6 +373,8 @@ struct arch_mbm_state {
>>    * @ctrl_val:    array of cache or mem ctrl values (indexed by CLOSID)
>>    * @arch_mbm_total:    arch private state for MBM total bandwidth
>>    * @arch_mbm_local:    arch private state for MBM local bandwidth
>> + * @mbm_total_cfg:    MBM total bandwidth configuration
>> + * @mbm_local_cfg:    MBM local bandwidth configuration
>>    *
>>    * Members of this structure are accessed via helpers that provide
>> abstraction.
>>    */
>> @@ -381,6 +383,8 @@ struct rdt_hw_domain {
>>       u32                *ctrl_val;
>>       struct arch_mbm_state        *arch_mbm_total;
>>       struct arch_mbm_state        *arch_mbm_local;
>> +    u32                mbm_total_cfg;
>> +    u32                mbm_local_cfg;
>>   };
> 
> Similar to the abmc_enabled member of rdt_hw_resource, these new
> members of rdt_hw_domain are architecture specific and should never be
> touched directly by resctrl fs code, for example, from mbm_config_show().

Yes. Understood. Will check all the accesses.

> 
>>     static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct
>> rdt_domain *r)
>> @@ -622,6 +626,7 @@ void __check_limbo(struct rdt_domain *d, bool
>> force_free);
>>   void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>>   void __init resctrl_file_fflags_init(const char *config,
>>                        unsigned long fflags);
>> +void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
>>   void rdt_staged_configs_clear(void);
>>   bool closid_allocated(unsigned int closid);
>>   int resctrl_find_cleanest_closid(void);
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c
>> b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index b1d002e5e93d..ab0f4bb49bd9 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1093,6 +1093,27 @@ int __init rdt_get_mon_l3_config(struct
>> rdt_resource *r)
>>       return 0;
>>   }
>>   +void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom)
>> +{
>> +    u64 msrval;
>> +
>> +    /*
>> +     * Read the configuration registers QOS_EVT_CFG_n, where <n> is
>> +     * the BMEC event number (EvtID).
>> +     * 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
>> +     * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
> 
> This is what mon_event_config_index_get() is for, no?

Yes, it is. I can change it.

> 
>> +     */
>> +    if (mbm_total_event.configurable) {
>> +        rdmsrl(MSR_IA32_EVT_CFG_BASE, msrval);
>> +        hw_dom->mbm_total_cfg = msrval & MAX_EVT_CONFIG_BITS;
>> +    }
>> +
>> +    if (mbm_local_event.configurable) {
>> +        rdmsrl(MSR_IA32_EVT_CFG_BASE + 1, msrval);
>> +        hw_dom->mbm_local_cfg = msrval & MAX_EVT_CONFIG_BITS;
>> +    }
>> +}
>> +
>>   void __exit rdt_put_mon_l3_config(void)
>>   {
>>       dom_data_exit();
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment
  2024-06-14  1:44   ` Reinette Chatre
@ 2024-06-19 20:10     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 20:10 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:44, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID and monitor the bandwidth as long as the counter
>> is assigned. The bandwidth events will be tracked by the hardware until
>> the user changes the configuration. Each resctrl group can configure
>> maximum two counters, one for total event and one for local event.
>>
>> The counters are configured by writing to MSR L3_QOS_ABMC_CFG.
>> Configuration is done by setting the counter id, bandwidth source (RMID),
>> bandwidth types. Reading L3_QOS_ABMC_DSC returns the configuration of
> 
> What is a "bandwidth type" here?

Its bandwidth configuration(reads, writes etc). It is
mbm_total_bytes_config", "mbm_local_bytes_config" in
https://www.kernel.org/doc/Documentation/x86/resctrl.rst

I will add that text.

> 
>> the counter id specified in L3_QOS_ABMC_CFG.
>>
>> Attempts to read or write these MSRs when ABMC is not enabled will result
>> in a #GP(0) exception.
>>
>> MSR L3_QOS_ABMC_CFG (0xC000_03FDh) and L3_QOS_ABMC_DSC (0xC000_03FEh)
>> details.
>> ==========================================================================
>> Bits     Mnemonic    Description        Access Type   Reset Value
>> ==========================================================================
>> 63     CfgEn         Configuration Enable     R/W         0
>>
>> 62     CtrEn         Counter Enable         R/W         0
>>
>> 61:53     –         Reserved         MBZ         0
>>
>> 52:48     CtrID         Counter Identifier    R/W        0
>>
>> 47     IsCOS        BwSrc field is a COS    R/W        0
>>             (not an RMID)
>>
>> 46:44     –        Reserved        MBZ        0
>>
>> 43:32    BwSrc        Bandwidth Source    R/W        0
>>             (RMID or COS)
>>
>> 31:0    BwType        Bandwidth types to    R/W        0
>>             track for this counter
>> ==========================================================================
>>
> 
> This is copy&paste from AMD spec but needs some translation to map it to
> resctrl.

Yes. Sure. Will elaborate.

> 
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> ---
>> v4: Added more descriptions.
>>      Changed the name abmc_ctr_id to ctr_id.
>>      Added L3_QOS_ABMC_DSC. Used for reading the configuration.
>>
>> v3: No changes.
>>
>> v2: No changes.
>> ---
>>   arch/x86/include/asm/msr-index.h       |  2 ++
>>   arch/x86/kernel/cpu/resctrl/internal.h | 30 ++++++++++++++++++++++++++
>>   2 files changed, 32 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h
>> b/arch/x86/include/asm/msr-index.h
>> index 5f9a0139e98c..6d2fe39ac68f 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -1172,6 +1172,8 @@
>>   #define MSR_IA32_SMBA_BW_BASE        0xc0000280
>>   #define MSR_IA32_EVT_CFG_BASE        0xc0000400
>>   #define MSR_IA32_L3_QOS_EXT_CFG        0xc00003ff
>> +#define MSR_IA32_L3_QOS_ABMC_CFG    0xc00003fd
>> +#define MSR_IA32_L3_QOS_ABMC_DSC    0xc00003fe
>>     /* MSR_IA32_VMX_MISC bits */
>>   #define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 60a1ca0a11a7..45ed33f4f0ff 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -107,6 +107,9 @@ cpumask_any_housekeeping(const struct cpumask *mask,
>> int exclude_cpu)
>>   #define ASSIGN_TOTAL            BIT(QOS_L3_MBM_TOTAL_EVENT_ID)
>>   #define ASSIGN_LOCAL            BIT(QOS_L3_MBM_LOCAL_EVENT_ID)
>>   +/* Maximum assignable counters per resctrl group */
>> +#define MAX_CNTRS        2
>> +
>>   struct rdt_fs_context {
>>       struct kernfs_fs_context    kfc;
>>       bool                enable_cdpl2;
>> @@ -211,6 +214,7 @@ enum rdtgrp_mode {
>>    * @crdtgrp_list:        child rdtgroup node list
>>    * @rmid:            rmid for this rdtgroup
>>    * @mon_state:            Assignment state of the group
>> + * @ctr_id:            ABMC counter ids assigned to this group
>>    */
>>   struct mongroup {
>>       struct kernfs_node    *mon_data_kn;
>> @@ -218,6 +222,7 @@ struct mongroup {
>>       struct list_head    crdtgrp_list;
>>       u32            rmid;
>>       u32            mon_state;
>> +    u32            ctr_id[MAX_CNTRS];
> 
> To match "num_cntrs" it seems "cntr_id" may be more appropriate?

Yes. Will change appropriately.

> 
>>   };
>>     /**
>> @@ -568,6 +573,31 @@ union cpuid_0x10_x_edx {
>>       unsigned int full;
>>   };
>>   +/*
>> + * ABMC counters can be configured  by writing to L3_QOS_ABMC_CFG.
> 
> extra space

ok

> 
>> + * @bw_type        : Bandwidth types to track for this counter
> 
> What is a "Bandwidth type"? What are possible values?

Its bandwidth configuration(reads, writes etc). It is
mbm_total_bytes_config", "mbm_local_bytes_config" in
https://www.kernel.org/doc/Documentation/x86/resctrl.rst
It is a bitmap. Will clarify  that.


> 
>> + * @bw_src        : Bandwidth Source (RMID or CLOSID)
>> + * @reserved1        : Reserved
>> + * @is_cos        : BwSrc field is a COS (not an RMID)
> 
> Please be consistent wrt "CLOS"/"CLOSID" in comment and member name

Sure. Will change it to CLOSID.

> 
>> + * @ctr_id        : Counter Identifier
> 
> Please be consistent with "cntr" vs "ctr". Earlier the series
> introduces "num_cntrs" so cntr seems appropriate?

Sure.

> 
>> + * @reserved        : Reserved
>> + * @ctr_en        : Counter Enable bit
>> + * @cfg_en        : Configuration Enable bit
> 
> Why are there two enable bits? How are they used?

I am setting both these bits when configuring, I need some clarifications
on these bits with h/w team. I need to know how different combinations
changes the behavior. I will confirm about this.

> 
> (please match other kernel doc wrt : usage)
> 
>> + */
>> +union l3_qos_abmc_cfg {
>> +    struct {
>> +        unsigned long  bw_type    :32,
>> +                   bw_src    :12,
>> +                   reserved1: 3,
>> +                   is_cos    : 1,
>> +                   ctr_id    : 5,
>> +                   reserved    : 9,
>> +                   ctr_en    : 1,
>> +                   cfg_en    : 1;
>> +    } split;
>> +    unsigned long full;
>> +};
>> +
>>   void rdt_last_cmd_clear(void);
>>   void rdt_last_cmd_puts(const char *s);
>>   __printf(1, 2)
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter
  2024-06-14  1:48   ` Reinette Chatre
@ 2024-06-19 22:38     ` Moger, Babu
  2024-06-20 22:50       ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-19 22:38 UTC (permalink / raw)
  To: Reinette Chatre, Babu Moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/2024 8:48 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> ABMC feature requires to users to assign a hardware counter to an RMID
> 
> "requires to users" -> "requires users"?

Sure.

> 
>> to monitor the events. Provide the interfaces to assign a counter.
>>
>> Individual counters are configured by writing to L3_QOS_ABMC_CFG MSR
>> and specifying the counter id, bandwidth source, and bandwidth types.
> 
> Where is discription of what this patch does and why?
> 
> This and following patches seem to introduce building blocks that
> are only used later in series. These building blocks are introduced
> without any information about what they do and why and that makes it
> difficult to understand how this work will fall into place.

Ok. Will elaborate this.


> 
>>
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>      Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable 
>> Bandwidth
>>      Monitoring (ABMC).
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> ---
>> v4: Commit message update.
>>      User bitmap APIs where applicable.
>>      Changed the interfaces considering MPAM(arm).
>>      Added domain specific assignment.
>>
>> v3: Removed the static from the prototype of rdtgroup_assign_abmc.
>>      The function is not called directly from user anymore. These
>>      changes are related to global assignment interface.
>>
>> v2: Minor text changes in commit message.
>> ---
>>   arch/x86/kernel/cpu/resctrl/internal.h |   3 +
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 101 +++++++++++++++++++++++++
>>   2 files changed, 104 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h 
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 45ed33f4f0ff..a88c8fc5e4df 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -657,6 +657,9 @@ void rdt_domain_reconfigure_cdp(struct 
>> rdt_resource *r);
>>   void __init resctrl_file_fflags_init(const char *config,
>>                        unsigned long fflags);
>>   void arch_domain_mbm_evt_config(struct rdt_hw_domain *hw_dom);
>> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>> +            u32 ctr_id, u32 closid, bool enable);
>> +int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
>>   void rdt_staged_configs_clear(void);
>>   bool closid_allocated(unsigned int closid);
>>   int resctrl_find_cleanest_closid(void);
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 0e425c91fa46..48df76499a04 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -203,6 +203,19 @@ static void num_cntrs_init(void)
>>       num_cntrs_free_map_len = r->mon.num_cntrs;
>>   }
>> +static int assign_cntrs_alloc(void)
> 
> Only one counter is allocated and "assign" is unexpected since it
> seems to imply how this counter will be used.
> It can just be "mon_cntr_alloc()"?

Sure.

> 
>> +{
>> +    u32 ctr_id = find_first_bit(&num_cntrs_free_map,
>> +                    num_cntrs_free_map_len);
>> +
>> +    if (ctr_id >= num_cntrs_free_map_len)
>> +        return -ENOSPC;
>> +
>> +    __clear_bit(ctr_id, &num_cntrs_free_map);
>> +
>> +    return ctr_id;
>> +}
>> +
>>   /**
>>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>>    * @closid: closid if the resource group
>> @@ -1830,6 +1843,94 @@ static ssize_t 
>> mbm_local_bytes_config_write(struct kernfs_open_file *of,
>>       return ret ?: nbytes;
>>   }
>> +static void rdtgroup_abmc_cfg(void *info)
>> +{
>> +    u64 *msrval = info;
>> +
>> +    wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *msrval);
>> +}
>> +
> 
> Please add comments to these functions to summarize what
> they do.

Sure.

> 
>> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>> +            u32 ctr_id, u32 closid, bool enable)
>> +{
>> +    struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>> +    union l3_qos_abmc_cfg abmc_cfg = { 0 };
>> +    struct arch_mbm_state *arch_mbm;
>> +
>> +    abmc_cfg.split.cfg_en = 1;
>> +    abmc_cfg.split.ctr_en = enable ? 1 : 0;
>> +    abmc_cfg.split.ctr_id = ctr_id;
>> +    abmc_cfg.split.bw_src = rmid;
>> +
>> +    /*
>> +     * Read the event configuration from the domain and pass it as
>> +     * bw_type.
>> +     */
>> +    if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
>> +        abmc_cfg.split.bw_type = hw_dom->mbm_total_cfg;
>> +        arch_mbm = &hw_dom->arch_mbm_total[rmid];
>> +    } else {
>> +        abmc_cfg.split.bw_type = hw_dom->mbm_local_cfg;
>> +        arch_mbm = &hw_dom->arch_mbm_local[rmid];
>> +    }
>> +
>> +    smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_cfg, &abmc_cfg, 
>> 1);
>> +
>> +    /* Reset the internal counters */
> 
> "internal counters"? This needs a definition ... but since this is not
> a new data structure the comment can be more specific about what is done
> and why.

How about?
/* Reset internal mbm event counters. */

> 
>> +    if (arch_mbm)
>> +        memset(arch_mbm, 0, sizeof(struct arch_mbm_state));
>> +
>> +    return 0;
> 
> If this function always succeeds then it can just be void?

Sure. We can do that.

> 
>> +}
>> +
>> +int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid)
> 
> Please use consistent naming. Note how the filename is "rdtgroup.c" and 
> this
> function operates on "struct rdtgroup" and if you take a closer look at
> the functions in this file and the header where you add this function you
> will notice a distinct pattern of "rdtgroup_" used as prefix. There really
> is no reason to start a new namespace for this.

Sure.

> 
>> +{
>> +    struct rdt_resource *r = 
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +    int ctr_id = 0, index;
>> +    struct rdt_domain *d;
>> +    u32 mon_state;
>> +
>> +    index = mon_event_config_index_get(evtid);
>> +    if (index == INVALID_CONFIG_INDEX) {
>> +        rdt_last_cmd_puts("Invalid event id\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
>> +        mon_state = ASSIGN_TOTAL;
>> +    } else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
>> +        mon_state = ASSIGN_LOCAL;
>> +    } else {
>> +        rdt_last_cmd_puts("Invalid event id\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Nothing to do if event has been assigned already */
>> +    if (rdtgrp->mon.mon_state & mon_state) {
> 
> Why is this mon_state needed? Is it not redundant when considering that
> struct mongroup has an array of counter IDs that can already be used
> to see if a counter is assigned or not?
> What if there is a new:
> #define MON_CNTR_UNSET U32_MAX
> 
> When cntr_id[index] == MON_CNTR_UNSET then it is not assigned.

Yes. That can be done. That case mon_state is not required. We just need 
the index. Also, we need to initialize the ctrl_id when the group is 
created. Hope that should work.

> 
>> +        rdt_last_cmd_puts("ABMC counter is assigned already\n");
>> +        return 0;
>> +    }
>> +
>> +    /*
>> +     * Allocate a new counter and update domains
>> +     */
>> +    ctr_id = assign_cntrs_alloc();
>> +    if (ctr_id < 0) {
>> +        rdt_last_cmd_puts("Out of ABMC counters\n");
>> +        return -ENOSPC;
>> +    }
>> +
>> +    rdtgrp->mon.ctr_id[index] = ctr_id;
>> +
>> +    list_for_each_entry(d, &r->domains, list)
>> +        resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid, ctr_id,
>> +                    rdtgrp->closid, 1);
>> +
>> +    rdtgrp->mon.mon_state |= mon_state;
>> +
>> +    return 0;
>> +}
>> +
>>   /* rdtgroup information files for one cache resource. */
>>   static struct rftype res_common_files[] = {
>>       {
> 
> Reinette
> 

-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 15/19] x86/resctrl: Add the interface to unassign ABMC counter
  2024-06-14  1:49   ` Reinette Chatre
@ 2024-06-20 13:48     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-20 13:48 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:49, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Hardware provides a limited number of ABMC counters. Once all the
>> counters are exhausted, counters need to be freed for new assignments.
>>
>> Provide the interface to unassign the counter.
> 
> Please write a proper changelog. This needs information that explains
> what this patch does and why.

Sure. Will elaborate on this.
> 
>>
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>      Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable
>> Bandwidth
>>      Monitoring (ABMC).
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> ---
>> v4: Added domain specific unassign feature.
>>      Few name changes.
>>
>> v3: Removed the static from the prototype of rdtgroup_unassign_abmc.
>>      The function is not called directly from user anymore. These
>>      changes are related to global assignment interface.
>>
>> v2: No changes.
>> ---
>>   arch/x86/kernel/cpu/resctrl/internal.h |  2 ++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 ++++++++++++++++++++++++++
>>   2 files changed, 44 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index a88c8fc5e4df..e16244895350 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -660,6 +660,8 @@ void arch_domain_mbm_evt_config(struct rdt_hw_domain
>> *hw_dom);
>>   int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>>               u32 ctr_id, u32 closid, bool enable);
>>   int resctrl_grp_assign(struct rdtgroup *rdtgrp, u32 evtid);
>> +int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid);
>> +void num_cntrs_free(u32 ctr_id);
>>   void rdt_staged_configs_clear(void);
>>   bool closid_allocated(unsigned int closid);
>>   int resctrl_find_cleanest_closid(void);
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 48df76499a04..5ea1e58c7201 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -216,6 +216,11 @@ static int assign_cntrs_alloc(void)
>>       return ctr_id;
>>   }
>>   +void num_cntrs_free(u32 ctr_id)
> 
> The name does not reflect what it does. It neither frees the "num_cntrs"
> information
> nor does it free "num_cntrs" of counters. How about "mon_cntr_free()"?

Sure. If we change the name of num_cntrs to mbm_cntrs, then we can change
to mbm_cntr_free().
Here is the comments on name change.
https://lore.kernel.org/lkml/62fe683f-3a4c-4280-8770-d2aaff478d33@amd.com/


> 
> 
>> +{
>> +    __set_bit(ctr_id, &num_cntrs_free_map);
>> +}
>> +
>>   /**
>>    * rdtgroup_mode_by_closid - Return mode of resource group with closid
>>    * @closid: closid if the resource group
>> @@ -1931,6 +1936,43 @@ int resctrl_grp_assign(struct rdtgroup *rdtgrp,
>> u32 evtid)
>>       return 0;
>>   }
>>   +int resctrl_grp_unassign(struct rdtgroup *rdtgrp, u32 evtid)
> 
> Same comment wrt namespace. Also this function needs a description.

Sure.

> 
>> +{
>> +    struct rdt_resource *r =
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +    struct rdt_domain *d;
>> +    u32 mon_state;
>> +    int index;
>> +
>> +    index = mon_event_config_index_get(evtid);
>> +    if (index == INVALID_CONFIG_INDEX) {
>> +        pr_warn_once("Invalid event id %d\n", evtid);
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
>> +        mon_state = ASSIGN_TOTAL;
>> +    } else if (evtid == QOS_L3_MBM_LOCAL_EVENT_ID) {
>> +        mon_state = ASSIGN_LOCAL;
>> +    } else {
>> +        rdt_last_cmd_puts("Invalid event id\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (rdtgrp->mon.mon_state & mon_state) {
>> +        list_for_each_entry(d, &r->domains, list)
>> +            resctrl_arch_assign(d, evtid, rdtgrp->mon.rmid,
>> +                        rdtgrp->mon.ctr_id[index],
>> +                        rdtgrp->closid, 0);
>> +
>> +        /* Update the counter bitmap */
>> +        num_cntrs_free(rdtgrp->mon.ctr_id[index]);
>> +    }
>> +
>> +    rdtgrp->mon.mon_state &= ~mon_state;
>> +
>> +    return 0;
>> +}
>> +
>>   /* rdtgroup information files for one cache resource. */
>>   static struct rftype res_common_files[] = {
>>       {
> 
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount
  2024-06-14  1:50   ` Reinette Chatre
@ 2024-06-20 14:46     ` Moger, Babu
  2024-06-20 22:49       ` Reinette Chatre
  0 siblings, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-20 14:46 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:50, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Enable ABMC by default when feature is supported.
> 
> Why enable/disable it on every mount/unmount cycle though? Why not just
> enable it at the beginning and only change if requested by user?

I think we can do that. This needs to be done at __init
rdt_get_mon_l3_config(). Will let you if i see any issues with that.

> 
>>
>> Counters are assigned automatically on group creation. If the counters
>> are exhausted, report the warnings and continue. It is not required to
>> fail group creation for assignment failures. Users will have the option
>> to modify the assignments.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Few name changes based on the upstream discussion.
>>      Commit message update.
>>
>> v3: This is a new patch. Patch addresses the upstream comment to enable
>>      ABMC feature by default if the feature is available.
>> ---
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 69 ++++++++++++++++++++++++++
>>   1 file changed, 69 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 5ea1e58c7201..f452b6d9bb99 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2756,6 +2756,7 @@ static void rdt_disable_ctx(void)
>>   {
>>       resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
>>       resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
>> +    resctrl_abmc_disable(RDT_RESOURCE_L3);
>>       set_mba_sc(false);
>>         resctrl_debug = false;
>> @@ -2786,6 +2787,8 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
>>       if (ctx->enable_debug)
>>           resctrl_debug = true;
>>   +    resctrl_abmc_enable(RDT_RESOURCE_L3);
>> +
>>       return 0;
>>     out_cdpl3:
>> @@ -2882,6 +2885,41 @@ static void schemata_list_destroy(void)
>>       }
>>   }
>>   +static int resctrl_assign_events(struct rdtgroup *rdtgrp)
>> +{
>> +    struct rdt_resource *r =
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> +    int ret = 0;
>> +
>> +    if (!hw_res->abmc_enabled)
>> +        return 0;
> 
> resctrl fs should not be peeking in arch structure

Sure.

> 
>> +
>> +    if (is_mbm_total_enabled())
>> +        ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> +    if (!ret && is_mbm_local_enabled())
>> +        ret = resctrl_grp_assign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +
> 
> This function needs a comment about why it is ok to return with error and
> partial changes.

Sure.

> 
>> +    return ret;
>> +}
>> +
>> +static int resctrl_unassign_events(struct rdtgroup *rdtgrp)
>> +{
>> +    struct rdt_resource *r =
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> +    int ret = 0;
>> +
>> +    if (!hw_res->abmc_enabled)
>> +        return 0;
>> +
>> +    if (is_mbm_total_enabled())
>> +        ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> +    if (!ret && is_mbm_local_enabled())
>> +        ret = resctrl_grp_unassign(rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +    return ret;
>> +}
>> +
>>   static int rdt_get_tree(struct fs_context *fc)
>>   {
>>       struct rdt_fs_context *ctx = rdt_fc2context(fc);
>> @@ -2941,6 +2979,14 @@ static int rdt_get_tree(struct fs_context *fc)
>>           if (ret < 0)
>>               goto out_mongrp;
>>           rdtgroup_default.mon.mon_data_kn = kn_mondata;
>> +
>> +        /*
>> +         * Assign the monitor counters if it is available. If it fails,
>> +         * report the warnings and continue. It is not nessaccery to
>> +         * fail here.
>> +         */
> 
> nessaccery -> necessary

Sure.
> 
> Please elaborate why it is not necessary to fail.

Sure.

> 
> 
>> +        if (resctrl_assign_events(&rdtgroup_default) < 0)
>> +            rdt_last_cmd_puts("Monitor assignment failed\n");
> 
> How will user know that there is a warning? This does not return an error
> so nothing will prompt user to check the status file. Perhaps add a comment
> in the docs to help users at least know that this exists. Another helpful

Sure. Will add text about this in the documentation.

> thing will be to have the counter return "Unassigned" if it has not been
> assigned
> since users familiar with old interface may see "Unavailable" without
> knowing that user action is required to get a working event.

Not so sure about interpreting hardware reported error.

> 
>>       }
>>         ret = rdt_pseudo_lock_init();
>> @@ -3214,6 +3260,8 @@ static void rdt_kill_sb(struct super_block *sb)
>>       cpus_read_lock();
>>       mutex_lock(&rdtgroup_mutex);
>>   +    resctrl_unassign_events(&rdtgroup_default);
>> +
>>       rdt_disable_ctx();
>>         /*Put everything back to default values. */
>> @@ -3752,6 +3800,14 @@ static int rdtgroup_mkdir_mon(struct kernfs_node
>> *parent_kn,
>>           goto out_unlock;
>>       }
>>   +    /*
>> +     * Assign the monitor counters if it is available. If it fails,
>> +     * report the warnings and continue. It is not nessaccery to
>> +     * fail here.
>> +     */
> 
> Fix copy&paste here and below.

Sure.

> 
>> +    if (resctrl_assign_events(rdtgrp) < 0)
>> +        rdt_last_cmd_puts("Monitor assignment failed\n");
>> +
>>       kernfs_activate(rdtgrp->kn);
>>         /*
>> @@ -3796,6 +3852,14 @@ static int rdtgroup_mkdir_ctrl_mon(struct
>> kernfs_node *parent_kn,
>>       if (ret)
>>           goto out_closid_free;
>>   +    /*
>> +     * Assign the monitor counters if it is available. If it fails,
>> +     * report the warnings and continue. It is not nessaccery to
>> +     * fail here.
>> +     */
>> +    if (resctrl_assign_events(rdtgrp) < 0)
>> +        rdt_last_cmd_puts("Monitor assignment failed\n");
>> +
>>       kernfs_activate(rdtgrp->kn);
>>         ret = rdtgroup_init_alloc(rdtgrp);
>> @@ -3891,6 +3955,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup
>> *rdtgrp, cpumask_var_t tmpmask)
>>       update_closid_rmid(tmpmask, NULL);
>>         rdtgrp->flags = RDT_DELETED;
>> +
>> +    resctrl_unassign_events(rdtgrp);
>> +
>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>         /*
>> @@ -3937,6 +4004,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup
>> *rdtgrp, cpumask_var_t tmpmask)
>>       cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>       update_closid_rmid(tmpmask, NULL);
>>   +    resctrl_unassign_events(rdtgrp);
>> +
>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>       closid_free(rdtgrp->closid);
>>   
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy
  2024-06-14  1:51   ` Reinette Chatre
@ 2024-06-20 14:53     ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-20 14:53 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:51, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> Introduce interface to switch between ABMC and legacy_mbm modes.
> 
> shortlog and first sentence of changelog do not match: mbm_legacy vs
> legacy_mbm?

Will correct it.

> 
>>
>> By default ABMC is enabled on resctrl mount if the feature is available.
>> However, user will have the option to go back to legacy_mbm if required.
>>
>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> [abmc]
>> mbm_legacy
>>
>> To enable the legacy monitoring feature:
>> $ echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> 
> Missing information about user visible impact to counters/events
> and any mitigations needed in implementation.

Sure. Will add the details.

> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Minor commit text changes. Keep the default to ABMC when supported.
>>
>> v3: New patch to address the review comments from upstream.
>> ---
>>   Documentation/arch/x86/resctrl.rst     | 10 +++++++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 +++++++++++++++++++++++++-
>>   2 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst
>> b/Documentation/arch/x86/resctrl.rst
>> index ab3cde61a124..fd050d4d22cd 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -271,6 +271,16 @@ with the following files:
>>         [abmc]
>>         mbm_legacy
>>   +    * To enable ABMC feature:
>> +      ::
>> +
>> +        # echo  "abmc" > /sys/fs/resctrl/info/L3_MON/mbm_assign
>> +
>> +    * To enable the legacy monitoring feature:
>> +      ::
>> +
>> +        # echo  "mbm_legacy" > /sys/fs/resctrl/info/L3_MON/mbm_assign
>> +
> 
> No information about what the features are or what will happen on such a
> switch.

Ok. Sure.  Will add more documentation.

> 
>>   "max_threshold_occupancy":
>>           Read/write file provides the largest value (in
>>           bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index f452b6d9bb99..d77ff059269a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -908,6 +908,43 @@ static int rdtgroup_mbm_assign_show(struct
>> kernfs_open_file *of,
>>       return 0;
>>   }
>>   +/*
>> + * rdtgroup_mode_write - Modify the resource group's mode
> 
> Comment does not match function name. Seems like "mode" is the generic term
> to use instead of "assign".

Sure. Will fix.

> 
>> + */
>> +static ssize_t rdtgroup_mbm_assign_write(struct kernfs_open_file *of,
>> +                     char *buf, size_t nbytes,
>> +                     loff_t off)
>> +{
>> +    struct rdt_resource *r = of->kn->parent->priv;
>> +    int ret = 0;
>> +
>> +    if (!r->abmc_capable)
>> +        return -EINVAL;
>> +
>> +    /* Valid input requires a trailing newline */
>> +    if (nbytes == 0 || buf[nbytes - 1] != '\n')
>> +        return -EINVAL;
>> +
>> +    buf[nbytes - 1] = '\0';
>> +
>> +    cpus_read_lock();
>> +    mutex_lock(&rdtgroup_mutex);
>> +
>> +    rdt_last_cmd_clear();
>> +
>> +    if (!strcmp(buf, "mbm_legacy"))
>> +        resctrl_abmc_disable(RDT_RESOURCE_L3);
>> +    else if (!strcmp(buf, "abmc"))
>> +        ret = resctrl_abmc_enable(RDT_RESOURCE_L3);
>> +    else
>> +        ret = -EINVAL;
>> +
>> +    mutex_unlock(&rdtgroup_mutex);
>> +    cpus_read_unlock();
>> +
>> +    return ret ?: nbytes;
>> +}
>> +
>>   #ifdef CONFIG_PROC_CPU_RESCTRL
>>     /*
>> @@ -2087,9 +2124,10 @@ static struct rftype res_common_files[] = {
>>       },
>>       {
>>           .name        = "mbm_assign",
>> -        .mode        = 0444,
>> +        .mode        = 0644,
>>           .kf_ops        = &rdtgroup_kf_single_ops,
>>           .seq_show    = rdtgroup_mbm_assign_show,
>> +        .write        = rdtgroup_mbm_assign_write,
>>       },
>>       {
>>           .name        = "cpus",
> 
> 
> Reinette
> 

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature
  2024-06-19 15:37     ` Moger, Babu
@ 2024-06-20 22:02       ` Reinette Chatre
  2024-06-21 15:44         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:02 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/19/24 8:37 AM, Moger, Babu wrote:
> On 6/13/24 19:59, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:

>>> +     * Reset the internal counters so that it is not considered as
>>> +     * an overflow in next update.
>>
>> For the first time the term "internal counters" is introduced. What does it
>> mean?
> 
> How about "mbm counters"?

I think it refers to the state resctrl maintains to be able to catch overflows,
so how about just "architectural state" with matching change of something like:

	Reset the architectural state so that reading of hardware counter
	is not considered as an overflow in next update.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode
  2024-06-19 16:25     ` Moger, Babu
@ 2024-06-20 22:05       ` Reinette Chatre
  2024-06-21 15:47         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:05 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/19/24 9:25 AM, Moger, Babu wrote:
> On 6/13/24 20:40, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:

>>> --- a/Documentation/arch/x86/resctrl.rst
>>> +++ b/Documentation/arch/x86/resctrl.rst
>>> @@ -261,6 +261,16 @@ with the following files:
>>>        Available when ABMC feature is supported. The number of ABMC counters
>>>        available for configuration.
>>>    +"mbm_assign":
>>
>> This name is not ideal but I am having trouble finding a better one ... I
>> have
>> seen you use "monitor mode" a couple of times (even in shortlog), so maybe
>> that
>> could be the start of a more generic name? "mbm_mode"?
> 
> mbm_mode sounds good. Like this.
> 
> $cat /sys/fs/resctrl/info/L3_MON/mbm_mode
> [abmc]
> legacy
> 
> Keeping just "legacy" vs mbm_legacy.
> 

Looks good to me. This sounds generic enough to build on.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap
  2024-06-19 17:03     ` Moger, Babu
@ 2024-06-20 22:20       ` Reinette Chatre
  2024-06-21 16:01         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:20 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/19/24 10:03 AM, Moger, Babu wrote:
> On 6/13/24 20:42, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:

...

>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 3071bbb7a15e..400ae405e10e 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -186,6 +186,23 @@ bool closid_allocated(unsigned int closid)
>>>        return !test_bit(closid, &closid_free_map);
>>>    }
>>>    +/*
>>> + * ABMC Counter bitmap and length for tracking available counters.
>>> + * ABMC feature provides set of hardware counters for enabling events.
>>> + * Each event takes one hardware counter. Kernel needs to keep track
>>> + * of number of available counters.
>>> + */
>>> +static unsigned long num_cntrs_free_map;
>>
>> Why does variable have "num" in its name? That seems strange. How
>> about just "mon_cntrs_free_map
> 
> It came from patch 4/19.
> 
> struct resctrl_mon {
>          int                     num_rmid;
> +       int                     num_cntrs;
>          struct list_head        evt_list;
>   };
> 
> num_cntrs_free_map is a bitmap representing num_cntrs. Kept the matching
> name for better understanding. Renaming it will loose that connection.

I disagree. The "num" in "num_cntrs" indicates that this variable stores
the _number_ of a particular entity. In this case "cntrs" or ... counters.
This is just like how resctrl uses "num_closid" to indicate how many closid
are available and then have a separate "closid_free_map" to actually track
now closids are used ... it is _not_ "num_closid_free_map". Similarly,
"num_rmid" indicates how many RMID are available and the "rmid_free_lru"
tracks how RMID are used ... it is _not_ "num_rmid_free_lru".

> If I rename then I will have to rename both.

No, you do not.

> 
> How about mbm_cntrs and mbm_cntrs_free_map?

"mbm_cntrs" does sound good. It is more specific than "cntrs". I would suggest that
use "num_mbm_cntrs" to match with "num_rmid" and "num_closid" and then
you can introduce "mbm_cntrs_free_map".

>>> +static u32 num_cntrs_free_map_len;
>>
>> Same comment about "num" ... also, any special reason why u32 is needed?
> 
> Only reason is, it is supposed to be unsigned.  I can change it "unsigned
> int".
> 
>>
>>> +
>>> +static void num_cntrs_init(void)
>>
>> mon_cntrs_init() ?
> 
> mbm_cntrs_init?

Sounds good.

> 
>>
>>> +{
>>> +    struct rdt_resource *r =
>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>> +
>>> +    bitmap_fill(&num_cntrs_free_map, r->mon.num_cntrs);
>>> +    num_cntrs_free_map_len = r->mon.num_cntrs;
>>> +}
>>> +
>>>    /**
>>>     * rdtgroup_mode_by_closid - Return mode of resource group with closid
>>>     * @closid: closid if the resource group
>>> @@ -2459,6 +2476,12 @@ static int resctrl_abmc_set_all(enum
>>> resctrl_res_level l, bool enable)
>>
>> resctrl_abmc_set_all() was initially created as a complement of
>> resctrl_abmc_set_one() ... but with more initialization added to
>> resctrl_abmc_set_all() this relationship becomes vague.
> 
> Yes. Understood. Let me know if want me to change anything here.

How about renaming resctrl_abmc_set_all() to _resctrl_abmc_enable()?

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2024-06-18 21:02   ` Moger, Babu
@ 2024-06-20 22:49     ` Reinette Chatre
  2024-06-21 16:41       ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:49 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/18/24 2:02 PM, Moger, Babu wrote:
> On 6/13/24 19:54, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:
>>
>>>     when reading, then the read will come back as Unavailable.
>>
>> Should this not rather be "Unassigned"? According to the docs the counters
>> will return "Unavailable" right after reconfigure so it seems that there
>> are scenarios where an "assigned" counter returns "Unavailable". It seems
>> more
>> useful to return "Unassigned" that will have a new specific meaning that
>> overloading existing "Unavailable" that has original meaning of "try
>> again" ....
>> but in this case trying again will be futile.
> 
> Hardware returns "Unavailable" in both the cases. So, thought of reporting the same without any interpretation.
> 

I do not see these as the same. When a counter is assigned and its
read returns "Unavailable" then the user reasonably expects that
retry will work.
When a counter is not assigned then no amount of retries will result
in data. How is user space expected to distinguish between the two
scenarios that return the same error with such significantly different
meaning? rdtgroup_mondata_shows() can just peek into the monitor
group state and see if a counter is assigned and immediately
return "Unassigned" if no counter is assigned, no? I see no need
for it to IPI another CPU and try to read a counter that it already knows
will be futile. This seems unnecessary and the generic "Unavailable" is
not helpful to user space.

Reinette


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount
  2024-06-20 14:46     ` Moger, Babu
@ 2024-06-20 22:49       ` Reinette Chatre
  2024-06-21 16:29         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:49 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/20/24 7:46 AM, Moger, Babu wrote:
> On 6/13/24 20:50, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:


>>> +        if (resctrl_assign_events(&rdtgroup_default) < 0)
>>> +            rdt_last_cmd_puts("Monitor assignment failed\n");
>>
>> How will user know that there is a warning? This does not return an error
>> so nothing will prompt user to check the status file. Perhaps add a comment
>> in the docs to help users at least know that this exists. Another helpful
> 
> Sure. Will add text about this in the documentation.
> 
>> thing will be to have the counter return "Unassigned" if it has not been
>> assigned
>> since users familiar with old interface may see "Unavailable" without
>> knowing that user action is required to get a working event.
> 
> Not so sure about interpreting hardware reported error.
> 

I only see "Unavailable" as a hardware error and that is already
supported. "Unassigned" is not a hardware error but instead indicates
to user space that no hardware counter is backing the queried event.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter
  2024-06-19 22:38     ` Moger, Babu
@ 2024-06-20 22:50       ` Reinette Chatre
  2024-06-21 16:07         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-20 22:50 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/19/24 3:38 PM, Moger, Babu wrote:
> On 6/13/2024 8:48 PM, Reinette Chatre wrote:
>> On 5/24/24 5:23 AM, Babu Moger wrote:

>>> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>>> +            u32 ctr_id, u32 closid, bool enable)
>>> +{
>>> +    struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>>> +    union l3_qos_abmc_cfg abmc_cfg = { 0 };
>>> +    struct arch_mbm_state *arch_mbm;
>>> +
>>> +    abmc_cfg.split.cfg_en = 1;
>>> +    abmc_cfg.split.ctr_en = enable ? 1 : 0;
>>> +    abmc_cfg.split.ctr_id = ctr_id;
>>> +    abmc_cfg.split.bw_src = rmid;
>>> +
>>> +    /*
>>> +     * Read the event configuration from the domain and pass it as
>>> +     * bw_type.
>>> +     */
>>> +    if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
>>> +        abmc_cfg.split.bw_type = hw_dom->mbm_total_cfg;
>>> +        arch_mbm = &hw_dom->arch_mbm_total[rmid];
>>> +    } else {
>>> +        abmc_cfg.split.bw_type = hw_dom->mbm_local_cfg;
>>> +        arch_mbm = &hw_dom->arch_mbm_local[rmid];
>>> +    }
>>> +
>>> +    smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_cfg, &abmc_cfg, 1);
>>> +
>>> +    /* Reset the internal counters */
>>
>> "internal counters"? This needs a definition ... but since this is not
>> a new data structure the comment can be more specific about what is done
>> and why.
> 
> How about?
> /* Reset internal mbm event counters. */

I think it is potentially confusing to introduce a new concept of
"internal counters". This is just the architectural state helping to
detect overflows. This commit has no mention of this but I assume that
the re-configuration of the counter will reset it, which explains
why it is required to reset the state maintained by software.
A similar comment to what can be found in mbm_config_write_domain()
would be useful. This does not have to be so elaborate, maybe just something
like (please feel free to improve):

	Configuring a hardware counter resets it. Reset architectural
	state to match.

Reinette

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy
  2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
  2024-06-14  1:51   ` Reinette Chatre
@ 2024-06-21 14:43   ` Markus Elfring
  1 sibling, 0 replies; 66+ messages in thread
From: Markus Elfring @ 2024-06-21 14:43 UTC (permalink / raw)
  To: Babu Moger, x86, Borislav Petkov, Dave Hansen, Fenghua Yu,
	Ingo Molnar, Jonathan Corbet, Reinette Chatre, Thomas Gleixner
  Cc: LKML, linux-doc, Breno Leitao, Daniel Sneddon, H. Peter Anvin,
	Ilpo Järvinen, James Morse, Jim Mattson, Jithu Joseph,
	Josh Poimboeuf, Julia Lawall, Kai Huang, Kan Liang, Kim Phillips,
	Lukas Bulwahn, Maciej Wieczor-Retman, Paolo Bonzini,
	Paul E. McKenney, Peter Newman, Peter Zijlstra, Randy Dunlap,
	Rick Edgecombe, Sandipan Das, Sean Christopherson,
	Stephane Eranian, Tejun Heo, Yan-Jie Wang

…
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +static ssize_t rdtgroup_mbm_assign_write(struct kernfs_open_file *of,
> +					 char *buf, size_t nbytes,
> +					 loff_t off)
> +{
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret ?: nbytes;
> +}
…

Would you become interested to apply statements like the following?

* guard(cpus_read_lock)();
  https://elixir.bootlin.com/linux/v6.10-rc4/source/include/linux/cleanup.h#L133

* guard(mutex)(&rdtgroup_mutex);
  https://elixir.bootlin.com/linux/v6.10-rc4/source/include/linux/mutex.h#L196

Regards,
Markus

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature
  2024-06-20 22:02       ` Reinette Chatre
@ 2024-06-21 15:44         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 15:44 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/20/2024 5:02 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/19/24 8:37 AM, Moger, Babu wrote:
>> On 6/13/24 19:59, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
> 
>>>> +     * Reset the internal counters so that it is not considered as
>>>> +     * an overflow in next update.
>>>
>>> For the first time the term "internal counters" is introduced. What 
>>> does it
>>> mean?
>>
>> How about "mbm counters"?
> 
> I think it refers to the state resctrl maintains to be able to catch 
> overflows,
> so how about just "architectural state" with matching change of 
> something like:
> 
>      Reset the architectural state so that reading of hardware counter
>      is not considered as an overflow in next update.

Sure. Looks good. thanks

-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode
  2024-06-20 22:05       ` Reinette Chatre
@ 2024-06-21 15:47         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 15:47 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse



On 6/20/2024 5:05 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/19/24 9:25 AM, Moger, Babu wrote:
>> On 6/13/24 20:40, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
> 
>>>> --- a/Documentation/arch/x86/resctrl.rst
>>>> +++ b/Documentation/arch/x86/resctrl.rst
>>>> @@ -261,6 +261,16 @@ with the following files:
>>>>        Available when ABMC feature is supported. The number of ABMC 
>>>> counters
>>>>        available for configuration.
>>>>    +"mbm_assign":
>>>
>>> This name is not ideal but I am having trouble finding a better one 
>>> ... I
>>> have
>>> seen you use "monitor mode" a couple of times (even in shortlog), so 
>>> maybe
>>> that
>>> could be the start of a more generic name? "mbm_mode"?
>>
>> mbm_mode sounds good. Like this.
>>
>> $cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>> [abmc]
>> legacy
>>
>> Keeping just "legacy" vs mbm_legacy.
>>
> 
> Looks good to me. This sounds generic enough to build on.

Thanks
-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap
  2024-06-20 22:20       ` Reinette Chatre
@ 2024-06-21 16:01         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 16:01 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/20/2024 5:20 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/19/24 10:03 AM, Moger, Babu wrote:
>> On 6/13/24 20:42, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
> 
> ...
> 
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index 3071bbb7a15e..400ae405e10e 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -186,6 +186,23 @@ bool closid_allocated(unsigned int closid)
>>>>        return !test_bit(closid, &closid_free_map);
>>>>    }
>>>>    +/*
>>>> + * ABMC Counter bitmap and length for tracking available counters.
>>>> + * ABMC feature provides set of hardware counters for enabling events.
>>>> + * Each event takes one hardware counter. Kernel needs to keep track
>>>> + * of number of available counters.
>>>> + */
>>>> +static unsigned long num_cntrs_free_map;
>>>
>>> Why does variable have "num" in its name? That seems strange. How
>>> about just "mon_cntrs_free_map
>>
>> It came from patch 4/19.
>>
>> struct resctrl_mon {
>>          int                     num_rmid;
>> +       int                     num_cntrs;
>>          struct list_head        evt_list;
>>   };
>>
>> num_cntrs_free_map is a bitmap representing num_cntrs. Kept the matching
>> name for better understanding. Renaming it will loose that connection.
> 
> I disagree. The "num" in "num_cntrs" indicates that this variable stores
> the _number_ of a particular entity. In this case "cntrs" or ... counters.
> This is just like how resctrl uses "num_closid" to indicate how many closid
> are available and then have a separate "closid_free_map" to actually track
> now closids are used ... it is _not_ "num_closid_free_map". Similarly,
> "num_rmid" indicates how many RMID are available and the "rmid_free_lru"
> tracks how RMID are used ... it is _not_ "num_rmid_free_lru".
> 
>> If I rename then I will have to rename both.
> 
> No, you do not.
> 
>>
>> How about mbm_cntrs and mbm_cntrs_free_map?
> 
> "mbm_cntrs" does sound good. It is more specific than "cntrs". I would 
> suggest that
> use "num_mbm_cntrs" to match with "num_rmid" and "num_closid" and then
> you can introduce "mbm_cntrs_free_map".

Sure.

> 
>>>> +static u32 num_cntrs_free_map_len;
>>>
>>> Same comment about "num" ... also, any special reason why u32 is needed?
>>
>> Only reason is, it is supposed to be unsigned.  I can change it "unsigned
>> int".
>>
>>>
>>>> +
>>>> +static void num_cntrs_init(void)
>>>
>>> mon_cntrs_init() ?
>>
>> mbm_cntrs_init?
> 
> Sounds good.
ok

> 
>>
>>>
>>>> +{
>>>> +    struct rdt_resource *r =
>>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>> +
>>>> +    bitmap_fill(&num_cntrs_free_map, r->mon.num_cntrs);
>>>> +    num_cntrs_free_map_len = r->mon.num_cntrs;
>>>> +}
>>>> +
>>>>    /**
>>>>     * rdtgroup_mode_by_closid - Return mode of resource group with 
>>>> closid
>>>>     * @closid: closid if the resource group
>>>> @@ -2459,6 +2476,12 @@ static int resctrl_abmc_set_all(enum
>>>> resctrl_res_level l, bool enable)
>>>
>>> resctrl_abmc_set_all() was initially created as a complement of
>>> resctrl_abmc_set_one() ... but with more initialization added to
>>> resctrl_abmc_set_all() this relationship becomes vague.
>>
>> Yes. Understood. Let me know if want me to change anything here.
> 
> How about renaming resctrl_abmc_set_all() to _resctrl_abmc_enable()?

Sure.
-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter
  2024-06-20 22:50       ` Reinette Chatre
@ 2024-06-21 16:07         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 16:07 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/20/2024 5:50 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/19/24 3:38 PM, Moger, Babu wrote:
>> On 6/13/2024 8:48 PM, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
> 
>>>> +int resctrl_arch_assign(struct rdt_domain *d, u32 evtid, u32 rmid,
>>>> +            u32 ctr_id, u32 closid, bool enable)
>>>> +{
>>>> +    struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>>>> +    union l3_qos_abmc_cfg abmc_cfg = { 0 };
>>>> +    struct arch_mbm_state *arch_mbm;
>>>> +
>>>> +    abmc_cfg.split.cfg_en = 1;
>>>> +    abmc_cfg.split.ctr_en = enable ? 1 : 0;
>>>> +    abmc_cfg.split.ctr_id = ctr_id;
>>>> +    abmc_cfg.split.bw_src = rmid;
>>>> +
>>>> +    /*
>>>> +     * Read the event configuration from the domain and pass it as
>>>> +     * bw_type.
>>>> +     */
>>>> +    if (evtid == QOS_L3_MBM_TOTAL_EVENT_ID) {
>>>> +        abmc_cfg.split.bw_type = hw_dom->mbm_total_cfg;
>>>> +        arch_mbm = &hw_dom->arch_mbm_total[rmid];
>>>> +    } else {
>>>> +        abmc_cfg.split.bw_type = hw_dom->mbm_local_cfg;
>>>> +        arch_mbm = &hw_dom->arch_mbm_local[rmid];
>>>> +    }
>>>> +
>>>> +    smp_call_function_any(&d->cpu_mask, rdtgroup_abmc_cfg, 
>>>> &abmc_cfg, 1);
>>>> +
>>>> +    /* Reset the internal counters */
>>>
>>> "internal counters"? This needs a definition ... but since this is not
>>> a new data structure the comment can be more specific about what is done
>>> and why.
>>
>> How about?
>> /* Reset internal mbm event counters. */
> 
> I think it is potentially confusing to introduce a new concept of
> "internal counters". This is just the architectural state helping to
> detect overflows. This commit has no mention of this but I assume that
> the re-configuration of the counter will reset it, which explains
> why it is required to reset the state maintained by software.
> A similar comment to what can be found in mbm_config_write_domain()
> would be useful. This does not have to be so elaborate, maybe just 
> something
> like (please feel free to improve):
> 
>      Configuring a hardware counter resets it. Reset architectural
>      state to match.
> 

Sure. Thanks
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount
  2024-06-20 22:49       ` Reinette Chatre
@ 2024-06-21 16:29         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 16:29 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/20/2024 5:49 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/20/24 7:46 AM, Moger, Babu wrote:
>> On 6/13/24 20:50, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
> 
> 
>>>> +        if (resctrl_assign_events(&rdtgroup_default) < 0)
>>>> +            rdt_last_cmd_puts("Monitor assignment failed\n");
>>>
>>> How will user know that there is a warning? This does not return an 
>>> error
>>> so nothing will prompt user to check the status file. Perhaps add a 
>>> comment
>>> in the docs to help users at least know that this exists. Another 
>>> helpful
>>
>> Sure. Will add text about this in the documentation.
>>
>>> thing will be to have the counter return "Unassigned" if it has not been
>>> assigned
>>> since users familiar with old interface may see "Unavailable" without
>>> knowing that user action is required to get a working event.
>>
>> Not so sure about interpreting hardware reported error.
>>
> 
> I only see "Unavailable" as a hardware error and that is already
> supported. "Unassigned" is not a hardware error but instead indicates
> to user space that no hardware counter is backing the queried event.
> 

That is fine. Will change to report "Unassinged" when it queried when 
event is not assigned. thanks
-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
  2024-06-20 22:49     ` Reinette Chatre
@ 2024-06-21 16:41       ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-21 16:41 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/20/2024 5:49 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/18/24 2:02 PM, Moger, Babu wrote:
>> On 6/13/24 19:54, Reinette Chatre wrote:
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
>>>
>>>>     when reading, then the read will come back as Unavailable.
>>>
>>> Should this not rather be "Unassigned"? According to the docs the 
>>> counters
>>> will return "Unavailable" right after reconfigure so it seems that there
>>> are scenarios where an "assigned" counter returns "Unavailable". It 
>>> seems
>>> more
>>> useful to return "Unassigned" that will have a new specific meaning that
>>> overloading existing "Unavailable" that has original meaning of "try
>>> again" ....
>>> but in this case trying again will be futile.
>>
>> Hardware returns "Unavailable" in both the cases. So, thought of 
>> reporting the same without any interpretation.
>>
> 
> I do not see these as the same. When a counter is assigned and its
> read returns "Unavailable" then the user reasonably expects that
> retry will work.
> When a counter is not assigned then no amount of retries will result
> in data. How is user space expected to distinguish between the two
> scenarios that return the same error with such significantly different
> meaning? rdtgroup_mondata_shows() can just peek into the monitor
> group state and see if a counter is assigned and immediately
> return "Unassigned" if no counter is assigned, no? I see no need
> for it to IPI another CPU and try to read a counter that it already knows
> will be futile. This seems unnecessary and the generic "Unavailable" is
> not helpful to user space.
> 

Sure. Will change it to report "Unassigned" in these cases.
Thanks
-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-06-14  1:43   ` Reinette Chatre
  2024-06-19 18:46     ` Moger, Babu
@ 2024-06-27 18:51     ` Moger, Babu
  2024-06-27 20:56       ` Reinette Chatre
  1 sibling, 1 reply; 66+ messages in thread
From: Moger, Babu @ 2024-06-27 18:51 UTC (permalink / raw)
  To: Reinette Chatre, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/13/24 20:43, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/24/24 5:23 AM, Babu Moger wrote:
>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>> supported, the bandwidth events can be configured to track specific
>> events. The event configuration is domain specific. ABMC (Assignable
>> Bandwidth Monitoring Counters) feature needs event configuration
>> information to assign hardware counter to an RMID. Event configurations
>> are not stored in resctrl but instead always read from or written to
>> hardware directly when prompted by user space.
>>
>> Read the event configuration from the hardware during the domain
>> initialization. Save the configuration information in the rdt_hw_domain,
>> so it can be used for counter assignment.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v4: Read the configuration information from the hardware to initialize.
>>      Added few commit messages.
>>      Fixed the tab spaces.
>>
>> v3: Minor changes related to rebase in mbm_config_write_domain.
>>
>> v2: No changes.
>> ---
>>   arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>>   arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
>>   arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
>>   3 files changed, 28 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
>> b/arch/x86/kernel/cpu/resctrl/core.c
>> index ec93f6a50308..856c46d12177 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct
>> rdt_resource *r)
>>           return;
>>       }
>>   +    arch_domain_mbm_evt_config(hw_dom);
>> +
>>       list_add_tail_rcu(&d->list, add_pos);
>>         err = resctrl_online_domain(r, d);
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>> b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 5e7e76cd512f..60a1ca0a11a7 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -373,6 +373,8 @@ struct arch_mbm_state {
>>    * @ctrl_val:    array of cache or mem ctrl values (indexed by CLOSID)
>>    * @arch_mbm_total:    arch private state for MBM total bandwidth
>>    * @arch_mbm_local:    arch private state for MBM local bandwidth
>> + * @mbm_total_cfg:    MBM total bandwidth configuration
>> + * @mbm_local_cfg:    MBM local bandwidth configuration
>>    *
>>    * Members of this structure are accessed via helpers that provide
>> abstraction.
>>    */
>> @@ -381,6 +383,8 @@ struct rdt_hw_domain {
>>       u32                *ctrl_val;
>>       struct arch_mbm_state        *arch_mbm_total;
>>       struct arch_mbm_state        *arch_mbm_local;
>> +    u32                mbm_total_cfg;
>> +    u32                mbm_local_cfg;
>>   };
> 
> Similar to the abmc_enabled member of rdt_hw_resource, these new
> members of rdt_hw_domain are architecture specific and should never be
> touched directly by resctrl fs code, for example, from mbm_config_show().

Need some clarification here.

I am thinking you want to introduce architecture specific routines to get
and set mbm_total_config/mbm_local_config for the domain.
Something like this.

+int arch_get_mbm_evt_cfg(struct rdt_domain *d, enum resctrl_event_id eventid)
+{
+       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+
+       switch (eventid) {
+       case QOS_L3_OCCUP_EVENT_ID:
+               break;
+       case QOS_L3_MBM_TOTAL_EVENT_ID:
+               return hw_dom->mbm_total_cfg;
+       case QOS_L3_MBM_LOCAL_EVENT_ID:
+               return hw_dom->mbm_local_cfg;
+       }
+
+       /* Never expect to get here */
+       WARN_ON_ONCE(1);
+
+       return -1;
+}
+
+void arch_set_mbm_evt_cfg(struct rdt_domain *d,
+                         enum resctrl_event_id eventid, u32 val)
+{
+       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+
+       switch (eventid) {
+       case QOS_L3_OCCUP_EVENT_ID:
+               break;
+       case QOS_L3_MBM_TOTAL_EVENT_ID:
+               hw_dom->mbm_total_cfg = val;
+               break;
+       case QOS_L3_MBM_LOCAL_EVENT_ID:
+               hw_dom->mbm_local_cfg = val;
+       }
+
+       return;
+}
+

I have added the functions in rdtgroup.c and prototypes in internal.h.
This will be called in mbm)config_show() and mbm_config_write_domain()

Did I understand this correctly?

Thanks
Babu

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-06-27 18:51     ` Moger, Babu
@ 2024-06-27 20:56       ` Reinette Chatre
  2024-06-27 21:26         ` Moger, Babu
  0 siblings, 1 reply; 66+ messages in thread
From: Reinette Chatre @ 2024-06-27 20:56 UTC (permalink / raw)
  To: babu.moger, corbet, fenghua.yu, tglx, mingo, bp, dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Babu,

On 6/27/24 11:51 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 6/13/24 20:43, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 5/24/24 5:23 AM, Babu Moger wrote:
>>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>>> supported, the bandwidth events can be configured to track specific
>>> events. The event configuration is domain specific. ABMC (Assignable
>>> Bandwidth Monitoring Counters) feature needs event configuration
>>> information to assign hardware counter to an RMID. Event configurations
>>> are not stored in resctrl but instead always read from or written to
>>> hardware directly when prompted by user space.
>>>
>>> Read the event configuration from the hardware during the domain
>>> initialization. Save the configuration information in the rdt_hw_domain,
>>> so it can be used for counter assignment.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> v4: Read the configuration information from the hardware to initialize.
>>>       Added few commit messages.
>>>       Fixed the tab spaces.
>>>
>>> v3: Minor changes related to rebase in mbm_config_write_domain.
>>>
>>> v2: No changes.
>>> ---
>>>    arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>>>    arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
>>>    arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
>>>    3 files changed, 28 insertions(+)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
>>> b/arch/x86/kernel/cpu/resctrl/core.c
>>> index ec93f6a50308..856c46d12177 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>> @@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct
>>> rdt_resource *r)
>>>            return;
>>>        }
>>>    +    arch_domain_mbm_evt_config(hw_dom);
>>> +
>>>        list_add_tail_rcu(&d->list, add_pos);
>>>          err = resctrl_online_domain(r, d);
>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>>> b/arch/x86/kernel/cpu/resctrl/internal.h
>>> index 5e7e76cd512f..60a1ca0a11a7 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>>> @@ -373,6 +373,8 @@ struct arch_mbm_state {
>>>     * @ctrl_val:    array of cache or mem ctrl values (indexed by CLOSID)
>>>     * @arch_mbm_total:    arch private state for MBM total bandwidth
>>>     * @arch_mbm_local:    arch private state for MBM local bandwidth
>>> + * @mbm_total_cfg:    MBM total bandwidth configuration
>>> + * @mbm_local_cfg:    MBM local bandwidth configuration
>>>     *
>>>     * Members of this structure are accessed via helpers that provide
>>> abstraction.
>>>     */
>>> @@ -381,6 +383,8 @@ struct rdt_hw_domain {
>>>        u32                *ctrl_val;
>>>        struct arch_mbm_state        *arch_mbm_total;
>>>        struct arch_mbm_state        *arch_mbm_local;
>>> +    u32                mbm_total_cfg;
>>> +    u32                mbm_local_cfg;
>>>    };
>>
>> Similar to the abmc_enabled member of rdt_hw_resource, these new
>> members of rdt_hw_domain are architecture specific and should never be
>> touched directly by resctrl fs code, for example, from mbm_config_show().
> 
> Need some clarification here.
> 
> I am thinking you want to introduce architecture specific routines to get
> and set mbm_total_config/mbm_local_config for the domain.
> Something like this.
> 
> +int arch_get_mbm_evt_cfg(struct rdt_domain *d, enum resctrl_event_id eventid)

The prefix for arch specific calls converged to "resctrl_arch_".

> +{
> +       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> +
> +       switch (eventid) {
> +       case QOS_L3_OCCUP_EVENT_ID:
> +               break;
> +       case QOS_L3_MBM_TOTAL_EVENT_ID:
> +               return hw_dom->mbm_total_cfg;
> +       case QOS_L3_MBM_LOCAL_EVENT_ID:
> +               return hw_dom->mbm_local_cfg;
> +       }
> +
> +       /* Never expect to get here */
> +       WARN_ON_ONCE(1);
> +
> +       return -1;
> +}
> +
> +void arch_set_mbm_evt_cfg(struct rdt_domain *d,
> +                         enum resctrl_event_id eventid, u32 val)
> +{
> +       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> +
> +       switch (eventid) {
> +       case QOS_L3_OCCUP_EVENT_ID:
> +               break;
> +       case QOS_L3_MBM_TOTAL_EVENT_ID:
> +               hw_dom->mbm_total_cfg = val;
> +               break;
> +       case QOS_L3_MBM_LOCAL_EVENT_ID:
> +               hw_dom->mbm_local_cfg = val;
> +       }
> +
> +       return;
> +}
> +

I expected that a call to set the event configuration on an architecture to
also interact with the hardware. Essentially what mon_event_config_write()
does today, but in addition also sets the cached mbm_total_cfg/mbm_local_cfg.

Part of this is done by the new MPAM portion [1]. With that the post-ABMC
implementation of resctrl_arch_mon_event_config_read() may be what you
have as arch_get_mbm_evt_cfg() above and resctrl_arch_mon_event_config_write()
does the same as in [1] but with addition of updating the cached
mbm_local_cfg/mbm_total_cfg within struct rdt_hw_domain. Would this work
for ABMC?
  
Reinette

[1] https://lore.kernel.org/lkml/20240614150033.10454-21-james.morse@arm.com/


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg
  2024-06-27 20:56       ` Reinette Chatre
@ 2024-06-27 21:26         ` Moger, Babu
  0 siblings, 0 replies; 66+ messages in thread
From: Moger, Babu @ 2024-06-27 21:26 UTC (permalink / raw)
  To: Reinette Chatre, babu.moger, corbet, fenghua.yu, tglx, mingo, bp,
	dave.hansen
  Cc: x86, hpa, paulmck, rdunlap, tj, peterz, yanjiewtw, kim.phillips,
	lukas.bulwahn, seanjc, jmattson, leitao, jpoimboe,
	rick.p.edgecombe, kirill.shutemov, jithu.joseph, kai.huang,
	kan.liang, daniel.sneddon, pbonzini, sandipan.das, ilpo.jarvinen,
	peternewman, maciej.wieczor-retman, linux-doc, linux-kernel,
	eranian, james.morse

Hi Reinette,

On 6/27/2024 3:56 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/27/24 11:51 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 6/13/24 20:43, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 5/24/24 5:23 AM, Babu Moger wrote:
>>>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>>>> supported, the bandwidth events can be configured to track specific
>>>> events. The event configuration is domain specific. ABMC (Assignable
>>>> Bandwidth Monitoring Counters) feature needs event configuration
>>>> information to assign hardware counter to an RMID. Event configurations
>>>> are not stored in resctrl but instead always read from or written to
>>>> hardware directly when prompted by user space.
>>>>
>>>> Read the event configuration from the hardware during the domain
>>>> initialization. Save the configuration information in the 
>>>> rdt_hw_domain,
>>>> so it can be used for counter assignment.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> v4: Read the configuration information from the hardware to initialize.
>>>>       Added few commit messages.
>>>>       Fixed the tab spaces.
>>>>
>>>> v3: Minor changes related to rebase in mbm_config_write_domain.
>>>>
>>>> v2: No changes.
>>>> ---
>>>>    arch/x86/kernel/cpu/resctrl/core.c     |  2 ++
>>>>    arch/x86/kernel/cpu/resctrl/internal.h |  5 +++++
>>>>    arch/x86/kernel/cpu/resctrl/monitor.c  | 21 +++++++++++++++++++++
>>>>    3 files changed, 28 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
>>>> b/arch/x86/kernel/cpu/resctrl/core.c
>>>> index ec93f6a50308..856c46d12177 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>>> @@ -542,6 +542,8 @@ static void domain_add_cpu(int cpu, struct
>>>> rdt_resource *r)
>>>>            return;
>>>>        }
>>>>    +    arch_domain_mbm_evt_config(hw_dom);
>>>> +
>>>>        list_add_tail_rcu(&d->list, add_pos);
>>>>          err = resctrl_online_domain(r, d);
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h
>>>> b/arch/x86/kernel/cpu/resctrl/internal.h
>>>> index 5e7e76cd512f..60a1ca0a11a7 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>>>> @@ -373,6 +373,8 @@ struct arch_mbm_state {
>>>>     * @ctrl_val:    array of cache or mem ctrl values (indexed by 
>>>> CLOSID)
>>>>     * @arch_mbm_total:    arch private state for MBM total bandwidth
>>>>     * @arch_mbm_local:    arch private state for MBM local bandwidth
>>>> + * @mbm_total_cfg:    MBM total bandwidth configuration
>>>> + * @mbm_local_cfg:    MBM local bandwidth configuration
>>>>     *
>>>>     * Members of this structure are accessed via helpers that provide
>>>> abstraction.
>>>>     */
>>>> @@ -381,6 +383,8 @@ struct rdt_hw_domain {
>>>>        u32                *ctrl_val;
>>>>        struct arch_mbm_state        *arch_mbm_total;
>>>>        struct arch_mbm_state        *arch_mbm_local;
>>>> +    u32                mbm_total_cfg;
>>>> +    u32                mbm_local_cfg;
>>>>    };
>>>
>>> Similar to the abmc_enabled member of rdt_hw_resource, these new
>>> members of rdt_hw_domain are architecture specific and should never be
>>> touched directly by resctrl fs code, for example, from 
>>> mbm_config_show().
>>
>> Need some clarification here.
>>
>> I am thinking you want to introduce architecture specific routines to get
>> and set mbm_total_config/mbm_local_config for the domain.
>> Something like this.
>>
>> +int arch_get_mbm_evt_cfg(struct rdt_domain *d, enum resctrl_event_id 
>> eventid)
> 
> The prefix for arch specific calls converged to "resctrl_arch_".
> 
>> +{
>> +       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>> +
>> +       switch (eventid) {
>> +       case QOS_L3_OCCUP_EVENT_ID:
>> +               break;
>> +       case QOS_L3_MBM_TOTAL_EVENT_ID:
>> +               return hw_dom->mbm_total_cfg;
>> +       case QOS_L3_MBM_LOCAL_EVENT_ID:
>> +               return hw_dom->mbm_local_cfg;
>> +       }
>> +
>> +       /* Never expect to get here */
>> +       WARN_ON_ONCE(1);
>> +
>> +       return -1;
>> +}
>> +
>> +void arch_set_mbm_evt_cfg(struct rdt_domain *d,
>> +                         enum resctrl_event_id eventid, u32 val)
>> +{
>> +       struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>> +
>> +       switch (eventid) {
>> +       case QOS_L3_OCCUP_EVENT_ID:
>> +               break;
>> +       case QOS_L3_MBM_TOTAL_EVENT_ID:
>> +               hw_dom->mbm_total_cfg = val;
>> +               break;
>> +       case QOS_L3_MBM_LOCAL_EVENT_ID:
>> +               hw_dom->mbm_local_cfg = val;
>> +       }
>> +
>> +       return;
>> +}
>> +
> 
> I expected that a call to set the event configuration on an architecture to
> also interact with the hardware. Essentially what mon_event_config_write()
> does today, but in addition also sets the cached 
> mbm_total_cfg/mbm_local_cfg.

Yea. Understood.
> 
> Part of this is done by the new MPAM portion [1]. With that the post-ABMC
> implementation of resctrl_arch_mon_event_config_read() may be what you
> have as arch_get_mbm_evt_cfg() above and 
> resctrl_arch_mon_event_config_write()
> does the same as in [1] but with addition of updating the cached
> mbm_local_cfg/mbm_total_cfg within struct rdt_hw_domain. Would this work
> for ABMC?


Yes. It does work. I can keep the same name as well.
Thanks for the clarifications.

> 
> Reinette
> 
> [1] 
> https://lore.kernel.org/lkml/20240614150033.10454-21-james.morse@arm.com/
> 
> 

-- 
- Babu Moger

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2024-06-27 21:26 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
2024-06-14  0:54   ` Reinette Chatre
2024-06-18 21:02     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 02/19] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2024-05-24 12:23 ` [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2024-06-14  0:55   ` Reinette Chatre
2024-06-18 21:02     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2024-06-14  0:56   ` Reinette Chatre
2024-06-18 21:03     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags Babu Moger
2024-06-14  0:57   ` Reinette Chatre
2024-06-18 21:03     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
2024-06-14  0:57   ` Reinette Chatre
2024-06-18 21:04     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
2024-06-14  0:59   ` Reinette Chatre
2024-06-19 15:37     ` Moger, Babu
2024-06-20 22:02       ` Reinette Chatre
2024-06-21 15:44         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2024-06-14  1:40   ` Reinette Chatre
2024-06-19 16:25     ` Moger, Babu
2024-06-20 22:05       ` Reinette Chatre
2024-06-21 15:47         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
2024-06-14  1:42   ` Reinette Chatre
2024-06-19 17:03     ` Moger, Babu
2024-06-20 22:20       ` Reinette Chatre
2024-06-21 16:01         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 10/19] x86/resctrl: Introduce ABMC state for the monitor group Babu Moger
2024-05-24 12:23 ` [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
2024-06-14  1:43   ` Reinette Chatre
2024-06-19 18:46     ` Moger, Babu
2024-06-27 18:51     ` Moger, Babu
2024-06-27 20:56       ` Reinette Chatre
2024-06-27 21:26         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 12/19] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2024-05-24 12:23 ` [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment Babu Moger
2024-06-14  1:44   ` Reinette Chatre
2024-06-19 20:10     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter Babu Moger
2024-06-14  1:48   ` Reinette Chatre
2024-06-19 22:38     ` Moger, Babu
2024-06-20 22:50       ` Reinette Chatre
2024-06-21 16:07         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 15/19] x86/resctrl: Add the interface to unassign " Babu Moger
2024-06-14  1:49   ` Reinette Chatre
2024-06-20 13:48     ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount Babu Moger
2024-06-14  1:50   ` Reinette Chatre
2024-06-20 14:46     ` Moger, Babu
2024-06-20 22:49       ` Reinette Chatre
2024-06-21 16:29         ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
2024-06-14  1:51   ` Reinette Chatre
2024-06-20 14:53     ` Moger, Babu
2024-06-21 14:43   ` Markus Elfring
2024-05-24 12:23 ` [PATCH v4 18/19] x86/resctrl: Introduce interface to list monitor states of all the groups Babu Moger
2024-05-24 12:23 ` [PATCH v4 19/19] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
2024-06-14  0:54 ` [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
2024-06-18 21:02   ` Moger, Babu
2024-06-20 22:49     ` Reinette Chatre
2024-06-21 16:41       ` Moger, Babu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).