linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: <babu.moger@amd.com>, James Morse <james.morse@arm.com>,
	<corbet@lwn.net>, <fenghua.yu@intel.com>, <tglx@linutronix.de>,
	<mingo@redhat.com>, <bp@alien8.de>, <dave.hansen@linux.intel.com>
Cc: <x86@kernel.org>, <hpa@zytor.com>, <paulmck@kernel.org>,
	<rdunlap@infradead.org>, <tj@kernel.org>, <peterz@infradead.org>,
	<yanjiewtw@gmail.com>, <kim.phillips@amd.com>,
	<lukas.bulwahn@gmail.com>, <seanjc@google.com>,
	<jmattson@google.com>, <leitao@debian.org>, <jpoimboe@kernel.org>,
	<rick.p.edgecombe@intel.com>, <kirill.shutemov@linux.intel.com>,
	<jithu.joseph@intel.com>, <kai.huang@intel.com>,
	<kan.liang@linux.intel.com>, <daniel.sneddon@linux.intel.com>,
	<pbonzini@redhat.com>, <sandipan.das@amd.com>,
	<ilpo.jarvinen@linux.intel.com>, <peternewman@google.com>,
	<maciej.wieczor-retman@intel.com>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <eranian@google.com>
Subject: Re: [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Date: Mon, 26 Feb 2024 13:20:50 -0800	[thread overview]
Message-ID: <32a588e2-7b09-4257-b838-4268583a724d@intel.com> (raw)
In-Reply-To: <1ae73c9a-cec4-4496-86c6-3ffcef7940d6@amd.com>

Hi Babu,

On 2/26/2024 9:59 AM, Moger, Babu wrote:
> On 2/23/24 16:21, Reinette Chatre wrote:
>> On 2/23/2024 12:11 PM, Moger, Babu wrote:
>>> On 2/23/24 11:17, Reinette Chatre wrote:
>>>>
>>>>
>>>> On 2/20/2024 12:48 PM, Moger, Babu wrote:
>>>>> On 2/20/24 09:21, James Morse wrote:
>>>>>> On 19/01/2024 18:22, Babu Moger wrote:
>>>>
>>>>>>> e. Enable ABMC mode.
>>>>>>>
>>>>>>> 	#echo 1 > /sys/fs/resctrl/info/L3_MON/mbm_assign_enable
>>>>>>>         #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_enable
>>>>>>>         1
>>>>>>
>>>>>> Why does this mode need enabling? Can't it be enabled automatically on hardware that
>>>>>> supports it, or enabled implicitly when the first assignment attempt arrives?
>>>>>>
>>>>>> I guess this is really needed for a reset - could we implement that instead? This way
>>>>>> there isn't an extra step user-space has to do to make the assignments work.
>>>>>
>>>>> Mostly the new features are added as an opt-in method. So, kept it that
>>>>> way. If we enable this feature automatically, then we have provide an
>>>>> option to disable it.
>>>>>
>>>>
>>>> At the same time it sounds to me like ABMC can improve current users'
>>>> experience without requiring them to do anything. This sounds appealing.
>>>> For example, if I understand correctly, it may be possible to start resctrl
>>>> with ABMC enabled by default and the number of monitoring groups (currently
>>>> exposed to user space via "num_rmids") limited to the number of counters
>>>> supported by ABMC. Existing users would then by default obtain better behavior
>>>> of counters not resetting.
>>>
>>> Yes, I like the idea. But i will break compatibility with pqos
>>> tool(intel_cmt_cat utility). pqos tool monitoring will not work without
>>> supporting ABMC enablement in the tool. ABMC feature requires an extra
>>> step to assign the counters for monitor to work.
>>
>> I am considering two scenarios, the "default behavior" is what a user will
>> experience when booting resctrl on an ABMC system and the "new feature
>> behavior" where a user can take full advantage of all that ABMC (and soft
>> RMID, and MPAM) can offer.
>>
>> So, first, on an ABMC system in the "default behavior" scenario I expect
>> that resctrl can do required ABMC counter configuration automatically at
>> the time a monitor group is created. In this "default behavior" scenario
>> resctrl would expose "num_rmids" to be half of the number of assignable
>> counters. When a user then creates a monitor group two counters will be
>> used and configured to count the local and total bytes respectively. If
>> two counters are not available then ENOSPC returned, just like when system
>> is out of closid/rmid.  With this "default behavior" user space thus gets
>> improved behavior without making any changes on its part. I do not have
> 
> We can automatically assign the h/w counter when monitor group is created
> until we run out of h/w counters. That is good idea. By default user will
> not notice any difference in ABMC mode.
> 
>> insight into how many counters ABMC could be expected to expose though ...
>> so some users may be surprised at how few monitor groups can be created
>> with new hardware? This may not be an issue since that would accurately
>> reflect how many _reliable_ monitor groups can be created and if user needs
>> more monitor groups then that would be a time to explore the "new feature"
>> that requires changes in how user interacts with resctrl.
> 
> Currently, 32 h/w counters are available to configure. With two counters
> for each group, we can create 16 groups(15 new groups plus the default
> group). That should be fine as pqos tool creates only 16 groups when it is
> started.

user space can never assume that a certain number of groups can
be created. 

>> Apart from the "default behavior" there are two options to consider ...
>> (a) the "original" behavior(? I do not know what to call it) - this would be
>>     where user space wants(?) to have the current non-ABMC behavior on an ABMC
>>     system, where the previous "num_rmids" monitor groups can be created but
>>     the counters are reset unpredictably ... should this still be supported
>>     on ABMC systems though?
> 
> I would say yes. For some reason user(hardware or software issues) is not
> able to use ABMC mode, they have an option to go back to legacy mode.

I see. Should this perhaps be protected behind the resctrl "debug" mount option?

>> (b) the "new feature" behavior where user space gets full benefit of ABMC
>>     that allows user space to create any number of monitor groups but then
>>     user space needs to let hardware (via resctrl) know which
>>     events should be counted.
> 
> Is this "new feature" is enabled by default when ABMC is available?

Not in this design, no. In these scenarios ABMC will be available and enabled
in both the "default" and "new feature" behavior. The difference is no user
space changes are needed in "default" scenario and resctrl limits the number
of monitor groups to support all monitor groups to be backed by hardware
counters. 
When "new feature" is enabled when ABMC is available and enabled then
user space is able to create more monitor groups than available hardware
counters and new user interface is required to manage associating counters
with monitor events.

> 
> Or we need to provide an interface to enable this feature?

Yes, an interface will be needed to enable this feature.

> 
> 
>>
>> I expect that only (b) above would require user space change. Considering
>> that per documentation, "num_rmids" means "This is the upper bound for how
>> many "CTRL_MON" + "MON" groups can be created" I expect that "num_rmids"
>> becomes undefined when "new feature" is enabled. When this new feature is enabled
>> then user space is no longer limited by number of RMIDs on how many monitor
> 
> With ABMC, we will have a new field "mbm_assignable_counters". We don't
> have to change the definition of "num_rmids".

The problem here is that "num_rmids" is (as per Documentation/arch/x86/resctrl.rst)
documented to be an upper bound for how many monitor groups can be created.
As I understand, when ABMC is enabled and its full capability exposed to user
space then there is no limit to how many monitor groups can be created, no?

For example, if I understand correctly, theoretically, when ABMC is enabled then
"num_rmids" can be U32_MAX (after a quick look it is not clear to me why r->num_rmid
is not unsigned, tbd if number of directories may also be limited by kernfs).
User space could theoretically create more monitor groups than the number of
rmids that a resource claims to support using current upstream enumeration.
Instead, it is the "mbm_assignable_counters" that is of interest, that is what
user space uses to determine how many of the (potentially very large number of)
monitor groups/monitor events can be counted at any particular time.

>> groups can be created and this is the point that the user interface that you
>> and Peter have ideas about comes into play. Specifically, user space needing
>> a way to specify:
>> (a) "let me create more monitor groups that the hardware can support"/"let me
>>      control which events/monitor groups are counted"
>>      (like the "mbm_assign" file in your proposal)
>> (b) "here are the events that need to be counted" 
>>      (like the "monitor_state" and "mbm_{local,total}_bytes_assigned" proposals)
> 
> With global assignment option out of way for now(may be introduced later),
> we can provide two interfaces.
> 
> 1. /sys/fs/resctrl/info/L3_MON/mbm_assign
> This will be enabled by default when ABMC is available. Users can disable
> this option to go back to legacy mode.

Potentially (all naming placeholders that will only be visible on systems that
actually supports particular mode):
legacy [default] new_feature soft_rmid

> 
> 2. /sys/fs/resctrl/monitor_state.
> This can used to individually assign or unassign the counters in each group.
> 
> When assigned:
> #cat /sys/fs/resctrl/monitor_state
> 0=total-assign,local-assign;1=total-assign,local-assign
> 
> When unassigned:
> #cat /sys/fs/resctrl/monitor_state
> 0=total-unassign,local-unassign;1=total-unassign,local-unassign
> 
> 
> Thoughts?

How do you expect this interface to be used? I understand the mechanics
of this interface but on a higher level, do you expect user space to
once in a while assign a new counter to a single event or monitor group
(for which a fine grained interface works) or do you expect user space to
shift multiple counters across several monitor events at intervals?

Across resctrl's lifetime we have seen examples of user space wanting
to accomplish more with a single resctrl interaction. For example moving
multiple tasks to a group that you added support for and moving a monitor
group feature from Peter.

I thus think that it would be valuable to consider more efficient
interfaces from the beginning. I do not think that this is the type
of work that is an optimization to be delayed until an unspecified later
time, but instead multiple usage of interface can be considered from the
start with a most optimal interface created from the beginning. Specifically,
why does resctrl need to be "extended" to support a global assignment as proposed
by Peter at a later time, why can it not be done as the original and (ideally)
only mechanism?

Reinette

  reply	other threads:[~2024-02-26 21:21 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-01  0:57 [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature Babu Moger
2023-12-01  0:57 ` [PATCH 01/15] x86/resctrl: Remove hard-coded memory bandwidth limit Babu Moger
2023-12-05 23:18   ` Reinette Chatre
2023-12-06 16:29     ` Moger, Babu
2023-12-06 17:09       ` Reinette Chatre
2023-12-06 17:37         ` Moger, Babu
2023-12-01  0:57 ` [PATCH 02/15] x86/resctrl: Remove hard-coded memory bandwidth event configuration Babu Moger
2023-12-05 23:21   ` Reinette Chatre
2023-12-06 17:17     ` Moger, Babu
2023-12-06 18:32       ` Reinette Chatre
2023-12-06 19:17         ` Moger, Babu
2023-12-07 19:02           ` Reinette Chatre
2023-12-07 23:37             ` Moger, Babu
2023-12-01  0:57 ` [PATCH 03/15] x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2023-12-01  0:57 ` [PATCH 04/15] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2023-12-01  0:57 ` [PATCH 05/15] x86/resctrl: Detect ABMC feature details Babu Moger
2023-12-01  0:57 ` [PATCH 06/15] x86/resctrl: Add the mount option for ABMC feature Babu Moger
2023-12-01  0:57 ` [PATCH 07/15] x86/resctrl: Add support to enable/disable " Babu Moger
2023-12-05 16:48   ` kernel test robot
2023-12-05 17:40     ` Moger, Babu
2023-12-05 18:50   ` kernel test robot
2023-12-01  0:57 ` [PATCH 08/15] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
2023-12-01  0:57 ` [PATCH 09/15] x86/resctrl: Add interface to display monitor state of the group Babu Moger
2023-12-01  0:57 ` [PATCH 10/15] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
2023-12-01  0:57 ` [PATCH 11/15] x86/resctrl: Add data structures for ABMC assignment Babu Moger
2023-12-01  0:57 ` [PATCH 12/15] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
2023-12-01  0:57 ` [PATCH 13/15] x86/resctrl: Add the interface to assign a ABMC counter Babu Moger
2023-12-01  0:57 ` [PATCH 14/15] x86/resctrl: Add interface unassign " Babu Moger
2023-12-05 17:55   ` kernel test robot
2023-12-05 18:09     ` Moger, Babu
2023-12-01  0:57 ` [PATCH 15/15] x86/resctrl: Update ABMC assignment on event configuration changes Babu Moger
2023-12-05  0:13 ` [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature Peter Newman
2023-12-05 23:17 ` Reinette Chatre
2023-12-06 15:40   ` Moger, Babu
2023-12-06 18:49     ` Reinette Chatre
2023-12-07 16:12       ` Moger, Babu
2023-12-07 19:29         ` Reinette Chatre
2023-12-07 23:07           ` Moger, Babu
2023-12-07 23:26             ` Reinette Chatre
2023-12-07 23:34               ` Moger, Babu
2023-12-08 22:58           ` Moger, Babu
2023-12-08 19:45   ` Peter Newman
2023-12-08 20:09     ` Reinette Chatre
2023-12-12 18:02 ` [PATCH v2 1/2] x86/resctrl: Remove hard-coded memory bandwidth limit Babu Moger
2023-12-15  2:20   ` Reinette Chatre
2024-01-02 19:52     ` Moger, Babu
2023-12-12 18:02 ` [PATCH v2 2/2] x86/resctrl: Remove hard-coded memory bandwidth event configuration Babu Moger
2023-12-15  1:24   ` Reinette Chatre
2024-01-02 20:00     ` Moger, Babu
2024-01-03 18:38       ` Reinette Chatre
2024-01-03 21:03         ` Moger, Babu
2024-01-03 21:40           ` Reinette Chatre
2024-01-04 13:48             ` Moger, Babu
2024-01-04 21:21 ` [PATCH v3 1/2] x86/resctrl: Remove hard-coded memory bandwidth limit Babu Moger
2024-01-05 21:14   ` Reinette Chatre
2024-01-05 23:51     ` Moger, Babu
2024-01-04 21:21 ` [PATCH v3 2/2] x86/resctrl: Remove hard-coded memory bandwidth event configuration Babu Moger
2024-01-05 21:18   ` Reinette Chatre
2024-01-06  0:13     ` Moger, Babu
2024-01-11 21:36 ` [PATCH v4 1/2] x86/resctrl: Remove hard-coded memory bandwidth limit Babu Moger
2024-01-11 21:36 ` [PATCH v4 2/2] x86/resctrl: Read supported bandwidth sources using CPUID command Babu Moger
2024-01-12 19:02   ` Reinette Chatre
2024-01-12 20:38     ` Moger, Babu
2024-01-12 21:24       ` Reinette Chatre
2024-01-12 21:54         ` Moger, Babu
2024-01-15 22:52 ` [PATCH v5 1/2] x86/resctrl: Remove hard-coded memory bandwidth limit Babu Moger
2024-01-23 10:36   ` Borislav Petkov
2024-01-23 14:58     ` Moger, Babu
2024-01-15 22:52 ` [PATCH v5 2/2] x86/resctrl: Read supported bandwidth sources using CPUID command Babu Moger
2024-01-16 19:44   ` Reinette Chatre
2024-01-16 21:39     ` Moger, Babu
2024-01-19 18:22 ` [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2024-01-19 18:22   ` [PATCH v2 01/17] x86/cpufeatures: Add word 21 for scattered CPUID features Babu Moger
2024-01-19 18:22   ` [PATCH v2 02/17] x86/resctrl: Add support for Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2024-01-19 18:22   ` [PATCH v2 03/17] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2024-01-19 18:22   ` [PATCH v2 04/17] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2024-02-20 17:56     ` James Morse
2024-02-20 21:27       ` Moger, Babu
2024-01-19 18:22   ` [PATCH v2 05/17] x86/resctrl: Introduce resctrl_file_fflags_init Babu Moger
2024-01-19 18:22   ` [PATCH v2 06/17] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
2024-02-20 18:14     ` James Morse
2024-02-20 21:23       ` Moger, Babu
2024-01-19 18:22   ` [PATCH v2 07/17] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
2024-01-19 18:22   ` [PATCH v2 08/17] x86/resctrl: Introduce the interface to display ABMC state Babu Moger
2024-01-19 18:22   ` [PATCH v2 09/17] x86/resctrl: Introdruce rdtgroup_assign_enable_write Babu Moger
2024-01-19 18:22   ` [PATCH v2 10/17] x86/resctrl: Add interface to display monitor state of the group Babu Moger
2024-01-19 18:22   ` [PATCH v2 11/17] x86/resctrl: Report Unsupported when MBM events are read Babu Moger
2024-01-19 18:22   ` [PATCH v2 12/17] x86/resctrl: Initialize assignable counters bitmap Babu Moger
2024-01-19 18:22   ` [PATCH v2 13/17] x86/resctrl: Add data structures for ABMC assignment Babu Moger
2024-01-19 18:22   ` [PATCH v2 14/17] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
2024-01-19 18:22   ` [PATCH v2 15/17] x86/resctrl: Add the interface to assign the RMID Babu Moger
2024-01-19 18:22   ` [PATCH v2 16/17] x86/resctrl: Add the interface unassign " Babu Moger
2024-01-19 18:22   ` [PATCH v2 17/17] x86/resctrl: Update RMID assignments on event configuration changes Babu Moger
2024-01-19 18:32   ` [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Reinette Chatre
2024-01-19 20:35     ` Moger, Babu
2024-02-02  4:09   ` Reinette Chatre
2024-02-02  5:01     ` Reinette Chatre
2024-02-02 21:57     ` Moger, Babu
2024-02-05 22:38       ` Reinette Chatre
2024-02-08 17:29         ` Moger, Babu
2024-02-16 20:18           ` Peter Newman
2024-02-19 18:00             ` Moger, Babu
2024-02-20 15:21             ` James Morse
2024-02-20 18:11               ` Peter Newman
2024-02-23 21:47                 ` Moger, Babu
2024-02-20 15:21   ` James Morse
2024-02-20 18:14     ` James Morse
2024-02-20 20:48     ` Moger, Babu
2024-02-23 17:17       ` Reinette Chatre
2024-02-23 20:11         ` Moger, Babu
2024-02-23 22:21           ` Reinette Chatre
2024-02-26 17:59             ` Moger, Babu
2024-02-26 21:20               ` Reinette Chatre [this message]
2024-02-27 18:12                 ` Moger, Babu
2024-02-27 18:26                   ` Peter Newman
2024-02-27 19:37                     ` Moger, Babu
2024-02-27 20:06                       ` Peter Newman
2024-02-27 20:42                         ` Moger, Babu
2024-02-27 23:50                   ` Reinette Chatre
2024-02-28 17:59                     ` Moger, Babu
2024-02-28 20:04                       ` Reinette Chatre
2024-02-29 20:37                         ` Moger, Babu
2024-02-29 21:50                           ` Reinette Chatre
2024-03-01 20:36                             ` Moger, Babu
2024-03-01 23:20                               ` Reinette Chatre
2024-03-04 19:34                                 ` Moger, Babu
2024-03-04 19:58                                   ` Reinette Chatre
2024-03-04 22:24                                     ` Moger, Babu
2024-03-05 14:58                                       ` Moger, Babu
2024-03-05 17:12                                       ` Reinette Chatre
2024-03-05 19:35                                         ` Moger, Babu
2024-03-07 18:57                                       ` Peter Newman
2024-03-07 20:41                                         ` Reinette Chatre
2024-03-07 22:33                                           ` Peter Newman
2024-03-07 22:53                                             ` Reinette Chatre
2024-03-07 23:14                                               ` Peter Newman
2024-03-08 17:13                                                 ` Reinette Chatre
2024-03-08  3:50                                               ` Moger, Babu
2024-03-08 17:20                                                 ` Reinette Chatre
2024-03-12 13:30                                                   ` Moger, Babu
2024-03-11 15:40                     ` Moger, Babu
2024-03-12 15:13                       ` Reinette Chatre
2024-03-12 17:07                         ` Moger, Babu
2024-03-12 17:15                           ` Reinette Chatre
2024-03-12 17:24                             ` Moger, Babu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32a588e2-7b09-4257-b838-4268583a724d@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=daniel.sneddon@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=eranian@google.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=jithu.joseph@intel.com \
    --cc=jmattson@google.com \
    --cc=jpoimboe@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kim.phillips@amd.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=leitao@debian.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peternewman@google.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sandipan.das@amd.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yanjiewtw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).