From: Reinette Chatre <reinette.chatre@intel.com>
To: Babu Moger <babu.moger@amd.com>, <corbet@lwn.net>,
<fenghua.yu@intel.com>, <tglx@linutronix.de>, <mingo@redhat.com>,
<bp@alien8.de>, <dave.hansen@linux.intel.com>
Cc: <x86@kernel.org>, <hpa@zytor.com>, <paulmck@kernel.org>,
<rdunlap@infradead.org>, <tj@kernel.org>, <peterz@infradead.org>,
<yanjiewtw@gmail.com>, <kim.phillips@amd.com>,
<lukas.bulwahn@gmail.com>, <seanjc@google.com>,
<jmattson@google.com>, <leitao@debian.org>, <jpoimboe@kernel.org>,
<rick.p.edgecombe@intel.com>, <kirill.shutemov@linux.intel.com>,
<jithu.joseph@intel.com>, <kai.huang@intel.com>,
<kan.liang@linux.intel.com>, <daniel.sneddon@linux.intel.com>,
<pbonzini@redhat.com>, <sandipan.das@amd.com>,
<ilpo.jarvinen@linux.intel.com>, <peternewman@google.com>,
<maciej.wieczor-retman@intel.com>, <linux-doc@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <eranian@google.com>,
<james.morse@arm.com>
Subject: Re: [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Date: Thu, 13 Jun 2024 17:54:10 -0700 [thread overview]
Message-ID: <2e488812-671e-4aa9-a292-c54b174f2dd7@intel.com> (raw)
In-Reply-To: <cover.1716552602.git.babu.moger@amd.com>
Hi Babu,
On 5/24/24 5:23 AM, Babu Moger wrote:
>
>
> d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> to list and modify the group's assignment states.
There was a lot of discussion resulting in this centralized file. At first glance this
file appears to be very complicated and I believe any reasonable person would wonder if
all of this is necessary. I recommend that you add a motivation for why this file is needed.
Some items I recall are : it makes it easier for user space to learn how counters are used (no
need to traverse resctrl and open()/close() many files), on the resctrl side it makes
it possible to support counter re-assignment with a single IPI. There may be other motivations
that I am forgetting now.
Also, could the name just be "mbm_control"? What is enabled at this time are "assignable
counters" but in the future we may want to add support for other flags that have nothing to
do with "assignable counters".
>
> The list follows the following format:
>
> "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
"assignment_flags" -> "flags" ? (throughout)
>
>
> Format for specific type of groups:
>
> * Default CTRL_MON group:
> "//<domain_id>=<assignment_flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id>=<assignment_flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id>=<assignment_flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
>
> Assignment flags can be one of the following:
>
> t MBM total event is enabled
> l MBM local event is enabled
> tl Both total and local MBM events are enabled
> _ None of the MBM events are enabled
>
> Examples:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> There are four groups and all the groups have local and total
> event enabled on domain 0 and 1.
>
> =tl means both total and local events are enabled.
>
> "//" - This is a default CONTROL MON group
>
> "non_default_ctrl_mon_grp//" - This is non default CONTROL MON group
Be consistent with "non-default" (vs non default) as well as "CTRL_MON" (vs
CONTROL MON).
>
> "/child_default_mon_grp/" - This is Child MON group of the defult group
"Child" -> "child"
"defult" -> "default"
>
> "non_default_ctrl_mon_grp/child_non_default_mon_grp/" - This is child
> MON group of the non default group
non-default
>
> e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control.
>
> The write format is similar to the above list format with addition of
> op-code for the assignment operation.
>
> * Default CTRL_MON group:
> "//<domain_id><op-code><assignment_flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id><op-code><assignment_flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
>
> Op-code can be one of the following:
>
> = Update the assignment to match the flags
> + Assign a new state
> - Unassign a new state
Looking here and the implementation it seems that "+_" and "-_" is supported.
I think that should be invalid. Only "=_" seems appropriate to me.
Also please take care to not have a catchall "default" that does an
unassign. Doing something like that will prevent us from ever being
able to add any flags in the future.
>
>
> Initial group status:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> To update the default group to enable only total event on domain 0:
> # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=t;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> To update the MON group child_default_mon_grp to remove total event on domain 1:
> # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=t;1=l;
> /child_default_mon_grp/0=t;1=tl;
This does not look right. Why did domain #1 of the default CTRL_MON group change also?
>
> To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
> remove both local and total events on domain 1:
> # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> //0=t;1=l;
> /child_default_mon_grp/0=t;1=tl;
>
> To update the default group to add a total event domain 1.
> # echo "//1+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
Unclear where "t" flag was removed.
> Assignment status after the update:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> //0=t;1=tl;
> /child_default_mon_grp/0=t;1=tl;
>
> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
> There is no change in reading the evetns with ABMC. If the event is unassigned
"evetns" -> "events"
> when reading, then the read will come back as Unavailable.
Should this not rather be "Unassigned"? According to the docs the counters
will return "Unavailable" right after reconfigure so it seems that there
are scenarios where an "assigned" counter returns "Unavailable". It seems more
useful to return "Unassigned" that will have a new specific meaning that
overloading existing "Unavailable" that has original meaning of "try again" ....
but in this case trying again will be futile.
>
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 779247936
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> 765207488
>
> g. Users will have the option to go back to legacy_mbm mode if required.
> This can be done using the following command.
>
> # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
> abmc
> [mbm_legacy]
It is confusing for the value written by user space to be different from
the value displayed: "legacy_mbm" vs "mbm_legacy.
This is still missing information about what happens to the counters/events on
such a switch. Will events just keep counting? Will they be reset? ...?
I also think we should try to find a more generic name for this file.
"mbm_cntr_mode" or "mbm_mode" maybe?
>
> h. Check the bandwidth configuration for the group. Note that bandwidth
> configuration has a domain scope. Total event defaults to 0x7F (to
> count all the events) and local event defaults to 0x15 (to count all
> the local numa events). The event bitmap decoding is available at
> https://www.kernel.org/doc/Documentation/x86/resctrl.rst
> in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x7f;1=0x7f
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> 0=0x15;1=0x15
>
> j. Change the bandwidth source for domain 0 for the total event to count only reads.
> Note that this change effects total events on the domain 0.
>
> #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x33;1=0x7F
>
> k. Now read the total event again. The mbm_total_bytes should display
> only the read events.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 314101
According to doc, right after a BMEC change the counter will read "Unavailable"
is this not the case here?
>
> l. Unmount the resctrl
>
> #umount /sys/fs/resctrl/
Reinette
next prev parent reply other threads:[~2024-06-14 0:54 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 12:23 [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Babu Moger
2024-05-24 12:23 ` [PATCH v4 01/19] x86/resctrl: Add support for " Babu Moger
2024-06-14 0:54 ` Reinette Chatre
2024-06-18 21:02 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 02/19] x86/resctrl: Add ABMC feature in the command line options Babu Moger
2024-05-24 12:23 ` [PATCH v4 03/19] x86/resctrl: Consolidate monitoring related data from rdt_resource Babu Moger
2024-06-14 0:55 ` Reinette Chatre
2024-06-18 21:02 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 04/19] x86/resctrl: Detect Assignable Bandwidth Monitoring feature details Babu Moger
2024-06-14 0:56 ` Reinette Chatre
2024-06-18 21:03 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 05/19] x86/resctrl: Introduce resctrl_file_fflags_init to initialize fflags Babu Moger
2024-06-14 0:57 ` Reinette Chatre
2024-06-18 21:03 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 06/19] x86/resctrl: Introduce interface to display number of ABMC counters Babu Moger
2024-06-14 0:57 ` Reinette Chatre
2024-06-18 21:04 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 07/19] x86/resctrl: Add support to enable/disable ABMC feature Babu Moger
2024-06-14 0:59 ` Reinette Chatre
2024-06-19 15:37 ` Moger, Babu
2024-06-20 22:02 ` Reinette Chatre
2024-06-21 15:44 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 08/19] x86/resctrl: Introduce the interface to display monitor mode Babu Moger
2024-06-14 1:40 ` Reinette Chatre
2024-06-19 16:25 ` Moger, Babu
2024-06-20 22:05 ` Reinette Chatre
2024-06-21 15:47 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 09/19] x86/resctrl: Initialize ABMC counters bitmap Babu Moger
2024-06-14 1:42 ` Reinette Chatre
2024-06-19 17:03 ` Moger, Babu
2024-06-20 22:20 ` Reinette Chatre
2024-06-21 16:01 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 10/19] x86/resctrl: Introduce ABMC state for the monitor group Babu Moger
2024-05-24 12:23 ` [PATCH v4 11/19] x86/resctrl: Introduce mbm_total_cfg and mbm_local_cfg Babu Moger
2024-06-14 1:43 ` Reinette Chatre
2024-06-19 18:46 ` Moger, Babu
2024-06-27 18:51 ` Moger, Babu
2024-06-27 20:56 ` Reinette Chatre
2024-06-27 21:26 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 12/19] x86/resctrl: Remove MSR reading of event configuration value Babu Moger
2024-05-24 12:23 ` [PATCH v4 13/19] x86/resctrl: Add data structures for ABMC assignment Babu Moger
2024-06-14 1:44 ` Reinette Chatre
2024-06-19 20:10 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 14/19] x86/resctrl: Add the interface to assign ABMC counter Babu Moger
2024-06-14 1:48 ` Reinette Chatre
2024-06-19 22:38 ` Moger, Babu
2024-06-20 22:50 ` Reinette Chatre
2024-06-21 16:07 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 15/19] x86/resctrl: Add the interface to unassign " Babu Moger
2024-06-14 1:49 ` Reinette Chatre
2024-06-20 13:48 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 16/19] x86/resctrl: Enable ABMC by default on resctrl mount Babu Moger
2024-06-14 1:50 ` Reinette Chatre
2024-06-20 14:46 ` Moger, Babu
2024-06-20 22:49 ` Reinette Chatre
2024-06-21 16:29 ` Moger, Babu
2024-05-24 12:23 ` [PATCH v4 17/19] x86/resctrl: Introduce the interface switch between ABMC and mbm_legacy Babu Moger
2024-06-14 1:51 ` Reinette Chatre
2024-06-20 14:53 ` Moger, Babu
2024-06-21 14:43 ` Markus Elfring
2024-05-24 12:23 ` [PATCH v4 18/19] x86/resctrl: Introduce interface to list monitor states of all the groups Babu Moger
2024-05-24 12:23 ` [PATCH v4 19/19] x86/resctrl: Introduce interface to modify assignment states of " Babu Moger
2024-06-14 0:54 ` Reinette Chatre [this message]
2024-06-18 21:02 ` [PATCH v4 00/19] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC) Moger, Babu
2024-06-20 22:49 ` Reinette Chatre
2024-06-21 16:41 ` Moger, Babu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2e488812-671e-4aa9-a292-c54b174f2dd7@intel.com \
--to=reinette.chatre@intel.com \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=daniel.sneddon@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=james.morse@arm.com \
--cc=jithu.joseph@intel.com \
--cc=jmattson@google.com \
--cc=jpoimboe@kernel.org \
--cc=kai.huang@intel.com \
--cc=kan.liang@linux.intel.com \
--cc=kim.phillips@amd.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=leitao@debian.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lukas.bulwahn@gmail.com \
--cc=maciej.wieczor-retman@intel.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peternewman@google.com \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=sandipan.das@amd.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
--cc=yanjiewtw@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).