* [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
@ 2025-04-21 22:43 Babu Moger
2025-04-21 22:43 ` [PATCH v4 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
This series adds the support for L3 Smart Data Cache Injection Allocation
Enforcement (SDCIAE) to resctrl infrastructure. It is refered to as "io_alloc"
in resctrl subsystem.
Upcoming AMD hardware implements Smart Data Cache Injection (SDCI).
Smart Data Cache Injection (SDCI) is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache. By directly caching
data from I/O devices rather than first storing the I/O data in DRAM, SDCI
reduces demands on DRAM bandwidth and reduces latency to the processor
consuming the I/O data.
The SDCIAE (SDCI Allocation Enforcement) PQE feature allows system software
to control the portion of the L3 cache used for SDCI devices.
When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache
partitions identified by the highest-supported L3_MASK_n register, where n
is the maximum supported CLOSID.
The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
Injection Allocation Enforcement (SDCIAE)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
The feature requires linux support of TPH (TLP Processing Hints).
The support is available in linux kernel after the commit
48d0fd2b903e3 ("PCI/TPH: Add TPH documentation")
The patches are based on top of commit
84c319145cbad6 Merge branch into ("tip/master: 'x86/nmi'")
# Linux Implementation
Feature adds following interface files when the resctrl "io_alloc" feature is
supported on L3 resource:
/sys/fs/resctrl/info/L3/io_alloc: Report the feature status. Enable/disable the
feature by writing to the interface.
/sys/fs/resctrl/info/L3/io_alloc_cbm: List the Capacity Bit Masks (CBMs) available
for I/O devices when io_alloc feature is enabled.
Configure the CBM by writing to the interface.
# Examples:
a. Check if io_alloc feature is available
#mount -t resctrl resctrl /sys/fs/resctrl/
# cat /sys/fs/resctrl/info/L3/io_alloc
disabled
b. Enable the io_alloc feature.
# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
# cat /sys/fs/resctrl/info/L3/io_alloc
enabled
c. Check the CBM values for the io_alloc feature.
# cat /sys/fs/resctrl/info/L3/io_alloc_cbm
L3:0=ffff;1=ffff
d. Change the CBM value for the domain 1:
# echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
# cat /sys/fs/resctrl/info/L3/io_alloc_cbm
L3:0=ffff;1=00ff
d. Disable io_alloc feature and exit.
# echo 0 > /sys/fs/resctrl/info/L3/io_alloc
# cat /sys/fs/resctrl/info/L3/io_alloc
disabled
#umount /sys/fs/resctrl/
---
v4: The "io_alloc" interface will report "enabled/disabled/not supported"
instead of 0 or 1..
Updated resctrl_io_alloc_closid_get() to verify the max closid availability
using closids_supported().
Updated the documentation for "shareable_bits" and "bit_usage".
NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
handle bit_usage for specific CLOS. Its about overall system. So, we cannot
really tell the user which CLOS is shared across both hardware and software.
This is something we need to discuss.
Introduced io_alloc_init() to initialize fflags.
Printed the group name when io_alloc enablement fails to help user.
Added rdtgroup_mutex before rdt_last_cmd_puts() in resctrl_io_alloc_cbm_show().
Returned -ENODEV when resource type is CDP_DATA.
Kept the resource name while printing the CBM (L3:0=ffff) that way we dont have
to change show_doms() just for this feature and it is consistant across all the
schemata display.
Added new patch to call parse_cbm() directly to avoid code duplication.
Checked all the series(v1-v3) again to verify if I missed any comment.
v3: Rewrote commit log for the last 3 patches. Changed the text to bit
more generic than the AMD specific feature. Added AMD feature
specifics in the end.
Renamed the rdt_get_sdciae_alloc_cfg() to rdt_set_io_alloc_capable().
Renamed the _resctrl_io_alloc_enable() to _resctrl_sdciae_enable()
as it is arch specific.
Changed the return to void in _resctrl_sdciae_enable() instead of int.
The number of CLOSIDs is determined based on the minimum supported
across all resources (in closid_init). It needs to match the max
supported on the resource. Added the check to verify if MAX CLOSID
availability on the system.
Added CDP check to make sure io_alloc is configured in CDP_CODE.
Highest CLOSID corresponds to CDP_CODE.
Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
Added errors in few cases when CLOSID allocation fails.
Fixes splat reported when info/L3/bit_usage is accesed when io_alloc is enabled.
https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/
v2: Added dependancy on X86_FEATURE_CAT_L3
Removed the "" in CPU feature definition.
Changed sdciae_capable to io_alloc_capable to make it as generic feature.
Moved io_alloc_capable field in struct resctrl_cache.
Changed the name of few arch functions similar to ABMC series.
resctrl_arch_get_io_alloc_enabled()
resctrl_arch_io_alloc_enable()
Renamed the feature to "io_alloc".
Added generic texts for the feature in commit log and resctrl.rst doc.
Added resctrl_io_alloc_init_cat() to initialize io_alloc to default values
when enabled.
Fixed io_alloc interface to show only on L3 resource.
Added the locks while processing io_alloc CBMs.
Previous versions:
v3: https://lore.kernel.org/lkml/cover.1738272037.git.babu.moger@amd.com/
v2: https://lore.kernel.org/lkml/cover.1734556832.git.babu.moger@amd.com/
v1: https://lore.kernel.org/lkml/cover.1723824984.git.babu.moger@amd.com/
Babu Moger (8):
x86/cpufeatures: Add support for L3 Smart Data Cache Injection
Allocation Enforcement
x86/resctrl: Add SDCIAE feature in the command line options
x86/resctrl: Detect io_alloc feature
x86/resctrl: Implement "io_alloc" enable/disable handlers
x86/resctrl: Add user interface to enable/disable io_alloc feature
x86/resctrl: Introduce interface to display io_alloc CBMs
x86/resctrl: Modify rdt_parse_data to pass mode and CLOSID
x86/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
.../admin-guide/kernel-parameters.txt | 2 +-
Documentation/arch/x86/resctrl.rst | 55 +++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/resctrl/core.c | 9 +
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 35 +-
arch/x86/kernel/cpu/resctrl/internal.h | 19 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 372 +++++++++++++++++-
arch/x86/kernel/cpu/scattered.c | 1 +
include/linux/resctrl.h | 12 +
11 files changed, 487 insertions(+), 21 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v4 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 2/8] x86/resctrl: Add SDCIAE feature in the command line options Babu Moger
` (7 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
Smart Data Cache Injection (SDCI) is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache. By directly caching
data from I/O devices rather than first storing the I/O data in DRAM,
SDCI reduces demands on DRAM bandwidth and reduces latency to the processor
consuming the I/O data.
The SDCIAE (SDCI Allocation Enforcement) PQE feature allows system software
to control the portion of the L3 cache used for SDCI.
When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache
partitions identified by the highest-supported L3_MASK_n register, where n
is the maximum supported CLOSID.
Add CPUID feature bit that can be used to configure SDCIAE.
The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
Injection Allocation Enforcement (SDCIAE)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Resolved a minor conflict in cpufeatures.h.
v3: No changes.
v2: Added dependancy on X86_FEATURE_CAT_L3
Removed the "" in CPU feature definition.
Minor text changes.
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 6c2c152d8a67..8dfbea91bef6 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -481,6 +481,7 @@
#define X86_FEATURE_AMD_HETEROGENEOUS_CORES (21*32 + 6) /* Heterogeneous Core Topology */
#define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32 + 7) /* Workload Classification */
#define X86_FEATURE_PREFER_YMM (21*32 + 8) /* Avoid ZMM registers due to downclocking */
+#define X86_FEATURE_SDCIAE (21*32 + 9) /* L3 Smart Data Cache Injection Allocation Enforcement */
/*
* BUG word(s)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 94c062cddfa4..24ff4a98d204 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -71,6 +71,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_TOTAL },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_LOCAL },
+ { X86_FEATURE_SDCIAE, X86_FEATURE_CAT_L3 },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
{ X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 16f3ca30626a..d18a7ce16388 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -49,6 +49,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
{ X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
+ { X86_FEATURE_SDCIAE, CPUID_EBX, 6, 0x80000020, 0 },
{ X86_FEATURE_AMD_WORKLOAD_CLASS, CPUID_EAX, 22, 0x80000021, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ X86_FEATURE_AMD_LBR_V2, CPUID_EAX, 1, 0x80000022, 0 },
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 2/8] x86/resctrl: Add SDCIAE feature in the command line options
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
2025-04-21 22:43 ` [PATCH v4 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
` (6 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
Add the command line options to enable or disable the new resctrl feature
L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE).
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: No changes.
v3: No changes.
v2: No changes.
---
Documentation/admin-guide/kernel-parameters.txt | 2 +-
arch/x86/kernel/cpu/resctrl/core.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 76e538c77e31..5e5abc270f91 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5991,7 +5991,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
- mba, smba, bmec.
+ mba, smba, bmec, sdciae.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index cf29681d01e0..422083dc4651 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -723,6 +723,7 @@ enum {
RDT_FLAG_MBA,
RDT_FLAG_SMBA,
RDT_FLAG_BMEC,
+ RDT_FLAG_SDCIAE,
};
#define RDT_OPT(idx, n, f) \
@@ -748,6 +749,7 @@ static struct rdt_options rdt_options[] __initdata = {
RDT_OPT(RDT_FLAG_MBA, "mba", X86_FEATURE_MBA),
RDT_OPT(RDT_FLAG_SMBA, "smba", X86_FEATURE_SMBA),
RDT_OPT(RDT_FLAG_BMEC, "bmec", X86_FEATURE_BMEC),
+ RDT_OPT(RDT_FLAG_SDCIAE, "sdciae", X86_FEATURE_SDCIAE),
};
#define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 3/8] x86/resctrl: Detect io_alloc feature
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
2025-04-21 22:43 ` [PATCH v4 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
2025-04-21 22:43 ` [PATCH v4 2/8] x86/resctrl: Add SDCIAE feature in the command line options Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
` (5 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
Data from I/O devices can be inserted directly into L3 cache. This reduces
demands on DRAM bandwidth and reduces latency to the processor consuming
the I/O data.
Introduce cache resource property "io_alloc_capable" that an architecture
can set if a portion of the L3 cache can be allocated for I/O traffic.
Set this property on x86 systems that support SDCIAE (L3 Smart Data Cache
Injection Allocation Enforcement).
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Updated the commit message and code comment based on feedback.
v3: Rewrote commit log. Changed the text to bit generic than the AMD specific.
Renamed the rdt_get_sdciae_alloc_cfg() to rdt_set_io_alloc_capable().
Removed leftover comment from v2.
v2: Changed sdciae_capable to io_alloc_capable to make it generic feature.
Also moved the io_alloc_capable in struct resctrl_cache.
---
arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++
include/linux/resctrl.h | 3 +++
2 files changed, 10 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 422083dc4651..c478f591b7c1 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -292,6 +292,11 @@ static void rdt_get_cdp_config(int level)
rdt_resources_all[level].r_resctrl.cdp_capable = true;
}
+static void rdt_set_io_alloc_capable(struct rdt_resource *r)
+{
+ r->cache.io_alloc_capable = true;
+}
+
static void rdt_get_cdp_l3_config(void)
{
rdt_get_cdp_config(RDT_RESOURCE_L3);
@@ -858,6 +863,8 @@ static __init bool get_rdt_alloc_resources(void)
rdt_get_cache_alloc_cfg(1, r);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
rdt_get_cdp_l3_config();
+ if (rdt_cpu_has(X86_FEATURE_SDCIAE))
+ rdt_set_io_alloc_capable(r);
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 880351ca3dfc..dd09bb9a173b 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -177,6 +177,8 @@ struct rdt_mon_domain {
* @arch_has_sparse_bitmasks: True if a bitmask like f00f is valid.
* @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
* level has CPU scope.
+ * @io_alloc_capable: True if portion of the cache can be allocated
+ * for I/O traffic.
*/
struct resctrl_cache {
unsigned int cbm_len;
@@ -184,6 +186,7 @@ struct resctrl_cache {
unsigned int shareable_bits;
bool arch_has_sparse_bitmasks;
bool arch_has_per_cpu_cfg;
+ bool io_alloc_capable;
};
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (2 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 5/8] x86/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
` (4 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
"io_alloc" enables direct insertion of data from I/O devices into the L3
cache.
On AMD, "io_alloc" feature is backed by L3 Smart Data Cache Injection
Allocation Enforcement (SDCIAE). Change SDCIAE state by setting (to enable)
or clearing (to disable) bit 1 of MSR L3_QOS_EXT_CFG on all logical
processors within the cache domain.
Introduce architecture-specific handlers to enable and disable the feature.
The SDCIAE feature details are available in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
Injection Allocation Enforcement (SDCIAE)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Updated the commit log to address the feedback.
v3: Passed the struct rdt_resource to resctrl_arch_get_io_alloc_enabled() instead of resource id.
Renamed the _resctrl_io_alloc_enable() to _resctrl_sdciae_enable() as it is arch specific.
Changed the return to void in _resctrl_sdciae_enable() instead of int.
Added more context in commit log and fixed few typos.
v2: Renamed the functions to simplify the code.
Renamed sdciae_capable to io_alloc_capable.
Changed the name of few arch functions similar to ABMC series.
resctrl_arch_get_io_alloc_enabled()
resctrl_arch_io_alloc_enable()
---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 10 ++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 32 ++++++++++++++++++++++++++
include/linux/resctrl.h | 9 ++++++++
4 files changed, 52 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e6134ef2263d..3970e0b16e47 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1203,6 +1203,7 @@
/* - AMD: */
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
+#define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff
#define MSR_IA32_EVT_CFG_BASE 0xc0000400
/* AMD-V MSRs */
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index eaae99602b61..6ead222904fe 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -32,6 +32,9 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
+/* Setting bit 1 in L3_QOS_EXT_CFG enables the SDCIAE feature. */
+#define SDCIAE_ENABLE_BIT 1
+
/**
* cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
* aren't marked nohz_full
@@ -381,6 +384,7 @@ struct msr_param {
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
* @cdp_enabled: CDP state of this resource
+ * @sdciae_enabled: SDCIAE feature is enabled
*
* Members of this structure are either private to the architecture
* e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -394,6 +398,7 @@ struct rdt_hw_resource {
unsigned int mon_scale;
unsigned int mbm_width;
bool cdp_enabled;
+ bool sdciae_enabled;
};
static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
@@ -420,6 +425,11 @@ static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+static inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
+{
+ return resctrl_to_arch_res(r)->sdciae_enabled;
+}
+
void arch_mon_domain_online(struct rdt_resource *r, struct rdt_mon_domain *d);
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 93ec829015f1..85796a186374 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1845,6 +1845,38 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}
+static void resctrl_sdciae_set_one_amd(void *arg)
+{
+ bool *enable = arg;
+
+ if (*enable)
+ msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
+ else
+ msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
+}
+
+static void _resctrl_sdciae_enable(struct rdt_resource *r, bool enable)
+{
+ struct rdt_ctrl_domain *d;
+
+ /* Update L3_QOS_EXT_CFG MSR on all the CPUs in all domains */
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list)
+ on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_sdciae_set_one_amd, &enable, 1);
+}
+
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+ if (hw_res->r_resctrl.cache.io_alloc_capable &&
+ hw_res->sdciae_enabled != enable) {
+ _resctrl_sdciae_enable(r, enable);
+ hw_res->sdciae_enabled = enable;
+ }
+
+ return 0;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index dd09bb9a173b..92e242c13719 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -514,6 +514,15 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
*/
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
+/**
+ * resctrl_arch_io_alloc_enable() - Enable/disable io_alloc feature.
+ * @r: The resctrl resource.
+ * @enable: Enable (true) or disable (false) io_alloc on resource @r.
+ *
+ * This can be called from any CPU.
+ */
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable);
+
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 5/8] x86/resctrl: Add user interface to enable/disable io_alloc feature
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (3 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 6/8] x86/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
` (3 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
The io_alloc feature in resctrl is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache.
On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
lines to be placed into the L3 cache partitions identified by the
highest-supported L3_MASK_n register as reported by CPUID
Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
be allocated into the L3 cache partitions determined by the bitmask in
the L3_MASK_15 register.
When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
allocated for the instruction cache (L3CODE).
Introduce user interface to enable/disable "io_alloc" feature.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Updated the change log.
Updated the user doc.
The "io_alloc" interface will report "enabled/disabled/not supported".
Updated resctrl_io_alloc_closid_get() to verify the max closid availability.
Updated the documentation for "shareable_bits" and "bit_usage".
Introduced io_alloc_init() to initialize fflags.
Printed the group name when io_alloc enablement fails.
NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
handle bit_usage for specific CLOS. Its about overall system. So, we cannot
really tell the user which CLOS is shared across both hardware and software.
We need to discuss this.
v3: Rewrote the change to make it generic.
Rewrote the documentation in resctrl.rst to be generic and added
AMD feature details in the end.
Added the check to verify if MAX CLOSID availability on the system.
Added CDP check to make sure io_alloc is configured in CDP_CODE.
Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
Added errors in few cases when CLOSID allocation fails.
Fixes splat reported when info/L3/bit_usage is accesed when io_alloc
is enabled.
https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/
v2: Renamed the feature to "io_alloc".
Added generic texts for the feature in commit log and resctrl.rst doc.
Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
values when enabled.
Fixed io_alloc show functinality to display only on L3 resource.
---
Documentation/arch/x86/resctrl.rst | 34 +++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 185 ++++++++++++++++++++++++-
2 files changed, 218 insertions(+), 1 deletion(-)
diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 6768fc1fad16..7672c5c52c1a 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -95,6 +95,11 @@ related to allocation:
some platforms support devices that have their
own settings for cache use which can over-ride
these bits.
+
+ When the "io_alloc" feature is enabled, a portion of the cache
+ is reserved for shared use between hardware and software. Refer
+ to "bit_usage" to see which portion is allocated for this purpose.
+
"bit_usage":
Annotated capacity bitmasks showing how all
instances of the resource are used. The legend is:
@@ -135,6 +140,35 @@ related to allocation:
"1":
Non-contiguous 1s value in CBM is supported.
+"io_alloc":
+ The "io_alloc" enables system software to configure the portion
+ of the L3 cache allocated for I/O traffic.
+
+ The feature routes the I/O traffic via specific CLOSID reserved
+ for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
+ for the CLOSID, users can control the L3 portions available for
+ I/0 traffic. The reserved CLOSID will be excluded for group creation.
+
+ The interface provides a means to query the status of feature support.
+
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3/io_alloc
+ disabled
+
+ Feature can be enabled/disabled by writing to the interface.
+ Example::
+
+ # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+ # cat /sys/fs/resctrl/info/L3/io_alloc
+ enabled
+
+ On AMD systems, the io_alloc feature is supported by the L3 Smart
+ Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
+ io_alloc is determined by the highest CLOSID supported by the resource.
+ When CDP is enabled, io_alloc routes I/O traffic using the highest
+ CLOSID allocated for the instruction cache (L3CODE).
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 85796a186374..d53a2068cde4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -68,6 +68,7 @@ static char last_cmd_status_buf[512];
static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
static void rdtgroup_destroy_root(void);
+static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
struct dentry *debugfs_resctrl;
@@ -199,6 +200,19 @@ void closid_free(int closid)
__set_bit(closid, &closid_free_map);
}
+static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
+{
+ if (__test_and_clear_bit(io_alloc_closid, &closid_free_map))
+ return io_alloc_closid;
+ else
+ return -ENOSPC;
+}
+
+static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
+{
+ closid_free(io_alloc_closid);
+}
+
/**
* closid_allocated - test if provided closid is in use
* @closid: closid to be tested
@@ -1033,6 +1047,31 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
return 0;
}
+/*
+ * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
+ * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
+ *
+ * The total number of CLOSIDs is determined in closid_init(), based on the
+ * minimum supported across all resources. If CDP (Code Data Prioritization)
+ * is enabled, the number of CLOSIDs is halved. The final value is returned
+ * by closids_supported() and stored in s->num_closid for each resource.
+ * Make sure this value aligns with the maximum CLOSID supported by the
+ * respective resource.
+ */
+static int resctrl_io_alloc_closid_get(struct rdt_resource *r,
+ struct resctrl_schema *s)
+{
+ int num_closids = closids_supported();
+
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ num_closids *= 2;
+
+ if (num_closids != resctrl_arch_get_num_closid(r))
+ return -ENOSPC;
+ else
+ return s->num_closid - 1;
+}
+
/*
* rdt_bit_usage_show - Display current usage of resources
*
@@ -1076,9 +1115,20 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
for (i = 0; i < closids_supported(); i++) {
if (!closid_allocated(i))
continue;
+ /*
+ * If io_alloc is enabled, the CLOSID will be
+ * allocated but will not be associated with any
+ * groups. The region is available for sharing with
+ * io_alloc feature as well as resctrl groups.
+ */
+ if (i == resctrl_io_alloc_closid_get(r, s) &&
+ resctrl_arch_get_io_alloc_enabled(r))
+ mode = RDT_MODE_SHAREABLE;
+ else
+ mode = rdtgroup_mode_by_closid(i);
+
ctrl_val = resctrl_arch_get_config(r, dom, i,
s->conf_type);
- mode = rdtgroup_mode_by_closid(i);
switch (mode) {
case RDT_MODE_SHAREABLE:
sw_shareable |= ctrl_val;
@@ -1877,6 +1927,121 @@ int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
return 0;
}
+static int resctrl_io_alloc_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+
+ if (r->cache.io_alloc_capable && !(s->conf_type == CDP_DATA)) {
+ if (resctrl_arch_get_io_alloc_enabled(r))
+ seq_puts(seq, "enabled\n");
+ else
+ seq_puts(seq, "disabled\n");
+ } else {
+ seq_puts(seq, "not supported\n");
+ }
+
+ return 0;
+}
+
+/*
+ * Initialize io_alloc CLOSID cache resource with default CBM values.
+ */
+static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
+ struct resctrl_schema *s, u32 closid)
+{
+ int ret;
+
+ rdt_staged_configs_clear();
+
+ ret = rdtgroup_init_cat(s, closid);
+ if (ret < 0)
+ goto out_init_cat;
+
+ ret = resctrl_arch_update_domains(r, closid);
+
+out_init_cat:
+ rdt_staged_configs_clear();
+ return ret;
+}
+
+static const char *rdtgroup_name_by_closid(int closid)
+{
+ struct rdtgroup *rdtgrp;
+
+ list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+ if (rdtgrp->closid == closid)
+ return rdtgrp->kn->name;
+ }
+
+ return NULL;
+}
+
+static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ char const *grp_name;
+ u32 io_alloc_closid;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret)
+ return ret;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable || s->conf_type == CDP_DATA) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ ret = -ENODEV;
+ goto out_io_alloc;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r, s);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto out_io_alloc;
+ }
+
+ if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
+ if (enable) {
+ ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
+ if (ret < 0) {
+ grp_name = rdtgroup_name_by_closid(io_alloc_closid);
+ rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
+ grp_name ? grp_name : "another");
+ ret = -EINVAL;
+ goto out_io_alloc;
+ }
+
+ ret = resctrl_io_alloc_init_cat(r, s, io_alloc_closid);
+ if (ret) {
+ rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
+ resctrl_io_alloc_closid_free(io_alloc_closid);
+ goto out_io_alloc;
+ }
+
+ } else {
+ resctrl_io_alloc_closid_free(io_alloc_closid);
+ }
+
+ ret = resctrl_arch_io_alloc_enable(r, enable);
+ }
+
+out_io_alloc:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -2029,6 +2194,13 @@ static struct rftype res_common_files[] = {
.seq_show = rdtgroup_schemata_show,
.fflags = RFTYPE_CTRL_BASE,
},
+ {
+ .name = "io_alloc",
+ .mode = 0644,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = resctrl_io_alloc_show,
+ .write = resctrl_io_alloc_write,
+ },
{
.name = "mba_MBps_event",
.mode = 0644,
@@ -2137,6 +2309,15 @@ static void thread_throttle_mode_init(void)
RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
}
+static void io_alloc_init(void)
+{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+ if (r->cache.io_alloc_capable)
+ resctrl_file_fflags_init("io_alloc",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+}
+
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
{
struct rftype *rft;
@@ -4381,6 +4562,8 @@ int __init resctrl_init(void)
thread_throttle_mode_init();
+ io_alloc_init();
+
ret = resctrl_mon_resource_init();
if (ret)
return ret;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 6/8] x86/resctrl: Introduce interface to display io_alloc CBMs
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (4 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 5/8] x86/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 7/8] x86/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
` (2 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
The io_alloc feature in resctrl enables system software to configure
the portion of the L3 cache allocated for I/O traffic.
Add the interface to display CBMs (Capacity Bit Mask) of io_alloc
feature.
When CDP is enabled, io_alloc routes traffic using the highest CLOSID
used by a L3CODE resource. Add a check for the CDP resource type.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Updated the change log.
Added rdtgroup_mutex before rdt_last_cmd_puts().
Returned -ENODEV when resource type is CDP_DATA.
Kept the resource name while printing the CBM (L3:0=fff) that way
I dont have to change show_doms() just for this feature and it is
consistant across all the schemata display.
v3: Minor changes due to changes in resctrl_arch_get_io_alloc_enabled()
and resctrl_io_alloc_closid_get().
Added the check to verify CDP resource type.
Updated the commit log.
v2: Fixed to display only on L3 resources.
Added the locks while processing.
Rename the displat to io_alloc_cbm (from sdciae_cmd).
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 51 ++++++++++++++++++++++-
3 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 0a0ac5f6112e..d1a59b56a456 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -454,7 +454,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
return hw_dom->ctrl_val[idx];
}
-static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
+void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
{
struct rdt_resource *r = schema->res;
struct rdt_ctrl_domain *dom;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6ead222904fe..2ac78650500a 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -519,6 +519,7 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
+void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid);
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d53a2068cde4..5633437ea85d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2042,6 +2042,46 @@ static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
return ret ?: nbytes;
}
+static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ u32 io_alloc_closid;
+ int ret = 0;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable || s->conf_type == CDP_DATA) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ ret = -ENODEV;
+ goto cbm_show_out;
+ }
+
+ if (!resctrl_arch_get_io_alloc_enabled(r)) {
+ rdt_last_cmd_puts("io_alloc feature is not enabled\n");
+ ret = -EINVAL;
+ goto cbm_show_out;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r, s);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto cbm_show_out;
+ }
+
+ show_doms(seq, s, io_alloc_closid);
+
+cbm_show_out:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+ return ret;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -2201,6 +2241,12 @@ static struct rftype res_common_files[] = {
.seq_show = resctrl_io_alloc_show,
.write = resctrl_io_alloc_write,
},
+ {
+ .name = "io_alloc_cbm",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = resctrl_io_alloc_cbm_show,
+ },
{
.name = "mba_MBps_event",
.mode = 0644,
@@ -2313,9 +2359,12 @@ static void io_alloc_init(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
- if (r->cache.io_alloc_capable)
+ if (r->cache.io_alloc_capable) {
resctrl_file_fflags_init("io_alloc",
RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+ resctrl_file_fflags_init("io_alloc_cbm",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+ }
}
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 7/8] x86/resctrl: Modify rdt_parse_data to pass mode and CLOSID
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (5 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 6/8] x86/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-04-21 22:43 ` [PATCH v4 8/8] x86/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
2025-05-02 21:20 ` [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Reinette Chatre
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
The functions parse_cbm() and parse_bw() require mode and CLOSID to
validate the Capacity Bit Mask (CBM). It is passed through struct
rdtgroup in rdt_parse_data. Instead of passing them through struct
rdtgroup, pass mode and closid directly.
This change enables parse_cbm() to be used for verifying CBM in io_alloc
feature.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: New patch to call parse_cbm() directly to avoid code duplication.
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 29 ++++++++++-------------
arch/x86/kernel/cpu/resctrl/internal.h | 6 +++++
2 files changed, 19 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index d1a59b56a456..e5d1e77e1995 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -23,11 +23,6 @@
#include "internal.h"
-struct rdt_parse_data {
- struct rdtgroup *rdtgrp;
- char *buf;
-};
-
typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
struct resctrl_schema *s,
struct rdt_ctrl_domain *d);
@@ -77,8 +72,8 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_ctrl_domain *d)
{
struct resctrl_staged_config *cfg;
- u32 closid = data->rdtgrp->closid;
struct rdt_resource *r = s->res;
+ u32 closid = data->closid;
u32 bw_val;
cfg = &d->staged_config[s->conf_type];
@@ -156,9 +151,10 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_ctrl_domain *d)
{
- struct rdtgroup *rdtgrp = data->rdtgrp;
+ enum rdtgrp_mode mode = data->mode;
struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
+ u32 closid = data->closid;
u32 cbm_val;
cfg = &d->staged_config[s->conf_type];
@@ -171,7 +167,7 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
* Cannot set up more than one pseudo-locked region in a cache
* hierarchy.
*/
- if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
+ if (mode == RDT_MODE_PSEUDO_LOCKSETUP &&
rdtgroup_pseudo_locked_in_hierarchy(d)) {
rdt_last_cmd_puts("Pseudo-locked region in hierarchy\n");
return -EINVAL;
@@ -180,9 +176,9 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
if (!cbm_validate(data->buf, &cbm_val, r))
return -EINVAL;
- if ((rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
- rdtgrp->mode == RDT_MODE_SHAREABLE) &&
- rdtgroup_cbm_overlaps_pseudo_locked(d, cbm_val)) {
+ if ((mode == RDT_MODE_EXCLUSIVE ||
+ mode == RDT_MODE_SHAREABLE) &&
+ rdtgroup_cbm_overlaps_pseudo_locked(d, cbm_val)) {
rdt_last_cmd_puts("CBM overlaps with pseudo-locked region\n");
return -EINVAL;
}
@@ -191,14 +187,14 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
* The CBM may not overlap with the CBM of another closid if
* either is exclusive.
*/
- if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, true)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, closid, true)) {
rdt_last_cmd_puts("Overlaps with exclusive group\n");
return -EINVAL;
}
- if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, false)) {
- if (rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
- rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, closid, false)) {
+ if (mode == RDT_MODE_EXCLUSIVE ||
+ mode == RDT_MODE_PSEUDO_LOCKSETUP) {
rdt_last_cmd_puts("Overlaps with other group\n");
return -EINVAL;
}
@@ -262,7 +258,8 @@ static int parse_line(char *line, struct resctrl_schema *s,
list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
if (d->hdr.id == dom_id) {
data.buf = dom;
- data.rdtgrp = rdtgrp;
+ data.closid = rdtgrp->closid;
+ data.mode = rdtgrp->mode;
if (parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2ac78650500a..92246d2b91c8 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -241,6 +241,12 @@ struct rdtgroup {
struct pseudo_lock_region *plr;
};
+struct rdt_parse_data {
+ u32 closid;
+ enum rdtgrp_mode mode;
+ char *buf;
+};
+
/* rdtgroup.flags */
#define RDT_DELETED 1
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v4 8/8] x86/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (6 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 7/8] x86/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
@ 2025-04-21 22:43 ` Babu Moger
2025-05-02 21:20 ` [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Reinette Chatre
8 siblings, 0 replies; 20+ messages in thread
From: Babu Moger @ 2025-04-21 22:43 UTC (permalink / raw)
To: tony.luck, reinette.chatre, tglx, mingo, bp, dave.hansen
Cc: babu.moger, corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, thomas.lendacky, mario.limonciello, perry.yuan, seanjc,
kai.huang, xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta,
ak, ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du,
linux-doc, linux-kernel, james.morse, fenghuay, peternewman
"io_alloc" feature is a mechanism that enables direct insertion of data
from I/O devices into the L3 cache. By directly caching data from I/O
devices rather than first storing the I/O data in DRAM, it reduces the
demands on DRAM bandwidth and reduces latency to the processor consuming
the I/O data.
io_alloc feature uses the highest CLOSID to route the traffic from I/O
devices. Provide the interface to modify io_alloc CBMs (Capacity Bit Mask)
when feature is enabled.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v4: Removed resctrl_io_alloc_parse_cbm and called parse_cbm() directly.
v3: Minor changes due to changes in resctrl_arch_get_io_alloc_enabled()
and resctrl_io_alloc_closid_get().
Taken care of handling the CBM update when CDP is enabled.
Updated the commit log to make it generic.
v2: Added more generic text in documentation.
---
Documentation/arch/x86/resctrl.rst | 21 +++++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 +-
arch/x86/kernel/cpu/resctrl/internal.h | 2 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 108 +++++++++++++++++++++-
4 files changed, 132 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 7672c5c52c1a..6fdea77a1675 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -169,6 +169,27 @@ related to allocation:
When CDP is enabled, io_alloc routes I/O traffic using the highest
CLOSID allocated for the instruction cache (L3CODE).
+"io_alloc_cbm":
+ Capacity Bit Masks (CBMs) available to supported IO devices which
+ can directly insert cache lines in L3 which can help to reduce the
+ latency. CBM can be configured by writing to the interface in the
+ following format::
+
+ <resource_name>:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
+ L3:0=ffff;1=ffff
+
+ # echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
+ # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
+ L3:0=ffff;1=00ff
+
+ When CDP is enabled, L3 control is divided into two separate resources:
+ L3CODE and L3DATA. However, the CBM can only be updated on the L3CODE
+ resource.
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index e5d1e77e1995..315584415cc4 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -148,8 +148,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
enum rdtgrp_mode mode = data->mode;
struct resctrl_staged_config *cfg;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 92246d2b91c8..1d3b60741a39 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -526,6 +526,8 @@ void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid);
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d);
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5633437ea85d..73532c363e57 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2082,6 +2082,111 @@ static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
return ret;
}
+static int resctrl_io_alloc_parse_line(char *line, struct rdt_resource *r,
+ struct resctrl_schema *s, u32 closid)
+{
+ struct rdt_parse_data data;
+ struct rdt_ctrl_domain *d;
+ char *dom = NULL, *id;
+ unsigned long dom_id;
+
+next:
+ if (!line || line[0] == '\0')
+ return 0;
+
+ dom = strsep(&line, ";");
+ id = strsep(&dom, "=");
+ if (!dom || kstrtoul(id, 10, &dom_id)) {
+ rdt_last_cmd_puts("Missing '=' or non-numeric domain\n");
+ return -EINVAL;
+ }
+
+ dom = strim(dom);
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ if (d->hdr.id == dom_id) {
+ data.buf = dom;
+ data.mode = RDT_MODE_SHAREABLE;
+ data.closid = closid;
+ if (parse_cbm(&data, s, d))
+ return -EINVAL;
+ goto next;
+ }
+ }
+ return -EINVAL;
+}
+
+static ssize_t resctrl_io_alloc_cbm_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ u32 io_alloc_closid;
+ char *resname;
+ int ret = 0;
+
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
+ return -EINVAL;
+
+ buf[nbytes - 1] = '\0';
+
+ if (!r->cache.io_alloc_capable || s->conf_type == CDP_DATA) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ return -EINVAL;
+ }
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+ rdt_staged_configs_clear();
+
+ if (!resctrl_arch_get_io_alloc_enabled(r)) {
+ rdt_last_cmd_puts("io_alloc feature is not enabled\n");
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ resname = strim(strsep(&buf, ":"));
+ if (!buf) {
+ rdt_last_cmd_puts("Missing ':'\n");
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ if (strcmp(resname, s->name)) {
+ rdt_last_cmd_printf("Unsupported resource name '%s'\n", resname);
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ if (buf[0] == '\0') {
+ rdt_last_cmd_printf("Missing '%s' value\n", resname);
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r, s);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ ret = resctrl_io_alloc_parse_line(buf, r, s, io_alloc_closid);
+ if (ret)
+ goto cbm_write_out;
+
+ ret = resctrl_arch_update_domains(r, io_alloc_closid);
+
+cbm_write_out:
+ rdt_staged_configs_clear();
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -2243,9 +2348,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "io_alloc_cbm",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = resctrl_io_alloc_cbm_show,
+ .write = resctrl_io_alloc_cbm_write,
},
{
.name = "mba_MBps_event",
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (7 preceding siblings ...)
2025-04-21 22:43 ` [PATCH v4 8/8] x86/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
@ 2025-05-02 21:20 ` Reinette Chatre
2025-05-03 0:53 ` Moger, Babu
8 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2025-05-02 21:20 UTC (permalink / raw)
To: Babu Moger, tony.luck, tglx, mingo, bp, dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Babu,
On 4/21/25 3:43 PM, Babu Moger wrote:
> # Linux Implementation
>
> Feature adds following interface files when the resctrl "io_alloc" feature is
> supported on L3 resource:
>
> /sys/fs/resctrl/info/L3/io_alloc: Report the feature status. Enable/disable the
> feature by writing to the interface.
>
> /sys/fs/resctrl/info/L3/io_alloc_cbm: List the Capacity Bit Masks (CBMs) available
> for I/O devices when io_alloc feature is enabled.
> Configure the CBM by writing to the interface.
>
> # Examples:
>
> a. Check if io_alloc feature is available
> #mount -t resctrl resctrl /sys/fs/resctrl/
>
> # cat /sys/fs/resctrl/info/L3/io_alloc
> disabled
>
> b. Enable the io_alloc feature.
>
> # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
> # cat /sys/fs/resctrl/info/L3/io_alloc
> enabled
>
> c. Check the CBM values for the io_alloc feature.
>
> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> L3:0=ffff;1=ffff
>
> d. Change the CBM value for the domain 1:
> # echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
>
> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> L3:0=ffff;1=00ff
>
> d. Disable io_alloc feature and exit.
>
> # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
> # cat /sys/fs/resctrl/info/L3/io_alloc
> disabled
>
> #umount /sys/fs/resctrl/
>
From what I can tell the interface when CDP is enabled will look
as follows:
# mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
# cat /sys/fs/resctrl/info/L3CODE/io_alloc
disabled
# cat /sys/fs/resctrl/info/L3DATA/io_alloc
not supported
"io_alloc" can thus be enabled for L3CODE but not for L3DATA.
This is unexpected considering the feature is called
"L3 Smart *Data* Cache Injection Allocation Enforcement".
I understand that the interface evolved into this because the
"code" allocation of CDP uses the CLOSID required by SDCIAE but I think
leaking implementation details like this to the user interface can
cause confusion.
Since there is no distinction between code and data in these
IO allocations, what do you think of connecting the io_alloc and
io_alloc_cbm files within L3CODE and L3DATA so that the user can
read/write from either with a read showing the same data and
user able to write to either? For example,
# mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
# cat /sys/fs/resctrl/info/L3CODE/io_alloc
disabled
# cat /sys/fs/resctrl/info/L3DATA/io_alloc
disabled
# echo 1 > /sys/fs/resctrl/info/L3CODE/io_alloc
# cat /sys/fs/resctrl/info/L3CODE/io_alloc
enabled
# cat /sys/fs/resctrl/info/L3DATA/io_alloc
enabled
# cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
0=ffff;1=ffff
# cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
0=ffff;1=ffff
# echo 1=FF > /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
# cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
0=ffff;1=00ff
# cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
0=ffff;1=00ff
(Note in above I removed the resource name from io_alloc_cbm to match
what was discussed during previous version:
https://lore.kernel.org/lkml/251c8fe1-603f-4993-a822-afb35b49cdfa@amd.com/ )
What do you think?
> ---
> v4: The "io_alloc" interface will report "enabled/disabled/not supported"
> instead of 0 or 1..
>
> Updated resctrl_io_alloc_closid_get() to verify the max closid availability
> using closids_supported().
>
> Updated the documentation for "shareable_bits" and "bit_usage".
>
> NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
> handle bit_usage for specific CLOS. Its about overall system. So, we cannot
> really tell the user which CLOS is shared across both hardware and software.
"bit_usage" is not about CLOS but how the resource is used. Per the doc:
"bit_usage":
Annotated capacity bitmasks showing how all
instances of the resource are used.
The key here is the CBM, not CLOS. For each bit in the *CBM* "bit_usage" shows
how that portion of the cache is used with the legend documented in
Documentation/arch/x86/resctrl.rst.
Consider a system with the following allocations:
# cat /sys/fs/resctrl/schemata
L3:0=0ff0
# cat /sys/fs/resctrl/info/L3/io_alloc_cbm
0=ff00
Then "bit_usage" will look like:
# cat /sys/fs/resctrl/info/L3/bit_usage
0=HHHHXXXXSSSS0000
"bit_usage" shows how the cache is being used. It shows that the portion of cache represented
by first four bits of CBM is unused, portion of cache represented by bits 4 to 7 of CBM is
only used by software, portion of cache represented by bits 8 to 11 of CBM is shared between
software and hardware, portion of cache represented by bits 12 to 15 is only used by hardware.
> This is something we need to discuss.
Looking at implementation in patch #5 the "io_alloc_cbm" bits of CBM are presented
as software bits, since "io_alloc_cbm" represents IO from devices it should be "hardware" bits
(hw_shareable), no?
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-02 21:20 ` [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Reinette Chatre
@ 2025-05-03 0:53 ` Moger, Babu
2025-05-05 16:22 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Moger, Babu @ 2025-05-03 0:53 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, tony.luck, tglx, mingo, bp,
dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Reinette,
Thanks for quick turnaround.
On 5/2/2025 4:20 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/21/25 3:43 PM, Babu Moger wrote:
>> # Linux Implementation
>>
>> Feature adds following interface files when the resctrl "io_alloc" feature is
>> supported on L3 resource:
>>
>> /sys/fs/resctrl/info/L3/io_alloc: Report the feature status. Enable/disable the
>> feature by writing to the interface.
>>
>> /sys/fs/resctrl/info/L3/io_alloc_cbm: List the Capacity Bit Masks (CBMs) available
>> for I/O devices when io_alloc feature is enabled.
>> Configure the CBM by writing to the interface.
>>
>> # Examples:
>>
>> a. Check if io_alloc feature is available
>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>
>> # cat /sys/fs/resctrl/info/L3/io_alloc
>> disabled
>>
>> b. Enable the io_alloc feature.
>>
>> # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>> # cat /sys/fs/resctrl/info/L3/io_alloc
>> enabled
>>
>> c. Check the CBM values for the io_alloc feature.
>>
>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> L3:0=ffff;1=ffff
>>
>> d. Change the CBM value for the domain 1:
>> # echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
>>
>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> L3:0=ffff;1=00ff
>>
>> d. Disable io_alloc feature and exit.
>>
>> # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
>> # cat /sys/fs/resctrl/info/L3/io_alloc
>> disabled
>>
>> #umount /sys/fs/resctrl/
>>
>
>>From what I can tell the interface when CDP is enabled will look
> as follows:
>
> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
> disabled
> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
> not supported
>
> "io_alloc" can thus be enabled for L3CODE but not for L3DATA.
> This is unexpected considering the feature is called
> "L3 Smart *Data* Cache Injection Allocation Enforcement".
>
> I understand that the interface evolved into this because the
> "code" allocation of CDP uses the CLOSID required by SDCIAE but I think
> leaking implementation details like this to the user interface can
> cause confusion.
>
> Since there is no distinction between code and data in these
> IO allocations, what do you think of connecting the io_alloc and
> io_alloc_cbm files within L3CODE and L3DATA so that the user can
> read/write from either with a read showing the same data and
> user able to write to either? For example,
>
> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
> disabled
> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
> disabled
> # echo 1 > /sys/fs/resctrl/info/L3CODE/io_alloc
> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
> enabled
> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
> enabled
> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
> 0=ffff;1=ffff
> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
> 0=ffff;1=ffff
> # echo 1=FF > /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
> 0=ffff;1=00ff
> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
> 0=ffff;1=00ff
I agree. There is no right or wrong here. It can be done this way like
you mentioned above. But I am not sure if will clear the confusion.
We have already added the text in user doc (also spec says the same).
"On AMD systems, the io_alloc feature is supported by the L3 Smart
Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
io_alloc is determined by the highest CLOSID supported by the resource.
When CDP is enabled, io_alloc routes I/O traffic using the highest
CLOSID allocated for the instruction cache (L3CODE).
Dont you think this text might clear the confusion? We can add examples
also if that makes it even more clear.
>
> (Note in above I removed the resource name from io_alloc_cbm to match
> what was discussed during previous version:
> https://lore.kernel.org/lkml/251c8fe1-603f-4993-a822-afb35b49cdfa@amd.com/ )
> What do you think?
Yes. I remember. "Kept the resource name while printing the CBM for
io_alloc, so we dont have to change show_doms() just for this feature
and it is consistant across all the schemata display.
I added the note in here.
https://lore.kernel.org/lkml/784fbc61e02e9a834473c3476ee196ef6a44e338.1745275431.git.babu.moger@amd.com/
I will change it if you feel strongly about it. We will have to change
show_doms() to handle this.
>
>
>> ---
>> v4: The "io_alloc" interface will report "enabled/disabled/not supported"
>> instead of 0 or 1..
>>
>> Updated resctrl_io_alloc_closid_get() to verify the max closid availability
>> using closids_supported().
>>
>> Updated the documentation for "shareable_bits" and "bit_usage".
>>
>> NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
>> handle bit_usage for specific CLOS. Its about overall system. So, we cannot
>> really tell the user which CLOS is shared across both hardware and software.
>
> "bit_usage" is not about CLOS but how the resource is used. Per the doc:
>
> "bit_usage":
> Annotated capacity bitmasks showing how all
> instances of the resource are used.
>
> The key here is the CBM, not CLOS. For each bit in the *CBM* "bit_usage" shows
> how that portion of the cache is used with the legend documented in
> Documentation/arch/x86/resctrl.rst.
>
> Consider a system with the following allocations:
> # cat /sys/fs/resctrl/schemata
> L3:0=0ff0
This is CLOS 0.
> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> 0=ff00
This is CLOS 15.
>
> Then "bit_usage" will look like:
>
> # cat /sys/fs/resctrl/info/L3/bit_usage
> 0=HHHHXXXXSSSS0000
It is confusing here. To make it clear we may have to print all the
CLOSes in each domain.
# cat /sys/fs/resctrl/info/L3/bit_usage
DOM0=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000;
DOM1=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000
>
> "bit_usage" shows how the cache is being used. It shows that the portion of cache represented
> by first four bits of CBM is unused, portion of cache represented by bits 4 to 7 of CBM is
> only used by software, portion of cache represented by bits 8 to 11 of CBM is shared between
> software and hardware, portion of cache represented by bits 12 to 15 is only used by hardware.
>
>> This is something we need to discuss.
>
> Looking at implementation in patch #5 the "io_alloc_cbm" bits of CBM are presented
> as software bits, since "io_alloc_cbm" represents IO from devices it should be "hardware" bits
> (hw_shareable), no?
>
Yes. It is. But logic is bit different there.
It loops thru all the CLOSes on the domain. So, it will print again like
this below.
#cat bit_usage
0=HHHHXXXXSSSS0000
It tells the user that all the CLOSes in domain 0 has this sharing
propery which is not correct.
To make it clear we really need to print every CLOS here. What do you think?
Thanks
Babu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-03 0:53 ` Moger, Babu
@ 2025-05-05 16:22 ` Reinette Chatre
2025-05-05 17:01 ` Luck, Tony
2025-05-05 19:54 ` Moger, Babu
0 siblings, 2 replies; 20+ messages in thread
From: Reinette Chatre @ 2025-05-05 16:22 UTC (permalink / raw)
To: Moger, Babu, Babu Moger, tony.luck, tglx, mingo, bp, dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Babu,
On 5/2/25 5:53 PM, Moger, Babu wrote:
> Hi Reinette,
>
> Thanks for quick turnaround.
>
> On 5/2/2025 4:20 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 4/21/25 3:43 PM, Babu Moger wrote:
>>> # Linux Implementation
>>>
>>> Feature adds following interface files when the resctrl "io_alloc" feature is
>>> supported on L3 resource:
>>>
>>> /sys/fs/resctrl/info/L3/io_alloc: Report the feature status. Enable/disable the
>>> feature by writing to the interface.
>>>
>>> /sys/fs/resctrl/info/L3/io_alloc_cbm: List the Capacity Bit Masks (CBMs) available
>>> for I/O devices when io_alloc feature is enabled.
>>> Configure the CBM by writing to the interface.
>>>
>>> # Examples:
>>>
>>> a. Check if io_alloc feature is available
>>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>>
>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>> disabled
>>>
>>> b. Enable the io_alloc feature.
>>>
>>> # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>> enabled
>>>
>>> c. Check the CBM values for the io_alloc feature.
>>>
>>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>>> L3:0=ffff;1=ffff
>>>
>>> d. Change the CBM value for the domain 1:
>>> # echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
>>>
>>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>>> L3:0=ffff;1=00ff
>>>
>>> d. Disable io_alloc feature and exit.
>>>
>>> # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>> disabled
>>>
>>> #umount /sys/fs/resctrl/
>>>
>>
>>> From what I can tell the interface when CDP is enabled will look
>> as follows:
>>
>> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>> disabled
>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>> not supported
>> "io_alloc" can thus be enabled for L3CODE but not for L3DATA.
>> This is unexpected considering the feature is called
>> "L3 Smart *Data* Cache Injection Allocation Enforcement".
>>
>> I understand that the interface evolved into this because the
>> "code" allocation of CDP uses the CLOSID required by SDCIAE but I think
>> leaking implementation details like this to the user interface can
>> cause confusion.
>>
>> Since there is no distinction between code and data in these
>> IO allocations, what do you think of connecting the io_alloc and
>> io_alloc_cbm files within L3CODE and L3DATA so that the user can
>> read/write from either with a read showing the same data and
>> user able to write to either? For example,
>>
>> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>> disabled
>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>> disabled
>> # echo 1 > /sys/fs/resctrl/info/L3CODE/io_alloc
>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>> enabled
>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>> enabled
>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>> 0=ffff;1=ffff
>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
>> 0=ffff;1=ffff
>> # echo 1=FF > /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>> 0=ffff;1=00ff
>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
>> 0=ffff;1=00ff
>
> I agree. There is no right or wrong here. It can be done this way like you mentioned above. But I am not sure if will clear the confusion.
>
> We have already added the text in user doc (also spec says the same).
>
> "On AMD systems, the io_alloc feature is supported by the L3 Smart
> Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
> io_alloc is determined by the highest CLOSID supported by the resource.
> When CDP is enabled, io_alloc routes I/O traffic using the highest
> CLOSID allocated for the instruction cache (L3CODE).
>
> Dont you think this text might clear the confusion? We can add examples also if that makes it even more clear.
The user interface is not intended to be a mirror of the hardware interface.
If it was, doing so is becoming increasingly difficult with multiple
architectures with different hardware intefaces needing to use the same
user interface for control. Remember, there are no "CLOSID" in MPAM and
I do not know details of what RISC-V brings.
We should aim to have something as generic as possible that makes sense
for user space. All the hardware interface details should be hidden as much
as possible from user interface. When we expose the hardware interface details
it becomes very difficult to support new use cases.
The only aspect of "closids" that has been exposed to user space thus far
is the "num_closids" and in user documentation a CLOSid has been linked to the
number of control groups. That is the only constraint we need to think about
here. I have repeatedly asked for IO alloc connection with CLOSIDs to not be exposed
to user space (yet user documentation and messages to user space keeps doing so
in this series). Support for IO alloc in this way is unique to AMD. We do not want
resctrl to be constrained like this if another architecture needs to support
some form of IO alloc and does so in a different way.
I understand that IO alloc backed by CLOSID is forming part of resctrl fs in this
implementation and that is ok for now. As long as we do not leak this to user space
it gives use flexibility to change resctrl fs when/if we learn different architecture
needs later.
>> (Note in above I removed the resource name from io_alloc_cbm to match
>> what was discussed during previous version:
>> https://lore.kernel.org/lkml/251c8fe1-603f-4993-a822-afb35b49cdfa@amd.com/ )
>> What do you think?
>
> Yes. I remember. "Kept the resource name while printing the CBM for io_alloc, so we dont have to change show_doms() just for this feature and it is consistant across all the schemata display.
It almost sounds like you do not want to implement something because the
code to support it does not exist?
>
> I added the note in here.
> https://lore.kernel.org/lkml/784fbc61e02e9a834473c3476ee196ef6a44e338.1745275431.git.babu.moger@amd.com/
You mention "I dont have to change show_doms() just for this feature and it is
consistant across all the schemata display."
I am indeed seeing a pattern where one goal is to add changes by changing minimum
amount of code. Please let this not be a goal but instead make it a goal to integrate
changes into resctrl appropriately, not just pasted on top.
When it comes to the schemata display then it makes sense to add the resource name since
the schemata file is within a resource group containing multiple resources and the schemata
file thus needs to identify resources. Compare this to, for example, the "bit_usage" file
that is unique to a resource and thus no need to identify the resource.
>
> I will change it if you feel strongly about it. We will have to change show_doms() to handle this.
What is the problem with changing show_doms()?
>
>>
>>
>>> ---
>>> v4: The "io_alloc" interface will report "enabled/disabled/not supported"
>>> instead of 0 or 1..
>>>
>>> Updated resctrl_io_alloc_closid_get() to verify the max closid availability
>>> using closids_supported().
>>>
>>> Updated the documentation for "shareable_bits" and "bit_usage".
>>>
>>> NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
>>> handle bit_usage for specific CLOS. Its about overall system. So, we cannot
>>> really tell the user which CLOS is shared across both hardware and software.
>>
>> "bit_usage" is not about CLOS but how the resource is used. Per the doc:
>>
>> "bit_usage":
>> Annotated capacity bitmasks showing how all
>> instances of the resource are used.
>>
>> The key here is the CBM, not CLOS. For each bit in the *CBM* "bit_usage" shows
>> how that portion of the cache is used with the legend documented in
>> Documentation/arch/x86/resctrl.rst.
>>
>> Consider a system with the following allocations:
>> # cat /sys/fs/resctrl/schemata
>> L3:0=0ff0
>
> This is CLOS 0.
>
>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> 0=ff00
>
> This is CLOS 15.
>
>>
>> Then "bit_usage" will look like:
>>
>> # cat /sys/fs/resctrl/info/L3/bit_usage
>> 0=HHHHXXXXSSSS0000
>
> It is confusing here. To make it clear we may have to print all the CLOSes in each domain.
Could you please elaborate how this is confusing?
>
> # cat /sys/fs/resctrl/info/L3/bit_usage
> DOM0=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000;
> DOM1=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000
Please no. Not just does this change existing user interface it also breaks the goal of
"bit_usage".
Please think of it from user perspective. If user wants to know, for example, "how is my
L3 cache allocated" then the "bit_usage" file provides that summary.
>> "bit_usage" shows how the cache is being used. It shows that the portion of cache represented
>> by first four bits of CBM is unused, portion of cache represented by bits 4 to 7 of CBM is
>> only used by software, portion of cache represented by bits 8 to 11 of CBM is shared between
>> software and hardware, portion of cache represented by bits 12 to 15 is only used by hardware.
>>
>>> This is something we need to discuss.
>>
>> Looking at implementation in patch #5 the "io_alloc_cbm" bits of CBM are presented
>> as software bits, since "io_alloc_cbm" represents IO from devices it should be "hardware" bits
>> (hw_shareable), no?
>>
> Yes. It is. But logic is bit different there.
>
> It loops thru all the CLOSes on the domain. So, it will print again like this below.
This is what current code does, but the code can be changed, no? For example, rdt_bit_usage_show()
does not need to treat the IO allocation like all the other resource groups but instead handle it
separately. Below us some pseudo code that presents the idea, untested, not compiled.
hw_shareable = r->cache.shareable_bits;
for (i = 0; i < closids_supported(); i++) {
if (!closid_allocated(i) ||
(resctrl_arch_get_io_alloc_enabled(r) && i == resctrl_io_alloc_closid_get(r, s)))
continue;
/* Intitialize sw_shareable and exclusive */
}
if (resctrl_arch_get_io_alloc_enabled(r)) {
/*
* Sidenote: I do not think schemata parameter is needed for
* resctrl_io_alloc_closid_get()
*/
io_alloc_closid = resctrl_io_alloc_closid_get(r, s);
if (resctrl_arch_get_cdp_enabled(r->rid))
ctrl_val = resctrl_arch_get_config(r, dom, io_alloc_closid, CDP_CODE);
else
ctrl_val = resctrl_arch_get_config(r, dom, io_alloc_closid, CDP_NONE);
hw_shareable |= ctrl_val;
}
for (i = r->cache.cbm_len - 1; i >= 0; i--) {
/* Write annotated bitmask to user space */
}
>
> #cat bit_usage
> 0=HHHHXXXXSSSS0000
>
> It tells the user that all the CLOSes in domain 0 has this sharing propery which is not correct.
>
> To make it clear we really need to print every CLOS here. What do you think?
No. We cannot just change user space API like this. The way I see it the implementation can
support existing user space API. I am sure the above can be improved but it presents an idea
that we can use to start with.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 16:22 ` Reinette Chatre
@ 2025-05-05 17:01 ` Luck, Tony
2025-05-05 17:14 ` Reinette Chatre
2025-05-05 19:54 ` Moger, Babu
1 sibling, 1 reply; 20+ messages in thread
From: Luck, Tony @ 2025-05-05 17:01 UTC (permalink / raw)
To: Chatre, Reinette, Moger, Babu, Babu Moger, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: corbet@lwn.net, x86@kernel.org, hpa@zytor.com,
akpm@linux-foundation.org, paulmck@kernel.org,
rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org,
gregkh@linuxfoundation.org, thomas.lendacky@amd.com,
mario.limonciello@amd.com, perry.yuan@amd.com, seanjc@google.com,
Huang, Kai, Li, Xiaoyao, kan.liang@linux.intel.com,
riel@surriel.com, Li, Xin3, xin@zytor.com, Mehta, Sohil,
ak@linux.intel.com, ebiggers@google.com,
andrew.cooper3@citrix.com, gautham.shenoy@amd.com,
Xiaojian.Du@amd.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, james.morse@arm.com,
fenghuay@nvidia.com, peternewman@google.com
> The only aspect of "closids" that has been exposed to user space thus far
> is the "num_closids" and in user documentation a CLOSid has been linked to the
> number of control groups. That is the only constraint we need to think about
> here. I have repeatedly asked for IO alloc connection with CLOSIDs to not be exposed
> to user space (yet user documentation and messages to user space keeps doing so
> in this series). Support for IO alloc in this way is unique to AMD. We do not want
> resctrl to be constrained like this if another architecture needs to support
> some form of IO alloc and does so in a different way.
This isn't unique to AMD. Intel also ties CLOSid to control features associated with
I/O (likewise with RMIDs for monitoring).
See the Intel RDT architecture specification[1] chapter 4.4:
" Non-CPU agent RDT uses the RMID and CLOS tags in the same way that they are used for CPU agents."
-Tony
[1] https://cdrdv2.intel.com/v1/dl/getContent/789566
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 17:01 ` Luck, Tony
@ 2025-05-05 17:14 ` Reinette Chatre
2025-05-05 17:27 ` Luck, Tony
0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2025-05-05 17:14 UTC (permalink / raw)
To: Luck, Tony, Moger, Babu, Babu Moger, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: corbet@lwn.net, x86@kernel.org, hpa@zytor.com,
akpm@linux-foundation.org, paulmck@kernel.org,
rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org,
gregkh@linuxfoundation.org, thomas.lendacky@amd.com,
mario.limonciello@amd.com, perry.yuan@amd.com, seanjc@google.com,
Huang, Kai, Li, Xiaoyao, kan.liang@linux.intel.com,
riel@surriel.com, Li, Xin3, xin@zytor.com, Mehta, Sohil,
ak@linux.intel.com, ebiggers@google.com,
andrew.cooper3@citrix.com, gautham.shenoy@amd.com,
Xiaojian.Du@amd.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, james.morse@arm.com,
fenghuay@nvidia.com, peternewman@google.com
Hi Tony,
On 5/5/25 10:01 AM, Luck, Tony wrote:
>> The only aspect of "closids" that has been exposed to user space thus far
>> is the "num_closids" and in user documentation a CLOSid has been linked to the
>> number of control groups. That is the only constraint we need to think about
>> here. I have repeatedly asked for IO alloc connection with CLOSIDs to not be exposed
>> to user space (yet user documentation and messages to user space keeps doing so
>> in this series). Support for IO alloc in this way is unique to AMD. We do not want
>> resctrl to be constrained like this if another architecture needs to support
>> some form of IO alloc and does so in a different way.
>
> This isn't unique to AMD. Intel also ties CLOSid to control features associated with
> I/O (likewise with RMIDs for monitoring).
>
> See the Intel RDT architecture specification[1] chapter 4.4:
>
> " Non-CPU agent RDT uses the RMID and CLOS tags in the same way that they are used for CPU agents."
As I understand AMD uses a single specific (the highest CLOSid supported by L3)
CLOS that is then reserved for IO allocation. While both Intel and AMD technically
"uses CLOSid", it is done differently, no?
Specifically, is this documentation introduced in patch #5 accurate for Intel?
+ The feature routes the I/O traffic via specific CLOSID reserved
+ for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
+ for the CLOSID, users can control the L3 portions available for
+ I/0 traffic. The reserved CLOSID will be excluded for group creation.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 17:14 ` Reinette Chatre
@ 2025-05-05 17:27 ` Luck, Tony
2025-05-05 17:39 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Luck, Tony @ 2025-05-05 17:27 UTC (permalink / raw)
To: Chatre, Reinette, Moger, Babu, Babu Moger, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: corbet@lwn.net, x86@kernel.org, hpa@zytor.com,
akpm@linux-foundation.org, paulmck@kernel.org,
rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org,
gregkh@linuxfoundation.org, thomas.lendacky@amd.com,
mario.limonciello@amd.com, perry.yuan@amd.com, seanjc@google.com,
Huang, Kai, Li, Xiaoyao, kan.liang@linux.intel.com,
riel@surriel.com, Li, Xin3, xin@zytor.com, Mehta, Sohil,
ak@linux.intel.com, ebiggers@google.com,
andrew.cooper3@citrix.com, gautham.shenoy@amd.com,
Xiaojian.Du@amd.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, james.morse@arm.com,
fenghuay@nvidia.com, peternewman@google.com
> > " Non-CPU agent RDT uses the RMID and CLOS tags in the same way that they are used for CPU agents."
>
> As I understand AMD uses a single specific (the highest CLOSid supported by L3)
> CLOS that is then reserved for IO allocation. While both Intel and AMD technically
> "uses CLOSid", it is done differently, no?
>
> Specifically, is this documentation introduced in patch #5 accurate for Intel?
> + The feature routes the I/O traffic via specific CLOSID reserved
> + for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
> + for the CLOSID, users can control the L3 portions available for
> + I/0 traffic. The reserved CLOSID will be excluded for group creation.
No. Intel doesn't reserve a single CLOS. It allows to assign RMIDs and CLOSids
for I/O monitoring and control. Different IDs can be assigned to different groups
of devices (the "grouping" is dependent on h/w routing to devices, not
assignable by the OS).
I had some patches for this in my abandoned "resctrl2" implementation. No
immediate plans to resurrect them since it became clear that the h/w implementation
was model specific for just one generation.
-Tony
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 17:27 ` Luck, Tony
@ 2025-05-05 17:39 ` Reinette Chatre
2025-05-05 17:50 ` Luck, Tony
0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2025-05-05 17:39 UTC (permalink / raw)
To: Luck, Tony, Moger, Babu, Babu Moger, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: corbet@lwn.net, x86@kernel.org, hpa@zytor.com,
akpm@linux-foundation.org, paulmck@kernel.org,
rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org,
gregkh@linuxfoundation.org, thomas.lendacky@amd.com,
mario.limonciello@amd.com, perry.yuan@amd.com, seanjc@google.com,
Huang, Kai, Li, Xiaoyao, kan.liang@linux.intel.com,
riel@surriel.com, Li, Xin3, xin@zytor.com, Mehta, Sohil,
ak@linux.intel.com, ebiggers@google.com,
andrew.cooper3@citrix.com, gautham.shenoy@amd.com,
Xiaojian.Du@amd.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, james.morse@arm.com,
fenghuay@nvidia.com, peternewman@google.com
Hi Tony,
On 5/5/25 10:27 AM, Luck, Tony wrote:
>>> " Non-CPU agent RDT uses the RMID and CLOS tags in the same way that they are used for CPU agents."
>>
>> As I understand AMD uses a single specific (the highest CLOSid supported by L3)
>> CLOS that is then reserved for IO allocation. While both Intel and AMD technically
>> "uses CLOSid", it is done differently, no?
>>
>> Specifically, is this documentation introduced in patch #5 accurate for Intel?
>> + The feature routes the I/O traffic via specific CLOSID reserved
>> + for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
>> + for the CLOSID, users can control the L3 portions available for
>> + I/0 traffic. The reserved CLOSID will be excluded for group creation.
>
> No. Intel doesn't reserve a single CLOS. It allows to assign RMIDs and CLOSids
> for I/O monitoring and control. Different IDs can be assigned to different groups
> of devices (the "grouping" is dependent on h/w routing to devices, not
> assignable by the OS).
How does this work with CDP on Intel? Can CDP be enabled for CPU agents while the
"code" and "data" CLOSids be used for I/O control?
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 17:39 ` Reinette Chatre
@ 2025-05-05 17:50 ` Luck, Tony
0 siblings, 0 replies; 20+ messages in thread
From: Luck, Tony @ 2025-05-05 17:50 UTC (permalink / raw)
To: Chatre, Reinette, Moger, Babu, Babu Moger, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
Cc: corbet@lwn.net, x86@kernel.org, hpa@zytor.com,
akpm@linux-foundation.org, paulmck@kernel.org,
rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org,
gregkh@linuxfoundation.org, thomas.lendacky@amd.com,
mario.limonciello@amd.com, perry.yuan@amd.com, seanjc@google.com,
Huang, Kai, Li, Xiaoyao, kan.liang@linux.intel.com,
riel@surriel.com, Li, Xin3, xin@zytor.com, Mehta, Sohil,
ak@linux.intel.com, ebiggers@google.com,
andrew.cooper3@citrix.com, gautham.shenoy@amd.com,
Xiaojian.Du@amd.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, james.morse@arm.com,
fenghuay@nvidia.com, peternewman@google.com
>> No. Intel doesn't reserve a single CLOS. It allows to assign RMIDs and CLOSids
>> for I/O monitoring and control. Different IDs can be assigned to different groups
>> of devices (the "grouping" is dependent on h/w routing to devices, not
>> assignable by the OS).
>
> How does this work with CDP on Intel? Can CDP be enabled for CPU agents while the
> "code" and "data" CLOSids be used for I/O control?
Reinette,
Good question. I'll have to check with h/w folks.
-Tony
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 16:22 ` Reinette Chatre
2025-05-05 17:01 ` Luck, Tony
@ 2025-05-05 19:54 ` Moger, Babu
2025-05-05 21:13 ` Reinette Chatre
1 sibling, 1 reply; 20+ messages in thread
From: Moger, Babu @ 2025-05-05 19:54 UTC (permalink / raw)
To: Reinette Chatre, Moger, Babu, tony.luck, tglx, mingo, bp,
dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Reinette,
On 5/5/25 11:22, Reinette Chatre wrote:
> Hi Babu,
>
> On 5/2/25 5:53 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> Thanks for quick turnaround.
>>
>> On 5/2/2025 4:20 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/21/25 3:43 PM, Babu Moger wrote:
>>>> # Linux Implementation
>>>>
>>>> Feature adds following interface files when the resctrl "io_alloc" feature is
>>>> supported on L3 resource:
>>>>
>>>> /sys/fs/resctrl/info/L3/io_alloc: Report the feature status. Enable/disable the
>>>> feature by writing to the interface.
>>>>
>>>> /sys/fs/resctrl/info/L3/io_alloc_cbm: List the Capacity Bit Masks (CBMs) available
>>>> for I/O devices when io_alloc feature is enabled.
>>>> Configure the CBM by writing to the interface.
>>>>
>>>> # Examples:
>>>>
>>>> a. Check if io_alloc feature is available
>>>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>>>
>>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>>> disabled
>>>>
>>>> b. Enable the io_alloc feature.
>>>>
>>>> # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>>> enabled
>>>>
>>>> c. Check the CBM values for the io_alloc feature.
>>>>
>>>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>>>> L3:0=ffff;1=ffff
>>>>
>>>> d. Change the CBM value for the domain 1:
>>>> # echo L3:1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
>>>>
>>>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>>>> L3:0=ffff;1=00ff
>>>>
>>>> d. Disable io_alloc feature and exit.
>>>>
>>>> # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
>>>> # cat /sys/fs/resctrl/info/L3/io_alloc
>>>> disabled
>>>>
>>>> #umount /sys/fs/resctrl/
>>>>
>>>
>>>> From what I can tell the interface when CDP is enabled will look
>>> as follows:
>>>
>>> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
>>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>>> disabled
>>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>>> not supported
>>> "io_alloc" can thus be enabled for L3CODE but not for L3DATA.
>>> This is unexpected considering the feature is called
>>> "L3 Smart *Data* Cache Injection Allocation Enforcement".
>>>
>>> I understand that the interface evolved into this because the
>>> "code" allocation of CDP uses the CLOSID required by SDCIAE but I think
>>> leaking implementation details like this to the user interface can
>>> cause confusion.
>>>
>>> Since there is no distinction between code and data in these
>>> IO allocations, what do you think of connecting the io_alloc and
>>> io_alloc_cbm files within L3CODE and L3DATA so that the user can
>>> read/write from either with a read showing the same data and
>>> user able to write to either? For example,
>>>
>>> # mount -o cdp -t resctrl resctrl /sys/fs/resctrl/
>>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>>> disabled
>>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>>> disabled
>>> # echo 1 > /sys/fs/resctrl/info/L3CODE/io_alloc
>>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc
>>> enabled
>>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc
>>> enabled
>>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>>> 0=ffff;1=ffff
>>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
>>> 0=ffff;1=ffff
>>> # echo 1=FF > /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>>> # cat /sys/fs/resctrl/info/L3DATA/io_alloc_cbm
>>> 0=ffff;1=00ff
>>> # cat /sys/fs/resctrl/info/L3CODE/io_alloc_cbm
>>> 0=ffff;1=00ff
>>
>> I agree. There is no right or wrong here. It can be done this way like you mentioned above. But I am not sure if will clear the confusion.
>>
>> We have already added the text in user doc (also spec says the same).
>>
>> "On AMD systems, the io_alloc feature is supported by the L3 Smart
>> Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
>> io_alloc is determined by the highest CLOSID supported by the resource.
>> When CDP is enabled, io_alloc routes I/O traffic using the highest
>> CLOSID allocated for the instruction cache (L3CODE).
>>
>> Dont you think this text might clear the confusion? We can add examples also if that makes it even more clear.
>
> The user interface is not intended to be a mirror of the hardware interface.
> If it was, doing so is becoming increasingly difficult with multiple
> architectures with different hardware intefaces needing to use the same
> user interface for control. Remember, there are no "CLOSID" in MPAM and
> I do not know details of what RISC-V brings.
>
> We should aim to have something as generic as possible that makes sense
> for user space. All the hardware interface details should be hidden as much
> as possible from user interface. When we expose the hardware interface details
> it becomes very difficult to support new use cases.
>
> The only aspect of "closids" that has been exposed to user space thus far
> is the "num_closids" and in user documentation a CLOSid has been linked to the
> number of control groups. That is the only constraint we need to think about
> here. I have repeatedly asked for IO alloc connection with CLOSIDs to not be exposed
> to user space (yet user documentation and messages to user space keeps doing so
> in this series). Support for IO alloc in this way is unique to AMD. We do not want
> resctrl to be constrained like this if another architecture needs to support
> some form of IO alloc and does so in a different way.
>
> I understand that IO alloc backed by CLOSID is forming part of resctrl fs in this
> implementation and that is ok for now. As long as we do not leak this to user space
> it gives use flexibility to change resctrl fs when/if we learn different architecture
> needs later.
That makes sense. I’ll go ahead and adjust it as suggested.
>
>>> (Note in above I removed the resource name from io_alloc_cbm to match
>>> what was discussed during previous version:
>>> https://lore.kernel.org/lkml/251c8fe1-603f-4993-a822-afb35b49cdfa@amd.com/ )
>>> What do you think?
>>
>> Yes. I remember. "Kept the resource name while printing the CBM for io_alloc, so we dont have to change show_doms() just for this feature and it is consistant across all the schemata display.
>
> It almost sounds like you do not want to implement something because the
> code to support it does not exist?
>
>>
>> I added the note in here.
>> https://lore.kernel.org/lkml/784fbc61e02e9a834473c3476ee196ef6a44e338.1745275431.git.babu.moger@amd.com/
>
> You mention "I dont have to change show_doms() just for this feature and it is
> consistant across all the schemata display."
> I am indeed seeing a pattern where one goal is to add changes by changing minimum
> amount of code. Please let this not be a goal but instead make it a goal to integrate
> changes into resctrl appropriately, not just pasted on top.
>
> When it comes to the schemata display then it makes sense to add the resource name since
> the schemata file is within a resource group containing multiple resources and the schemata
> file thus needs to identify resources. Compare this to, for example, the "bit_usage" file
> that is unique to a resource and thus no need to identify the resource.
>
>>
>> I will change it if you feel strongly about it. We will have to change show_doms() to handle this.
>
> What is the problem with changing show_doms()?
There is no problem changing show_doms(). My intenstion was to keep the
change as minimul as possible.
Sure. Will make the changes "not" to print the resource name for io_alloc_cbm.
>
>>
>>>
>>>
>>>> ---
>>>> v4: The "io_alloc" interface will report "enabled/disabled/not supported"
>>>> instead of 0 or 1..
>>>>
>>>> Updated resctrl_io_alloc_closid_get() to verify the max closid availability
>>>> using closids_supported().
>>>>
>>>> Updated the documentation for "shareable_bits" and "bit_usage".
>>>>
>>>> NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
>>>> handle bit_usage for specific CLOS. Its about overall system. So, we cannot
>>>> really tell the user which CLOS is shared across both hardware and software.
>>>
>>> "bit_usage" is not about CLOS but how the resource is used. Per the doc:
>>>
>>> "bit_usage":
>>> Annotated capacity bitmasks showing how all
>>> instances of the resource are used.
>>>
>>> The key here is the CBM, not CLOS. For each bit in the *CBM* "bit_usage" shows
>>> how that portion of the cache is used with the legend documented in
>>> Documentation/arch/x86/resctrl.rst.
>>>
>>> Consider a system with the following allocations:
>>> # cat /sys/fs/resctrl/schemata
>>> L3:0=0ff0
>>
>> This is CLOS 0.
>>
>>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>>> 0=ff00
>>
>> This is CLOS 15.
>>
>>>
>>> Then "bit_usage" will look like:
>>>
>>> # cat /sys/fs/resctrl/info/L3/bit_usage
>>> 0=HHHHXXXXSSSS0000
>>
>> It is confusing here. To make it clear we may have to print all the CLOSes in each domain.
>
> Could you please elaborate how this is confusing?
# cat /sys/fs/resctrl/info/L3/bit_usage
0=HHHHXXXXSSSS0000
This may give the impression that the all CLOSes in all domains carries
this property, but in reality, it applies only to one CLOS(15) within each
domain.
Example below....
>
>>
>> # cat /sys/fs/resctrl/info/L3/bit_usage
>> DOM0=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000;
>> DOM1=CLOS0:SSSSSSSSSSSSSSSS;... ;CLOS15=HHHHXXXXSSSS0000
>
> Please no. Not just does this change existing user interface it also breaks the goal of
> "bit_usage".
>
> Please think of it from user perspective. If user wants to know, for example, "how is my
> L3 cache allocated" then the "bit_usage" file provides that summary.
>
>>> "bit_usage" shows how the cache is being used. It shows that the portion of cache represented
>>> by first four bits of CBM is unused, portion of cache represented by bits 4 to 7 of CBM is
>>> only used by software, portion of cache represented by bits 8 to 11 of CBM is shared between
>>> software and hardware, portion of cache represented by bits 12 to 15 is only used by hardware.
>>>
>>>> This is something we need to discuss.
>>>
>>> Looking at implementation in patch #5 the "io_alloc_cbm" bits of CBM are presented
>>> as software bits, since "io_alloc_cbm" represents IO from devices it should be "hardware" bits
>>> (hw_shareable), no?
>>>
>> Yes. It is. But logic is bit different there.
>>
>> It loops thru all the CLOSes on the domain. So, it will print again like this below.
>
> This is what current code does, but the code can be changed, no? For example, rdt_bit_usage_show()
> does not need to treat the IO allocation like all the other resource groups but instead handle it
> separately. Below us some pseudo code that presents the idea, untested, not compiled.
>
> hw_shareable = r->cache.shareable_bits;
>
> for (i = 0; i < closids_supported(); i++) {
> if (!closid_allocated(i) ||
> (resctrl_arch_get_io_alloc_enabled(r) && i == resctrl_io_alloc_closid_get(r, s)))
> continue;
>
> /* Intitialize sw_shareable and exclusive */
> }
>
> if (resctrl_arch_get_io_alloc_enabled(r)) {
> /*
> * Sidenote: I do not think schemata parameter is needed for
> * resctrl_io_alloc_closid_get()
Sure. Got it.
> */
> io_alloc_closid = resctrl_io_alloc_closid_get(r, s);
> if (resctrl_arch_get_cdp_enabled(r->rid))
> ctrl_val = resctrl_arch_get_config(r, dom, io_alloc_closid, CDP_CODE);
> else
> ctrl_val = resctrl_arch_get_config(r, dom, io_alloc_closid, CDP_NONE);
> hw_shareable |= ctrl_val;
> }
>
> for (i = r->cache.cbm_len - 1; i >= 0; i--) {
> /* Write annotated bitmask to user space */
> }
>
Here is the behaviour after these cahnges.
=== Before io_alloc enabled==============================
#cd /sys/fs/resctrl/L3/
# cat io_alloc
disabled
# cat shareable_bits
0 (This is always 0 for AMD)
# cat bit_usage
0=SSSSSSSSSSSSSSSS;1=SSSSSSSSSSSSSSSS;2=SSSSSSSSSSSSSSSS;3=SSSSSSSSSSSSSSSS
==== After io_alloc enabled=================================
# echo 1 > io_alloc
# cat io_alloc
enabled
# cat io_alloc_cbm
L3:0=ffff;1=ffff;2=ffff;3=ffff
#cat bit_usage
0=XXXXXXXXXXXXXXXX;1=XXXXXXXXXXXXXXXX;2=XXXXXXXXXXXXXXXX;3=XXXXXXXXXXXXXXXX
==== After changing io_alloc_cbm ============================
#echo "L3:0=ff00;1=ff00;2=ff00;3=ff00 > io_alloc_cbm
# cat io_alloc_cbm
L3:0=ff00;1=ff00;2=ff00;3=ff00
#cat bit_usage
0=XXXXXXXXSSSSSSSS;1=XXXXXXXXSSSSSSSS;2=XXXXXXXXSSSSSSSS;3=XXXXXXXXSSSSSSSS
=============================================================
My concern here is, this may imply that the property is present across all
CLOSes in all the domains, while in fact, it only applies to a single
CLOS(15) within each domain.
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 19:54 ` Moger, Babu
@ 2025-05-05 21:13 ` Reinette Chatre
2025-05-05 22:29 ` Moger, Babu
0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2025-05-05 21:13 UTC (permalink / raw)
To: babu.moger, Moger, Babu, tony.luck, tglx, mingo, bp, dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Babu,
On 5/5/25 12:54 PM, Moger, Babu wrote:
> On 5/5/25 11:22, Reinette Chatre wrote:
>> On 5/2/25 5:53 PM, Moger, Babu wrote:
>>> On 5/2/2025 4:20 PM, Reinette Chatre wrote:
>>>> On 4/21/25 3:43 PM, Babu Moger wrote:
...
>>>>
>>>> Then "bit_usage" will look like:
>>>>
>>>> # cat /sys/fs/resctrl/info/L3/bit_usage
>>>> 0=HHHHXXXXSSSS0000
>>>
>>> It is confusing here. To make it clear we may have to print all the CLOSes in each domain.
>>
>> Could you please elaborate how this is confusing?
>
> # cat /sys/fs/resctrl/info/L3/bit_usage
> 0=HHHHXXXXSSSS0000
>
> This may give the impression that the all CLOSes in all domains carries
> this property, but in reality, it applies only to one CLOS(15) within each
> domain.
>
> Example below....
>
...
> Here is the behaviour after these cahnges.
>
> === Before io_alloc enabled==============================
>
> #cd /sys/fs/resctrl/L3/
> # cat io_alloc
> disabled
>
> # cat shareable_bits
> 0 (This is always 0 for AMD)
>
> # cat bit_usage
> 0=SSSSSSSSSSSSSSSS;1=SSSSSSSSSSSSSSSS;2=SSSSSSSSSSSSSSSS;3=SSSSSSSSSSSSSSSS
Please note that the "S" in above does not have anything to do with
"shareable_bits" at this point. The "S" indicates that all L3 instances
are currently used by software and that sharing is allowed.
"bit_usage" gives insight to user space how all L3 instances are used.
If at this point a new resource group is created and it has an "exclusive"
allocation then "bit_usage" will change to reflect that. For example,
you can try this on the system you are testing on:
# echo 'L3:0=fff0;1=fff0;2=fff0;3=fff0' > /sys/fs/resctrl/schemata
# mkdir /sys/fs/resctrl/g1
# echo 'L3:0=f;1=f;2=f;3=f' > /sys/fs/resctrl/g1/schemata
# echo 'exclusive' > /sys/fs/resctrl/g1/mode
The above isolates a portion of all L3 instances for exclusive use by g1.
After above changes:
# cat /sys/fs/resctrl/info/L3/bit_usage
0=SSSSSSSSSSSSEEEE;1=SSSSSSSSSSSSEEEE;2=SSSSSSSSSSSSEEEE;3=SSSSSSSSSSSSEEEE
Note that there is no "closid" or resource group information but instead,
"bit_usage" shows to user space how each cache instance is being used
across all resource groups and hardware (IO) allocations.
>
> ==== After io_alloc enabled=================================
>
> # echo 1 > io_alloc
>
> # cat io_alloc
> enabled
>
> # cat io_alloc_cbm
> L3:0=ffff;1=ffff;2=ffff;3=ffff
>
> #cat bit_usage
> 0=XXXXXXXXXXXXXXXX;1=XXXXXXXXXXXXXXXX;2=XXXXXXXXXXXXXXXX;3=XXXXXXXXXXXXXXXX
Looks accurate to me. It shows that both hardware and software can
allocate into all portions of all caches.
>
> ==== After changing io_alloc_cbm ============================
>
> #echo "L3:0=ff00;1=ff00;2=ff00;3=ff00 > io_alloc_cbm
>
> # cat io_alloc_cbm
> L3:0=ff00;1=ff00;2=ff00;3=ff00
>
> #cat bit_usage
> 0=XXXXXXXXSSSSSSSS;1=XXXXXXXXSSSSSSSS;2=XXXXXXXXSSSSSSSS;3=XXXXXXXXSSSSSSSS
Looks accurate to me.
> =============================================================
>
> My concern here is, this may imply that the property is present across all
> CLOSes in all the domains, while in fact, it only applies to a single
> CLOS(15) within each domain.
If a user wants a resource group specific view then the schemata should be used.
"bit_usage" presents the view from the cache instance perspective and reflects
how each L3 cache instance is being used at that moment in time. It helps
system administrator answer the question "how are the caches used at the moment"?
"bit_usage" does so by presenting a summary of all allocations across all resource
groups and any hardware allocations that may exist. This file helps user space
to understand how the cache is being used without needing to correlate the CBMs
of all resource groups and IO allocations. For example, "bit_usage" is to be used
by system administrator to ensure cache is used optimally (for example, there are
no unused portions). Also, a user may be investigating a performance issue in
a particular resource group and "bit_usage" will help with that to see if
the tasks in that resource group may be competing with IO.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE)
2025-05-05 21:13 ` Reinette Chatre
@ 2025-05-05 22:29 ` Moger, Babu
0 siblings, 0 replies; 20+ messages in thread
From: Moger, Babu @ 2025-05-05 22:29 UTC (permalink / raw)
To: Reinette Chatre, babu.moger, tony.luck, tglx, mingo, bp,
dave.hansen
Cc: corbet, x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh,
thomas.lendacky, mario.limonciello, perry.yuan, seanjc, kai.huang,
xiaoyao.li, kan.liang, riel, xin3.li, xin, sohil.mehta, ak,
ebiggers, andrew.cooper3, gautham.shenoy, Xiaojian.Du, linux-doc,
linux-kernel, james.morse, fenghuay, peternewman
Hi Reinette,
On 5/5/2025 4:13 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 5/5/25 12:54 PM, Moger, Babu wrote:
>> On 5/5/25 11:22, Reinette Chatre wrote:
>>> On 5/2/25 5:53 PM, Moger, Babu wrote:
>>>> On 5/2/2025 4:20 PM, Reinette Chatre wrote:
>>>>> On 4/21/25 3:43 PM, Babu Moger wrote:
>
> ...
>
>>>>>
>>>>> Then "bit_usage" will look like:
>>>>>
>>>>> # cat /sys/fs/resctrl/info/L3/bit_usage
>>>>> 0=HHHHXXXXSSSS0000
>>>>
>>>> It is confusing here. To make it clear we may have to print all the CLOSes in each domain.
>>>
>>> Could you please elaborate how this is confusing?
>>
>> # cat /sys/fs/resctrl/info/L3/bit_usage
>> 0=HHHHXXXXSSSS0000
>>
>> This may give the impression that the all CLOSes in all domains carries
>> this property, but in reality, it applies only to one CLOS(15) within each
>> domain.
>>
>> Example below....
>>
>
> ...
>
>> Here is the behaviour after these cahnges.
>>
>> === Before io_alloc enabled==============================
>>
>> #cd /sys/fs/resctrl/L3/
>> # cat io_alloc
>> disabled
>>
>> # cat shareable_bits
>> 0 (This is always 0 for AMD)
>>
>> # cat bit_usage
>> 0=SSSSSSSSSSSSSSSS;1=SSSSSSSSSSSSSSSS;2=SSSSSSSSSSSSSSSS;3=SSSSSSSSSSSSSSSS
>
> Please note that the "S" in above does not have anything to do with
> "shareable_bits" at this point. The "S" indicates that all L3 instances
> are currently used by software and that sharing is allowed.
>
> "bit_usage" gives insight to user space how all L3 instances are used.
>
> If at this point a new resource group is created and it has an "exclusive"
> allocation then "bit_usage" will change to reflect that. For example,
> you can try this on the system you are testing on:
>
> # echo 'L3:0=fff0;1=fff0;2=fff0;3=fff0' > /sys/fs/resctrl/schemata
> # mkdir /sys/fs/resctrl/g1
> # echo 'L3:0=f;1=f;2=f;3=f' > /sys/fs/resctrl/g1/schemata
> # echo 'exclusive' > /sys/fs/resctrl/g1/mode
>
> The above isolates a portion of all L3 instances for exclusive use by g1.
> After above changes:
> # cat /sys/fs/resctrl/info/L3/bit_usage
> 0=SSSSSSSSSSSSEEEE;1=SSSSSSSSSSSSEEEE;2=SSSSSSSSSSSSEEEE;3=SSSSSSSSSSSSEEEE
>
Yes. I see the same output.
> Note that there is no "closid" or resource group information but instead,
> "bit_usage" shows to user space how each cache instance is being used
> across all resource groups and hardware (IO) allocations.
Ok. Got it.
>
>>
>> ==== After io_alloc enabled=================================
>>
>> # echo 1 > io_alloc
>>
>> # cat io_alloc
>> enabled
>>
>> # cat io_alloc_cbm
>> L3:0=ffff;1=ffff;2=ffff;3=ffff
>>
>> #cat bit_usage
>> 0=XXXXXXXXXXXXXXXX;1=XXXXXXXXXXXXXXXX;2=XXXXXXXXXXXXXXXX;3=XXXXXXXXXXXXXXXX
>
> Looks accurate to me. It shows that both hardware and software can
> allocate into all portions of all caches.
>
>>
>> ==== After changing io_alloc_cbm ============================
>>
>> #echo "L3:0=ff00;1=ff00;2=ff00;3=ff00 > io_alloc_cbm
>>
>> # cat io_alloc_cbm
>> L3:0=ff00;1=ff00;2=ff00;3=ff00
>>
>> #cat bit_usage
>> 0=XXXXXXXXSSSSSSSS;1=XXXXXXXXSSSSSSSS;2=XXXXXXXXSSSSSSSS;3=XXXXXXXXSSSSSSSS
>
> Looks accurate to me.
>
>> =============================================================
>>
>> My concern here is, this may imply that the property is present across all
>> CLOSes in all the domains, while in fact, it only applies to a single
>> CLOS(15) within each domain.
>
> If a user wants a resource group specific view then the schemata should be used.
> "bit_usage" presents the view from the cache instance perspective and reflects
> how each L3 cache instance is being used at that moment in time. It helps
> system administrator answer the question "how are the caches used at the moment"?
> "bit_usage" does so by presenting a summary of all allocations across all resource
> groups and any hardware allocations that may exist. This file helps user space
> to understand how the cache is being used without needing to correlate the CBMs
> of all resource groups and IO allocations. For example, "bit_usage" is to be used
> by system administrator to ensure cache is used optimally (for example, there are
> no unused portions). Also, a user may be investigating a performance issue in
> a particular resource group and "bit_usage" will help with that to see if
> the tasks in that resource group may be competing with IO.
>
Ok, "bit_usage" is a summary across all the groups. That is a good
point. Thanks for the detailed explanation.
Will make those changes in next revision.
Thank you.
Babu
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-05-05 22:29 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21 22:43 [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
2025-04-21 22:43 ` [PATCH v4 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
2025-04-21 22:43 ` [PATCH v4 2/8] x86/resctrl: Add SDCIAE feature in the command line options Babu Moger
2025-04-21 22:43 ` [PATCH v4 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
2025-04-21 22:43 ` [PATCH v4 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
2025-04-21 22:43 ` [PATCH v4 5/8] x86/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
2025-04-21 22:43 ` [PATCH v4 6/8] x86/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
2025-04-21 22:43 ` [PATCH v4 7/8] x86/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
2025-04-21 22:43 ` [PATCH v4 8/8] x86/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
2025-05-02 21:20 ` [PATCH v4 0/8] Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Reinette Chatre
2025-05-03 0:53 ` Moger, Babu
2025-05-05 16:22 ` Reinette Chatre
2025-05-05 17:01 ` Luck, Tony
2025-05-05 17:14 ` Reinette Chatre
2025-05-05 17:27 ` Luck, Tony
2025-05-05 17:39 ` Reinette Chatre
2025-05-05 17:50 ` Luck, Tony
2025-05-05 19:54 ` Moger, Babu
2025-05-05 21:13 ` Reinette Chatre
2025-05-05 22:29 ` Moger, Babu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).