* [PATCH v6 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-11 21:23 ` [PATCH v6 2/8] x86/resctrl: Add SDCIAE feature in the command line options Babu Moger
` (6 subsequent siblings)
7 siblings, 0 replies; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Smart Data Cache Injection (SDCI) is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache. By directly caching
data from I/O devices rather than first storing the I/O data in DRAM,
SDCI reduces demands on DRAM bandwidth and reduces latency to the processor
consuming the I/O data.
The SDCIAE (SDCI Allocation Enforcement) PQE feature allows system software
to control the portion of the L3 cache used for SDCI.
When enabled, SDCIAE forces all SDCI lines to be placed into the L3 cache
partitions identified by the highest-supported L3_MASK_n register, where n
is the maximum supported CLOSID.
Add CPUID feature bit that can be used to configure SDCIAE.
The feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
Injection Allocation Enforcement (SDCIAE)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Resolved conflicts in cpufeatures.h.
v5: No changes.
v4: Resolved a minor conflict in cpufeatures.h.
v3: No changes.
v2: Added dependancy on X86_FEATURE_CAT_L3
Removed the "" in CPU feature definition.
Minor text changes.
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index ee176236c2be..1e3d4d8a45ec 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -487,6 +487,7 @@
#define X86_FEATURE_PREFER_YMM (21*32+ 8) /* Avoid ZMM registers due to downclocking */
#define X86_FEATURE_APX (21*32+ 9) /* Advanced Performance Extensions */
#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32+10) /* Use thunk for indirect branches in lower half of cacheline */
+#define X86_FEATURE_SDCIAE (21*32+11) /* L3 Smart Data Cache Injection Allocation Enforcement */
/*
* BUG word(s)
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 46efcbd6afa4..87e78586395b 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -72,6 +72,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_TOTAL },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_LOCAL },
+ { X86_FEATURE_SDCIAE, X86_FEATURE_CAT_L3 },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
{ X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index dbf6d71bdf18..3d1f4d718a84 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -50,6 +50,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
{ X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
+ { X86_FEATURE_SDCIAE, CPUID_EBX, 6, 0x80000020, 0 },
{ X86_FEATURE_AMD_WORKLOAD_CLASS, CPUID_EAX, 22, 0x80000021, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ X86_FEATURE_AMD_LBR_V2, CPUID_EAX, 1, 0x80000022, 0 },
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v6 2/8] x86/resctrl: Add SDCIAE feature in the command line options
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
2025-06-11 21:23 ` [PATCH v6 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-11 21:23 ` [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
` (5 subsequent siblings)
7 siblings, 0 replies; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Add the command line options to enable or disable the new resctrl feature
L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE).
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: No changes.
v5: No changes.
v4: No changes.
v3: No changes.
v2: No changes.
---
Documentation/admin-guide/kernel-parameters.txt | 2 +-
arch/x86/kernel/cpu/resctrl/core.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f1f2c0874da9..95d89f881e10 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6066,7 +6066,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
- mba, smba, bmec.
+ mba, smba, bmec, sdciae.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7109cbfcad4f..326c679ade5c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -705,6 +705,7 @@ enum {
RDT_FLAG_MBA,
RDT_FLAG_SMBA,
RDT_FLAG_BMEC,
+ RDT_FLAG_SDCIAE,
};
#define RDT_OPT(idx, n, f) \
@@ -730,6 +731,7 @@ static struct rdt_options rdt_options[] __ro_after_init = {
RDT_OPT(RDT_FLAG_MBA, "mba", X86_FEATURE_MBA),
RDT_OPT(RDT_FLAG_SMBA, "smba", X86_FEATURE_SMBA),
RDT_OPT(RDT_FLAG_BMEC, "bmec", X86_FEATURE_BMEC),
+ RDT_OPT(RDT_FLAG_SDCIAE, "sdciae", X86_FEATURE_SDCIAE),
};
#define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
2025-06-11 21:23 ` [PATCH v6 1/8] x86/cpufeatures: Add support for L3 Smart Data Cache Injection Allocation Enforcement Babu Moger
2025-06-11 21:23 ` [PATCH v6 2/8] x86/resctrl: Add SDCIAE feature in the command line options Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-18 3:45 ` Reinette Chatre
2025-06-11 21:23 ` [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
` (4 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Smart Data Cache Injection (SDCI) is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache. It can the demands
on DRAM bandwidth and reduces latency to the processor consuming the I/O
data.
Introduce cache resource property "io_alloc_capable" that an architecture
can set if a portion of the L3 cache can be allocated for I/O traffic.
Set this property on x86 systems that support SDCIAE (L3 Smart Data Cache
Injection Allocation Enforcement).
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: No changes.
v5: No changes.
v4: Updated the commit message and code comment based on feedback.
v3: Rewrote commit log. Changed the text to bit generic than the AMD specific.
Renamed the rdt_get_sdciae_alloc_cfg() to rdt_set_io_alloc_capable().
Removed leftover comment from v2.
v2: Changed sdciae_capable to io_alloc_capable to make it generic feature.
Also moved the io_alloc_capable in struct resctrl_cache.
---
arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++
include/linux/resctrl.h | 3 +++
2 files changed, 10 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 326c679ade5c..a3d174362249 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -274,6 +274,11 @@ static void rdt_get_cdp_config(int level)
rdt_resources_all[level].r_resctrl.cdp_capable = true;
}
+static void rdt_set_io_alloc_capable(struct rdt_resource *r)
+{
+ r->cache.io_alloc_capable = true;
+}
+
static void rdt_get_cdp_l3_config(void)
{
rdt_get_cdp_config(RDT_RESOURCE_L3);
@@ -840,6 +845,8 @@ static __init bool get_rdt_alloc_resources(void)
rdt_get_cache_alloc_cfg(1, r);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
rdt_get_cdp_l3_config();
+ if (rdt_cpu_has(X86_FEATURE_SDCIAE))
+ rdt_set_io_alloc_capable(r);
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 9ba771f2ddea..0e8641e41100 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -191,6 +191,8 @@ struct rdt_mon_domain {
* @arch_has_sparse_bitmasks: True if a bitmask like f00f is valid.
* @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
* level has CPU scope.
+ * @io_alloc_capable: True if portion of the cache can be allocated
+ * for I/O traffic.
*/
struct resctrl_cache {
unsigned int cbm_len;
@@ -198,6 +200,7 @@ struct resctrl_cache {
unsigned int shareable_bits;
bool arch_has_sparse_bitmasks;
bool arch_has_per_cpu_cfg;
+ bool io_alloc_capable;
};
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature
2025-06-11 21:23 ` [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
@ 2025-06-18 3:45 ` Reinette Chatre
2025-06-18 19:27 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 3:45 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
Please modify subject prefix to "x86,fs/resctrl" since this adds new arch
API called by resctrl fs.
On 6/11/25 2:23 PM, Babu Moger wrote:
> Smart Data Cache Injection (SDCI) is a mechanism that enables direct
> insertion of data from I/O devices into the L3 cache. It can the demands
"It can the demands" -> "It can reduce demands"?
> on DRAM bandwidth and reduces latency to the processor consuming the I/O
> data.
>
> Introduce cache resource property "io_alloc_capable" that an architecture
> can set if a portion of the L3 cache can be allocated for I/O traffic.
I think "L3" should be dropped because the cache resource property is not
unique to L3. It is a property of all cache resources.
> Set this property on x86 systems that support SDCIAE (L3 Smart Data Cache
> Injection Allocation Enforcement).
Here it can be mentioned that this property is only set for the L3 cache
resource on systems that supports SDCIAE.
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v6: No changes.
>
> v5: No changes.
>
> v4: Updated the commit message and code comment based on feedback.
>
> v3: Rewrote commit log. Changed the text to bit generic than the AMD specific.
> Renamed the rdt_get_sdciae_alloc_cfg() to rdt_set_io_alloc_capable().
> Removed leftover comment from v2.
>
> v2: Changed sdciae_capable to io_alloc_capable to make it generic feature.
> Also moved the io_alloc_capable in struct resctrl_cache.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++
> include/linux/resctrl.h | 3 +++
> 2 files changed, 10 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 326c679ade5c..a3d174362249 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -274,6 +274,11 @@ static void rdt_get_cdp_config(int level)
> rdt_resources_all[level].r_resctrl.cdp_capable = true;
> }
>
> +static void rdt_set_io_alloc_capable(struct rdt_resource *r)
> +{
> + r->cache.io_alloc_capable = true;
> +}
> +
> static void rdt_get_cdp_l3_config(void)
> {
> rdt_get_cdp_config(RDT_RESOURCE_L3);
> @@ -840,6 +845,8 @@ static __init bool get_rdt_alloc_resources(void)
> rdt_get_cache_alloc_cfg(1, r);
> if (rdt_cpu_has(X86_FEATURE_CDP_L3))
> rdt_get_cdp_l3_config();
> + if (rdt_cpu_has(X86_FEATURE_SDCIAE))
> + rdt_set_io_alloc_capable(r);
> ret = true;
> }
> if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 9ba771f2ddea..0e8641e41100 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -191,6 +191,8 @@ struct rdt_mon_domain {
> * @arch_has_sparse_bitmasks: True if a bitmask like f00f is valid.
> * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
> * level has CPU scope.
> + * @io_alloc_capable: True if portion of the cache can be allocated
> + * for I/O traffic.
Reading this again I think the description can be improved since technically
resctrl does not allocate a portion of cache but instead configures the
portion of cache that device can allocate into.
So perhaps this can be: "True if portion of the cache can be configured
for I/O traffic allocation."
> */
> struct resctrl_cache {
> unsigned int cbm_len;
> @@ -198,6 +200,7 @@ struct resctrl_cache {
> unsigned int shareable_bits;
> bool arch_has_sparse_bitmasks;
> bool arch_has_per_cpu_cfg;
> + bool io_alloc_capable;
> };
>
> /**
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature
2025-06-18 3:45 ` Reinette Chatre
@ 2025-06-18 19:27 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-18 19:27 UTC (permalink / raw)
To: Reinette Chatre, corbet, tony.luck, Dave.Martin, james.morse,
tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/25 22:45, Reinette Chatre wrote:
> Hi Babu,
>
> Please modify subject prefix to "x86,fs/resctrl" since this adds new arch
> API called by resctrl fs.
Sure.
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> Smart Data Cache Injection (SDCI) is a mechanism that enables direct
>> insertion of data from I/O devices into the L3 cache. It can the demands
>
> "It can the demands" -> "It can reduce demands"?
Sure.
>
>> on DRAM bandwidth and reduces latency to the processor consuming the I/O
>> data.
>>
>> Introduce cache resource property "io_alloc_capable" that an architecture
>> can set if a portion of the L3 cache can be allocated for I/O traffic.
>
> I think "L3" should be dropped because the cache resource property is not
> unique to L3. It is a property of all cache resources.
>
>> Set this property on x86 systems that support SDCIAE (L3 Smart Data Cache
>> Injection Allocation Enforcement).
>
> Here it can be mentioned that this property is only set for the L3 cache
> resource on systems that supports SDCIAE.
>
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v6: No changes.
>>
>> v5: No changes.
>>
>> v4: Updated the commit message and code comment based on feedback.
>>
>> v3: Rewrote commit log. Changed the text to bit generic than the AMD specific.
>> Renamed the rdt_get_sdciae_alloc_cfg() to rdt_set_io_alloc_capable().
>> Removed leftover comment from v2.
>>
>> v2: Changed sdciae_capable to io_alloc_capable to make it generic feature.
>> Also moved the io_alloc_capable in struct resctrl_cache.
>> ---
>> arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++
>> include/linux/resctrl.h | 3 +++
>> 2 files changed, 10 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 326c679ade5c..a3d174362249 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -274,6 +274,11 @@ static void rdt_get_cdp_config(int level)
>> rdt_resources_all[level].r_resctrl.cdp_capable = true;
>> }
>>
>> +static void rdt_set_io_alloc_capable(struct rdt_resource *r)
>> +{
>> + r->cache.io_alloc_capable = true;
>> +}
>> +
>> static void rdt_get_cdp_l3_config(void)
>> {
>> rdt_get_cdp_config(RDT_RESOURCE_L3);
>> @@ -840,6 +845,8 @@ static __init bool get_rdt_alloc_resources(void)
>> rdt_get_cache_alloc_cfg(1, r);
>> if (rdt_cpu_has(X86_FEATURE_CDP_L3))
>> rdt_get_cdp_l3_config();
>> + if (rdt_cpu_has(X86_FEATURE_SDCIAE))
>> + rdt_set_io_alloc_capable(r);
>> ret = true;
>> }
>> if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 9ba771f2ddea..0e8641e41100 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -191,6 +191,8 @@ struct rdt_mon_domain {
>> * @arch_has_sparse_bitmasks: True if a bitmask like f00f is valid.
>> * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
>> * level has CPU scope.
>> + * @io_alloc_capable: True if portion of the cache can be allocated
>> + * for I/O traffic.
>
> Reading this again I think the description can be improved since technically
> resctrl does not allocate a portion of cache but instead configures the
> portion of cache that device can allocate into.
> So perhaps this can be: "True if portion of the cache can be configured
> for I/O traffic allocation."
Sure.
>
>> */
>> struct resctrl_cache {
>> unsigned int cbm_len;
>> @@ -198,6 +200,7 @@ struct resctrl_cache {
>> unsigned int shareable_bits;
>> bool arch_has_sparse_bitmasks;
>> bool arch_has_per_cpu_cfg;
>> + bool io_alloc_capable;
>> };
>>
>> /**
>
> Reinette
>
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (2 preceding siblings ...)
2025-06-11 21:23 ` [PATCH v6 3/8] x86/resctrl: Detect io_alloc feature Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-18 3:51 ` Reinette Chatre
2025-06-11 21:23 ` [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
` (3 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
"io_alloc" enables direct insertion of data from I/O devices into the L3
cache.
On AMD, "io_alloc" feature is backed by L3 Smart Data Cache Injection
Allocation Enforcement (SDCIAE). Change SDCIAE state by setting (to enable)
or clearing (to disable) bit 1 of MSR L3_QOS_EXT_CFG on all logical
processors within the cache domain.
Introduce architecture-specific handlers to enable and disable the feature.
The SDCIAE feature details are available in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
Injection Allocation Enforcement (SDCIAE)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Added lockdep_assert_cpus_held() in _resctrl_sdciae_enable() to protect
r->ctrl_domains.
Added more comments in include/linux/resctrl.h.
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
The files monitor.c/rdtgroup.c have been split between FS and ARCH directories.
Moved prototypes of resctrl_arch_io_alloc_enable() and
resctrl_arch_get_io_alloc_enabled() to include/linux/resctrl.h.
v4: Updated the commit log to address the feedback.
v3: Passed the struct rdt_resource to resctrl_arch_get_io_alloc_enabled() instead of resource id.
Renamed the _resctrl_io_alloc_enable() to _resctrl_sdciae_enable() as it is arch specific.
Changed the return to void in _resctrl_sdciae_enable() instead of int.
Added more context in commit log and fixed few typos.
v2: Renamed the functions to simplify the code.
Renamed sdciae_capable to io_alloc_capable.
Changed the name of few arch functions similar to ABMC series.
resctrl_arch_get_io_alloc_enabled()
resctrl_arch_io_alloc_enable()
---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 5 ++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 ++++++++++++++++++++++++++
include/linux/resctrl.h | 21 ++++++++++++++
4 files changed, 67 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b7dded3c8113..b92b04fa9888 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1215,6 +1215,7 @@
/* - AMD: */
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
+#define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff
#define MSR_IA32_EVT_CFG_BASE 0xc0000400
/* AMD-V MSRs */
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5e3c41b36437..cfa519ea2875 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -37,6 +37,9 @@ struct arch_mbm_state {
u64 prev_msr;
};
+/* Setting bit 1 in L3_QOS_EXT_CFG enables the SDCIAE feature. */
+#define SDCIAE_ENABLE_BIT 1
+
/**
* struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
* a resource for a control function
@@ -102,6 +105,7 @@ struct msr_param {
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
* @cdp_enabled: CDP state of this resource
+ * @sdciae_enabled: SDCIAE feature is enabled
*
* Members of this structure are either private to the architecture
* e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -115,6 +119,7 @@ struct rdt_hw_resource {
unsigned int mon_scale;
unsigned int mbm_width;
bool cdp_enabled;
+ bool sdciae_enabled;
};
static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 885026468440..3bdcd53b3ce3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -229,6 +229,46 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
return rdt_resources_all[l].cdp_enabled;
}
+inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
+{
+ return resctrl_to_arch_res(r)->sdciae_enabled;
+}
+
+static void resctrl_sdciae_set_one_amd(void *arg)
+{
+ bool *enable = arg;
+
+ if (*enable)
+ msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
+ else
+ msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
+}
+
+static void _resctrl_sdciae_enable(struct rdt_resource *r, bool enable)
+{
+ struct rdt_ctrl_domain *d;
+
+ /* Walking r->ctrl_domains, ensure it can't race with cpuhp */
+ lockdep_assert_cpus_held();
+
+ /* Update L3_QOS_EXT_CFG MSR on all the CPUs in all domains */
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list)
+ on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_sdciae_set_one_amd, &enable, 1);
+}
+
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+ if (hw_res->r_resctrl.cache.io_alloc_capable &&
+ hw_res->sdciae_enabled != enable) {
+ _resctrl_sdciae_enable(r, enable);
+ hw_res->sdciae_enabled = enable;
+ }
+
+ return 0;
+}
+
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0e8641e41100..06e8a1821702 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -531,6 +531,27 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
*/
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
+/**
+ * resctrl_arch_io_alloc_enable() - Enable/disable io_alloc feature.
+ * @r: The resctrl resource.
+ * @enable: Enable (true) or disable (false) io_alloc on resource @r.
+ *
+ * This can be called from any CPU.
+ *
+ * Return:
+ * 0 on success, or non-zero on error.
+ */
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable);
+
+/**
+ * resctrl_arch_get_io_alloc_enabled() - Get io_alloc feature state.
+ * @r: The resctrl resource.
+ *
+ * Return:
+ * true if io_alloc is enabled or false if disabled.
+ */
+inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
+
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-06-11 21:23 ` [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
@ 2025-06-18 3:51 ` Reinette Chatre
2025-06-18 19:27 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 3:51 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/11/25 2:23 PM, Babu Moger wrote:
> "io_alloc" enables direct insertion of data from I/O devices into the L3
> cache.
Above is from resctrl perspective and resctrl does not limit this to L3. Here also
I think L3 should be dropped.
>
> On AMD, "io_alloc" feature is backed by L3 Smart Data Cache Injection
> Allocation Enforcement (SDCIAE). Change SDCIAE state by setting (to enable)
> or clearing (to disable) bit 1 of MSR L3_QOS_EXT_CFG on all logical
> processors within the cache domain.
>
> Introduce architecture-specific handlers to enable and disable the feature.
>
> The SDCIAE feature details are available in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
> Injection Allocation Enforcement (SDCIAE)
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...
> ---
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/kernel/cpu/resctrl/internal.h | 5 ++++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 ++++++++++++++++++++++++++
> include/linux/resctrl.h | 21 ++++++++++++++
This hints the subject prefix should be "x86,fs/resctrl".
> 4 files changed, 67 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index b7dded3c8113..b92b04fa9888 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1215,6 +1215,7 @@
> /* - AMD: */
> #define MSR_IA32_MBA_BW_BASE 0xc0000200
> #define MSR_IA32_SMBA_BW_BASE 0xc0000280
> +#define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff
> #define MSR_IA32_EVT_CFG_BASE 0xc0000400
>
> /* AMD-V MSRs */
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 5e3c41b36437..cfa519ea2875 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -37,6 +37,9 @@ struct arch_mbm_state {
> u64 prev_msr;
> };
>
> +/* Setting bit 1 in L3_QOS_EXT_CFG enables the SDCIAE feature. */
> +#define SDCIAE_ENABLE_BIT 1
> +
> /**
> * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
> * a resource for a control function
> @@ -102,6 +105,7 @@ struct msr_param {
> * @mon_scale: cqm counter * mon_scale = occupancy in bytes
> * @mbm_width: Monitor width, to detect and correct for overflow.
> * @cdp_enabled: CDP state of this resource
> + * @sdciae_enabled: SDCIAE feature is enabled
nit: "SDCIAE feature (backing "io_alloc") is enabled"
> *
> * Members of this structure are either private to the architecture
> * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
> @@ -115,6 +119,7 @@ struct rdt_hw_resource {
> unsigned int mon_scale;
> unsigned int mbm_width;
> bool cdp_enabled;
> + bool sdciae_enabled;
> };
>
> static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 885026468440..3bdcd53b3ce3 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -229,6 +229,46 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
> return rdt_resources_all[l].cdp_enabled;
> }
>
> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
As indicated by lkp the inline usage needs to be fixed.
> +{
> + return resctrl_to_arch_res(r)->sdciae_enabled;
> +}
> +
> +static void resctrl_sdciae_set_one_amd(void *arg)
> +{
> + bool *enable = arg;
> +
> + if (*enable)
> + msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
> + else
> + msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
> +}
> +
> +static void _resctrl_sdciae_enable(struct rdt_resource *r, bool enable)
> +{
> + struct rdt_ctrl_domain *d;
> +
> + /* Walking r->ctrl_domains, ensure it can't race with cpuhp */
> + lockdep_assert_cpus_held();
> +
> + /* Update L3_QOS_EXT_CFG MSR on all the CPUs in all domains */
> + list_for_each_entry(d, &r->ctrl_domains, hdr.list)
> + on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_sdciae_set_one_amd, &enable, 1);
> +}
> +
> +int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
> +{
> + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> +
> + if (hw_res->r_resctrl.cache.io_alloc_capable &&
> + hw_res->sdciae_enabled != enable) {
> + _resctrl_sdciae_enable(r, enable);
> + hw_res->sdciae_enabled = enable;
> + }
> +
> + return 0;
> +}
> +
> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
> {
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 0e8641e41100..06e8a1821702 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -531,6 +531,27 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
> */
> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
>
> +/**
> + * resctrl_arch_io_alloc_enable() - Enable/disable io_alloc feature.
> + * @r: The resctrl resource.
> + * @enable: Enable (true) or disable (false) io_alloc on resource @r.
> + *
> + * This can be called from any CPU.
> + *
> + * Return:
> + * 0 on success, or non-zero on error.
Please change to "0 on success, <0 on error" to make clear it needs to be
non-zero *and* negative to be considered error by resctrl fs.
> + */
> +int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable);
> +
> +/**
> + * resctrl_arch_get_io_alloc_enabled() - Get io_alloc feature state.
> + * @r: The resctrl resource.
> + *
> + * Return:
> + * true if io_alloc is enabled or false if disabled.
> + */
> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
> +
> extern unsigned int resctrl_rmid_realloc_threshold;
> extern unsigned int resctrl_rmid_realloc_limit;
>
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-06-18 3:51 ` Reinette Chatre
@ 2025-06-18 19:27 ` Moger, Babu
2025-06-18 20:32 ` Reinette Chatre
0 siblings, 1 reply; 26+ messages in thread
From: Moger, Babu @ 2025-06-18 19:27 UTC (permalink / raw)
To: Reinette Chatre, corbet, tony.luck, Dave.Martin, james.morse,
tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/25 22:51, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> "io_alloc" enables direct insertion of data from I/O devices into the L3
>> cache.
>
> Above is from resctrl perspective and resctrl does not limit this to L3. Here also
> I think L3 should be dropped.
Sure.
>
>>
>> On AMD, "io_alloc" feature is backed by L3 Smart Data Cache Injection
>> Allocation Enforcement (SDCIAE). Change SDCIAE state by setting (to enable)
>> or clearing (to disable) bit 1 of MSR L3_QOS_EXT_CFG on all logical
>> processors within the cache domain.
>>
>> Introduce architecture-specific handlers to enable and disable the feature.
>>
>> The SDCIAE feature details are available in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.4.7 L3 Smart Data Cache
>> Injection Allocation Enforcement (SDCIAE)
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>
> ...
>
>> ---
>> arch/x86/include/asm/msr-index.h | 1 +
>> arch/x86/kernel/cpu/resctrl/internal.h | 5 ++++
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 40 ++++++++++++++++++++++++++
>> include/linux/resctrl.h | 21 ++++++++++++++
>
> This hints the subject prefix should be "x86,fs/resctrl".
Sure.
>
>> 4 files changed, 67 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index b7dded3c8113..b92b04fa9888 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -1215,6 +1215,7 @@
>> /* - AMD: */
>> #define MSR_IA32_MBA_BW_BASE 0xc0000200
>> #define MSR_IA32_SMBA_BW_BASE 0xc0000280
>> +#define MSR_IA32_L3_QOS_EXT_CFG 0xc00003ff
>> #define MSR_IA32_EVT_CFG_BASE 0xc0000400
>>
>> /* AMD-V MSRs */
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 5e3c41b36437..cfa519ea2875 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -37,6 +37,9 @@ struct arch_mbm_state {
>> u64 prev_msr;
>> };
>>
>> +/* Setting bit 1 in L3_QOS_EXT_CFG enables the SDCIAE feature. */
>> +#define SDCIAE_ENABLE_BIT 1
>> +
>> /**
>> * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs that share
>> * a resource for a control function
>> @@ -102,6 +105,7 @@ struct msr_param {
>> * @mon_scale: cqm counter * mon_scale = occupancy in bytes
>> * @mbm_width: Monitor width, to detect and correct for overflow.
>> * @cdp_enabled: CDP state of this resource
>> + * @sdciae_enabled: SDCIAE feature is enabled
>
> nit: "SDCIAE feature (backing "io_alloc") is enabled"
Sure.
>
>> *
>> * Members of this structure are either private to the architecture
>> * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
>> @@ -115,6 +119,7 @@ struct rdt_hw_resource {
>> unsigned int mon_scale;
>> unsigned int mbm_width;
>> bool cdp_enabled;
>> + bool sdciae_enabled;
>> };
>>
>> static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 885026468440..3bdcd53b3ce3 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -229,6 +229,46 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
>> return rdt_resources_all[l].cdp_enabled;
>> }
>>
>> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
>
> As indicated by lkp the inline usage needs to be fixed.
I am assuming that you are referring to
https://www.kernel.org/doc/html/next/process/coding-style.html#the-inline-disease
I will remove inline attribute.
>
>> +{
>> + return resctrl_to_arch_res(r)->sdciae_enabled;
>> +}
>> +
>> +static void resctrl_sdciae_set_one_amd(void *arg)
>> +{
>> + bool *enable = arg;
>> +
>> + if (*enable)
>> + msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
>> + else
>> + msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, SDCIAE_ENABLE_BIT);
>> +}
>> +
>> +static void _resctrl_sdciae_enable(struct rdt_resource *r, bool enable)
>> +{
>> + struct rdt_ctrl_domain *d;
>> +
>> + /* Walking r->ctrl_domains, ensure it can't race with cpuhp */
>> + lockdep_assert_cpus_held();
>> +
>> + /* Update L3_QOS_EXT_CFG MSR on all the CPUs in all domains */
>> + list_for_each_entry(d, &r->ctrl_domains, hdr.list)
>> + on_each_cpu_mask(&d->hdr.cpu_mask, resctrl_sdciae_set_one_amd, &enable, 1);
>> +}
>> +
>> +int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
>> +{
>> + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> +
>> + if (hw_res->r_resctrl.cache.io_alloc_capable &&
>> + hw_res->sdciae_enabled != enable) {
>> + _resctrl_sdciae_enable(r, enable);
>> + hw_res->sdciae_enabled = enable;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
>> {
>> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 0e8641e41100..06e8a1821702 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -531,6 +531,27 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
>> */
>> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
>>
>> +/**
>> + * resctrl_arch_io_alloc_enable() - Enable/disable io_alloc feature.
>> + * @r: The resctrl resource.
>> + * @enable: Enable (true) or disable (false) io_alloc on resource @r.
>> + *
>> + * This can be called from any CPU.
>> + *
>> + * Return:
>> + * 0 on success, or non-zero on error.
>
> Please change to "0 on success, <0 on error" to make clear it needs to be
> non-zero *and* negative to be considered error by resctrl fs.
Sure.
>
>> + */
>> +int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable);
>> +
>> +/**
>> + * resctrl_arch_get_io_alloc_enabled() - Get io_alloc feature state.
>> + * @r: The resctrl resource.
>> + *
>> + * Return:
>> + * true if io_alloc is enabled or false if disabled.
>> + */
>> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
>> +
>> extern unsigned int resctrl_rmid_realloc_threshold;
>> extern unsigned int resctrl_rmid_realloc_limit;
>>
>
> Reinette
>
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-06-18 19:27 ` Moger, Babu
@ 2025-06-18 20:32 ` Reinette Chatre
2025-06-18 21:57 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 20:32 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/18/25 12:27 PM, Moger, Babu wrote:
> On 6/17/25 22:51, Reinette Chatre wrote:
>> On 6/11/25 2:23 PM, Babu Moger wrote:
>>> static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 885026468440..3bdcd53b3ce3 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -229,6 +229,46 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
>>> return rdt_resources_all[l].cdp_enabled;
>>> }
>>>
>>> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
>>
>> As indicated by lkp the inline usage needs to be fixed.
>
> I am assuming that you are referring to
> https://www.kernel.org/doc/html/next/process/coding-style.html#the-inline-disease
No. I am referring to the lkp test report of an issue detected by sparse. Looks like
the message was not cc'd to lkml so I cannot provide link. I paste it below. You are in
"To:".
>
> I will remove inline attribute.
The goal was to fix the broken usage of inline, but you are right that it may
not be needed here.
Here is the original report:
> Date: Fri, 13 Jun 2025 12:18:35 +0800
> From: kernel test robot <lkp@intel.com>
> To: Babu Moger <babu.moger@amd.com>, corbet@lwn.net, tony.luck@intel.com, reinette.chatre@intel.com, Dave.Martin@arm.com, james.morse@arm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
> CC: oe-kbuild-all@lists.linux.dev, x86@kernel.org, hpa@zytor.com, akpm@linux-foundation.org, paulmck@kernel.org, rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org, gregkh@linuxfoundation.org, seanjc@google.com, thomas.lendacky@amd.com, pawan.kumar.gupta@linux.intel.com, perry.yuan@amd.com, yosry.ahmed@linux.dev, kai.huang@intel.com, xiaoyao.li@intel.com, peterz@infradead.org, kan.liang@linux.intel.com, mario.limonciello@amd.com, xin3.li@intel.com, sohil.mehta@intel.com
> Subject: Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
> Message-ID: <202506131104.d1oo8NWe-lkp@intel.com>
> In-Reply-To: <b3d8e2ccd23b295f3735fc9f5420458cfc18a896.1749677012.git.babu.moger@amd.com>
>
> Hi Babu,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on brauner-vfs/vfs.all]
> [also build test WARNING on linus/master v6.16-rc1 next-20250612]
> [cannot apply to tip/x86/core aegl/next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Babu-Moger/x86-cpufeatures-Add-support-for-L3-Smart-Data-Cache-Injection-Allocation-Enforcement/20250612-053050
> base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
> patch link: https://lore.kernel.org/r/b3d8e2ccd23b295f3735fc9f5420458cfc18a896.1749677012.git.babu.moger%40amd.com
> patch subject: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
> config: i386-randconfig-061-20250613 (https://download.01.org/0day-ci/archive/20250613/202506131104.d1oo8NWe-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250613/202506131104.d1oo8NWe-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506131104.d1oo8NWe-lkp@intel.com/
>
> sparse warnings: (new ones prefixed by >>)
> fs/resctrl/rdtgroup.c: note: in included file:
>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>
> vim +553 include/linux/resctrl.h
>
> 48e63934badb71 Babu Moger 2025-06-11 545
> 48e63934badb71 Babu Moger 2025-06-11 546 /**
> 48e63934badb71 Babu Moger 2025-06-11 547 * resctrl_arch_get_io_alloc_enabled() - Get io_alloc feature state.
> 48e63934badb71 Babu Moger 2025-06-11 548 * @r: The resctrl resource.
> 48e63934badb71 Babu Moger 2025-06-11 549 *
> 48e63934badb71 Babu Moger 2025-06-11 550 * Return:
> 48e63934badb71 Babu Moger 2025-06-11 551 * true if io_alloc is enabled or false if disabled.
> 48e63934badb71 Babu Moger 2025-06-11 552 */
> 48e63934badb71 Babu Moger 2025-06-11 @553 inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
> 48e63934badb71 Babu Moger 2025-06-11 554
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers
2025-06-18 20:32 ` Reinette Chatre
@ 2025-06-18 21:57 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-18 21:57 UTC (permalink / raw)
To: Reinette Chatre, babu.moger, corbet, tony.luck, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/18/2025 3:32 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/18/25 12:27 PM, Moger, Babu wrote:
>
>> On 6/17/25 22:51, Reinette Chatre wrote:
>
>>> On 6/11/25 2:23 PM, Babu Moger wrote:
>
>>>> static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index 885026468440..3bdcd53b3ce3 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -229,6 +229,46 @@ bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
>>>> return rdt_resources_all[l].cdp_enabled;
>>>> }
>>>>
>>>> +inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
>>>
>>> As indicated by lkp the inline usage needs to be fixed.
>>
>> I am assuming that you are referring to
>> https://www.kernel.org/doc/html/next/process/coding-style.html#the-inline-disease
>
> No. I am referring to the lkp test report of an issue detected by sparse. Looks like
> the message was not cc'd to lkml so I cannot provide link. I paste it below. You are in
> "To:".
Yes. I saw that report. It's strange why it was not cc'd to lkml.
Happened few times before.
>
>>
>> I will remove inline attribute.
>
> The goal was to fix the broken usage of inline, but you are right that it may
> not be needed here.
Sure. Will remove inline.
>
> Here is the original report:
Thanks
Babu
>
>> Date: Fri, 13 Jun 2025 12:18:35 +0800
>> From: kernel test robot <lkp@intel.com>
>> To: Babu Moger <babu.moger@amd.com>, corbet@lwn.net, tony.luck@intel.com, reinette.chatre@intel.com, Dave.Martin@arm.com, james.morse@arm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com
>> CC: oe-kbuild-all@lists.linux.dev, x86@kernel.org, hpa@zytor.com, akpm@linux-foundation.org, paulmck@kernel.org, rostedt@goodmis.org, thuth@redhat.com, ardb@kernel.org, gregkh@linuxfoundation.org, seanjc@google.com, thomas.lendacky@amd.com, pawan.kumar.gupta@linux.intel.com, perry.yuan@amd.com, yosry.ahmed@linux.dev, kai.huang@intel.com, xiaoyao.li@intel.com, peterz@infradead.org, kan.liang@linux.intel.com, mario.limonciello@amd.com, xin3.li@intel.com, sohil.mehta@intel.com
>> Subject: Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
>> Message-ID: <202506131104.d1oo8NWe-lkp@intel.com>
>> In-Reply-To: <b3d8e2ccd23b295f3735fc9f5420458cfc18a896.1749677012.git.babu.moger@amd.com>
>>
>> Hi Babu,
>>
>> kernel test robot noticed the following build warnings:
>>
>> [auto build test WARNING on brauner-vfs/vfs.all]
>> [also build test WARNING on linus/master v6.16-rc1 next-20250612]
>> [cannot apply to tip/x86/core aegl/next]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>>
>> url: https://github.com/intel-lab-lkp/linux/commits/Babu-Moger/x86-cpufeatures-Add-support-for-L3-Smart-Data-Cache-Injection-Allocation-Enforcement/20250612-053050
>> base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
>> patch link: https://lore.kernel.org/r/b3d8e2ccd23b295f3735fc9f5420458cfc18a896.1749677012.git.babu.moger%40amd.com
>> patch subject: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
>> config: i386-randconfig-061-20250613 (https://download.01.org/0day-ci/archive/20250613/202506131104.d1oo8NWe-lkp@intel.com/config)
>> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250613/202506131104.d1oo8NWe-lkp@intel.com/reproduce)
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <lkp@intel.com>
>> | Closes: https://lore.kernel.org/oe-kbuild-all/202506131104.d1oo8NWe-lkp@intel.com/
>>
>> sparse warnings: (new ones prefixed by >>)
>> fs/resctrl/rdtgroup.c: note: in included file:
>>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>>> include/linux/resctrl.h:553:46: sparse: sparse: marked inline, but without a definition
>>
>> vim +553 include/linux/resctrl.h
>>
>> 48e63934badb71 Babu Moger 2025-06-11 545
>> 48e63934badb71 Babu Moger 2025-06-11 546 /**
>> 48e63934badb71 Babu Moger 2025-06-11 547 * resctrl_arch_get_io_alloc_enabled() - Get io_alloc feature state.
>> 48e63934badb71 Babu Moger 2025-06-11 548 * @r: The resctrl resource.
>> 48e63934badb71 Babu Moger 2025-06-11 549 *
>> 48e63934badb71 Babu Moger 2025-06-11 550 * Return:
>> 48e63934badb71 Babu Moger 2025-06-11 551 * true if io_alloc is enabled or false if disabled.
>> 48e63934badb71 Babu Moger 2025-06-11 552 */
>> 48e63934badb71 Babu Moger 2025-06-11 @553 inline bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
>> 48e63934badb71 Babu Moger 2025-06-11 554
>>
>> --
>> 0-DAY CI Kernel Test Service
>> https://github.com/intel/lkp-tests/wiki
>
> Reinette
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (3 preceding siblings ...)
2025-06-11 21:23 ` [PATCH v6 4/8] x86/resctrl: Implement "io_alloc" enable/disable handlers Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-18 3:59 ` Reinette Chatre
2025-06-11 21:23 ` [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
` (2 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
The io_alloc feature in resctrl is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache.
On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
lines to be placed into the L3 cache partitions identified by the
highest-supported L3_MASK_n register as reported by CPUID
Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
be allocated into the L3 cache partitions determined by the bitmask in
the L3_MASK_15 register.
When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
allocated for the instruction cache (L3CODE).
Introduce user interface to enable/disable "io_alloc" feature.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Changed the subject to fs/resctrl:
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
Used rdt_kn_name to get the rdtgroup name instead of accesssing it directly
while printing group name used by the io_alloc_closid.
Updated bit_usage to reflect the io_alloc CBM as discussed in the thread:
https://lore.kernel.org/lkml/3ca0a5dc-ad9c-4767-9011-b79d986e1e8d@intel.com/
Modified rdt_bit_usage_show() to read io_alloc_cbm in hw_shareable, ensuring
that bit_usage accurately represents the CBMs.
Updated the code to modify io_alloc either with L3CODE or L3DATA.
https://lore.kernel.org/lkml/c00c00ea-a9ac-4c56-961c-dc5bf633476b@intel.com/
v4: Updated the change log.
Updated the user doc.
The "io_alloc" interface will report "enabled/disabled/not supported".
Updated resctrl_io_alloc_closid_get() to verify the max closid availability.
Updated the documentation for "shareable_bits" and "bit_usage".
Introduced io_alloc_init() to initialize fflags.
Printed the group name when io_alloc enablement fails.
NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
handle bit_usage for specific CLOS. Its about overall system. So, we cannot
really tell the user which CLOS is shared across both hardware and software.
We need to discuss this.
v3: Rewrote the change to make it generic.
Rewrote the documentation in resctrl.rst to be generic and added
AMD feature details in the end.
Added the check to verify if MAX CLOSID availability on the system.
Added CDP check to make sure io_alloc is configured in CDP_CODE.
Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
Added errors in few cases when CLOSID allocation fails.
Fixes splat reported when info/L3/bit_usage is accesed when io_alloc
is enabled.
https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/
v2: Renamed the feature to "io_alloc".
Added generic texts for the feature in commit log and resctrl.rst doc.
Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
values when enabled.
Fixed io_alloc show functinality to display only on L3 resource.
---
Documentation/filesystems/resctrl.rst | 34 ++++
fs/resctrl/rdtgroup.c | 216 +++++++++++++++++++++++++-
2 files changed, 248 insertions(+), 2 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index c7949dd44f2f..03c829b2c276 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -95,6 +95,11 @@ related to allocation:
some platforms support devices that have their
own settings for cache use which can over-ride
these bits.
+
+ When the "io_alloc" feature is enabled, a portion of the cache
+ is reserved for shared use between hardware and software. Refer
+ to "bit_usage" to see which portion is allocated for this purpose.
+
"bit_usage":
Annotated capacity bitmasks showing how all
instances of the resource are used. The legend is:
@@ -135,6 +140,35 @@ related to allocation:
"1":
Non-contiguous 1s value in CBM is supported.
+"io_alloc":
+ The "io_alloc" enables system software to configure the portion
+ of the L3 cache allocated for I/O traffic.
+
+ The feature routes the I/O traffic via specific CLOSID reserved
+ for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
+ for the CLOSID, users can control the L3 portions available for
+ I/0 traffic. The reserved CLOSID will be excluded for group creation.
+
+ The interface provides a means to query the status of the feature.
+
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3/io_alloc
+ disabled
+
+ Feature can be enabled/disabled by writing to the interface.
+ Example::
+
+ # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+ # cat /sys/fs/resctrl/info/L3/io_alloc
+ enabled
+
+ On AMD systems, the io_alloc feature is supported by the L3 Smart
+ Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
+ io_alloc is determined by the highest CLOSID supported by the resource.
+ When CDP is enabled, io_alloc routes I/O traffic using the highest
+ CLOSID allocated for the instruction cache (L3CODE).
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 1beb124e25f6..bbc032b4d0e9 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
static char last_cmd_status_buf[512];
static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
+static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
static void rdtgroup_destroy_root(void);
@@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
return !test_bit(closid, closid_free_map);
}
+static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
+{
+ if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
+ return io_alloc_closid;
+ else
+ return -ENOSPC;
+}
+
+static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
+{
+ closid_free(io_alloc_closid);
+}
+
/**
* rdtgroup_mode_by_closid - Return mode of resource group with closid
* @closid: closid if the resource group
@@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
return 0;
}
+/*
+ * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
+ * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
+ *
+ * The total number of CLOSIDs is determined in closid_init(), based on the
+ * minimum supported across all resources. If CDP (Code Data Prioritization)
+ * is enabled, the number of CLOSIDs is halved. The final value is returned
+ * by closids_supported(). Make sure this value aligns with the maximum
+ * CLOSID supported by the respective resource.
+ */
+static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
+{
+ int num_closids = closids_supported();
+
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ num_closids *= 2;
+
+ if (num_closids != resctrl_arch_get_num_closid(r))
+ return -ENOSPC;
+
+ return closids_supported() - 1;
+}
+
/*
* rdt_bit_usage_show - Display current usage of resources
*
@@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
struct rdt_ctrl_domain *dom;
int i, hwb, swb, excl, psl;
enum rdtgrp_mode mode;
+ int io_alloc_closid;
bool sep = false;
u32 ctrl_val;
cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
- hw_shareable = r->cache.shareable_bits;
list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
if (sep)
seq_putc(seq, ';');
+ hw_shareable = r->cache.shareable_bits;
sw_shareable = 0;
exclusive = 0;
seq_printf(seq, "%d=", dom->hdr.id);
for (i = 0; i < closids_supported(); i++) {
- if (!closid_allocated(i))
+ if (!closid_allocated(i) ||
+ (resctrl_arch_get_io_alloc_enabled(r) &&
+ i == resctrl_io_alloc_closid_get(r)))
continue;
ctrl_val = resctrl_arch_get_config(r, dom, i,
s->conf_type);
@@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
break;
}
}
+
+ /*
+ * When the "io_alloc" feature is enabled, a portion of the
+ * cache is reserved for shared use between hardware and software.
+ */
+ if (resctrl_arch_get_io_alloc_enabled(r)) {
+ io_alloc_closid = resctrl_io_alloc_closid_get(r);
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ ctrl_val = resctrl_arch_get_config(r, dom,
+ io_alloc_closid,
+ CDP_CODE);
+ else
+ ctrl_val = resctrl_arch_get_config(r, dom,
+ io_alloc_closid,
+ CDP_NONE);
+ hw_shareable |= ctrl_val;
+ }
+
for (i = r->cache.cbm_len - 1; i >= 0; i--) {
pseudo_locked = dom->plr ? dom->plr->cbm : 0;
hwb = test_bit(i, &hw_shareable);
@@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}
+static int resctrl_io_alloc_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+
+ if (r->cache.io_alloc_capable) {
+ if (resctrl_arch_get_io_alloc_enabled(r))
+ seq_puts(seq, "enabled\n");
+ else
+ seq_puts(seq, "disabled\n");
+ } else {
+ seq_puts(seq, "not supported\n");
+ }
+
+ return 0;
+}
+
+/*
+ * Initialize io_alloc CLOSID cache resource with default CBM values.
+ */
+static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
+ struct resctrl_schema *s, u32 closid)
+{
+ int ret;
+
+ rdt_staged_configs_clear();
+
+ ret = rdtgroup_init_cat(s, closid);
+ if (ret < 0)
+ goto out_init_cat;
+
+ ret = resctrl_arch_update_domains(r, closid);
+
+out_init_cat:
+ rdt_staged_configs_clear();
+ return ret;
+}
+
+static const char *rdtgroup_name_by_closid(int closid)
+{
+ struct rdtgroup *rdtgrp;
+
+ list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+ if (rdtgrp->closid == closid)
+ return rdt_kn_name(rdtgrp->kn);
+ }
+
+ return NULL;
+}
+
+/*
+ * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
+ * linked to an L3CODE resource. Although CBMs can be accessed through
+ * either L3CODE or L3DATA resources, any updates to the schemata must
+ * always be performed on L3CODE.
+ */
+static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
+{
+ struct resctrl_schema *schema;
+
+ if (s->conf_type == CDP_DATA) {
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ if (schema->conf_type == CDP_CODE)
+ return schema;
+ }
+ }
+
+ return s;
+}
+
+static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ char const *grp_name;
+ u32 io_alloc_closid;
+ bool enable;
+ int ret;
+
+ ret = kstrtobool(buf, &enable);
+ if (ret)
+ return ret;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ ret = -ENODEV;
+ goto out_io_alloc;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto out_io_alloc;
+ }
+
+ if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
+ if (enable) {
+ ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
+ if (ret < 0) {
+ grp_name = rdtgroup_name_by_closid(io_alloc_closid);
+ rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
+ grp_name ? grp_name : "another");
+ ret = -EINVAL;
+ goto out_io_alloc;
+ }
+
+ ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
+ io_alloc_closid);
+ if (ret) {
+ rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
+ resctrl_io_alloc_closid_free(io_alloc_closid);
+ goto out_io_alloc;
+ }
+
+ } else {
+ resctrl_io_alloc_closid_free(io_alloc_closid);
+ }
+
+ ret = resctrl_arch_io_alloc_enable(r, enable);
+ }
+
+out_io_alloc:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
.seq_show = rdtgroup_schemata_show,
.fflags = RFTYPE_CTRL_BASE,
},
+ {
+ .name = "io_alloc",
+ .mode = 0644,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = resctrl_io_alloc_show,
+ .write = resctrl_io_alloc_write,
+ },
{
.name = "mba_MBps_event",
.mode = 0644,
@@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
}
+static void io_alloc_init(void)
+{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+ if (r->cache.io_alloc_capable)
+ resctrl_file_fflags_init("io_alloc",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+}
+
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
{
struct rftype *rft;
@@ -4249,6 +4459,8 @@ int resctrl_init(void)
thread_throttle_mode_init();
+ io_alloc_init();
+
ret = resctrl_mon_resource_init();
if (ret)
return ret;
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
2025-06-11 21:23 ` [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
@ 2025-06-18 3:59 ` Reinette Chatre
2025-06-19 18:41 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 3:59 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/11/25 2:23 PM, Babu Moger wrote:
> The io_alloc feature in resctrl is a mechanism that enables direct
"The ... feature ... is a mechanism"? What does it mean when a feature
is a mechanism? How about just "The io_alloc feature in resctrl enables ..."?
> insertion of data from I/O devices into the L3 cache.
Drop "L3"?
>
> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
> Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
> lines to be placed into the L3 cache partitions identified by the
> highest-supported L3_MASK_n register as reported by CPUID
> Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
> be allocated into the L3 cache partitions determined by the bitmask in
> the L3_MASK_15 register.
This is a resctrl fs patch but most of changelog so far is about AMD's implementation
details. I do think it is relevant, but all the register details can
probably be dropped since it is too low level to support the goal. I believe the
goal here is to motivate resctrl fs's implementation that needs to pick highest
CLOSID. So, essentially the changelog needs to hightlight that AMD's implementation
requires highest CLOSID and without any other reference that is what resctrl fs's
implementation supports. I think it is important to highlight that the
user interface is not tied to this implementation decision to avoid future issues
if another architecture support "io_alloc" "differently". Internals of
resctrl fs can then be changed.
>
> When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
> allocated for the instruction cache (L3CODE).
Again, this is a resctrl fs patch and above is an AMD implementation detail. resctrl
fs should not be limited by how AMD implements the feature but can use it as first
reference.
>
> Introduce user interface to enable/disable "io_alloc" feature.
This patch does more than this.
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...
> ---
> Documentation/filesystems/resctrl.rst | 34 ++++
> fs/resctrl/rdtgroup.c | 216 +++++++++++++++++++++++++-
This patch contains several logical changes that are not adequately described
in changelog.
I think the following can be separate patches:
- rdt_bit_usage_show() change
- io_alloc_init() definition and usage
- show() and write() helpers
- possibly the io_alloc_closid helpers (more later)
> 2 files changed, 248 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index c7949dd44f2f..03c829b2c276 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -95,6 +95,11 @@ related to allocation:
> some platforms support devices that have their
> own settings for cache use which can over-ride
> these bits.
> +
> + When the "io_alloc" feature is enabled, a portion of the cache
> + is reserved for shared use between hardware and software. Refer
"reserved" and "shared" sounds like a contradiction. How about "is reserved" ->
"can be configured"?
> + to "bit_usage" to see which portion is allocated for this purpose.
This is the "shareable_bits" section and the reason why user is pointed to
"bit_usage" is because "shareable_bits" is a global CBM that applies to all
cache domains/instances while "bit_usage" is per cache domain/instance. I think
it will be helpful to point this out to the user.
Perhaps second sentence can be replaced with something like:
""bit_usage" should be used to see which portions of each cache instance
is configured for hardware use via the "io_alloc" feature because every cache
instance can have its "io_alloc" bitmask configured independently".
Please do improve this.
To complete this the first part of the "shareable_bits" doc can be amended:
"Bitmask of shareable resource" -> "Bitmask (applicable to all instances of this resource) of shareable resource"
What do you think?
> +
> "bit_usage":
> Annotated capacity bitmasks showing how all
> instances of the resource are used. The legend is:
> @@ -135,6 +140,35 @@ related to allocation:
> "1":
> Non-contiguous 1s value in CBM is supported.
>
> +"io_alloc":
> + The "io_alloc" enables system software to configure the portion
"The "io_alloc" enables"? Maybe just ""io_alloc" enables"?
> + of the L3 cache allocated for I/O traffic.
Drop "L3"?
Can append, for example, "File may only exist if the system supports this feature on some of its cache
resources".
> +
> + The feature routes the I/O traffic via specific CLOSID reserved
> + for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
> + for the CLOSID, users can control the L3 portions available for
> + I/0 traffic. The reserved CLOSID will be excluded for group creation.
Looking back I've commented *four* times already about how resctrl fs user interface
should not be made specific to AMD's implementation.
This paragraph just be dropped?
The rest can be something like:
"disabled": Portions of cache used for allocation of I/O traffic cannot be configured.
"enabled": Portions of cache used for allocation of I/O traffic can be configured using "io_alloc_cbm"
"not supported": ...
The underlying implementation may reduce resources available to
general (CPU) cache allocation. See architecture specific notes below.
Depending on usage requirements the feature can be enabled or disabled:
To enable:
# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
To disable:
# echo 0 > /sys/fs/resctrl/info/L3/io_alloc
> +
> + The interface provides a means to query the status of the feature.
> +
> + Example::
> +
> + # cat /sys/fs/resctrl/info/L3/io_alloc
> + disabled
> +
> + Feature can be enabled/disabled by writing to the interface.
> + Example::
> +
> + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
> + # cat /sys/fs/resctrl/info/L3/io_alloc
> + enabled
> +
> + On AMD systems, the io_alloc feature is supported by the L3 Smart
> + Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
> + io_alloc is determined by the highest CLOSID supported by the resource.
> + When CDP is enabled, io_alloc routes I/O traffic using the highest
> + CLOSID allocated for the instruction cache (L3CODE).
> +
> Memory bandwidth(MB) subdirectory contains the following files
> with respect to allocation:
>
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 1beb124e25f6..bbc032b4d0e9 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
> static char last_cmd_status_buf[512];
>
> static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>
> static void rdtgroup_destroy_root(void);
>
> @@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
> return !test_bit(closid, closid_free_map);
> }
>
> +static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
> +{
> + if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
> + return io_alloc_closid;
> + else
> + return -ENOSPC;
> +}
> +
> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
> +{
> + closid_free(io_alloc_closid);
> +}
> +
> /**
> * rdtgroup_mode_by_closid - Return mode of resource group with closid
> * @closid: closid if the resource group
> @@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
> return 0;
> }
>
> +/*
> + * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
> + * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
The function name should be followed by a *brief* description.
> + *
> + * The total number of CLOSIDs is determined in closid_init(), based on the
> + * minimum supported across all resources. If CDP (Code Data Prioritization)
> + * is enabled, the number of CLOSIDs is halved. The final value is returned
> + * by closids_supported(). Make sure this value aligns with the maximum
> + * CLOSID supported by the respective resource.
All but the last sentence is unrelated to this function and the last sentence
is very vague. What is "this value" that it refers to?
> + */
> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
> +{
> + int num_closids = closids_supported();
> +
> + if (resctrl_arch_get_cdp_enabled(r->rid))
> + num_closids *= 2;
> +
> + if (num_closids != resctrl_arch_get_num_closid(r))
> + return -ENOSPC;
> +
> + return closids_supported() - 1;
> +}
resctrl_io_alloc_closid_get() seems to be trying to do two things:
- determine what the io_alloc_closid is
- make sure the io_alloc_closid is supported
I think this should be split into two functions. Once the
io_alloc_closid is determined to be supported and io_alloc
enabled then there is no reason to keep checking if it is
supported whenever the io_alloc_closid is queried.
How about simplifying this to:
/*
* note how this returns u32 that will eliminate
* unnecessary error checking in usages where io_alloc_closid
* needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
* already confirmed io_alloc is enabled
* function comment could note that this returns the CLOSID
* required by io_alloc but not whether the CLOSID can
* be supported, for this resctrl_io_alloc_closid_supported() should
* be used.
* Can also note that returned value will always be valid if
* resctrl_arch_get_io_alloc_enabled(r) is true.
*/
u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
if (resctrl_arch_get_cdp_enabled(r->rid))
return resctrl_arch_get_num_closid(r)/2 - 1
else
return resctrl_arch_get_num_closid(r) -1
}
/*
* note how below already makes resctrl's io_alloc implementation
* more generic
*/
resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
return io_alloc_closid < closids_supported()
}
> +
> /*
> * rdt_bit_usage_show - Display current usage of resources
> *
> @@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
> struct rdt_ctrl_domain *dom;
> int i, hwb, swb, excl, psl;
> enum rdtgrp_mode mode;
> + int io_alloc_closid;
> bool sep = false;
> u32 ctrl_val;
>
> cpus_read_lock();
> mutex_lock(&rdtgroup_mutex);
> - hw_shareable = r->cache.shareable_bits;
> list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
> if (sep)
> seq_putc(seq, ';');
> + hw_shareable = r->cache.shareable_bits;
> sw_shareable = 0;
> exclusive = 0;
> seq_printf(seq, "%d=", dom->hdr.id);
> for (i = 0; i < closids_supported(); i++) {
> - if (!closid_allocated(i))
> + if (!closid_allocated(i) ||
> + (resctrl_arch_get_io_alloc_enabled(r) &&
> + i == resctrl_io_alloc_closid_get(r)))
> continue;
> ctrl_val = resctrl_arch_get_config(r, dom, i,
> s->conf_type);
> @@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
> break;
> }
> }
> +
> + /*
> + * When the "io_alloc" feature is enabled, a portion of the
> + * cache is reserved for shared use between hardware and software.
> + */
> + if (resctrl_arch_get_io_alloc_enabled(r)) {
> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
In this implementation io_alloc_closid can be negative and the static
checker I used complained about the subsequent usage in
resctrl_arch_get_config() that must be unsigned.
Since resctrl_arch_get_io_alloc_enabled(r) already passed this is one
example where resctrl_io_alloc_closid() can be used.
> + if (resctrl_arch_get_cdp_enabled(r->rid))
> + ctrl_val = resctrl_arch_get_config(r, dom,
> + io_alloc_closid,
> + CDP_CODE);
> + else
> + ctrl_val = resctrl_arch_get_config(r, dom,
> + io_alloc_closid,
> + CDP_NONE);
> + hw_shareable |= ctrl_val;
> + }
> +
> for (i = r->cache.cbm_len - 1; i >= 0; i--) {
> pseudo_locked = dom->plr ? dom->plr->cbm : 0;
> hwb = test_bit(i, &hw_shareable);
> @@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
> return ret ?: nbytes;
> }
>
> +static int resctrl_io_alloc_show(struct kernfs_open_file *of,
> + struct seq_file *seq, void *v)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + struct rdt_resource *r = s->res;
> +
Needs rdtgroup_mutex
> + if (r->cache.io_alloc_capable) {
> + if (resctrl_arch_get_io_alloc_enabled(r))
> + seq_puts(seq, "enabled\n");
> + else
> + seq_puts(seq, "disabled\n");
> + } else {
> + seq_puts(seq, "not supported\n");
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * Initialize io_alloc CLOSID cache resource with default CBM values.
It is unclear what "default" means.
Could be "Initialize io_alloc CLOSID cache resource CBM with all usable (shared and unused) cache portions".
> + */
> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
> + struct resctrl_schema *s, u32 closid)
> +{
> + int ret;
> +
> + rdt_staged_configs_clear();
> +
> + ret = rdtgroup_init_cat(s, closid);
> + if (ret < 0)
> + goto out_init_cat;
The "out" label should reflect what is done at target, not the source.
Consider all the usages of "out_unlock" in resctrl.
Since this is the only label it can just be "out".
> +
> + ret = resctrl_arch_update_domains(r, closid);
> +
> +out_init_cat:
> + rdt_staged_configs_clear();
> + return ret;
> +}
> +
> +static const char *rdtgroup_name_by_closid(int closid)
> +{
> + struct rdtgroup *rdtgrp;
> +
> + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
> + if (rdtgrp->closid == closid)
> + return rdt_kn_name(rdtgrp->kn);
> + }
> +
> + return NULL;
> +}
> +
> +/*
> + * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
> + * linked to an L3CODE resource. Although CBMs can be accessed through
> + * either L3CODE or L3DATA resources, any updates to the schemata must
> + * always be performed on L3CODE.
> + */
I think that updates to the schemata needs to be made on *both* L3DATA and L3CODE.
Consider how __init_one_rdt_domain() works when a new resource group is created ...
the algorithm looks through all allocated CLOSID and the associated schemata impact
the CBM of the new group. If an allocated CLOSID is associated with L3CODE
then its "peer" L3DATA is also taken into account, similar for L3DATA.
If only L3CODE is updated for the io_alloc_closid then it looks to me that
whatever the original L3DATA schema was will keep impacting new resource
groups. To avoid that and ensure only accurate CBMs are used it looks to me
as though the L3DATA and L3CODE schema needs to be kept in sync.
> +static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
> +{
> + struct resctrl_schema *schema;
> +
> + if (s->conf_type == CDP_DATA) {
> + list_for_each_entry(schema, &resctrl_schema_all, list) {
> + if (schema->conf_type == CDP_CODE)
> + return schema;
> + }
> + }
> +
> + return s;
> +}
> +
> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> + size_t nbytes, loff_t off)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + struct rdt_resource *r = s->res;
> + char const *grp_name;
> + u32 io_alloc_closid;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret)
> + return ret;
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> +
> + if (!r->cache.io_alloc_capable) {
> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
This could be more useful with a small change:
"io_alloc is not supported on %s\n", s->name
> + ret = -ENODEV;
> + goto out_io_alloc;
out_io_alloc -> out_unlock
> + }
> +
> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
> + if (io_alloc_closid < 0) {
> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
This could be more useful to help debug by printing the value of io_alloc_closid
that user can compare against output of num_closids files. Here the terms become
a bit complicated since ideally we want to use ctrl_hw_id but that does not match
"num_closids", so perhaps use both terms, for example "CLOSID (ctrl_hw_id)"?
I am not sure here.
> + ret = -EINVAL;
> + goto out_io_alloc;
> + }
> +
> + if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
> + if (enable) {
> + ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
> + if (ret < 0) {
> + grp_name = rdtgroup_name_by_closid(io_alloc_closid);
Below handles !grp_name but that would be a kernel bug, no? Maybe WARN_ON_ONCE()?
> + rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
> + grp_name ? grp_name : "another");
CLOSID -> ctrl_hw_id
> + ret = -EINVAL;
> + goto out_io_alloc;
> + }
> +
> + ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
> + io_alloc_closid);
> + if (ret) {
> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
> + resctrl_io_alloc_closid_free(io_alloc_closid);
> + goto out_io_alloc;
> + }
> +
> + } else {
> + resctrl_io_alloc_closid_free(io_alloc_closid);
> + }
> +
> + ret = resctrl_arch_io_alloc_enable(r, enable);
> + }
> +
> +out_io_alloc:
out_unlock ... to match the other places in resctrl.
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> +
> + return ret ?: nbytes;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
> .seq_show = rdtgroup_schemata_show,
> .fflags = RFTYPE_CTRL_BASE,
> },
> + {
> + .name = "io_alloc",
> + .mode = 0644,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = resctrl_io_alloc_show,
> + .write = resctrl_io_alloc_write,
> + },
Please match existing code wrt tab usage.
> {
> .name = "mba_MBps_event",
> .mode = 0644,
> @@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
> RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
> }
>
> +static void io_alloc_init(void)
> +{
> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +
> + if (r->cache.io_alloc_capable)
> + resctrl_file_fflags_init("io_alloc",
> + RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
> +}
Note that even if it only checks the L3 cache resource, this will make the
file visible to all cache resource, also L2. This is why it is important to
ensure documentation and implementation also accommodates resources that
do not support io_alloc.
> +
> void resctrl_file_fflags_init(const char *config, unsigned long fflags)
> {
> struct rftype *rft;
> @@ -4249,6 +4459,8 @@ int resctrl_init(void)
>
> thread_throttle_mode_init();
>
> + io_alloc_init();
> +
> ret = resctrl_mon_resource_init();
> if (ret)
> return ret;
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
2025-06-18 3:59 ` Reinette Chatre
@ 2025-06-19 18:41 ` Moger, Babu
2025-06-20 15:53 ` Reinette Chatre
0 siblings, 1 reply; 26+ messages in thread
From: Moger, Babu @ 2025-06-19 18:41 UTC (permalink / raw)
To: Reinette Chatre, corbet, tony.luck, Dave.Martin, james.morse,
tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/25 22:59, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> The io_alloc feature in resctrl is a mechanism that enables direct
>
> "The ... feature ... is a mechanism"? What does it mean when a feature
> is a mechanism? How about just "The io_alloc feature in resctrl enables ..."?
Sure.
>
>> insertion of data from I/O devices into the L3 cache.
>
> Drop "L3"?
>
Sure.
>>
>> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
>> Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
>> lines to be placed into the L3 cache partitions identified by the
>> highest-supported L3_MASK_n register as reported by CPUID
>> Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
>> be allocated into the L3 cache partitions determined by the bitmask in
>> the L3_MASK_15 register.
>
> This is a resctrl fs patch but most of changelog so far is about AMD's implementation
> details. I do think it is relevant, but all the register details can
> probably be dropped since it is too low level to support the goal. I believe the
> goal here is to motivate resctrl fs's implementation that needs to pick highest
> CLOSID. So, essentially the changelog needs to hightlight that AMD's implementation
> requires highest CLOSID and without any other reference that is what resctrl fs's
> implementation supports. I think it is important to highlight that the
> user interface is not tied to this implementation decision to avoid future issues
> if another architecture support "io_alloc" "differently". Internals of
> resctrl fs can then be changed.
>
Sure. Will split these patches. Will try make the text more generic.
>>
>> When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
>> allocated for the instruction cache (L3CODE).
>
> Again, this is a resctrl fs patch and above is an AMD implementation detail. resctrl
> fs should not be limited by how AMD implements the feature but can use it as first
> reference.
Sure.
>
>>
>> Introduce user interface to enable/disable "io_alloc" feature.
>
> This patch does more than this.
Sure.
>
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
>
>> ---
>> Documentation/filesystems/resctrl.rst | 34 ++++
>> fs/resctrl/rdtgroup.c | 216 +++++++++++++++++++++++++-
>
> This patch contains several logical changes that are not adequately described
> in changelog.
> I think the following can be separate patches:
> - rdt_bit_usage_show() change
> - io_alloc_init() definition and usage
> - show() and write() helpers
> - possibly the io_alloc_closid helpers (more later)
Yes. Splitting this into 3 patches.
1. rdt_bit_usage_show() change
Updates shareable_bits section.
It calls - resctrl_io_alloc_closid().
2. io_alloc_init() definition and usage
Introdce the inteface and add show()
3. Add io_alloc write().
Calls resctrl_io_alloc_closid_supported() and
resctrl_io_alloc_closid_get().
>
>> 2 files changed, 248 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index c7949dd44f2f..03c829b2c276 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -95,6 +95,11 @@ related to allocation:
>> some platforms support devices that have their
>> own settings for cache use which can over-ride
>> these bits.
>> +
>> + When the "io_alloc" feature is enabled, a portion of the cache
>> + is reserved for shared use between hardware and software. Refer
>
> "reserved" and "shared" sounds like a contradiction. How about "is reserved" ->
> "can be configured"?
>
>> + to "bit_usage" to see which portion is allocated for this purpose.
>
> This is the "shareable_bits" section and the reason why user is pointed to
> "bit_usage" is because "shareable_bits" is a global CBM that applies to all
> cache domains/instances while "bit_usage" is per cache domain/instance. I think
> it will be helpful to point this out to the user.
> Perhaps second sentence can be replaced with something like:
> ""bit_usage" should be used to see which portions of each cache instance
> is configured for hardware use via the "io_alloc" feature because every cache
> instance can have its "io_alloc" bitmask configured independently".
Sure.
""bit_usage" should be used to see which portions of each cache
instance is configured for hardware use via the "io_alloc" feature
because every cache instance can have its "io_alloc" bitmask
configured independently.
>
> Please do improve this.
>
> To complete this the first part of the "shareable_bits" doc can be amended:
> "Bitmask of shareable resource" -> "Bitmask (applicable to all instances of this resource) of shareable resource"
> What do you think?
Sure. Added one sentance for that.
Bitmask of shareable resource with other executing entities
(e.g. I/O). Applies to all instances of this resource.
>
>> +
>> "bit_usage":
>> Annotated capacity bitmasks showing how all
>> instances of the resource are used. The legend is:
>> @@ -135,6 +140,35 @@ related to allocation:
>> "1":
>> Non-contiguous 1s value in CBM is supported.
>>
>> +"io_alloc":
>> + The "io_alloc" enables system software to configure the portion
>
> "The "io_alloc" enables"? Maybe just ""io_alloc" enables"?
>
>> + of the L3 cache allocated for I/O traffic.
>
> Drop "L3"?
>
> Can append, for example, "File may only exist if the system supports this feature on some of its cache
> resources".
>
>> +
>> + The feature routes the I/O traffic via specific CLOSID reserved
>> + for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
>> + for the CLOSID, users can control the L3 portions available for
>> + I/0 traffic. The reserved CLOSID will be excluded for group creation.
>
> Looking back I've commented *four* times already about how resctrl fs user interface
> should not be made specific to AMD's implementation.
> This paragraph just be dropped?
>
>
> The rest can be something like:
>
> "disabled": Portions of cache used for allocation of I/O traffic cannot be configured.
> "enabled": Portions of cache used for allocation of I/O traffic can be configured using "io_alloc_cbm"
> "not supported": ...
>
> The underlying implementation may reduce resources available to
> general (CPU) cache allocation. See architecture specific notes below.
> Depending on usage requirements the feature can be enabled or disabled:
>
> To enable:
> # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>
> To disable:
> # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
Sure.
>
>> +
>> + The interface provides a means to query the status of the feature.
>> +
>> + Example::
>> +
>> + # cat /sys/fs/resctrl/info/L3/io_alloc
>> + disabled
>> +
>> + Feature can be enabled/disabled by writing to the interface.
>> + Example::
>> +
>> + # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>> + # cat /sys/fs/resctrl/info/L3/io_alloc
>> + enabled
>> +
>> + On AMD systems, the io_alloc feature is supported by the L3 Smart
>> + Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
>> + io_alloc is determined by the highest CLOSID supported by the resource.
>> + When CDP is enabled, io_alloc routes I/O traffic using the highest
>> + CLOSID allocated for the instruction cache (L3CODE).
>> +
>> Memory bandwidth(MB) subdirectory contains the following files
>> with respect to allocation:
>>
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 1beb124e25f6..bbc032b4d0e9 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
>> static char last_cmd_status_buf[512];
>>
>> static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
>> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>>
>> static void rdtgroup_destroy_root(void);
>>
>> @@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
>> return !test_bit(closid, closid_free_map);
>> }
>>
>> +static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
>> +{
>> + if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
>> + return io_alloc_closid;
>> + else
>> + return -ENOSPC;
>> +}
>> +
>> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
>> +{
>> + closid_free(io_alloc_closid);
>> +}
>> +
>> /**
>> * rdtgroup_mode_by_closid - Return mode of resource group with closid
>> * @closid: closid if the resource group
>> @@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
>> return 0;
>> }
>>
>> +/*
>> + * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
>> + * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
>
> The function name should be followed by a *brief* description.
Sure.
>
>> + *
>> + * The total number of CLOSIDs is determined in closid_init(), based on the
>> + * minimum supported across all resources. If CDP (Code Data Prioritization)
>> + * is enabled, the number of CLOSIDs is halved. The final value is returned
>> + * by closids_supported(). Make sure this value aligns with the maximum
>> + * CLOSID supported by the respective resource.
>
> All but the last sentence is unrelated to this function and the last sentence
> is very vague. What is "this value" that it refers to?
Removed it.
>
>> + */
>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>> +{
>> + int num_closids = closids_supported();
>> +
>> + if (resctrl_arch_get_cdp_enabled(r->rid))
>> + num_closids *= 2;
>> +
>> + if (num_closids != resctrl_arch_get_num_closid(r))
>> + return -ENOSPC;
>> +
>> + return closids_supported() - 1;
>> +}
>
> resctrl_io_alloc_closid_get() seems to be trying to do two things:
> - determine what the io_alloc_closid is
> - make sure the io_alloc_closid is supported
>
> I think this should be split into two functions. Once the
> io_alloc_closid is determined to be supported and io_alloc
> enabled then there is no reason to keep checking if it is
> supported whenever the io_alloc_closid is queried.
>
> How about simplifying this to:
>
> /*
> * note how this returns u32 that will eliminate
> * unnecessary error checking in usages where io_alloc_closid
> * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
> * already confirmed io_alloc is enabled
> * function comment could note that this returns the CLOSID
> * required by io_alloc but not whether the CLOSID can
> * be supported, for this resctrl_io_alloc_closid_supported() should
> * be used.
> * Can also note that returned value will always be valid if
> * resctrl_arch_get_io_alloc_enabled(r) is true.
> */
> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
> if (resctrl_arch_get_cdp_enabled(r->rid))
> return resctrl_arch_get_num_closid(r)/2 - 1
> else
> return resctrl_arch_get_num_closid(r) -1
> }
>
> /*
> * note how below already makes resctrl's io_alloc implementation
> * more generic
> */
> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
> return io_alloc_closid < closids_supported()
> }
>
Sure.
Changed the check to
return io_alloc_closid == (closids_supported() -1)
>
>
>> +
>> /*
>> * rdt_bit_usage_show - Display current usage of resources
>> *
>> @@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>> struct rdt_ctrl_domain *dom;
>> int i, hwb, swb, excl, psl;
>> enum rdtgrp_mode mode;
>> + int io_alloc_closid;
>> bool sep = false;
>> u32 ctrl_val;
>>
>> cpus_read_lock();
>> mutex_lock(&rdtgroup_mutex);
>> - hw_shareable = r->cache.shareable_bits;
>> list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
>> if (sep)
>> seq_putc(seq, ';');
>> + hw_shareable = r->cache.shareable_bits;
>> sw_shareable = 0;
>> exclusive = 0;
>> seq_printf(seq, "%d=", dom->hdr.id);
>> for (i = 0; i < closids_supported(); i++) {
>> - if (!closid_allocated(i))
>> + if (!closid_allocated(i) ||
>> + (resctrl_arch_get_io_alloc_enabled(r) &&
>> + i == resctrl_io_alloc_closid_get(r)))
>> continue;
>> ctrl_val = resctrl_arch_get_config(r, dom, i,
>> s->conf_type);
>> @@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>> break;
>> }
>> }
>> +
>> + /*
>> + * When the "io_alloc" feature is enabled, a portion of the
>> + * cache is reserved for shared use between hardware and software.
>> + */
>> + if (resctrl_arch_get_io_alloc_enabled(r)) {
>> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
>
> In this implementation io_alloc_closid can be negative and the static
> checker I used complained about the subsequent usage in
> resctrl_arch_get_config() that must be unsigned.
> Since resctrl_arch_get_io_alloc_enabled(r) already passed this is one
> example where resctrl_io_alloc_closid() can be used.
Sure.
>
>> + if (resctrl_arch_get_cdp_enabled(r->rid))
>> + ctrl_val = resctrl_arch_get_config(r, dom,
>> + io_alloc_closid,
>> + CDP_CODE);
>> + else
>> + ctrl_val = resctrl_arch_get_config(r, dom,
>> + io_alloc_closid,
>> + CDP_NONE);
>> + hw_shareable |= ctrl_val;
>> + }
>> +
>> for (i = r->cache.cbm_len - 1; i >= 0; i--) {
>> pseudo_locked = dom->plr ? dom->plr->cbm : 0;
>> hwb = test_bit(i, &hw_shareable);
>> @@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>> return ret ?: nbytes;
>> }
>>
>> +static int resctrl_io_alloc_show(struct kernfs_open_file *of,
>> + struct seq_file *seq, void *v)
>> +{
>> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> + struct rdt_resource *r = s->res;
>> +
>
> Needs rdtgroup_mutex
Sure.
>
>> + if (r->cache.io_alloc_capable) {
>> + if (resctrl_arch_get_io_alloc_enabled(r))
>> + seq_puts(seq, "enabled\n");
>> + else
>> + seq_puts(seq, "disabled\n");
>> + } else {
>> + seq_puts(seq, "not supported\n");
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> + * Initialize io_alloc CLOSID cache resource with default CBM values.
>
> It is unclear what "default" means.
> Could be "Initialize io_alloc CLOSID cache resource CBM with all usable (shared and unused) cache portions".
Sure.
>
>> + */
>> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
>> + struct resctrl_schema *s, u32 closid)
>> +{
>> + int ret;
>> +
>> + rdt_staged_configs_clear();
>> +
>> + ret = rdtgroup_init_cat(s, closid);
>> + if (ret < 0)
>> + goto out_init_cat;
>
> The "out" label should reflect what is done at target, not the source.
> Consider all the usages of "out_unlock" in resctrl.
> Since this is the only label it can just be "out".
Sure.
>
>> +
>> + ret = resctrl_arch_update_domains(r, closid);
>> +
>> +out_init_cat:
>> + rdt_staged_configs_clear();
>> + return ret;
>> +}
>> +
>> +static const char *rdtgroup_name_by_closid(int closid)
>> +{
>> + struct rdtgroup *rdtgrp;
>> +
>> + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
>> + if (rdtgrp->closid == closid)
>> + return rdt_kn_name(rdtgrp->kn);
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +/*
>> + * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
>> + * linked to an L3CODE resource. Although CBMs can be accessed through
>> + * either L3CODE or L3DATA resources, any updates to the schemata must
>> + * always be performed on L3CODE.
>> + */
>
> I think that updates to the schemata needs to be made on *both* L3DATA and L3CODE.
> Consider how __init_one_rdt_domain() works when a new resource group is created ...
> the algorithm looks through all allocated CLOSID and the associated schemata impact
> the CBM of the new group. If an allocated CLOSID is associated with L3CODE
> then its "peer" L3DATA is also taken into account, similar for L3DATA.
> If only L3CODE is updated for the io_alloc_closid then it looks to me that
> whatever the original L3DATA schema was will keep impacting new resource
> groups. To avoid that and ensure only accurate CBMs are used it looks to me
> as though the L3DATA and L3CODE schema needs to be kept in sync.
Sure. Will verify this.
>
>> +static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
>> +{
>> + struct resctrl_schema *schema;
>> +
>> + if (s->conf_type == CDP_DATA) {
>> + list_for_each_entry(schema, &resctrl_schema_all, list) {
>> + if (schema->conf_type == CDP_CODE)
>> + return schema;
>> + }
>> + }
>> +
>> + return s;
>> +}
>> +
>> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
>> + size_t nbytes, loff_t off)
>> +{
>> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> + struct rdt_resource *r = s->res;
>> + char const *grp_name;
>> + u32 io_alloc_closid;
>> + bool enable;
>> + int ret;
>> +
>> + ret = kstrtobool(buf, &enable);
>> + if (ret)
>> + return ret;
>> +
>> + cpus_read_lock();
>> + mutex_lock(&rdtgroup_mutex);
>> +
>> + rdt_last_cmd_clear();
>> +
>> + if (!r->cache.io_alloc_capable) {
>> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
>
> This could be more useful with a small change:
> "io_alloc is not supported on %s\n", s->name
Yes.
>
>> + ret = -ENODEV;
>> + goto out_io_alloc;
>
> out_io_alloc -> out_unlock
Sure.
>
>> + }
>> +
>> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
>> + if (io_alloc_closid < 0) {
>> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
>
> This could be more useful to help debug by printing the value of io_alloc_closid
> that user can compare against output of num_closids files. Here the terms become
> a bit complicated since ideally we want to use ctrl_hw_id but that does not match
> "num_closids", so perhaps use both terms, for example "CLOSID (ctrl_hw_id)"?
> I am not sure here.
Fine with me.
>
>> + ret = -EINVAL;
>> + goto out_io_alloc;
>> + }
>> +
>> + if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
>> + if (enable) {
>> + ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
>> + if (ret < 0) {
>> + grp_name = rdtgroup_name_by_closid(io_alloc_closid);
>
> Below handles !grp_name but that would be a kernel bug, no? Maybe WARN_ON_ONCE()?
Sure.
>
>> + rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
>> + grp_name ? grp_name : "another");
>
>
> CLOSID -> ctrl_hw_id
>
sure.
>> + ret = -EINVAL;
>> + goto out_io_alloc;
>> + }
>> +
>> + ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
>> + io_alloc_closid);
>> + if (ret) {
>> + rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
>> + resctrl_io_alloc_closid_free(io_alloc_closid);
>> + goto out_io_alloc;
>> + }
>> +
>> + } else {
>> + resctrl_io_alloc_closid_free(io_alloc_closid);
>> + }
>> +
>> + ret = resctrl_arch_io_alloc_enable(r, enable);
>> + }
>> +
>> +out_io_alloc:
>
> out_unlock ... to match the other places in resctrl.
Sure.
>
>> + mutex_unlock(&rdtgroup_mutex);
>> + cpus_read_unlock();
>> +
>> + return ret ?: nbytes;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>> @@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
>> .seq_show = rdtgroup_schemata_show,
>> .fflags = RFTYPE_CTRL_BASE,
>> },
>> + {
>> + .name = "io_alloc",
>> + .mode = 0644,
>> + .kf_ops = &rdtgroup_kf_single_ops,
>> + .seq_show = resctrl_io_alloc_show,
>> + .write = resctrl_io_alloc_write,
>> + },
>
> Please match existing code wrt tab usage.
Sure.
>
>> {
>> .name = "mba_MBps_event",
>> .mode = 0644,
>> @@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
>> RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
>> }
>>
>> +static void io_alloc_init(void)
>> +{
>> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>> +
>> + if (r->cache.io_alloc_capable)
>> + resctrl_file_fflags_init("io_alloc",
>> + RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
>> +}
>
> Note that even if it only checks the L3 cache resource, this will make the
> file visible to all cache resource, also L2. This is why it is important to
> ensure documentation and implementation also accommodates resources that
> do not support io_alloc.
Agree.
>
>> +
>> void resctrl_file_fflags_init(const char *config, unsigned long fflags)
>> {
>> struct rftype *rft;
>> @@ -4249,6 +4459,8 @@ int resctrl_init(void)
>>
>> thread_throttle_mode_init();
>>
>> + io_alloc_init();
>> +
>> ret = resctrl_mon_resource_init();
>> if (ret)
>> return ret;
>
> Reinette
>
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
2025-06-19 18:41 ` Moger, Babu
@ 2025-06-20 15:53 ` Reinette Chatre
2025-06-20 21:57 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-20 15:53 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/19/25 11:41 AM, Moger, Babu wrote:
> On 6/17/25 22:59, Reinette Chatre wrote:
>> On 6/11/25 2:23 PM, Babu Moger wrote:
...
>>> + */
>>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>>> +{
>>> + int num_closids = closids_supported();
>>> +
>>> + if (resctrl_arch_get_cdp_enabled(r->rid))
>>> + num_closids *= 2;
>>> +
>>> + if (num_closids != resctrl_arch_get_num_closid(r))
>>> + return -ENOSPC;
>>> +
>>> + return closids_supported() - 1;
>>> +}
>>
>> resctrl_io_alloc_closid_get() seems to be trying to do two things:
>> - determine what the io_alloc_closid is
>> - make sure the io_alloc_closid is supported
>>
>> I think this should be split into two functions. Once the
>> io_alloc_closid is determined to be supported and io_alloc
>> enabled then there is no reason to keep checking if it is
>> supported whenever the io_alloc_closid is queried.
>>
>> How about simplifying this to:
>>
>> /*
>> * note how this returns u32 that will eliminate
>> * unnecessary error checking in usages where io_alloc_closid
>> * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
>> * already confirmed io_alloc is enabled
>> * function comment could note that this returns the CLOSID
>> * required by io_alloc but not whether the CLOSID can
>> * be supported, for this resctrl_io_alloc_closid_supported() should
>> * be used.
>> * Can also note that returned value will always be valid if
>> * resctrl_arch_get_io_alloc_enabled(r) is true.
>> */
>> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
>> if (resctrl_arch_get_cdp_enabled(r->rid))
>> return resctrl_arch_get_num_closid(r)/2 - 1
>> else
>> return resctrl_arch_get_num_closid(r) -1
>> }
>>
>> /*
>> * note how below already makes resctrl's io_alloc implementation
>> * more generic
>> */
>> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
>> return io_alloc_closid < closids_supported()
>> }
>>
>
> Sure.
> Changed the check to
>
> return io_alloc_closid == (closids_supported() -1)
>
resctrl_io_alloc_closid_supported() is not intended to reflect what the
value is but just check if provided value is supported. By changing the
check to above resctrl_io_alloc_closid_supported() does two things again
(what the move to new functions aimed to avoid): checking that the CLOSID
is supported while requiring that it is the highest supported CLOSID.
What issue(s) do you see with using "io_alloc_closid < closids_supported()"
as the check?
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
2025-06-20 15:53 ` Reinette Chatre
@ 2025-06-20 21:57 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-20 21:57 UTC (permalink / raw)
To: Reinette Chatre, babu.moger, corbet, tony.luck, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/20/2025 10:53 AM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/19/25 11:41 AM, Moger, Babu wrote:
>> On 6/17/25 22:59, Reinette Chatre wrote:
>>> On 6/11/25 2:23 PM, Babu Moger wrote:
>
> ...
>
>>>> + */
>>>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>>>> +{
>>>> + int num_closids = closids_supported();
>>>> +
>>>> + if (resctrl_arch_get_cdp_enabled(r->rid))
>>>> + num_closids *= 2;
>>>> +
>>>> + if (num_closids != resctrl_arch_get_num_closid(r))
>>>> + return -ENOSPC;
>>>> +
>>>> + return closids_supported() - 1;
>>>> +}
>>>
>>> resctrl_io_alloc_closid_get() seems to be trying to do two things:
>>> - determine what the io_alloc_closid is
>>> - make sure the io_alloc_closid is supported
>>>
>>> I think this should be split into two functions. Once the
>>> io_alloc_closid is determined to be supported and io_alloc
>>> enabled then there is no reason to keep checking if it is
>>> supported whenever the io_alloc_closid is queried.
>>>
>>> How about simplifying this to:
>>>
>>> /*
>>> * note how this returns u32 that will eliminate
>>> * unnecessary error checking in usages where io_alloc_closid
>>> * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
>>> * already confirmed io_alloc is enabled
>>> * function comment could note that this returns the CLOSID
>>> * required by io_alloc but not whether the CLOSID can
>>> * be supported, for this resctrl_io_alloc_closid_supported() should
>>> * be used.
>>> * Can also note that returned value will always be valid if
>>> * resctrl_arch_get_io_alloc_enabled(r) is true.
>>> */
>>> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
>>> if (resctrl_arch_get_cdp_enabled(r->rid))
>>> return resctrl_arch_get_num_closid(r)/2 - 1
>>> else
>>> return resctrl_arch_get_num_closid(r) -1
>>> }
>>>
>>> /*
>>> * note how below already makes resctrl's io_alloc implementation
>>> * more generic
>>> */
>>> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
>>> return io_alloc_closid < closids_supported()
>>> }
>>>
>>
>> Sure.
>> Changed the check to
>>
>> return io_alloc_closid == (closids_supported() -1)
>>
>
> resctrl_io_alloc_closid_supported() is not intended to reflect what the
> value is but just check if provided value is supported. By changing the
> check to above resctrl_io_alloc_closid_supported() does two things again
> (what the move to new functions aimed to avoid): checking that the CLOSID
> is supported while requiring that it is the highest supported CLOSID.
> What issue(s) do you see with using "io_alloc_closid < closids_supported()"
> as the check?
I don't see any issue. It should be fine. Will test and verify it.
thanks
Babu
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (4 preceding siblings ...)
2025-06-11 21:23 ` [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-18 4:01 ` Reinette Chatre
2025-06-11 21:23 ` [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
2025-06-11 21:23 ` [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
7 siblings, 1 reply; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
The io_alloc feature in resctrl enables system software to configure
the portion of the L3 cache allocated for I/O traffic.
Add the interface to display CBMs (Capacity Bit Mask) of io_alloc
feature.
The CBM interface file io_alloc_cbm will reside in the info directory
(e.g., /sys/fs/resctrl/info/L3/). Displaying the resource name is not
necessary. Pass the resource name to show_doms() and print it only if
the name is valid. For io_alloc, pass NULL to suppress printing the
resource name.
When CDP is enabled, io_alloc routes traffic using the highest CLOSID
associated with an L3CODE resource. However, CBMs can be accessed via
either L3CODE or L3DATA resources.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Added "io_alloc_cbm" details in user doc resctrl.rst.
Resource name is not printed in CBM now. Corrected the texts about it
in resctrl.rst.
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
Updated show_doms() to print the resource if only it is valid. Pass NULL while
printing io_alloc CBM.
Changed the code to access the CBMs via either L3CODE or L3DATA resources.
v4: Updated the change log.
Added rdtgroup_mutex before rdt_last_cmd_puts().
Returned -ENODEV when resource type is CDP_DATA.
Kept the resource name while printing the CBM (L3:0=fff) that way
I dont have to change show_doms() just for this feature and it is
consistant across all the schemata display.
v3: Minor changes due to changes in resctrl_arch_get_io_alloc_enabled()
and resctrl_io_alloc_closid_get().
Added the check to verify CDP resource type.
Updated the commit log.
v2: Fixed to display only on L3 resources.
Added the locks while processing.
Rename the displat to io_alloc_cbm (from sdciae_cmd).
---
Documentation/filesystems/resctrl.rst | 13 +++++++
fs/resctrl/ctrlmondata.c | 8 +++--
fs/resctrl/internal.h | 2 ++
fs/resctrl/rdtgroup.c | 51 ++++++++++++++++++++++++++-
4 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 03c829b2c276..b31748ec8c61 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -169,6 +169,19 @@ related to allocation:
When CDP is enabled, io_alloc routes I/O traffic using the highest
CLOSID allocated for the instruction cache (L3CODE).
+"io_alloc_cbm":
+ Capacity Bit Masks (CBMs) available to supported IO devices which
+ can directly insert cache lines in L3 which can help to reduce the
+ latency. CBMs are displayed in the following format:
+
+ <cache_id0>=<cbm>;<cache_id1>=<cbm>;...
+
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
+ 0=ffff;1=ffff
+
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index 6ed2dfd4dbbd..ea039852569a 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -381,7 +381,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}
-static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
+void show_doms(struct seq_file *s, struct resctrl_schema *schema, char *resource_name,
+ int closid)
{
struct rdt_resource *r = schema->res;
struct rdt_ctrl_domain *dom;
@@ -391,7 +392,8 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
/* Walking r->domains, ensure it can't race with cpuhp */
lockdep_assert_cpus_held();
- seq_printf(s, "%*s:", max_name_width, schema->name);
+ if (resource_name)
+ seq_printf(s, "%*s:", max_name_width, resource_name);
list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
if (sep)
seq_puts(s, ";");
@@ -437,7 +439,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
closid = rdtgrp->closid;
list_for_each_entry(schema, &resctrl_schema_all, list) {
if (closid < schema->num_closid)
- show_doms(s, schema, closid);
+ show_doms(s, schema, schema->name, closid);
}
}
} else {
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 9a8cf6f11151..14f3697c1187 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -374,6 +374,8 @@ void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
+void show_doms(struct seq_file *s, struct resctrl_schema *schema,
+ char *name, int closid);
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index bbc032b4d0e9..0c2d2cf4baa1 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1997,6 +1997,46 @@ static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
return ret ?: nbytes;
}
+static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ u32 io_alloc_closid;
+ int ret = 0;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!r->cache.io_alloc_capable) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ ret = -ENODEV;
+ goto cbm_show_out;
+ }
+
+ if (!resctrl_arch_get_io_alloc_enabled(r)) {
+ rdt_last_cmd_puts("io_alloc feature is not enabled\n");
+ ret = -EINVAL;
+ goto cbm_show_out;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto cbm_show_out;
+ }
+
+ show_doms(seq, resctrl_schema_io_alloc(s), NULL, io_alloc_closid);
+
+cbm_show_out:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+ return ret;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -2156,6 +2196,12 @@ static struct rftype res_common_files[] = {
.seq_show = resctrl_io_alloc_show,
.write = resctrl_io_alloc_write,
},
+ {
+ .name = "io_alloc_cbm",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = resctrl_io_alloc_cbm_show,
+ },
{
.name = "mba_MBps_event",
.mode = 0644,
@@ -2267,9 +2313,12 @@ static void io_alloc_init(void)
{
struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
- if (r->cache.io_alloc_capable)
+ if (r->cache.io_alloc_capable) {
resctrl_file_fflags_init("io_alloc",
RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+ resctrl_file_fflags_init("io_alloc_cbm",
+ RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+ }
}
void resctrl_file_fflags_init(const char *config, unsigned long fflags)
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs
2025-06-11 21:23 ` [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
@ 2025-06-18 4:01 ` Reinette Chatre
2025-06-20 21:57 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 4:01 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/11/25 2:23 PM, Babu Moger wrote:
> The io_alloc feature in resctrl enables system software to configure
> the portion of the L3 cache allocated for I/O traffic.
Drop L3?
>
> Add the interface to display CBMs (Capacity Bit Mask) of io_alloc
> feature.
After the fs/arch split it is not always obvious what is meant with
"interface" ... it could be new API between fs and arch or it could be
new resctrl file.
This can be specific:
Add "io_alloc_cbm" resctrl file to display ...
>
> The CBM interface file io_alloc_cbm will reside in the info directory
> (e.g., /sys/fs/resctrl/info/L3/). Displaying the resource name is not
> necessary. Pass the resource name to show_doms() and print it only if
> the name is valid. For io_alloc, pass NULL to suppress printing the
> resource name.
>
> When CDP is enabled, io_alloc routes traffic using the highest CLOSID
> associated with an L3CODE resource. However, CBMs can be accessed via
> either L3CODE or L3DATA resources.
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...
> ---
> Documentation/filesystems/resctrl.rst | 13 +++++++
> fs/resctrl/ctrlmondata.c | 8 +++--
> fs/resctrl/internal.h | 2 ++
> fs/resctrl/rdtgroup.c | 51 ++++++++++++++++++++++++++-
> 4 files changed, 70 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 03c829b2c276..b31748ec8c61 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -169,6 +169,19 @@ related to allocation:
> When CDP is enabled, io_alloc routes I/O traffic using the highest
> CLOSID allocated for the instruction cache (L3CODE).
>
> +"io_alloc_cbm":
> + Capacity Bit Masks (CBMs) available to supported IO devices which
> + can directly insert cache lines in L3 which can help to reduce the
"CBMs that describe the portions of cache instances to which I/O traffic
from supported IO devices are routed."
Please check ... there seems to be some inconsistency in "IO" vs "I/O" use.
Also consider something like,
"When CDP is enabled "io_alloc_cbm" associated with the DATA
and CODE resources may reflect the same values. For example, values read from
and written to /sys/fs/resctrl/info/L3DATA/io_alloc_cbm may be reflected by
/sys/fs/resctrl/info/L3CODE/io_alloc_cbm and vice versa."
What do you think?
> + latency. CBMs are displayed in the following format:
> +
> + <cache_id0>=<cbm>;<cache_id1>=<cbm>;...
> +
> + Example::
> +
> + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> + 0=ffff;1=ffff
> +
> +
> Memory bandwidth(MB) subdirectory contains the following files
> with respect to allocation:
>
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index 6ed2dfd4dbbd..ea039852569a 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -381,7 +381,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
> return ret ?: nbytes;
> }
>
> -static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
> +void show_doms(struct seq_file *s, struct resctrl_schema *schema, char *resource_name,
> + int closid)
> {
> struct rdt_resource *r = schema->res;
> struct rdt_ctrl_domain *dom;
> @@ -391,7 +392,8 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
> /* Walking r->domains, ensure it can't race with cpuhp */
> lockdep_assert_cpus_held();
>
> - seq_printf(s, "%*s:", max_name_width, schema->name);
> + if (resource_name)
> + seq_printf(s, "%*s:", max_name_width, resource_name);
> list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
> if (sep)
> seq_puts(s, ";");
> @@ -437,7 +439,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
> closid = rdtgrp->closid;
> list_for_each_entry(schema, &resctrl_schema_all, list) {
> if (closid < schema->num_closid)
> - show_doms(s, schema, closid);
> + show_doms(s, schema, schema->name, closid);
> }
> }
> } else {
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 9a8cf6f11151..14f3697c1187 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -374,6 +374,8 @@ void rdt_staged_configs_clear(void);
> bool closid_allocated(unsigned int closid);
>
> int resctrl_find_cleanest_closid(void);
> +void show_doms(struct seq_file *s, struct resctrl_schema *schema,
> + char *name, int closid);
>
> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index bbc032b4d0e9..0c2d2cf4baa1 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -1997,6 +1997,46 @@ static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> return ret ?: nbytes;
> }
>
> +static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
> + struct seq_file *seq, void *v)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + struct rdt_resource *r = s->res;
> + u32 io_alloc_closid;
> + int ret = 0;
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> +
> + if (!r->cache.io_alloc_capable) {
> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
> + ret = -ENODEV;
> + goto cbm_show_out;
out_unlock
> + }
> +
> + if (!resctrl_arch_get_io_alloc_enabled(r)) {
> + rdt_last_cmd_puts("io_alloc feature is not enabled\n");
> + ret = -EINVAL;
> + goto cbm_show_out;
> + }
> +
> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
> + if (io_alloc_closid < 0) {
Another example where io_alloc_closid must be valid thanks to earlier resctrl_arch_get_io_alloc_enabled(r).
> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
> + ret = -EINVAL;
> + goto cbm_show_out;
> + }
> +
> + show_doms(seq, resctrl_schema_io_alloc(s), NULL, io_alloc_closid);
> +
> +cbm_show_out:
out_unlock ... to match rest of resctrl
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> + return ret;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs
2025-06-18 4:01 ` Reinette Chatre
@ 2025-06-20 21:57 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-20 21:57 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet, tony.luck, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/2025 11:01 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> The io_alloc feature in resctrl enables system software to configure
>> the portion of the L3 cache allocated for I/O traffic.
>
> Drop L3?
Sure.
>>
>> Add the interface to display CBMs (Capacity Bit Mask) of io_alloc
>> feature.
>
> After the fs/arch split it is not always obvious what is meant with
> "interface" ... it could be new API between fs and arch or it could be
> new resctrl file.
> This can be specific:
> Add "io_alloc_cbm" resctrl file to display ...
>
Sure.
>>
>> The CBM interface file io_alloc_cbm will reside in the info directory
>> (e.g., /sys/fs/resctrl/info/L3/). Displaying the resource name is not
>> necessary. Pass the resource name to show_doms() and print it only if
>> the name is valid. For io_alloc, pass NULL to suppress printing the
>> resource name.
>>
>> When CDP is enabled, io_alloc routes traffic using the highest CLOSID
>> associated with an L3CODE resource. However, CBMs can be accessed via
>> either L3CODE or L3DATA resources.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>
> ...
>
>> ---
>> Documentation/filesystems/resctrl.rst | 13 +++++++
>> fs/resctrl/ctrlmondata.c | 8 +++--
>> fs/resctrl/internal.h | 2 ++
>> fs/resctrl/rdtgroup.c | 51 ++++++++++++++++++++++++++-
>> 4 files changed, 70 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index 03c829b2c276..b31748ec8c61 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -169,6 +169,19 @@ related to allocation:
>> When CDP is enabled, io_alloc routes I/O traffic using the highest
>> CLOSID allocated for the instruction cache (L3CODE).
>>
>> +"io_alloc_cbm":
>> + Capacity Bit Masks (CBMs) available to supported IO devices which
>> + can directly insert cache lines in L3 which can help to reduce the
>
> "CBMs that describe the portions of cache instances to which I/O traffic
> from supported IO devices are routed."
Sure.
> Please check ... there seems to be some inconsistency in "IO" vs "I/O" use.
Changed to "I/O" in all the places.
>
> Also consider something like,
> "When CDP is enabled "io_alloc_cbm" associated with the DATA
> and CODE resources may reflect the same values. For example, values read from
> and written to /sys/fs/resctrl/info/L3DATA/io_alloc_cbm may be reflected by
> /sys/fs/resctrl/info/L3CODE/io_alloc_cbm and vice versa."
> What do you think?
Looks good.
>
>> + latency. CBMs are displayed in the following format:
>> +
>> + <cache_id0>=<cbm>;<cache_id1>=<cbm>;...
>> +
>> + Example::
>> +
>> + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> + 0=ffff;1=ffff
>> +
>> +
>> Memory bandwidth(MB) subdirectory contains the following files
>> with respect to allocation:
>>
>> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
>> index 6ed2dfd4dbbd..ea039852569a 100644
>> --- a/fs/resctrl/ctrlmondata.c
>> +++ b/fs/resctrl/ctrlmondata.c
>> @@ -381,7 +381,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
>> return ret ?: nbytes;
>> }
>>
>> -static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
>> +void show_doms(struct seq_file *s, struct resctrl_schema *schema, char *resource_name,
>> + int closid)
>> {
>> struct rdt_resource *r = schema->res;
>> struct rdt_ctrl_domain *dom;
>> @@ -391,7 +392,8 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
>> /* Walking r->domains, ensure it can't race with cpuhp */
>> lockdep_assert_cpus_held();
>>
>> - seq_printf(s, "%*s:", max_name_width, schema->name);
>> + if (resource_name)
>> + seq_printf(s, "%*s:", max_name_width, resource_name);
>> list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
>> if (sep)
>> seq_puts(s, ";");
>> @@ -437,7 +439,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
>> closid = rdtgrp->closid;
>> list_for_each_entry(schema, &resctrl_schema_all, list) {
>> if (closid < schema->num_closid)
>> - show_doms(s, schema, closid);
>> + show_doms(s, schema, schema->name, closid);
>> }
>> }
>> } else {
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 9a8cf6f11151..14f3697c1187 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -374,6 +374,8 @@ void rdt_staged_configs_clear(void);
>> bool closid_allocated(unsigned int closid);
>>
>> int resctrl_find_cleanest_closid(void);
>> +void show_doms(struct seq_file *s, struct resctrl_schema *schema,
>> + char *name, int closid);
>>
>> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index bbc032b4d0e9..0c2d2cf4baa1 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -1997,6 +1997,46 @@ static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
>> return ret ?: nbytes;
>> }
>>
>> +static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
>> + struct seq_file *seq, void *v)
>> +{
>> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> + struct rdt_resource *r = s->res;
>> + u32 io_alloc_closid;
>> + int ret = 0;
>> +
>> + cpus_read_lock();
>> + mutex_lock(&rdtgroup_mutex);
>> +
>> + rdt_last_cmd_clear();
>> +
>> + if (!r->cache.io_alloc_capable) {
>> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
>> + ret = -ENODEV;
>> + goto cbm_show_out;
>
> out_unlock
>
Sure.
>> + }
>> +
>> + if (!resctrl_arch_get_io_alloc_enabled(r)) {
>> + rdt_last_cmd_puts("io_alloc feature is not enabled\n");
>> + ret = -EINVAL;
>> + goto cbm_show_out;
>> + }
>> +
>> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
>> + if (io_alloc_closid < 0) {
>
> Another example where io_alloc_closid must be valid thanks to earlier resctrl_arch_get_io_alloc_enabled(r).
Sure. Will remove this check.
>
>> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
>> + ret = -EINVAL;
>> + goto cbm_show_out;
>> + }
>> +
>> + show_doms(seq, resctrl_schema_io_alloc(s), NULL, io_alloc_closid);
>> +
>> +cbm_show_out:
>
> out_unlock ... to match rest of resctrl
sure.
>
>> + mutex_unlock(&rdtgroup_mutex);
>> + cpus_read_unlock();
>> + return ret;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>
> Reinette
>
thanks
Babu
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (5 preceding siblings ...)
2025-06-11 21:23 ` [PATCH v6 6/8] fs/resctrl: Introduce interface to display io_alloc CBMs Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
2025-06-18 4:03 ` Reinette Chatre
2025-06-11 21:23 ` [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
7 siblings, 1 reply; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
The functions parse_cbm() and parse_bw() require mode and CLOSID to
validate the Capacity Bit Mask (CBM). It is passed through struct
rdtgroup in rdt_parse_data. Instead of passing them through struct
rdtgroup, pass mode and closid directly.
This change enables parse_cbm() to be used for verifying CBM in io_alloc
feature.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Changed the subject line to fs/resctrl.
v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
v4: New patch to call parse_cbm() directly to avoid code duplication.
---
fs/resctrl/ctrlmondata.c | 29 +++++++++++++----------------
fs/resctrl/internal.h | 6 ++++++
2 files changed, 19 insertions(+), 16 deletions(-)
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index ea039852569a..6409637b4de6 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -23,11 +23,6 @@
#include "internal.h"
-struct rdt_parse_data {
- struct rdtgroup *rdtgrp;
- char *buf;
-};
-
typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
struct resctrl_schema *s,
struct rdt_ctrl_domain *d);
@@ -77,8 +72,8 @@ static int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_ctrl_domain *d)
{
struct resctrl_staged_config *cfg;
- u32 closid = data->rdtgrp->closid;
struct rdt_resource *r = s->res;
+ u32 closid = data->closid;
u32 bw_val;
cfg = &d->staged_config[s->conf_type];
@@ -156,9 +151,10 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_ctrl_domain *d)
{
- struct rdtgroup *rdtgrp = data->rdtgrp;
+ enum rdtgrp_mode mode = data->mode;
struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
+ u32 closid = data->closid;
u32 cbm_val;
cfg = &d->staged_config[s->conf_type];
@@ -171,7 +167,7 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
* Cannot set up more than one pseudo-locked region in a cache
* hierarchy.
*/
- if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
+ if (mode == RDT_MODE_PSEUDO_LOCKSETUP &&
rdtgroup_pseudo_locked_in_hierarchy(d)) {
rdt_last_cmd_puts("Pseudo-locked region in hierarchy\n");
return -EINVAL;
@@ -180,9 +176,9 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
if (!cbm_validate(data->buf, &cbm_val, r))
return -EINVAL;
- if ((rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
- rdtgrp->mode == RDT_MODE_SHAREABLE) &&
- rdtgroup_cbm_overlaps_pseudo_locked(d, cbm_val)) {
+ if ((mode == RDT_MODE_EXCLUSIVE ||
+ mode == RDT_MODE_SHAREABLE) &&
+ rdtgroup_cbm_overlaps_pseudo_locked(d, cbm_val)) {
rdt_last_cmd_puts("CBM overlaps with pseudo-locked region\n");
return -EINVAL;
}
@@ -191,14 +187,14 @@ static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
* The CBM may not overlap with the CBM of another closid if
* either is exclusive.
*/
- if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, true)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, closid, true)) {
rdt_last_cmd_puts("Overlaps with exclusive group\n");
return -EINVAL;
}
- if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, false)) {
- if (rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
- rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, closid, false)) {
+ if (mode == RDT_MODE_EXCLUSIVE ||
+ mode == RDT_MODE_PSEUDO_LOCKSETUP) {
rdt_last_cmd_puts("Overlaps with other group\n");
return -EINVAL;
}
@@ -262,7 +258,8 @@ static int parse_line(char *line, struct resctrl_schema *s,
list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
if (d->hdr.id == dom_id) {
data.buf = dom;
- data.rdtgrp = rdtgrp;
+ data.closid = rdtgrp->closid;
+ data.mode = rdtgrp->mode;
if (parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 14f3697c1187..10a3188ffa54 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -201,6 +201,12 @@ struct rdtgroup {
struct pseudo_lock_region *plr;
};
+struct rdt_parse_data {
+ u32 closid;
+ enum rdtgrp_mode mode;
+ char *buf;
+};
+
/* rdtgroup.flags */
#define RDT_DELETED 1
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID
2025-06-11 21:23 ` [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
@ 2025-06-18 4:03 ` Reinette Chatre
2025-06-20 21:58 ` Moger, Babu
0 siblings, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 4:03 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/11/25 2:23 PM, Babu Moger wrote:
> The functions parse_cbm() and parse_bw() require mode and CLOSID to
> validate the Capacity Bit Mask (CBM). It is passed through struct
> rdtgroup in rdt_parse_data. Instead of passing them through struct
> rdtgroup, pass mode and closid directly.
Above looks like a combination of context and solution description.
Expectation is for context, problem, and solution to be in this order
and in separate paragraphs.
>
> This change enables parse_cbm() to be used for verifying CBM in io_alloc
> feature.
Is this the problem? It is not clear from changelog what problem is
being solved.
Also, please drop "This change" that is semantically the same as "This patch".
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID
2025-06-18 4:03 ` Reinette Chatre
@ 2025-06-20 21:58 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-20 21:58 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet, tony.luck, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/2025 11:03 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> The functions parse_cbm() and parse_bw() require mode and CLOSID to
>> validate the Capacity Bit Mask (CBM). It is passed through struct
>> rdtgroup in rdt_parse_data. Instead of passing them through struct
>> rdtgroup, pass mode and closid directly.
>
> Above looks like a combination of context and solution description.
> Expectation is for context, problem, and solution to be in this order
> and in separate paragraphs.
>
Will rephrase it.
>>
>> This change enables parse_cbm() to be used for verifying CBM in io_alloc
>> feature.
>
> Is this the problem? It is not clear from changelog what problem is
> being solved.
>
> Also, please drop "This change" that is semantically the same as "This patch".
>
Sure. Will change it.
Thanks
Babu
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
2025-06-11 21:23 [PATCH v6 0/8] x86/resctrl: Support L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE) Babu Moger
` (6 preceding siblings ...)
2025-06-11 21:23 ` [PATCH v6 7/8] fs/resctrl: Modify rdt_parse_data to pass mode and CLOSID Babu Moger
@ 2025-06-11 21:23 ` Babu Moger
[not found] ` <202506131403.MWHHLsxB-lkp@intel.com>
2025-06-18 4:03 ` Reinette Chatre
7 siblings, 2 replies; 26+ messages in thread
From: Babu Moger @ 2025-06-11 21:23 UTC (permalink / raw)
To: babu.moger, corbet, tony.luck, reinette.chatre, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
"io_alloc" feature is a mechanism that enables direct insertion of data
from I/O devices into the L3 cache. By directly caching data from I/O
devices rather than first storing the I/O data in DRAM, it reduces the
demands on DRAM bandwidth and reduces latency to the processor consuming
the I/O data.
"io_alloc" feature uses the highest CLOSID to route the traffic from I/O
devices. Provide the interface to modify io_alloc CBMs (Capacity Bit Mask)
when feature is enabled.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Updated the user doc restctr.doc for minor texts.
Changed the subject to fs/resctrl.
v5: Changes due to FS/ARCH code restructure. The files monitor.c/rdtgroup.c
have been split between FS and ARCH directories.
Changed the code to access the CBMs via either L3CODE or L3DATA resources.
v4: Removed resctrl_io_alloc_parse_cbm and called parse_cbm() directly.
v3: Minor changes due to changes in resctrl_arch_get_io_alloc_enabled()
and resctrl_io_alloc_closid_get().
Taken care of handling the CBM update when CDP is enabled.
Updated the commit log to make it generic.
v2: Added more generic text in documentation.
---
Documentation/filesystems/resctrl.rst | 13 ++++
fs/resctrl/ctrlmondata.c | 4 +-
fs/resctrl/internal.h | 2 +
fs/resctrl/rdtgroup.c | 89 ++++++++++++++++++++++++++-
4 files changed, 105 insertions(+), 3 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index b31748ec8c61..ae1157dcb4a3 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -181,6 +181,19 @@ related to allocation:
# cat /sys/fs/resctrl/info/L3/io_alloc_cbm
0=ffff;1=ffff
+ CBM can be configured by writing to the interface.
+
+ Example::
+
+ # echo 1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
+ # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
+ 0=ffff;1=00ff
+
+ When CDP is enabled, io_alloc directs traffic using the highest CLOSID
+ linked to an L3CODE resource. Although CBMs can be accessed through
+ either L3CODE or L3DATA resources, any updates to the schemata are
+ always routed through L3CODE.
+
Memory bandwidth(MB) subdirectory contains the following files
with respect to allocation:
diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
index 6409637b4de6..f3e5e697945c 100644
--- a/fs/resctrl/ctrlmondata.c
+++ b/fs/resctrl/ctrlmondata.c
@@ -148,8 +148,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
- struct rdt_ctrl_domain *d)
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d)
{
enum rdtgrp_mode mode = data->mode;
struct resctrl_staged_config *cfg;
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 10a3188ffa54..755f23934295 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -382,6 +382,8 @@ bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
void show_doms(struct seq_file *s, struct resctrl_schema *schema,
char *name, int closid);
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
+ struct rdt_ctrl_domain *d);
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 0c2d2cf4baa1..f6c44fae4b72 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2037,6 +2037,92 @@ static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
return ret;
}
+static int resctrl_io_alloc_parse_line(char *line, struct rdt_resource *r,
+ struct resctrl_schema *s, u32 closid)
+{
+ struct rdt_parse_data data;
+ struct rdt_ctrl_domain *d;
+ char *dom = NULL, *id;
+ unsigned long dom_id;
+
+next:
+ if (!line || line[0] == '\0')
+ return 0;
+
+ dom = strsep(&line, ";");
+ id = strsep(&dom, "=");
+ if (!dom || kstrtoul(id, 10, &dom_id)) {
+ rdt_last_cmd_puts("Missing '=' or non-numeric domain\n");
+ return -EINVAL;
+ }
+
+ dom = strim(dom);
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ if (d->hdr.id == dom_id) {
+ data.buf = dom;
+ data.mode = RDT_MODE_SHAREABLE;
+ data.closid = closid;
+ if (parse_cbm(&data, s, d))
+ return -EINVAL;
+ goto next;
+ }
+ }
+ return -EINVAL;
+}
+
+static ssize_t resctrl_io_alloc_cbm_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes, loff_t off)
+{
+ struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+ struct rdt_resource *r = s->res;
+ u32 io_alloc_closid;
+ int ret = 0;
+
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
+ return -EINVAL;
+
+ buf[nbytes - 1] = '\0';
+
+ if (!r->cache.io_alloc_capable) {
+ rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+ return -EINVAL;
+ }
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+ rdt_staged_configs_clear();
+
+ if (!resctrl_arch_get_io_alloc_enabled(r)) {
+ rdt_last_cmd_puts("io_alloc feature is not enabled\n");
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ io_alloc_closid = resctrl_io_alloc_closid_get(r);
+ if (io_alloc_closid < 0) {
+ rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+ ret = -EINVAL;
+ goto cbm_write_out;
+ }
+
+ ret = resctrl_io_alloc_parse_line(buf, r, resctrl_schema_io_alloc(s),
+ io_alloc_closid);
+ if (ret)
+ goto cbm_write_out;
+
+ ret = resctrl_arch_update_domains(r, io_alloc_closid);
+
+cbm_write_out:
+ rdt_staged_configs_clear();
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -2198,9 +2284,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "io_alloc_cbm",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = resctrl_io_alloc_cbm_show,
+ .write = resctrl_io_alloc_cbm_write,
},
{
.name = "mba_MBps_event",
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
[parent not found: <202506131403.MWHHLsxB-lkp@intel.com>]
* Re: [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
[not found] ` <202506131403.MWHHLsxB-lkp@intel.com>
@ 2025-06-16 14:51 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-16 14:51 UTC (permalink / raw)
To: kernel test robot, corbet, tony.luck, reinette.chatre,
Dave.Martin, james.morse, tglx, mingo, bp, dave.hansen
Cc: oe-kbuild-all, x86, hpa, akpm, paulmck, rostedt, thuth, ardb,
gregkh, seanjc, thomas.lendacky, pawan.kumar.gupta, perry.yuan,
yosry.ahmed, kai.huang, xiaoyao.li, peterz, kan.liang,
mario.limonciello, xin3.li, sohil.mehta,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
On 6/13/25 02:04, kernel test robot wrote:
> Hi Babu,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on brauner-vfs/vfs.all]
> [also build test WARNING on linus/master v6.16-rc1 next-20250612]
> [cannot apply to tip/x86/core aegl/next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Babu-Moger/x86-cpufeatures-Add-support-for-L3-Smart-Data-Cache-Injection-Allocation-Enforcement/20250612-053050
> base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
> patch link: https://lore.kernel.org/r/6f8158ba0eebf41c9ec59e82dcdc28d4d3151c3b.1749677012.git.babu.moger%40amd.com
> patch subject: [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks> config: x86_64-randconfig-r071-20250612 (https://download.01.org/0day-ci/archive/20250613/202506131403.MWHHLsxB-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506131403.MWHHLsxB-lkp@intel.com/
>
> New smatch warnings:
> fs/resctrl/rdtgroup.c:2105 resctrl_io_alloc_cbm_write() warn: unsigned 'io_alloc_closid' is never less than zero.
> fs/resctrl/rdtgroup.c:2105 resctrl_io_alloc_cbm_write() warn: error code type promoted to positive: 'io_alloc_closid'
>
> Old smatch warnings:
> fs/resctrl/rdtgroup.c:1961 resctrl_io_alloc_write() warn: unsigned 'io_alloc_closid' is never less than zero.
> fs/resctrl/rdtgroup.c:1961 resctrl_io_alloc_write() warn: error code type promoted to positive: 'io_alloc_closid'
> fs/resctrl/rdtgroup.c:2026 resctrl_io_alloc_cbm_show() warn: unsigned 'io_alloc_closid' is never less than zero.
> fs/resctrl/rdtgroup.c:2026 resctrl_io_alloc_cbm_show() warn: error code type promoted to positive: 'io_alloc_closid'
>
> vim +/io_alloc_closid +2105 fs/resctrl/rdtgroup.c
>
> 2072
> 2073 static ssize_t resctrl_io_alloc_cbm_write(struct kernfs_open_file *of,
> 2074 char *buf, size_t nbytes, loff_t off)
> 2075 {
> 2076 struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> 2077 struct rdt_resource *r = s->res;
> 2078 u32 io_alloc_closid;
> 2079 int ret = 0;
> 2080
> 2081 /* Valid input requires a trailing newline */
> 2082 if (nbytes == 0 || buf[nbytes - 1] != '\n')
> 2083 return -EINVAL;
> 2084
> 2085 buf[nbytes - 1] = '\0';
> 2086
> 2087 if (!r->cache.io_alloc_capable) {
> 2088 rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
> 2089 return -EINVAL;
> 2090 }
> 2091
> 2092 cpus_read_lock();
> 2093 mutex_lock(&rdtgroup_mutex);
> 2094
> 2095 rdt_last_cmd_clear();
> 2096 rdt_staged_configs_clear();
> 2097
> 2098 if (!resctrl_arch_get_io_alloc_enabled(r)) {
> 2099 rdt_last_cmd_puts("io_alloc feature is not enabled\n");
> 2100 ret = -EINVAL;
> 2101 goto cbm_write_out;
> 2102 }
> 2103
> 2104 io_alloc_closid = resctrl_io_alloc_closid_get(r);
>> 2105 if (io_alloc_closid < 0) {
> 2106 rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
Thanks. Will fix it in next revision.
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
2025-06-11 21:23 ` [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks Babu Moger
[not found] ` <202506131403.MWHHLsxB-lkp@intel.com>
@ 2025-06-18 4:03 ` Reinette Chatre
2025-06-20 21:58 ` Moger, Babu
1 sibling, 1 reply; 26+ messages in thread
From: Reinette Chatre @ 2025-06-18 4:03 UTC (permalink / raw)
To: Babu Moger, corbet, tony.luck, Dave.Martin, james.morse, tglx,
mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Babu,
On 6/11/25 2:23 PM, Babu Moger wrote:
> "io_alloc" feature is a mechanism that enables direct insertion of data
""io_alloc" feature is a mechanism that enables" -> ""io_alloc" feature enables"
> from I/O devices into the L3 cache. By directly caching data from I/O
Drop L3
> devices rather than first storing the I/O data in DRAM, it reduces the
> demands on DRAM bandwidth and reduces latency to the processor consuming
> the I/O data.
>
> "io_alloc" feature uses the highest CLOSID to route the traffic from I/O
How is the CLOSID related here?
> devices. Provide the interface to modify io_alloc CBMs (Capacity Bit Mask)
> when feature is enabled.
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...
> ---
> Documentation/filesystems/resctrl.rst | 13 ++++
> fs/resctrl/ctrlmondata.c | 4 +-
> fs/resctrl/internal.h | 2 +
> fs/resctrl/rdtgroup.c | 89 ++++++++++++++++++++++++++-
> 4 files changed, 105 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index b31748ec8c61..ae1157dcb4a3 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -181,6 +181,19 @@ related to allocation:
> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> 0=ffff;1=ffff
>
> + CBM can be configured by writing to the interface.
> +
> + Example::
> +
> + # echo 1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
> + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
> + 0=ffff;1=00ff
> +
> + When CDP is enabled, io_alloc directs traffic using the highest CLOSID
> + linked to an L3CODE resource. Although CBMs can be accessed through
> + either L3CODE or L3DATA resources, any updates to the schemata are
> + always routed through L3CODE.
Please do not commit resctrl to this implementation by documenting it as part of
user interface. Could snippet about CDP I shared in patch 6 be placed here as
a replacement? Not the usage of *may* to keep implementation options open.
> +
>
> Memory bandwidth(MB) subdirectory contains the following files
> with respect to allocation:
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index 6409637b4de6..f3e5e697945c 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -148,8 +148,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> * Read one cache bit mask (hex). Check that it is valid for the current
> * resource type.
> */
> -static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
> - struct rdt_ctrl_domain *d)
> +int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
> + struct rdt_ctrl_domain *d)
> {
> enum rdtgrp_mode mode = data->mode;
> struct resctrl_staged_config *cfg;
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 10a3188ffa54..755f23934295 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -382,6 +382,8 @@ bool closid_allocated(unsigned int closid);
> int resctrl_find_cleanest_closid(void);
> void show_doms(struct seq_file *s, struct resctrl_schema *schema,
> char *name, int closid);
> +int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
> + struct rdt_ctrl_domain *d);
>
> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 0c2d2cf4baa1..f6c44fae4b72 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -2037,6 +2037,92 @@ static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
> return ret;
> }
>
> +static int resctrl_io_alloc_parse_line(char *line, struct rdt_resource *r,
> + struct resctrl_schema *s, u32 closid)
> +{
> + struct rdt_parse_data data;
> + struct rdt_ctrl_domain *d;
> + char *dom = NULL, *id;
> + unsigned long dom_id;
> +
> +next:
> + if (!line || line[0] == '\0')
> + return 0;
> +
> + dom = strsep(&line, ";");
> + id = strsep(&dom, "=");
> + if (!dom || kstrtoul(id, 10, &dom_id)) {
> + rdt_last_cmd_puts("Missing '=' or non-numeric domain\n");
> + return -EINVAL;
> + }
> +
> + dom = strim(dom);
> + list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> + if (d->hdr.id == dom_id) {
> + data.buf = dom;
> + data.mode = RDT_MODE_SHAREABLE;
> + data.closid = closid;
> + if (parse_cbm(&data, s, d))
> + return -EINVAL;
> + goto next;
> + }
> + }
> + return -EINVAL;
> +}
> +
> +static ssize_t resctrl_io_alloc_cbm_write(struct kernfs_open_file *of,
> + char *buf, size_t nbytes, loff_t off)
> +{
> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> + struct rdt_resource *r = s->res;
> + u32 io_alloc_closid;
> + int ret = 0;
> +
> + /* Valid input requires a trailing newline */
> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
> + return -EINVAL;
> +
> + buf[nbytes - 1] = '\0';
> +
> + if (!r->cache.io_alloc_capable) {
> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
rdt_last_cmd_puts() requires rdtgroup_mutex to be held.
> + return -EINVAL;
> + }
> +
> + cpus_read_lock();
> + mutex_lock(&rdtgroup_mutex);
> +
> + rdt_last_cmd_clear();
> + rdt_staged_configs_clear();
> +
> + if (!resctrl_arch_get_io_alloc_enabled(r)) {
> + rdt_last_cmd_puts("io_alloc feature is not enabled\n");
> + ret = -EINVAL;
> + goto cbm_write_out;
can just be "out"
> + }
> +
> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
> + if (io_alloc_closid < 0) {
Similar to other places, since this is preceded by resctrl_arch_get_io_alloc_enabled(r)
passing the io_alloc_closid has to be valid and can use proposed resctrl_io_alloc_closid()
helper to simplify the code.
> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
> + ret = -EINVAL;
> + goto cbm_write_out;
> + }
> +
> + ret = resctrl_io_alloc_parse_line(buf, r, resctrl_schema_io_alloc(s),
> + io_alloc_closid);
Here too I think both schemata needs to be updated.
> + if (ret)
> + goto cbm_write_out;
> +
> + ret = resctrl_arch_update_domains(r, io_alloc_closid);
> +
> +cbm_write_out:
> + rdt_staged_configs_clear();
> + mutex_unlock(&rdtgroup_mutex);
> + cpus_read_unlock();
> +
> + return ret ?: nbytes;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -2198,9 +2284,10 @@ static struct rftype res_common_files[] = {
> },
> {
> .name = "io_alloc_cbm",
> - .mode = 0444,
> + .mode = 0644,
> .kf_ops = &rdtgroup_kf_single_ops,
> .seq_show = resctrl_io_alloc_cbm_show,
> + .write = resctrl_io_alloc_cbm_write,
> },
> {
> .name = "mba_MBps_event",
Reinette
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v6 8/8] fs/resctrl: Introduce interface to modify io_alloc Capacity Bit Masks
2025-06-18 4:03 ` Reinette Chatre
@ 2025-06-20 21:58 ` Moger, Babu
0 siblings, 0 replies; 26+ messages in thread
From: Moger, Babu @ 2025-06-20 21:58 UTC (permalink / raw)
To: Reinette Chatre, Babu Moger, corbet, tony.luck, Dave.Martin,
james.morse, tglx, mingo, bp, dave.hansen
Cc: x86, hpa, akpm, paulmck, rostedt, thuth, ardb, gregkh, seanjc,
thomas.lendacky, pawan.kumar.gupta, perry.yuan, yosry.ahmed,
kai.huang, xiaoyao.li, peterz, kan.liang, mario.limonciello,
xin3.li, sohil.mehta, chang.seok.bae, andrew.cooper3, ebiggers,
ak, xin, linux-doc, linux-kernel
Hi Reinette,
On 6/17/2025 11:03 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> "io_alloc" feature is a mechanism that enables direct insertion of data
>
> ""io_alloc" feature is a mechanism that enables" -> ""io_alloc" feature enables"
Sure.
>
>> from I/O devices into the L3 cache. By directly caching data from I/O
>
> Drop L3
>
Sure.
>> devices rather than first storing the I/O data in DRAM, it reduces the
>> demands on DRAM bandwidth and reduces latency to the processor consuming
>> the I/O data.
>>
>> "io_alloc" feature uses the highest CLOSID to route the traffic from I/O
>
> How is the CLOSID related here?
Will remove it.
>
>> devices. Provide the interface to modify io_alloc CBMs (Capacity Bit Mask)
>> when feature is enabled.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>
> ...
>
>> ---
>> Documentation/filesystems/resctrl.rst | 13 ++++
>> fs/resctrl/ctrlmondata.c | 4 +-
>> fs/resctrl/internal.h | 2 +
>> fs/resctrl/rdtgroup.c | 89 ++++++++++++++++++++++++++-
>> 4 files changed, 105 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index b31748ec8c61..ae1157dcb4a3 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -181,6 +181,19 @@ related to allocation:
>> # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> 0=ffff;1=ffff
>>
>> + CBM can be configured by writing to the interface.
>> +
>> + Example::
>> +
>> + # echo 1=FF > /sys/fs/resctrl/info/L3/io_alloc_cbm
>> + # cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>> + 0=ffff;1=00ff
>> +
>> + When CDP is enabled, io_alloc directs traffic using the highest CLOSID
>> + linked to an L3CODE resource. Although CBMs can be accessed through
>> + either L3CODE or L3DATA resources, any updates to the schemata are
>> + always routed through L3CODE.
>
> Please do not commit resctrl to this implementation by documenting it as part of
> user interface. Could snippet about CDP I shared in patch 6 be placed here as
> a replacement? Not the usage of *may* to keep implementation options open.
>
Sure. Yea. Already added in show().
>> +
>>
>> Memory bandwidth(MB) subdirectory contains the following files
>> with respect to allocation:
>> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
>> index 6409637b4de6..f3e5e697945c 100644
>> --- a/fs/resctrl/ctrlmondata.c
>> +++ b/fs/resctrl/ctrlmondata.c
>> @@ -148,8 +148,8 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> * Read one cache bit mask (hex). Check that it is valid for the current
>> * resource type.
>> */
>> -static int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
>> - struct rdt_ctrl_domain *d)
>> +int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
>> + struct rdt_ctrl_domain *d)
>> {
>> enum rdtgrp_mode mode = data->mode;
>> struct resctrl_staged_config *cfg;
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 10a3188ffa54..755f23934295 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -382,6 +382,8 @@ bool closid_allocated(unsigned int closid);
>> int resctrl_find_cleanest_closid(void);
>> void show_doms(struct seq_file *s, struct resctrl_schema *schema,
>> char *name, int closid);
>> +int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
>> + struct rdt_ctrl_domain *d);
>>
>> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 0c2d2cf4baa1..f6c44fae4b72 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -2037,6 +2037,92 @@ static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of,
>> return ret;
>> }
>>
>> +static int resctrl_io_alloc_parse_line(char *line, struct rdt_resource *r,
>> + struct resctrl_schema *s, u32 closid)
>> +{
>> + struct rdt_parse_data data;
>> + struct rdt_ctrl_domain *d;
>> + char *dom = NULL, *id;
>> + unsigned long dom_id;
>> +
>> +next:
>> + if (!line || line[0] == '\0')
>> + return 0;
>> +
>> + dom = strsep(&line, ";");
>> + id = strsep(&dom, "=");
>> + if (!dom || kstrtoul(id, 10, &dom_id)) {
>> + rdt_last_cmd_puts("Missing '=' or non-numeric domain\n");
>> + return -EINVAL;
>> + }
>> +
>> + dom = strim(dom);
>> + list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
>> + if (d->hdr.id == dom_id) {
>> + data.buf = dom;
>> + data.mode = RDT_MODE_SHAREABLE;
>> + data.closid = closid;
>> + if (parse_cbm(&data, s, d))
>> + return -EINVAL;
>> + goto next;
>> + }
>> + }
>> + return -EINVAL;
>> +}
>> +
>> +static ssize_t resctrl_io_alloc_cbm_write(struct kernfs_open_file *of,
>> + char *buf, size_t nbytes, loff_t off)
>> +{
>> + struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> + struct rdt_resource *r = s->res;
>> + u32 io_alloc_closid;
>> + int ret = 0;
>> +
>> + /* Valid input requires a trailing newline */
>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>> + return -EINVAL;
>> +
>> + buf[nbytes - 1] = '\0';
>> +
>> + if (!r->cache.io_alloc_capable) {
>> + rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
>
> rdt_last_cmd_puts() requires rdtgroup_mutex to be held.
Sure.
>
>> + return -EINVAL;
>> + }
>> +
>> + cpus_read_lock();
>> + mutex_lock(&rdtgroup_mutex);
>> +
>> + rdt_last_cmd_clear();
>> + rdt_staged_configs_clear();
>> +
>> + if (!resctrl_arch_get_io_alloc_enabled(r)) {
>> + rdt_last_cmd_puts("io_alloc feature is not enabled\n");
>> + ret = -EINVAL;
>> + goto cbm_write_out;
>
> can just be "out"
Sure.
>
>> + }
>> +
>> + io_alloc_closid = resctrl_io_alloc_closid_get(r);
>> + if (io_alloc_closid < 0) {
>
> Similar to other places, since this is preceded by resctrl_arch_get_io_alloc_enabled(r)
> passing the io_alloc_closid has to be valid and can use proposed resctrl_io_alloc_closid()
> helper to simplify the code.
Yes. Check not required.
>
>> + rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
>> + ret = -EINVAL;
>> + goto cbm_write_out;
>> + }
>> +
>> + ret = resctrl_io_alloc_parse_line(buf, r, resctrl_schema_io_alloc(s),
>> + io_alloc_closid);
>
> Here too I think both schemata needs to be updated.
Yes. Added it.
>
>> + if (ret)
>> + goto cbm_write_out;
>> +
>> + ret = resctrl_arch_update_domains(r, io_alloc_closid);
>> +
>> +cbm_write_out:
>> + rdt_staged_configs_clear();
>> + mutex_unlock(&rdtgroup_mutex);
>> + cpus_read_unlock();
>> +
>> + return ret ?: nbytes;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>> @@ -2198,9 +2284,10 @@ static struct rftype res_common_files[] = {
>> },
>> {
>> .name = "io_alloc_cbm",
>> - .mode = 0444,
>> + .mode = 0644,
>> .kf_ops = &rdtgroup_kf_single_ops,
>> .seq_show = resctrl_io_alloc_cbm_show,
>> + .write = resctrl_io_alloc_cbm_write,
>> },
>> {
>> .name = "mba_MBps_event",
>
> Reinette
>
Thanks
Babu
^ permalink raw reply [flat|nested] 26+ messages in thread