* [PATCH v2 0/3] selftests/resctrl: Add Hygon CPUs support and bug fixes
@ 2025-12-05 9:25 Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon Xiaochen Shen
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-05 9:25 UTC (permalink / raw)
To: tony.luck, reinette.chatre, bp, fenghuay, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
The resctrl selftest currently exhibits several failures on Hygon CPUs
due to missing vendor detection and edge-case handling specific to
Hygon's architecture.
This patch series addresses three distinct issues:
1. Missing CPU vendor detection, causing the test to fail with
"# Can not get vendor info..." on Hygon CPUs.
2. A division-by-zero crash in SNC detection on Hygon CPUs.
3. Incorrect handling of non-contiguous CBM support on Hygon CPUs.
These changes enable resctrl selftest to run successfully on
Hygon CPUs that support Platform QoS features.
Changelog:
v2:
- Patch 1: switch all of the vendor id bitmasks to use BIT() (Reinette)
- Patch 2: add Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
- Patch 3: add Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
add a maintainer note to highlight it is not a candidate for
backport (Reinette)
Xiaochen Shen (3):
selftests/resctrl: Add CPU vendor detection for Hygon
selftests/resctrl: Fix a division by zero error on Hygon
selftests/resctrl: Fix non-contiguous CBM check for Hygon
tools/testing/selftests/resctrl/cat_test.c | 4 ++--
tools/testing/selftests/resctrl/resctrl.h | 6 ++++--
tools/testing/selftests/resctrl/resctrl_tests.c | 2 ++
tools/testing/selftests/resctrl/resctrlfs.c | 10 ++++++++++
4 files changed, 18 insertions(+), 4 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-05 9:25 [PATCH v2 0/3] selftests/resctrl: Add Hygon CPUs support and bug fixes Xiaochen Shen
@ 2025-12-05 9:25 ` Xiaochen Shen
2025-12-05 19:28 ` Fenghua Yu
2025-12-05 9:25 ` [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon Xiaochen Shen
2 siblings, 1 reply; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-05 9:25 UTC (permalink / raw)
To: tony.luck, reinette.chatre, bp, fenghuay, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
The resctrl selftest currently fails on Hygon CPUs that support Platform
QoS features, printing the error:
"# Can not get vendor info..."
This occurs because vendor detection is missing for Hygon CPUs.
Fix this by extending the CPU vendor detection logic to include
Hygon's vendor ID.
Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
---
tools/testing/selftests/resctrl/resctrl.h | 6 ++++--
tools/testing/selftests/resctrl/resctrl_tests.c | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index cd3adfc14969..411ee10380a5 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -23,6 +23,7 @@
#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <linux/compiler.h>
+#include <linux/bits.h>
#include "../kselftest.h"
#define MB (1024 * 1024)
@@ -36,8 +37,9 @@
* Define as bits because they're used for vendor_specific bitmask in
* the struct resctrl_test.
*/
-#define ARCH_INTEL 1
-#define ARCH_AMD 2
+#define ARCH_INTEL BIT(0)
+#define ARCH_AMD BIT(1)
+#define ARCH_HYGON BIT(2)
#define END_OF_TESTS 1
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 5154ffd821c4..9bf35f3beb6b 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -42,6 +42,8 @@ static int detect_vendor(void)
vendor_id = ARCH_INTEL;
else if (s && !strcmp(s, ": AuthenticAMD\n"))
vendor_id = ARCH_AMD;
+ else if (s && !strcmp(s, ": HygonGenuine\n"))
+ vendor_id = ARCH_HYGON;
fclose(inf);
free(res);
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon
2025-12-05 9:25 [PATCH v2 0/3] selftests/resctrl: Add Hygon CPUs support and bug fixes Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon Xiaochen Shen
@ 2025-12-05 9:25 ` Xiaochen Shen
2025-12-05 18:53 ` Fenghua Yu
2025-12-05 9:25 ` [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon Xiaochen Shen
2 siblings, 1 reply; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-05 9:25 UTC (permalink / raw)
To: tony.luck, reinette.chatre, bp, fenghuay, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Commit
a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
introduced the snc_nodes_per_l3_cache() function to detect the Intel
Sub-NUMA Clustering (SNC) feature by comparing #CPUs in node0 with #CPUs
sharing LLC with CPU0. The function was designed to return:
(1) >1: SNC mode is enabled.
(2) 1: SNC mode is not enabled or not supported.
However, on certain Hygon CPUs, #CPUs sharing LLC with CPU0 is actually
less than #CPUs in node0. This results in snc_nodes_per_l3_cache()
returning 0 (calculated as cache_cpus / node_cpus).
This leads to a division by zero error in get_cache_size():
*cache_size /= snc_nodes_per_l3_cache();
Causing the resctrl selftest to fail with:
"Floating point exception (core dumped)"
Fix the issue by ensuring snc_nodes_per_l3_cache() returns 1 when SNC
mode is not supported on the platform.
Fixes: a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/resctrlfs.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 195f04c4d158..2b075e7334bf 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -243,6 +243,16 @@ int snc_nodes_per_l3_cache(void)
}
snc_mode = cache_cpus / node_cpus;
+ /*
+ * On certain Hygon platforms:
+ * cache_cpus < node_cpus, the calculated snc_mode is 0.
+ *
+ * Set snc_mode = 1 to indicate that SNC mode is not
+ * supported on the platform.
+ */
+ if (!snc_mode)
+ snc_mode = 1;
+
if (snc_mode > 1)
ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon
2025-12-05 9:25 [PATCH v2 0/3] selftests/resctrl: Add Hygon CPUs support and bug fixes Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon Xiaochen Shen
@ 2025-12-05 9:25 ` Xiaochen Shen
2025-12-05 19:39 ` Fenghua Yu
2 siblings, 1 reply; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-05 9:25 UTC (permalink / raw)
To: tony.luck, reinette.chatre, bp, fenghuay, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
The resctrl selftest currently fails on Hygon CPUs that always supports
non-contiguous CBM, printing the error:
"# Hardware and kernel differ on non-contiguous CBM support!"
This occurs because the arch_supports_noncont_cat() function lacks
vendor detection for Hygon CPUs, preventing proper identification of
their non-contiguous CBM capability.
Fix this by adding Hygon vendor ID detection to
arch_supports_noncont_cat().
Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
---
Maintainer note:
Even though this is a fix it is not a candidate for backport since it is
based on another patch series (x86/resctrl: Fix Platform QoS issues for
Hygon) which is in process of being added to resctrl.
tools/testing/selftests/resctrl/cat_test.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 94cfdba5308d..59a0f80fdc5a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -290,8 +290,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
static bool arch_supports_noncont_cat(const struct resctrl_test *test)
{
- /* AMD always supports non-contiguous CBM. */
- if (get_vendor() == ARCH_AMD)
+ /* AMD and Hygon always supports non-contiguous CBM. */
+ if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_HYGON)
return true;
#if defined(__i386__) || defined(__x86_64__) /* arch */
--
2.47.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon
2025-12-05 9:25 ` [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon Xiaochen Shen
@ 2025-12-05 18:53 ` Fenghua Yu
2025-12-08 2:27 ` Xiaochen Shen
0 siblings, 1 reply; 18+ messages in thread
From: Fenghua Yu @ 2025-12-05 18:53 UTC (permalink / raw)
To: Xiaochen Shen, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi, Xiaochen,
On 12/5/25 01:25, Xiaochen Shen wrote:
> Commit
>
> a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
>
> introduced the snc_nodes_per_l3_cache() function to detect the Intel
> Sub-NUMA Clustering (SNC) feature by comparing #CPUs in node0 with #CPUs
> sharing LLC with CPU0. The function was designed to return:
> (1) >1: SNC mode is enabled.
> (2) 1: SNC mode is not enabled or not supported.
>
> However, on certain Hygon CPUs, #CPUs sharing LLC with CPU0 is actually
> less than #CPUs in node0. This results in snc_nodes_per_l3_cache()
> returning 0 (calculated as cache_cpus / node_cpus).
>
> This leads to a division by zero error in get_cache_size():
> *cache_size /= snc_nodes_per_l3_cache();
>
> Causing the resctrl selftest to fail with:
> "Floating point exception (core dumped)"
>
> Fix the issue by ensuring snc_nodes_per_l3_cache() returns 1 when SNC
> mode is not supported on the platform.
>
> Fixes: a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled")
> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> tools/testing/selftests/resctrl/resctrlfs.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> index 195f04c4d158..2b075e7334bf 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -243,6 +243,16 @@ int snc_nodes_per_l3_cache(void)
> }
> snc_mode = cache_cpus / node_cpus;
>
> + /*
> + * On certain Hygon platforms:
nit. This situation could happen on other platforms than Hygon. Maybe
it's better to have a more generic comment here?
* On some platforms (e.g. Hygon),
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
> + * cache_cpus < node_cpus, the calculated snc_mode is 0.
> + *
> + * Set snc_mode = 1 to indicate that SNC mode is not
> + * supported on the platform.
> + */
> + if (!snc_mode)
> + snc_mode = 1;
> +
> if (snc_mode > 1)
> ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
> }
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-05 9:25 ` [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon Xiaochen Shen
@ 2025-12-05 19:28 ` Fenghua Yu
2025-12-08 8:01 ` Xiaochen Shen
0 siblings, 1 reply; 18+ messages in thread
From: Fenghua Yu @ 2025-12-05 19:28 UTC (permalink / raw)
To: Xiaochen Shen, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi, Xiaochen,
On 12/5/25 01:25, Xiaochen Shen wrote:
> The resctrl selftest currently fails on Hygon CPUs that support Platform
> QoS features, printing the error:
>
> "# Can not get vendor info..."
>
> This occurs because vendor detection is missing for Hygon CPUs.
>
> Fix this by extending the CPU vendor detection logic to include
> Hygon's vendor ID.
>
> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
> ---
> tools/testing/selftests/resctrl/resctrl.h | 6 ++++--
> tools/testing/selftests/resctrl/resctrl_tests.c | 2 ++
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index cd3adfc14969..411ee10380a5 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -23,6 +23,7 @@
> #include <asm/unistd.h>
> #include <linux/perf_event.h>
> #include <linux/compiler.h>
> +#include <linux/bits.h>
> #include "../kselftest.h"
>
> #define MB (1024 * 1024)
> @@ -36,8 +37,9 @@
> * Define as bits because they're used for vendor_specific bitmask in
> * the struct resctrl_test.
> */
> -#define ARCH_INTEL 1
> -#define ARCH_AMD 2
> +#define ARCH_INTEL BIT(0)
> +#define ARCH_AMD BIT(1)
> +#define ARCH_HYGON BIT(2)
>
> #define END_OF_TESTS 1
>
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index 5154ffd821c4..9bf35f3beb6b 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -42,6 +42,8 @@ static int detect_vendor(void)
> vendor_id = ARCH_INTEL;
> else if (s && !strcmp(s, ": AuthenticAMD\n"))
> vendor_id = ARCH_AMD;
> + else if (s && !strcmp(s, ": HygonGenuine\n"))
> + vendor_id = ARCH_HYGON;
>
Since vendor_id is bitmask now and BIT() is a UL value, it's better to
define it as "unsigned int" (unsigned long is a bit overkill).
Otherwise, type conversion may be risky.
Is it better to change vendor_id as "unsigned int", static unsigned int
detect_vendor(), and a couple of other places?
> fclose(inf);
> free(res);
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon
2025-12-05 9:25 ` [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon Xiaochen Shen
@ 2025-12-05 19:39 ` Fenghua Yu
2025-12-05 21:30 ` Reinette Chatre
2025-12-08 8:06 ` Xiaochen Shen
0 siblings, 2 replies; 18+ messages in thread
From: Fenghua Yu @ 2025-12-05 19:39 UTC (permalink / raw)
To: Xiaochen Shen, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
On 12/5/25 01:25, Xiaochen Shen wrote:
> The resctrl selftest currently fails on Hygon CPUs that always supports
> non-contiguous CBM, printing the error:
>
> "# Hardware and kernel differ on non-contiguous CBM support!"
>
> This occurs because the arch_supports_noncont_cat() function lacks
> vendor detection for Hygon CPUs, preventing proper identification of
> their non-contiguous CBM capability.
>
> Fix this by adding Hygon vendor ID detection to
> arch_supports_noncont_cat().
>
> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Maintainer note:
> Even though this is a fix it is not a candidate for backport since it is
> based on another patch series (x86/resctrl: Fix Platform QoS issues for
> Hygon) which is in process of being added to resctrl.
>
> tools/testing/selftests/resctrl/cat_test.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 94cfdba5308d..59a0f80fdc5a 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -290,8 +290,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>
> static bool arch_supports_noncont_cat(const struct resctrl_test *test)
> {
> - /* AMD always supports non-contiguous CBM. */
> - if (get_vendor() == ARCH_AMD)
> + /* AMD and Hygon always supports non-contiguous CBM. */
> + if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_HYGON)
nit. Better to avoid call get_vendor() twice (or even more in the future)?
unsigned int vendor_id = get_vendor();
if (vendor_id == ARCH_AMD || vendor_id == ARCH_HYGON)
> return true;
>
> #if defined(__i386__) || defined(__x86_64__) /* arch */
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon
2025-12-05 19:39 ` Fenghua Yu
@ 2025-12-05 21:30 ` Reinette Chatre
2025-12-05 21:51 ` Fenghua Yu
2025-12-08 8:06 ` Xiaochen Shen
1 sibling, 1 reply; 18+ messages in thread
From: Reinette Chatre @ 2025-12-05 21:30 UTC (permalink / raw)
To: Fenghua Yu, Xiaochen Shen, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi Fenghua,
On 12/5/25 11:39 AM, Fenghua Yu wrote:
>
>
> On 12/5/25 01:25, Xiaochen Shen wrote:
>> The resctrl selftest currently fails on Hygon CPUs that always supports
>> non-contiguous CBM, printing the error:
>>
>> "# Hardware and kernel differ on non-contiguous CBM support!"
>>
>> This occurs because the arch_supports_noncont_cat() function lacks
>> vendor detection for Hygon CPUs, preventing proper identification of
>> their non-contiguous CBM capability.
>>
>> Fix this by adding Hygon vendor ID detection to
>> arch_supports_noncont_cat().
>>
>> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>> Maintainer note:
>> Even though this is a fix it is not a candidate for backport since it is
>> based on another patch series (x86/resctrl: Fix Platform QoS issues for
>> Hygon) which is in process of being added to resctrl.
>>
>> tools/testing/selftests/resctrl/cat_test.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>> index 94cfdba5308d..59a0f80fdc5a 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -290,8 +290,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>> static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>> {
>> - /* AMD always supports non-contiguous CBM. */
>> - if (get_vendor() == ARCH_AMD)
>> + /* AMD and Hygon always supports non-contiguous CBM. */
>> + if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_HYGON)
>
> nit. Better to avoid call get_vendor() twice (or even more in the future)?
Are you perhaps referring to detect_vendor()? detect_vendor() does the actual digging to
determine the vendor ID and is indeed called just once by get_vendor(). In subsequent calls
get_vendor() just returns the static ID.
Reinette
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon
2025-12-05 21:30 ` Reinette Chatre
@ 2025-12-05 21:51 ` Fenghua Yu
0 siblings, 0 replies; 18+ messages in thread
From: Fenghua Yu @ 2025-12-05 21:51 UTC (permalink / raw)
To: Reinette Chatre, Xiaochen Shen, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi, Reinette,
On 12/5/25 13:30, Reinette Chatre wrote:
> Hi Fenghua,
>
> On 12/5/25 11:39 AM, Fenghua Yu wrote:
>>
>>
>> On 12/5/25 01:25, Xiaochen Shen wrote:
>>> The resctrl selftest currently fails on Hygon CPUs that always supports
>>> non-contiguous CBM, printing the error:
>>>
>>> "# Hardware and kernel differ on non-contiguous CBM support!"
>>>
>>> This occurs because the arch_supports_noncont_cat() function lacks
>>> vendor detection for Hygon CPUs, preventing proper identification of
>>> their non-contiguous CBM capability.
>>>
>>> Fix this by adding Hygon vendor ID detection to
>>> arch_supports_noncont_cat().
>>>
>>> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
>>> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
>>> ---
>>> Maintainer note:
>>> Even though this is a fix it is not a candidate for backport since it is
>>> based on another patch series (x86/resctrl: Fix Platform QoS issues for
>>> Hygon) which is in process of being added to resctrl.
>>>
>>> tools/testing/selftests/resctrl/cat_test.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 94cfdba5308d..59a0f80fdc5a 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -290,8 +290,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>>> static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>>> {
>>> - /* AMD always supports non-contiguous CBM. */
>>> - if (get_vendor() == ARCH_AMD)
>>> + /* AMD and Hygon always supports non-contiguous CBM. */
>>> + if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_HYGON)
>>
>> nit. Better to avoid call get_vendor() twice (or even more in the future)?
>
> Are you perhaps referring to detect_vendor()? detect_vendor() does the actual digging to
> determine the vendor ID and is indeed called just once by get_vendor(). In subsequent calls
> get_vendor() just returns the static ID.
There is still cost to call get_vendor() (call, push, cmp, pop, ret,
etc) in subsequent calls. I just feel it's redundant to call it multiple
times in just one sentence.
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon
2025-12-05 18:53 ` Fenghua Yu
@ 2025-12-08 2:27 ` Xiaochen Shen
0 siblings, 0 replies; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-08 2:27 UTC (permalink / raw)
To: Fenghua Yu, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Hi Fenghua,
On 12/6/2025 2:53 AM, Fenghua Yu wrote:
>> @@ -243,6 +243,16 @@ int snc_nodes_per_l3_cache(void)
>> }
>> snc_mode = cache_cpus / node_cpus;
>> + /*
>> + * On certain Hygon platforms:
>
> nit. This situation could happen on other platforms than Hygon. Maybe it's better to have a more generic comment here?
> * On some platforms (e.g. Hygon),
>
I will update the comment as you suggested. Thank you!
> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Thank you!
>
>> + * cache_cpus < node_cpus, the calculated snc_mode is 0.
>> + *
>> + * Set snc_mode = 1 to indicate that SNC mode is not
>> + * supported on the platform.
>> + */
>> + if (!snc_mode)
>> + snc_mode = 1;
>> +
>> if (snc_mode > 1)
>> ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
>> }
> Thanks.
> -Fenghua
Best regards,
Xiaochen Shen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-05 19:28 ` Fenghua Yu
@ 2025-12-08 8:01 ` Xiaochen Shen
2025-12-08 17:57 ` Reinette Chatre
0 siblings, 1 reply; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-08 8:01 UTC (permalink / raw)
To: Fenghua Yu, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Hi Fenghua,
On 12/6/2025 3:28 AM, Fenghua Yu wrote:
>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>> @@ -42,6 +42,8 @@ static int detect_vendor(void)
>> vendor_id = ARCH_INTEL;
>> else if (s && !strcmp(s, ": AuthenticAMD\n"))
>> vendor_id = ARCH_AMD;
>> + else if (s && !strcmp(s, ": HygonGenuine\n"))
>> + vendor_id = ARCH_HYGON;
>>
> Since vendor_id is bitmask now and BIT() is a UL value, it's better to define it as "unsigned int" (unsigned long is a bit overkill). Otherwise, type conversion may be risky.
Thank you for the suggestion. How about using BIT_U8() instead of BIT()?
In my opinion, 8-bits type "unsigned int" is enough for "vendor id".
>
> Is it better to change vendor_id as "unsigned int", static unsigned int detect_vendor(), and a couple of other places?
Yes. It is better to update the return types of detect_vendor() and get_vendor() from 'int' to 'unsigned int'
to align with their usage as bitmask values and to prevent potentially risky type conversions.
Should I split the code changes (using BIT_xx(), updates of type 'unsigned int') into a separate patch?
The patch may look like:
-----------------------------
commit baaabb7bd3a3e45a8093422b576383da20488aca
Author: Xiaochen Shen <shenxiaochen@open-hieco.net>
Date: Mon Dec 8 14:26:45 2025 +0800
selftests/resctrl: Improve type definitions of CPU vendor IDs
In file resctrl.h:
-----------------
/*
* CPU vendor IDs
*
* Define as bits because they're used for vendor_specific bitmask in
* the struct resctrl_test.
*/
#define ARCH_INTEL 1
#define ARCH_AMD 2
-----------------
The comment before the CPU vendor IDs defines attempts to provide
guidance but it is clearly still quite subtle that these values are
required to be unique bits. Consider for example their usage in
test_vendor_specific_check():
return get_vendor() & test->vendor_specific
A clearer and more maintainable approach is to define these CPU vendor
IDs using BIT(). This ensures each vendor corresponds to a distinct bit
and makes it obvious when adding new vendor IDs.
Additionally, update the return types of detect_vendor() and
get_vendor() from 'int' to 'unsigned int' to align with their usage as
bitmask values and to prevent potentially risky type conversions.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Suggested-by: Fenghua Yu <fenghuay@nvidia.com>
Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index cd3adfc14969..2922dfbf9090 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -23,6 +23,7 @@
#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <linux/compiler.h>
+#include <linux/bits.h>
#include "../kselftest.h"
#define MB (1024 * 1024)
@@ -36,8 +37,8 @@
* Define as bits because they're used for vendor_specific bitmask in
* the struct resctrl_test.
*/
-#define ARCH_INTEL 1
-#define ARCH_AMD 2
+#define ARCH_INTEL BIT_U8(0)
+#define ARCH_AMD BIT_U8(1)
#define END_OF_TESTS 1
@@ -163,7 +164,7 @@ extern int snc_unreliable;
extern char llc_occup_path[1024];
int snc_nodes_per_l3_cache(void);
-int get_vendor(void);
+unsigned int get_vendor(void);
bool check_resctrlfs_support(void);
int filter_dmesg(void);
int get_domain_id(const char *resource, int cpu_no, int *domain_id);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 5154ffd821c4..0fef2e4171e7 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -23,10 +23,10 @@ static struct resctrl_test *resctrl_tests[] = {
&l2_noncont_cat_test,
};
-static int detect_vendor(void)
+static unsigned int detect_vendor(void)
{
FILE *inf = fopen("/proc/cpuinfo", "r");
- int vendor_id = 0;
+ unsigned int vendor_id = 0;
char *s = NULL;
char *res;
@@ -48,12 +48,14 @@ static int detect_vendor(void)
return vendor_id;
}
-int get_vendor(void)
+unsigned int get_vendor(void)
{
- static int vendor = -1;
+ static unsigned int vendor;
- if (vendor == -1)
+ if (vendor == 0)
vendor = detect_vendor();
+
+ /* detect_vendor() returns invalid vendor id */
if (vendor == 0)
ksft_print_msg("Can not get vendor info...\n");
-----------------------------
Thank you!
Best regards,
Xiaochen Shen
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon
2025-12-05 19:39 ` Fenghua Yu
2025-12-05 21:30 ` Reinette Chatre
@ 2025-12-08 8:06 ` Xiaochen Shen
1 sibling, 0 replies; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-08 8:06 UTC (permalink / raw)
To: Fenghua Yu, tony.luck, reinette.chatre, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Hi Fenghua,
On 12/6/2025 3:39 AM, Fenghua Yu wrote:
>> static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>> {
>> - /* AMD always supports non-contiguous CBM. */
>> - if (get_vendor() == ARCH_AMD)
>> + /* AMD and Hygon always supports non-contiguous CBM. */
>> + if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_HYGON)
>
> nit. Better to avoid call get_vendor() twice (or even more in the future)?
>
> unsigned int vendor_id = get_vendor();
>
> if (vendor_id == ARCH_AMD || vendor_id == ARCH_HYGON)
Thank you! I will update the code as you suggested.
>
>
>> return true;
>> #if defined(__i386__) || defined(__x86_64__) /* arch */
> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Thank you!
>
> Thanks.
>
> -Fenghua
Best regards,
Xiaochen Shen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-08 8:01 ` Xiaochen Shen
@ 2025-12-08 17:57 ` Reinette Chatre
2025-12-09 6:10 ` Xiaochen Shen
0 siblings, 1 reply; 18+ messages in thread
From: Reinette Chatre @ 2025-12-08 17:57 UTC (permalink / raw)
To: Xiaochen Shen, Fenghua Yu, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi Xiaochen,
On 12/8/25 12:01 AM, Xiaochen Shen wrote:
> Hi Fenghua,
>
> On 12/6/2025 3:28 AM, Fenghua Yu wrote:
>>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>>> @@ -42,6 +42,8 @@ static int detect_vendor(void)
>>> vendor_id = ARCH_INTEL;
>>> else if (s && !strcmp(s, ": AuthenticAMD\n"))
>>> vendor_id = ARCH_AMD;
>>> + else if (s && !strcmp(s, ": HygonGenuine\n"))
>>> + vendor_id = ARCH_HYGON;
>>>
>> Since vendor_id is bitmask now and BIT() is a UL value, it's better to define it as "unsigned int" (unsigned long is a bit overkill). Otherwise, type conversion may be risky.
>
> Thank you for the suggestion. How about using BIT_U8() instead of BIT()?
> In my opinion, 8-bits type "unsigned int" is enough for "vendor id".
BIT() is fine here. I prefer that types used by selftests are consistent, that is, not
a mix of user space and kernel types.
There may be good motivation to switch to kernel types but then it needs to be
throughout the resctrl selftests, which is not something this work needs to take on.
>
>>
>> Is it better to change vendor_id as "unsigned int", static unsigned int detect_vendor(), and a couple of other places?
>
> Yes. It is better to update the return types of detect_vendor() and get_vendor() from 'int' to 'unsigned int'
> to align with their usage as bitmask values and to prevent potentially risky type conversions.
>
> Should I split the code changes (using BIT_xx(), updates of type 'unsigned int') into a separate patch?
I agree this would be better as a separate patch.
>
> The patch may look like:
> -----------------------------
> commit baaabb7bd3a3e45a8093422b576383da20488aca
> Author: Xiaochen Shen <shenxiaochen@open-hieco.net>
> Date: Mon Dec 8 14:26:45 2025 +0800
>
> selftests/resctrl: Improve type definitions of CPU vendor IDs
Instead of a generic "Improve" it can just be specific about what it does:
"selftests/resctrl: Define CPU vendor IDs as bits to match usage"
>
> In file resctrl.h:
> -----------------
> /*
> * CPU vendor IDs
> *
> * Define as bits because they're used for vendor_specific bitmask in
> * the struct resctrl_test.
> */
> #define ARCH_INTEL 1
> #define ARCH_AMD 2
> -----------------
>
> The comment before the CPU vendor IDs defines attempts to provide
> guidance but it is clearly still quite subtle that these values are
I wrote "clearly" in response to the earlier patch that did not follow the quoted
documentation, implying that the documentation was not sufficient. I do not
think "clearly" applies here. This can just be specific about how these values
are used ... which this paragraph duplicates from the quoted comment so either this
paragraph or the code quote could be dropped?
> required to be unique bits. Consider for example their usage in
> test_vendor_specific_check():
> return get_vendor() & test->vendor_specific
>
> A clearer and more maintainable approach is to define these CPU vendor
> IDs using BIT(). This ensures each vendor corresponds to a distinct bit
> and makes it obvious when adding new vendor IDs.
>
> Additionally, update the return types of detect_vendor() and
> get_vendor() from 'int' to 'unsigned int' to align with their usage as
> bitmask values and to prevent potentially risky type conversions.
>
> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
> Suggested-by: Fenghua Yu <fenghuay@nvidia.com>
> Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net>
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index cd3adfc14969..2922dfbf9090 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -23,6 +23,7 @@
> #include <asm/unistd.h>
> #include <linux/perf_event.h>
> #include <linux/compiler.h>
> +#include <linux/bits.h>
> #include "../kselftest.h"
>
> #define MB (1024 * 1024)
> @@ -36,8 +37,8 @@
> * Define as bits because they're used for vendor_specific bitmask in
> * the struct resctrl_test.
> */
> -#define ARCH_INTEL 1
> -#define ARCH_AMD 2
> +#define ARCH_INTEL BIT_U8(0)
> +#define ARCH_AMD BIT_U8(1)
>
> #define END_OF_TESTS 1
>
> @@ -163,7 +164,7 @@ extern int snc_unreliable;
> extern char llc_occup_path[1024];
>
> int snc_nodes_per_l3_cache(void);
> -int get_vendor(void);
> +unsigned int get_vendor(void);
> bool check_resctrlfs_support(void);
> int filter_dmesg(void);
> int get_domain_id(const char *resource, int cpu_no, int *domain_id);
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index 5154ffd821c4..0fef2e4171e7 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -23,10 +23,10 @@ static struct resctrl_test *resctrl_tests[] = {
> &l2_noncont_cat_test,
> };
>
> -static int detect_vendor(void)
> +static unsigned int detect_vendor(void)
> {
> FILE *inf = fopen("/proc/cpuinfo", "r");
> - int vendor_id = 0;
> + unsigned int vendor_id = 0;
> char *s = NULL;
> char *res;
>
> @@ -48,12 +48,14 @@ static int detect_vendor(void)
> return vendor_id;
> }
>
> -int get_vendor(void)
> +unsigned int get_vendor(void)
> {
> - static int vendor = -1;
> + static unsigned int vendor;
>
> - if (vendor == -1)
> + if (vendor == 0)
> vendor = detect_vendor();
> +
> + /* detect_vendor() returns invalid vendor id */
> if (vendor == 0)
> ksft_print_msg("Can not get vendor info...\n");
detect_vendor() returns 0 if it cannot detect the vendor. Using "0" as well as
return value of detect_vendor() to indicate that detect_vendor() should be run will
thus cause detect_vendor() to always be called on failure even though it will keep
failing.
Can vendor be kept as int and just cast it on return? This may be introducing the
risky type conversion that the changelog claims to avoid though ....
Reinette
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-08 17:57 ` Reinette Chatre
@ 2025-12-09 6:10 ` Xiaochen Shen
2025-12-09 23:02 ` Reinette Chatre
0 siblings, 1 reply; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-09 6:10 UTC (permalink / raw)
To: Reinette Chatre, Fenghua Yu, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Hi Reinette,
On 12/9/2025 1:57 AM, Reinette Chatre wrote:
>> Thank you for the suggestion. How about using BIT_U8() instead of BIT()?
>> In my opinion, 8-bits type "unsigned int" is enough for "vendor id".
> BIT() is fine here. I prefer that types used by selftests are consistent, that is, not
> a mix of user space and kernel types.
> There may be good motivation to switch to kernel types but then it needs to be
> throughout the resctrl selftests, which is not something this work needs to take on.
Thank you. I will keep BIT() here.
>> Should I split the code changes (using BIT_xx(), updates of type 'unsigned int') into a separate patch?
> I agree this would be better as a separate patch.
Sure. I will add a prerequisite patch in this series.
>> The patch may look like:
>> -----------------------------
>> commit baaabb7bd3a3e45a8093422b576383da20488aca
>> Author: Xiaochen Shen <shenxiaochen@open-hieco.net>
>> Date: Mon Dec 8 14:26:45 2025 +0800
>>
>> selftests/resctrl: Improve type definitions of CPU vendor IDs
> Instead of a generic "Improve" it can just be specific about what it does:
> "selftests/resctrl: Define CPU vendor IDs as bits to match usage"
Thank you for the suggestion. The subject of the patch looks much better.
>> In file resctrl.h:
>> -----------------
>> /*
>> * CPU vendor IDs
>> *
>> * Define as bits because they're used for vendor_specific bitmask in
>> * the struct resctrl_test.
>> */
>> #define ARCH_INTEL 1
>> #define ARCH_AMD 2
>> -----------------
>>
>> The comment before the CPU vendor IDs defines attempts to provide
>> guidance but it is clearly still quite subtle that these values are
> I wrote "clearly" in response to the earlier patch that did not follow the quoted
> documentation, implying that the documentation was not sufficient. I do not
> think "clearly" applies here. This can just be specific about how these values
> are used ... which this paragraph duplicates from the quoted comment so either this
> paragraph or the code quote could be dropped?
Thank you for the suggestion.
The revised patch description as below:
--------------------------------------
The CPU vendor IDs are required to be unique bits because they're used
for vendor_specific bitmask in the struct resctrl_test.
Consider for example their usage in test_vendor_specific_check():
return get_vendor() & test->vendor_specific
However, the definitions of CPU vendor IDs in file resctrl.h is quite
subtle as a bitmask value:
#define ARCH_INTEL 1
#define ARCH_AMD 2
A clearer and more maintainable approach is to define these CPU vendor
IDs using BIT(). This ensures each vendor corresponds to a distinct bit
and makes it obvious when adding new vendor IDs.
...
--------------------------------------
>
>> required to be unique bits. Consider for example their usage in
>> test_vendor_specific_check():
>> return get_vendor() & test->vendor_specific
>> -int get_vendor(void)
>> +unsigned int get_vendor(void)
>> {
>> - static int vendor = -1;
>> + static unsigned int vendor;
>>
>> - if (vendor == -1)
>> + if (vendor == 0)
>> vendor = detect_vendor();
>> +
>> + /* detect_vendor() returns invalid vendor id */
>> if (vendor == 0)
>> ksft_print_msg("Can not get vendor info...\n");
> detect_vendor() returns 0 if it cannot detect the vendor. Using "0" as well as
> return value of detect_vendor() to indicate that detect_vendor() should be run will
> thus cause detect_vendor() to always be called on failure even though it will keep
> failing.
Thank you.
I got it. In original code, "static int vendor = -1;" does it intentionally.
>
> Can vendor be kept as int and just cast it on return? This may be introducing the
> risky type conversion that the changelog claims to avoid though ....
This is really a dilemma.
I could keep vendor as int, even thought the code doesn't look graceful. I will try to add a comment for it.
The code changes may look like:
-------------------------------
-int get_vendor(void)
+unsigned int get_vendor(void)
{
static int vendor = -1;
+ /*
+ * Notes on vendor:
+ * -1: initial value, detect_vendor() is not called yet.
+ * 0: detect_vendor() returns 0 if it cannot detect the vendor.
+ * > 0: detect_vendor() returns valid vendor id.
+ *
+ * The return type of detect_vendor() is 'unsigned int'.
+ * Cast vendor from 'int' to 'unsigned int' on return.
+ */
if (vendor == -1)
vendor = detect_vendor();
+
if (vendor == 0)
ksft_print_msg("Can not get vendor info...\n");
- return vendor;
+ return (unsigned int) vendor;
}
-------------------------------
Thank you!
Best regards,
Xiaochen Shen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-09 6:10 ` Xiaochen Shen
@ 2025-12-09 23:02 ` Reinette Chatre
2025-12-09 23:42 ` Luck, Tony
2025-12-10 4:46 ` Xiaochen Shen
0 siblings, 2 replies; 18+ messages in thread
From: Reinette Chatre @ 2025-12-09 23:02 UTC (permalink / raw)
To: Xiaochen Shen, Fenghua Yu, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest
Hi Xiaochen,
On 12/8/25 10:10 PM, Xiaochen Shen wrote:
> On 12/9/2025 1:57 AM, Reinette Chatre wrote:
...
>>> In file resctrl.h:
>>> -----------------
>>> /*
>>> * CPU vendor IDs
>>> *
>>> * Define as bits because they're used for vendor_specific bitmask in
>>> * the struct resctrl_test.
>>> */
>>> #define ARCH_INTEL 1
>>> #define ARCH_AMD 2
>>> -----------------
>>>
>>> The comment before the CPU vendor IDs defines attempts to provide
>>> guidance but it is clearly still quite subtle that these values are
>> I wrote "clearly" in response to the earlier patch that did not follow the quoted
>> documentation, implying that the documentation was not sufficient. I do not
>> think "clearly" applies here. This can just be specific about how these values
>> are used ... which this paragraph duplicates from the quoted comment so either this
>> paragraph or the code quote could be dropped?
>
> Thank you for the suggestion.
> The revised patch description as below:
> --------------------------------------
> The CPU vendor IDs are required to be unique bits because they're used
> for vendor_specific bitmask in the struct resctrl_test.
> Consider for example their usage in test_vendor_specific_check():
> return get_vendor() & test->vendor_specific
>
> However, the definitions of CPU vendor IDs in file resctrl.h is quite
> subtle as a bitmask value:
> #define ARCH_INTEL 1
> #define ARCH_AMD 2
>
> A clearer and more maintainable approach is to define these CPU vendor
> IDs using BIT(). This ensures each vendor corresponds to a distinct bit
> and makes it obvious when adding new vendor IDs.
Thank you. Looks good to me.
> ...
> --------------------------------------
>
>>
>>> required to be unique bits. Consider for example their usage in
>>> test_vendor_specific_check():
>>> return get_vendor() & test->vendor_specific
>>> -int get_vendor(void)
>>> +unsigned int get_vendor(void)
>>> {
>>> - static int vendor = -1;
>>> + static unsigned int vendor;
>>>
>>> - if (vendor == -1)
>>> + if (vendor == 0)
>>> vendor = detect_vendor();
>>> +
>>> + /* detect_vendor() returns invalid vendor id */
>>> if (vendor == 0)
>>> ksft_print_msg("Can not get vendor info...\n");
>> detect_vendor() returns 0 if it cannot detect the vendor. Using "0" as well as
>> return value of detect_vendor() to indicate that detect_vendor() should be run will
>> thus cause detect_vendor() to always be called on failure even though it will keep
>> failing.
>
> Thank you.
> I got it. In original code, "static int vendor = -1;" does it intentionally.
>
>
>>
>> Can vendor be kept as int and just cast it on return? This may be introducing the
>> risky type conversion that the changelog claims to avoid though ....
>
> This is really a dilemma.
> I could keep vendor as int, even thought the code doesn't look graceful. I will try to add a comment for it.
> The code changes may look like:
> -------------------------------
> -int get_vendor(void)
> +unsigned int get_vendor(void)
> {
> static int vendor = -1;
>
> + /*
> + * Notes on vendor:
> + * -1: initial value, detect_vendor() is not called yet.
> + * 0: detect_vendor() returns 0 if it cannot detect the vendor.
> + * > 0: detect_vendor() returns valid vendor id.
> + *
> + * The return type of detect_vendor() is 'unsigned int'.
> + * Cast vendor from 'int' to 'unsigned int' on return.
> + */
> if (vendor == -1)
> vendor = detect_vendor();
> +
> if (vendor == 0)
> ksft_print_msg("Can not get vendor info...\n");
>
> - return vendor;
> + return (unsigned int) vendor;
> }
I suggest this be simplified to not have the vendor ID be used both as a value and as a state.
Here is some pseudo-code that should be able to accomplish this:
unsigned int detect_vendor(void)
{
static bool initialized = false;
static unsigned int vendor_id;
...
FILE *inf;
if (initialized)
return vendor_id;
inf = fopen("/proc/cpuinfo", "r");
if (!inf) {
vendor_id = 0;
initialized = true;
return vendor_id;
}
/* initialize vendor_id from /proc/cpuinfo */
initialized = true;
return vendor_id;
}
unsigned int get_vendor(void)
{
unsigned int vendor;
vendor = detect_vendor();
if (vendor == 0)
ksft_print_msg(...);
return vendor;
}
Reinette
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-09 23:02 ` Reinette Chatre
@ 2025-12-09 23:42 ` Luck, Tony
2025-12-10 0:30 ` Reinette Chatre
2025-12-10 4:46 ` Xiaochen Shen
1 sibling, 1 reply; 18+ messages in thread
From: Luck, Tony @ 2025-12-09 23:42 UTC (permalink / raw)
To: Reinette Chatre
Cc: Xiaochen Shen, Fenghua Yu, bp, shuah, skhan, babu.moger,
james.morse, Dave.Martin, x86, linux-kernel, linux-kselftest
On Tue, Dec 09, 2025 at 03:02:14PM -0800, Reinette Chatre wrote:
> I suggest this be simplified to not have the vendor ID be used both as a value and as a state.
> Here is some pseudo-code that should be able to accomplish this:
>
>
> unsigned int detect_vendor(void)
> {
> static bool initialized = false;
> static unsigned int vendor_id;
> ...
> FILE *inf;
>
>
> if (initialized)
> return vendor_id;
>
> inf = fopen("/proc/cpuinfo", "r");
> if (!inf) {
> vendor_id = 0;
> initialized = true;
> return vendor_id;
> }
>
> /* initialize vendor_id from /proc/cpuinfo */
>
> initialized = true;
> return vendor_id;
> }
>
> unsigned int get_vendor(void)
> {
> unsigned int vendor;
>
> vendor = detect_vendor();
>
> if (vendor == 0)
> ksft_print_msg(...);
>
> return vendor;
> }
If detect_vendor() failed, this you'd get the ksft_print_msg() for every
call to get_vendor().
Why not split completly.
static unsigned int vendor_id;
void detect_vendor(void)
{
FILE *inf = fopen("/proc/cpuinfo", "r");
if (!inf) {
... warning unable to get vendor id ...
}
... initialize from /proc/cpuinfo ...
... warn if doesn't find a known vendor ...
}
Call detect_vendor() at the beginning of main() in each test.
Then just use "vendor_id" whenever you need to test for some vendor
specific feature.
-Tony
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-09 23:42 ` Luck, Tony
@ 2025-12-10 0:30 ` Reinette Chatre
0 siblings, 0 replies; 18+ messages in thread
From: Reinette Chatre @ 2025-12-10 0:30 UTC (permalink / raw)
To: Luck, Tony
Cc: Xiaochen Shen, Fenghua Yu, bp, shuah, skhan, babu.moger,
james.morse, Dave.Martin, x86, linux-kernel, linux-kselftest
Hi Tony,
On 12/9/25 3:42 PM, Luck, Tony wrote:
> On Tue, Dec 09, 2025 at 03:02:14PM -0800, Reinette Chatre wrote:
>
>> I suggest this be simplified to not have the vendor ID be used both as a value and as a state.
>> Here is some pseudo-code that should be able to accomplish this:
>>
>>
>> unsigned int detect_vendor(void)
>> {
>> static bool initialized = false;
>> static unsigned int vendor_id;
>> ...
>> FILE *inf;
>>
>>
>> if (initialized)
>> return vendor_id;
>>
>> inf = fopen("/proc/cpuinfo", "r");
>> if (!inf) {
>> vendor_id = 0;
>> initialized = true;
>> return vendor_id;
>> }
>>
>> /* initialize vendor_id from /proc/cpuinfo */
>>
>> initialized = true;
>> return vendor_id;
>> }
>>
>> unsigned int get_vendor(void)
>> {
>> unsigned int vendor;
>>
>> vendor = detect_vendor();
>>
>> if (vendor == 0)
>> ksft_print_msg(...);
>>
>> return vendor;
>> }
>
> If detect_vendor() failed, this you'd get the ksft_print_msg() for every
> call to get_vendor().
Right. This is intended to match existing behavior. The goal is to
only do the work of querying the vendor information once. The tests are
independent so to avoid the failure message about obtaining vendor information
to only be in the test that did the original query it is printed in every
test's output.
>
> Why not split completly.
>
> static unsigned int vendor_id;
>
> void detect_vendor(void)
> {
> FILE *inf = fopen("/proc/cpuinfo", "r");
>
> if (!inf) {
> ... warning unable to get vendor id ...
> }
>
> ... initialize from /proc/cpuinfo ...
>
> ... warn if doesn't find a known vendor ...
> }
>
> Call detect_vendor() at the beginning of main() in each test.
This will repeat the vendor detection for every (currently six) test?
This seems unnecessary work to me considering this only needs to be done
once.
>
> Then just use "vendor_id" whenever you need to test for some vendor
> specific feature.
Including the warning within detect_vendor() and calling it in each test
does address the goal of including any vendor detection failure message in log of
every test that depends on it. Even so, from what I can tell this separates the message
from where test actually fails though, making things harder to debug, and I expect will
result in more code duplication: the duplicated calls to detect_vendor() and likely
tests needing to duplicate current get_vendor() (more below):
If I understand correctly this would look something like:
some_function()
{
if (vendor_id == ARCH_TBD)
/* Do something */
}
resctrl_test::run_test(...)
{
/* prep */
detect_vendor(); /* may print warning */
/*
* Do not fail the test if vendor detection fails since not all
* flows may depend on vendor information.
*/
/* run actual test */
/* do some things that result in messages in test log */
some_function();
/* do some things that result in messages in test log */
}
In above the test log will show early that vendor detection failed but it is not
clear in the test log what the impact on the test is by being clear where in the
test the failed vendor detection is needed/used.
I anticipate that tests will start to be defensive and do things like below that
would create unnecessary duplication that is currently handled by get_vendor().
some_function()
{
if (vendor_id == 0) {
ksft_print_msg("Vendor invalid, cannot do <something>\n");
return;
}
if (vendor_id == ARCH_TBD)
/* Do something */
}
Just failing tests if the vendor cannot be determined may be an easy solution but
since not all tests depend on vendor information this seems too severe.
Reinette
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon
2025-12-09 23:02 ` Reinette Chatre
2025-12-09 23:42 ` Luck, Tony
@ 2025-12-10 4:46 ` Xiaochen Shen
1 sibling, 0 replies; 18+ messages in thread
From: Xiaochen Shen @ 2025-12-10 4:46 UTC (permalink / raw)
To: Reinette Chatre, Fenghua Yu, tony.luck, bp, shuah, skhan
Cc: babu.moger, james.morse, Dave.Martin, x86, linux-kernel,
linux-kselftest, shenxiaochen
Hi Reinette,
On 12/10/2025 7:02 AM, Reinette Chatre wrote:
> I suggest this be simplified to not have the vendor ID be used both as a value and as a state.
> Here is some pseudo-code that should be able to accomplish this:
>
>
> unsigned int detect_vendor(void)
> {
> static bool initialized = false;
> static unsigned int vendor_id;
> ...
> FILE *inf;
>
>
> if (initialized)
> return vendor_id;
>
> inf = fopen("/proc/cpuinfo", "r");
> if (!inf) {
> vendor_id = 0;
> initialized = true;
> return vendor_id;
> }
>
> /* initialize vendor_id from /proc/cpuinfo */
>
> initialized = true;
> return vendor_id;
> }
>
> unsigned int get_vendor(void)
> {
> unsigned int vendor;
>
> vendor = detect_vendor();
>
> if (vendor == 0)
> ksft_print_msg(...);
>
> return vendor;
> }
>
> Reinette
Thank you very much! I will make the change in v3 patch series.
Could you help review the revised patch description for the change?
--------------------------------
...
and makes it obvious when adding new vendor IDs.
Accordingly, update the return types of detect_vendor() and get_vendor()
from 'int' to 'unsigned int' to align with their usage as bitmask values
and to prevent potentially risky type conversions.
Furthermore, introduce a bool flag 'initialized' to simplify the
get_vendor() -> detect_vendor() logic. This ensures the vendor ID is
detected only once and resolves the ambiguity of using the same variable
'vendor' both as a value and as a state.
--------------------------------
Thank you!
Best regards,
Xiaochen Shen
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-12-10 4:46 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-05 9:25 [PATCH v2 0/3] selftests/resctrl: Add Hygon CPUs support and bug fixes Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 1/3] selftests/resctrl: Add CPU vendor detection for Hygon Xiaochen Shen
2025-12-05 19:28 ` Fenghua Yu
2025-12-08 8:01 ` Xiaochen Shen
2025-12-08 17:57 ` Reinette Chatre
2025-12-09 6:10 ` Xiaochen Shen
2025-12-09 23:02 ` Reinette Chatre
2025-12-09 23:42 ` Luck, Tony
2025-12-10 0:30 ` Reinette Chatre
2025-12-10 4:46 ` Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 2/3] selftests/resctrl: Fix a division by zero error on Hygon Xiaochen Shen
2025-12-05 18:53 ` Fenghua Yu
2025-12-08 2:27 ` Xiaochen Shen
2025-12-05 9:25 ` [PATCH v2 3/3] selftests/resctrl: Fix non-contiguous CBM check for Hygon Xiaochen Shen
2025-12-05 19:39 ` Fenghua Yu
2025-12-05 21:30 ` Reinette Chatre
2025-12-05 21:51 ` Fenghua Yu
2025-12-08 8:06 ` Xiaochen Shen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox