* [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms @ 2026-04-07 20:38 Chun-Tse Shao 2026-04-07 20:38 ` [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids Chun-Tse Shao 2026-04-10 4:39 ` [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Namhyung Kim 0 siblings, 2 replies; 5+ messages in thread From: Chun-Tse Shao @ 2026-04-07 20:38 UTC (permalink / raw) To: linux-kernel Cc: Chun-Tse Shao, peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter, james.clark, zide.chen, ravi.bangoria, linux-perf-users Prepare for supporting more Intel platforms with sub-NUMA clustering by generalizing the GNR specific logic. Signed-off-by: Chun-Tse Shao <ctshao@google.com> --- v5: Split patch. v4: lore.kernel.org/20260402205300.1953706-1-ctshao@google.com Rebase. v3: lore.kernel.org/20260212223942.3832857-1-ctshao@google.com Fix a typo. v2: lore.kernel.org/20260205232220.1980168-1-ctshao@google.com Split EMR and GNR in the SNC2 IMC cpu map. v1: lore.kernel.org/20260108184430.1210223-1-ctshao@google.com tools/perf/arch/x86/util/pmu.c | 44 +++++++++++++++++++++------------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c index 0661e0f0b02d..938be36ec0f7 100644 --- a/tools/perf/arch/x86/util/pmu.c +++ b/tools/perf/arch/x86/util/pmu.c @@ -23,20 +23,28 @@ #include "util/env.h" #include "util/header.h" -static bool x86__is_intel_graniterapids(void) +static bool x86__is_snc_supported(void) { - static bool checked_if_graniterapids; - static bool is_graniterapids; + static bool checked_if_snc_supported; + static bool is_supported; - if (!checked_if_graniterapids) { - const char *graniterapids_cpuid = "GenuineIntel-6-A[DE]"; + if (!checked_if_snc_supported) { + + /* Graniterapids supports SNC configuration. */ + static const char *const supported_cpuids[] = { + "GenuineIntel-6-A[DE]", /* Graniterapids */ + }; char *cpuid = get_cpuid_str((struct perf_cpu){0}); - is_graniterapids = cpuid && strcmp_cpuid_str(graniterapids_cpuid, cpuid) == 0; + for (size_t i = 0; i < ARRAY_SIZE(supported_cpuids); i++) { + is_supported = cpuid && strcmp_cpuid_str(supported_cpuids[i], cpuid) == 0; + if (is_supported) + break; + } free(cpuid); - checked_if_graniterapids = true; + checked_if_snc_supported = true; } - return is_graniterapids; + return is_supported; } static struct perf_cpu_map *read_sysfs_cpu_map(const char *sysfs_path) @@ -133,8 +141,8 @@ static int uncore_imc_snc(struct perf_pmu *pmu) // Compute the IMC SNC using lookup tables. unsigned int imc_num; int snc_nodes = snc_nodes_per_l3_cache(); - const u8 snc2_map[] = {1, 1, 0, 0, 1, 1, 0, 0}; - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2, 1, 1, 0, 0, 2, 2}; + const u8 snc2_map[] = {1, 1, 0, 0}; + const u8 snc3_map[] = {1, 1, 0, 0, 2, 2}; const u8 *snc_map; size_t snc_map_len; @@ -157,11 +165,12 @@ static int uncore_imc_snc(struct perf_pmu *pmu) pr_warning("Unexpected: unable to compute IMC number '%s'\n", pmu->name); return 0; } - if (imc_num >= snc_map_len) { + if (imc_num >= snc_map_len * perf_cpu_map__nr(pmu->cpus)) { pr_warning("Unexpected IMC %d for SNC%d mapping\n", imc_num, snc_nodes); return 0; } - return snc_map[imc_num]; + + return snc_map[imc_num % snc_map_len]; } static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) @@ -201,7 +210,7 @@ static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) return cpu_adjust[pmu_snc]; } -static void gnr_uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) +static void uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) { // With sub-NUMA clustering (SNC) there is a NUMA node per SNC in the // topology. For example, a two socket graniterapids machine may be set @@ -301,11 +310,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu) pmu->mem_events = perf_mem_events_intel_aux; else pmu->mem_events = perf_mem_events_intel; - } else if (x86__is_intel_graniterapids()) { + } else if (x86__is_snc_supported()) { if (strstarts(pmu->name, "uncore_cha_")) - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); - else if (strstarts(pmu->name, "uncore_imc_")) - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); + else if (strstarts(pmu->name, "uncore_imc_") && + !strstarts(pmu->name, "uncore_imc_free_running")) + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); } } } -- 2.53.0.1213.gd9a14994de-goog ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids 2026-04-07 20:38 [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Chun-Tse Shao @ 2026-04-07 20:38 ` Chun-Tse Shao 2026-04-10 4:43 ` Namhyung Kim 2026-04-10 4:39 ` [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Namhyung Kim 1 sibling, 1 reply; 5+ messages in thread From: Chun-Tse Shao @ 2026-04-07 20:38 UTC (permalink / raw) To: linux-kernel Cc: Chun-Tse Shao, Zide Chen, Ian Rogers, peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin, jolsa, adrian.hunter, james.clark, ravi.bangoria, linux-perf-users Similar to GNR [1], Emeraldrapids supports sub-NUMA clusters as well. Adjust cpumasks as the logic for GNR in [1]. Tested on Emeraldrapids with SNC2 enabled: $ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a -- sleep 1 Performance counter stats for 'system wide': N0 30 72125876670 UNC_CHA_CLOCKTICKS N0 4 8815163648 UNC_M_CLOCKTICKS N1 30 72124958844 UNC_CHA_CLOCKTICKS N1 4 8815014974 UNC_M_CLOCKTICKS N2 30 72121049022 UNC_CHA_CLOCKTICKS N2 4 8814592626 UNC_M_CLOCKTICKS N3 30 72117133854 UNC_CHA_CLOCKTICKS N3 4 8814012840 UNC_M_CLOCKTICKS 1.001574118 seconds time elapsed [1] lore.kernel.org/20250515181417.491401-1-irogers@google.com Reviewed-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> --- tools/perf/arch/x86/util/pmu.c | 56 +++++++++++++++++++++++----------- 1 file changed, 38 insertions(+), 18 deletions(-) diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c index 938be36ec0f7..3743f5145505 100644 --- a/tools/perf/arch/x86/util/pmu.c +++ b/tools/perf/arch/x86/util/pmu.c @@ -30,8 +30,9 @@ static bool x86__is_snc_supported(void) if (!checked_if_snc_supported) { - /* Graniterapids supports SNC configuration. */ + /* Emeraldrapids Graniterapids support SNC configuration. */ static const char *const supported_cpuids[] = { + "GenuineIntel-6-CF", /* Emeraldrapids */ "GenuineIntel-6-A[DE]", /* Graniterapids */ }; char *cpuid = get_cpuid_str((struct perf_cpu){0}); @@ -141,23 +142,42 @@ static int uncore_imc_snc(struct perf_pmu *pmu) // Compute the IMC SNC using lookup tables. unsigned int imc_num; int snc_nodes = snc_nodes_per_l3_cache(); - const u8 snc2_map[] = {1, 1, 0, 0}; - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2}; - const u8 *snc_map; - size_t snc_map_len; - - switch (snc_nodes) { - case 2: - snc_map = snc2_map; - snc_map_len = ARRAY_SIZE(snc2_map); - break; - case 3: - snc_map = snc3_map; - snc_map_len = ARRAY_SIZE(snc3_map); - break; - default: - /* Error or no lookup support for SNC with >3 nodes. */ - return 0; + char *cpuid; + static const u8 emr_snc2_map[] = { 0, 0, 1, 1 }; + static const u8 gnr_snc2_map[] = { 1, 1, 0, 0 }; + static const u8 snc3_map[] = { 1, 1, 0, 0, 2, 2 }; + static const u8 *snc_map; + static size_t snc_map_len; + + /* snc_map is not inited yet. We only look up once to avoid expensive operations. */ + if (!snc_map) { + switch (snc_nodes) { + case 2: + cpuid = get_cpuid_str((struct perf_cpu){ 0 }); + if (cpuid) { + if (strcmp_cpuid_str("GenuineIntel-6-CF", cpuid) == 0) { + snc_map = emr_snc2_map; + snc_map_len = ARRAY_SIZE(emr_snc2_map); + } else if (strcmp_cpuid_str("GenuineIntel-6-A[DE]", cpuid) == 0) { + snc_map = gnr_snc2_map; + snc_map_len = ARRAY_SIZE(gnr_snc2_map); + } + free(cpuid); + } + break; + case 3: + snc_map = snc3_map; + snc_map_len = ARRAY_SIZE(snc3_map); + break; + default: + /* Error or no lookup support for SNC with >3 nodes. */ + return 0; + } + + if (!snc_map) { + pr_warning("Unexpected: can not find snc map config"); + return 0; + } } /* Compute SNC for PMU. */ -- 2.53.0.1213.gd9a14994de-goog ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids 2026-04-07 20:38 ` [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids Chun-Tse Shao @ 2026-04-10 4:43 ` Namhyung Kim 0 siblings, 0 replies; 5+ messages in thread From: Namhyung Kim @ 2026-04-10 4:43 UTC (permalink / raw) To: Chun-Tse Shao Cc: linux-kernel, Zide Chen, Ian Rogers, peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, adrian.hunter, james.clark, ravi.bangoria, linux-perf-users On Tue, Apr 07, 2026 at 01:38:43PM -0700, Chun-Tse Shao wrote: > Similar to GNR [1], Emeraldrapids supports sub-NUMA clusters as well. > Adjust cpumasks as the logic for GNR in [1]. > > Tested on Emeraldrapids with SNC2 enabled: > $ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a -- sleep 1 > > Performance counter stats for 'system wide': > > N0 30 72125876670 UNC_CHA_CLOCKTICKS > N0 4 8815163648 UNC_M_CLOCKTICKS > N1 30 72124958844 UNC_CHA_CLOCKTICKS > N1 4 8815014974 UNC_M_CLOCKTICKS > N2 30 72121049022 UNC_CHA_CLOCKTICKS > N2 4 8814592626 UNC_M_CLOCKTICKS > N3 30 72117133854 UNC_CHA_CLOCKTICKS > N3 4 8814012840 UNC_M_CLOCKTICKS > > 1.001574118 seconds time elapsed > > [1] lore.kernel.org/20250515181417.491401-1-irogers@google.com > > Reviewed-by: Zide Chen <zide.chen@intel.com> > Reviewed-by: Ian Rogers <irogers@google.com> > Signed-off-by: Chun-Tse Shao <ctshao@google.com> > --- > tools/perf/arch/x86/util/pmu.c | 56 +++++++++++++++++++++++----------- > 1 file changed, 38 insertions(+), 18 deletions(-) > > diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c > index 938be36ec0f7..3743f5145505 100644 > --- a/tools/perf/arch/x86/util/pmu.c > +++ b/tools/perf/arch/x86/util/pmu.c > @@ -30,8 +30,9 @@ static bool x86__is_snc_supported(void) > > if (!checked_if_snc_supported) { > > - /* Graniterapids supports SNC configuration. */ > + /* Emeraldrapids Graniterapids support SNC configuration. */ > static const char *const supported_cpuids[] = { > + "GenuineIntel-6-CF", /* Emeraldrapids */ > "GenuineIntel-6-A[DE]", /* Graniterapids */ It'd be great if we can share these string literals.. > }; > char *cpuid = get_cpuid_str((struct perf_cpu){0}); > @@ -141,23 +142,42 @@ static int uncore_imc_snc(struct perf_pmu *pmu) > // Compute the IMC SNC using lookup tables. > unsigned int imc_num; > int snc_nodes = snc_nodes_per_l3_cache(); > - const u8 snc2_map[] = {1, 1, 0, 0}; > - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2}; > - const u8 *snc_map; > - size_t snc_map_len; > - > - switch (snc_nodes) { > - case 2: > - snc_map = snc2_map; > - snc_map_len = ARRAY_SIZE(snc2_map); > - break; > - case 3: > - snc_map = snc3_map; > - snc_map_len = ARRAY_SIZE(snc3_map); > - break; > - default: > - /* Error or no lookup support for SNC with >3 nodes. */ > - return 0; > + char *cpuid; > + static const u8 emr_snc2_map[] = { 0, 0, 1, 1 }; > + static const u8 gnr_snc2_map[] = { 1, 1, 0, 0 }; > + static const u8 snc3_map[] = { 1, 1, 0, 0, 2, 2 }; > + static const u8 *snc_map; > + static size_t snc_map_len; > + > + /* snc_map is not inited yet. We only look up once to avoid expensive operations. */ > + if (!snc_map) { > + switch (snc_nodes) { > + case 2: > + cpuid = get_cpuid_str((struct perf_cpu){ 0 }); > + if (cpuid) { > + if (strcmp_cpuid_str("GenuineIntel-6-CF", cpuid) == 0) { > + snc_map = emr_snc2_map; > + snc_map_len = ARRAY_SIZE(emr_snc2_map); > + } else if (strcmp_cpuid_str("GenuineIntel-6-A[DE]", cpuid) == 0) { > + snc_map = gnr_snc2_map; > + snc_map_len = ARRAY_SIZE(gnr_snc2_map); ... in here as well. Thanks, Namhyung > + } > + free(cpuid); > + } > + break; > + case 3: > + snc_map = snc3_map; > + snc_map_len = ARRAY_SIZE(snc3_map); > + break; > + default: > + /* Error or no lookup support for SNC with >3 nodes. */ > + return 0; > + } > + > + if (!snc_map) { > + pr_warning("Unexpected: can not find snc map config"); > + return 0; > + } > } > > /* Compute SNC for PMU. */ > -- > 2.53.0.1213.gd9a14994de-goog > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms 2026-04-07 20:38 [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Chun-Tse Shao 2026-04-07 20:38 ` [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids Chun-Tse Shao @ 2026-04-10 4:39 ` Namhyung Kim 2026-04-22 20:53 ` Chen, Zide 1 sibling, 1 reply; 5+ messages in thread From: Namhyung Kim @ 2026-04-10 4:39 UTC (permalink / raw) To: Chun-Tse Shao, zide.chen Cc: linux-kernel, peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter, james.clark, ravi.bangoria, linux-perf-users Hello, On Tue, Apr 07, 2026 at 01:38:42PM -0700, Chun-Tse Shao wrote: > Prepare for supporting more Intel platforms with sub-NUMA clustering by > generalizing the GNR specific logic. > > Signed-off-by: Chun-Tse Shao <ctshao@google.com> > --- > v5: > Split patch. > > v4: lore.kernel.org/20260402205300.1953706-1-ctshao@google.com > Rebase. > > v3: lore.kernel.org/20260212223942.3832857-1-ctshao@google.com > Fix a typo. > > v2: lore.kernel.org/20260205232220.1980168-1-ctshao@google.com > Split EMR and GNR in the SNC2 IMC cpu map. > > v1: lore.kernel.org/20260108184430.1210223-1-ctshao@google.com > > tools/perf/arch/x86/util/pmu.c | 44 +++++++++++++++++++++------------- > 1 file changed, 27 insertions(+), 17 deletions(-) > > diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c > index 0661e0f0b02d..938be36ec0f7 100644 > --- a/tools/perf/arch/x86/util/pmu.c > +++ b/tools/perf/arch/x86/util/pmu.c > @@ -23,20 +23,28 @@ > #include "util/env.h" > #include "util/header.h" > > -static bool x86__is_intel_graniterapids(void) > +static bool x86__is_snc_supported(void) > { > - static bool checked_if_graniterapids; > - static bool is_graniterapids; > + static bool checked_if_snc_supported; > + static bool is_supported; > > - if (!checked_if_graniterapids) { > - const char *graniterapids_cpuid = "GenuineIntel-6-A[DE]"; > + if (!checked_if_snc_supported) { > + > + /* Graniterapids supports SNC configuration. */ > + static const char *const supported_cpuids[] = { > + "GenuineIntel-6-A[DE]", /* Graniterapids */ > + }; > char *cpuid = get_cpuid_str((struct perf_cpu){0}); > > - is_graniterapids = cpuid && strcmp_cpuid_str(graniterapids_cpuid, cpuid) == 0; > + for (size_t i = 0; i < ARRAY_SIZE(supported_cpuids); i++) { > + is_supported = cpuid && strcmp_cpuid_str(supported_cpuids[i], cpuid) == 0; > + if (is_supported) > + break; > + } > free(cpuid); > - checked_if_graniterapids = true; > + checked_if_snc_supported = true; > } > - return is_graniterapids; > + return is_supported; > } > > static struct perf_cpu_map *read_sysfs_cpu_map(const char *sysfs_path) > @@ -133,8 +141,8 @@ static int uncore_imc_snc(struct perf_pmu *pmu) > // Compute the IMC SNC using lookup tables. > unsigned int imc_num; > int snc_nodes = snc_nodes_per_l3_cache(); > - const u8 snc2_map[] = {1, 1, 0, 0, 1, 1, 0, 0}; > - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2, 1, 1, 0, 0, 2, 2}; > + const u8 snc2_map[] = {1, 1, 0, 0}; > + const u8 snc3_map[] = {1, 1, 0, 0, 2, 2}; > const u8 *snc_map; > size_t snc_map_len; > > @@ -157,11 +165,12 @@ static int uncore_imc_snc(struct perf_pmu *pmu) > pr_warning("Unexpected: unable to compute IMC number '%s'\n", pmu->name); > return 0; > } > - if (imc_num >= snc_map_len) { > + if (imc_num >= snc_map_len * perf_cpu_map__nr(pmu->cpus)) { > pr_warning("Unexpected IMC %d for SNC%d mapping\n", imc_num, snc_nodes); > return 0; Like sashiko said, I'm curious if it'd work well on 1-socket machine which may have the same number of uncore IMC PMUs. Zide, can you confirm? Thanks, Namhyung > } > - return snc_map[imc_num]; > + > + return snc_map[imc_num % snc_map_len]; > } > > static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) > @@ -201,7 +210,7 @@ static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) > return cpu_adjust[pmu_snc]; > } > > -static void gnr_uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) > +static void uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) > { > // With sub-NUMA clustering (SNC) there is a NUMA node per SNC in the > // topology. For example, a two socket graniterapids machine may be set > @@ -301,11 +310,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu) > pmu->mem_events = perf_mem_events_intel_aux; > else > pmu->mem_events = perf_mem_events_intel; > - } else if (x86__is_intel_graniterapids()) { > + } else if (x86__is_snc_supported()) { > if (strstarts(pmu->name, "uncore_cha_")) > - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); > - else if (strstarts(pmu->name, "uncore_imc_")) > - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); > + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); > + else if (strstarts(pmu->name, "uncore_imc_") && > + !strstarts(pmu->name, "uncore_imc_free_running")) > + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); > } > } > } > -- > 2.53.0.1213.gd9a14994de-goog > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms 2026-04-10 4:39 ` [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Namhyung Kim @ 2026-04-22 20:53 ` Chen, Zide 0 siblings, 0 replies; 5+ messages in thread From: Chen, Zide @ 2026-04-22 20:53 UTC (permalink / raw) To: Namhyung Kim, Chun-Tse Shao Cc: linux-kernel, peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, irogers, adrian.hunter, james.clark, ravi.bangoria, linux-perf-users On 4/9/2026 9:39 PM, Namhyung Kim wrote: > Hello, > > On Tue, Apr 07, 2026 at 01:38:42PM -0700, Chun-Tse Shao wrote: >> Prepare for supporting more Intel platforms with sub-NUMA clustering by >> generalizing the GNR specific logic. >> >> Signed-off-by: Chun-Tse Shao <ctshao@google.com> >> --- >> v5: >> Split patch. >> >> v4: lore.kernel.org/20260402205300.1953706-1-ctshao@google.com >> Rebase. >> >> v3: lore.kernel.org/20260212223942.3832857-1-ctshao@google.com >> Fix a typo. >> >> v2: lore.kernel.org/20260205232220.1980168-1-ctshao@google.com >> Split EMR and GNR in the SNC2 IMC cpu map. >> >> v1: lore.kernel.org/20260108184430.1210223-1-ctshao@google.com >> >> tools/perf/arch/x86/util/pmu.c | 44 +++++++++++++++++++++------------- >> 1 file changed, 27 insertions(+), 17 deletions(-) >> >> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c >> index 0661e0f0b02d..938be36ec0f7 100644 >> --- a/tools/perf/arch/x86/util/pmu.c >> +++ b/tools/perf/arch/x86/util/pmu.c >> @@ -23,20 +23,28 @@ >> #include "util/env.h" >> #include "util/header.h" >> >> -static bool x86__is_intel_graniterapids(void) >> +static bool x86__is_snc_supported(void) >> { >> - static bool checked_if_graniterapids; >> - static bool is_graniterapids; >> + static bool checked_if_snc_supported; >> + static bool is_supported; >> >> - if (!checked_if_graniterapids) { >> - const char *graniterapids_cpuid = "GenuineIntel-6-A[DE]"; >> + if (!checked_if_snc_supported) { >> + >> + /* Graniterapids supports SNC configuration. */ >> + static const char *const supported_cpuids[] = { >> + "GenuineIntel-6-A[DE]", /* Graniterapids */ >> + }; >> char *cpuid = get_cpuid_str((struct perf_cpu){0}); >> >> - is_graniterapids = cpuid && strcmp_cpuid_str(graniterapids_cpuid, cpuid) == 0; >> + for (size_t i = 0; i < ARRAY_SIZE(supported_cpuids); i++) { >> + is_supported = cpuid && strcmp_cpuid_str(supported_cpuids[i], cpuid) == 0; >> + if (is_supported) >> + break; >> + } >> free(cpuid); >> - checked_if_graniterapids = true; >> + checked_if_snc_supported = true; >> } >> - return is_graniterapids; >> + return is_supported; >> } >> >> static struct perf_cpu_map *read_sysfs_cpu_map(const char *sysfs_path) >> @@ -133,8 +141,8 @@ static int uncore_imc_snc(struct perf_pmu *pmu) >> // Compute the IMC SNC using lookup tables. >> unsigned int imc_num; >> int snc_nodes = snc_nodes_per_l3_cache(); >> - const u8 snc2_map[] = {1, 1, 0, 0, 1, 1, 0, 0}; >> - const u8 snc3_map[] = {1, 1, 0, 0, 2, 2, 1, 1, 0, 0, 2, 2}; >> + const u8 snc2_map[] = {1, 1, 0, 0}; >> + const u8 snc3_map[] = {1, 1, 0, 0, 2, 2}; >> const u8 *snc_map; >> size_t snc_map_len; >> >> @@ -157,11 +165,12 @@ static int uncore_imc_snc(struct perf_pmu *pmu) >> pr_warning("Unexpected: unable to compute IMC number '%s'\n", pmu->name); >> return 0; >> } >> - if (imc_num >= snc_map_len) { >> + if (imc_num >= snc_map_len * perf_cpu_map__nr(pmu->cpus)) { >> pr_warning("Unexpected IMC %d for SNC%d mapping\n", imc_num, snc_nodes); >> return 0; > > Like sashiko said, I'm curious if it'd work well on 1-socket machine > which may have the same number of uncore IMC PMUs. > > Zide, can you confirm? On 1-socket, 1-die systems, the IMC controllers may be the same as on multi-die systems. Theoretically, this could cause issues if NUMA is enabled, but I am not aware of any platforms that support SNC on single-die GNR/EMR SKUs. I checked on a single-die GNR WS: SNC is not supported, so snc_nodes == 1 and the cpumask adjustment is correctly skipped. > > Thanks, > Namhyung > >> } >> - return snc_map[imc_num]; >> + >> + return snc_map[imc_num % snc_map_len]; >> } >> >> static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) >> @@ -201,7 +210,7 @@ static int uncore_cha_imc_compute_cpu_adjust(int pmu_snc) >> return cpu_adjust[pmu_snc]; >> } >> >> -static void gnr_uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) >> +static void uncore_cha_imc_adjust_cpumask_for_snc(struct perf_pmu *pmu, bool cha) >> { >> // With sub-NUMA clustering (SNC) there is a NUMA node per SNC in the >> // topology. For example, a two socket graniterapids machine may be set >> @@ -301,11 +310,12 @@ void perf_pmu__arch_init(struct perf_pmu *pmu) >> pmu->mem_events = perf_mem_events_intel_aux; >> else >> pmu->mem_events = perf_mem_events_intel; >> - } else if (x86__is_intel_graniterapids()) { >> + } else if (x86__is_snc_supported()) { >> if (strstarts(pmu->name, "uncore_cha_")) >> - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); >> - else if (strstarts(pmu->name, "uncore_imc_")) >> - gnr_uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); >> + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/true); >> + else if (strstarts(pmu->name, "uncore_imc_") && >> + !strstarts(pmu->name, "uncore_imc_free_running")) >> + uncore_cha_imc_adjust_cpumask_for_snc(pmu, /*cha=*/false); >> } >> } >> } >> -- >> 2.53.0.1213.gd9a14994de-goog >> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-22 20:53 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-07 20:38 [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Chun-Tse Shao 2026-04-07 20:38 ` [PATCH v5 2/2] perf pmu intel: Adjust cpumaks for sub-NUMA clusters on Emeraldrapids Chun-Tse Shao 2026-04-10 4:43 ` Namhyung Kim 2026-04-10 4:39 ` [PATCH v5 1/2] perf pmu intel: Generalize SNC cpumask adjustment for multiple platforms Namhyung Kim 2026-04-22 20:53 ` Chen, Zide
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox