From: Zhao Liu <zhao1.liu@linux.intel.com>
To: "Moger, Babu" <babu.moger@amd.com>
Cc: "Eduardo Habkost" <eduardo@habkost.net>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
qemu-devel@nongnu.org, "Zhenyu Wang" <zhenyu.z.wang@intel.com>,
"Xiaoyao Li" <xiaoyao.li@intel.com>,
"Zhao Liu" <zhao1.liu@intel.com>,
"Zhuocheng Ding" <zhuocheng.ding@intel.com>
Subject: Re: [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo
Date: Fri, 4 Aug 2023 17:05:32 +0800 [thread overview]
Message-ID: <ZMy/XAhuXJSZrwLk@liuzhao-OptiPlex-7080> (raw)
In-Reply-To: <cc472f47-2cb0-3cb8-f4c4-6f6db7bea782@amd.com>
Hi Babu,
On Wed, Aug 02, 2023 at 12:25:07PM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 12:25:07 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 08/17] i386: Support modules_per_die in
> X86CPUTopoInfo
>
> Hi Zhao,
>
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> >
> > Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> >
> > Since x86 does not yet support the "clusters" parameter in "-smp",
> > X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
> > module level width in APIC ID, which can be calculated by
> > "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
> > for now, so we can directly add APIC ID related helpers to support
> > module level parsing.
> >
> > At present, we don't expose module level in CPUID.1FH because currently
> > linux (v6.4-rc1) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > the wrong die_id. The module level should be exposed until the real
> > machine has the module level in CPUID.1FH.
> >
> > In addition, update topology structure in test-x86-topo.c.
> >
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v1:
> > * Include module level related helpers (apicid_module_width() and
> > apicid_module_offset()) in this patch. (Yanan)
> > ---
> > hw/i386/x86.c | 3 ++-
> > include/hw/i386/topology.h | 22 +++++++++++++++----
> > target/i386/cpu.c | 12 ++++++----
> > tests/unit/test-x86-topo.c | 45 ++++++++++++++++++++------------------
> > 4 files changed, 52 insertions(+), 30 deletions(-)
> >
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index 4efc390905ff..a552ae8bb4a8 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
> > MachineState *ms = MACHINE(x86ms);
> >
> > topo_info->dies_per_pkg = ms->smp.dies;
> > - topo_info->cores_per_die = ms->smp.cores;
> > + topo_info->modules_per_die = ms->smp.clusters;
>
> It is confusing. You said in the previous patch, using the clusters for
> x86 is going to cause compatibility issues.
The compatibility issue means the default L2 cache topology should be "1
L2 cache per core", and we shouldn't change this default setting.
If we want the "1 L2 cache per module", then we need other way to do
this (this is x-l2-cache-topo).
Since "cluster" was originally introduced into QEMU to help define the
L2 cache topology, I explained that we can't just change the default
topology level of L2.
> Why is this clusters is used to initialize modules_per_die?
"cluster" v.s. "module" just like "socket" v.s. "package".
The former is the generic name in smp code, while the latter is the more
accurate naming in the i386 context.
>
> Why not define a new field "modules"(just like clusters) in smp and use it
> x86? Is is going to a problem?
In this case (just add a new "module" in smp), the "cluster" parameter of
smp is not useful for i386, and different architectures have different
parameters for smp, which is not general enough. I think it's clearest to
have a common topology hierarchy in QEMU.
cluster was originally introduced to QEMU by arm. From Yanan's explanation
[1], it is a CPU topology level, above the core level, and that L2 is often
shared at this level as well.
This description is very similar to i386's module, so I think we could align
cluster with module instead of intruducing a new "module" in smp, just like
"socket" in smp is the same as "package" in i386.
[1]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> May be I am not clear here. I am yet to understand all the other changes.
>
Hope my explanation above clarifies your question.
Thanks,
Zhao
> Thanks
> Babu
>
> > + topo_info->cores_per_module = ms->smp.cores;
> > topo_info->threads_per_core = ms->smp.threads;
> > }
> >
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 5a19679f618b..c807d3811dd3 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
> >
> > typedef struct X86CPUTopoInfo {
> > unsigned dies_per_pkg;
> > - unsigned cores_per_die;
> > + unsigned modules_per_die;
> > + unsigned cores_per_module;
> > unsigned threads_per_core;
> > } X86CPUTopoInfo;
> >
> > @@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> > /* Bit width of the Core_ID field */
> > static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> > {
> > - return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > + return apicid_bitwidth_for_count(topo_info->cores_per_module);
> > +}
> > +
> > +/* Bit width of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> > +{
> > + return apicid_bitwidth_for_count(topo_info->modules_per_die);
> > }
> >
> > /* Bit width of the Die_ID field */
> > @@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> > return apicid_smt_width(topo_info);
> > }
> >
> > +/* Bit offset of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> > +{
> > + return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > +}
> > +
> > /* Bit offset of the Die_ID field */
> > static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> > {
> > - return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > + return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
> > }
> >
> > /* Bit offset of the Pkg_ID (socket ID) field */
> > @@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> > X86CPUTopoIDs *topo_ids)
> > {
> > unsigned nr_dies = topo_info->dies_per_pkg;
> > - unsigned nr_cores = topo_info->cores_per_die;
> > + unsigned nr_cores = topo_info->cores_per_module *
> > + topo_info->modules_per_die;
> > unsigned nr_threads = topo_info->threads_per_core;
> >
> > topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 8a9fd5682efc..d6969813ee02 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -339,7 +339,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >
> > /* L3 is shared among multiple cores */
> > if (cache->level == 3) {
> > - l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> > + l3_threads = topo_info->modules_per_die *
> > + topo_info->cores_per_module *
> > + topo_info->threads_per_core;
> > *eax |= (l3_threads - 1) << 14;
> > } else {
> > *eax |= ((topo_info->threads_per_core - 1) << 14);
> > @@ -6012,10 +6014,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> > uint32_t cpus_per_pkg;
> >
> > topo_info.dies_per_pkg = env->nr_dies;
> > - topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > + topo_info.modules_per_die = env->nr_modules;
> > + topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
> > topo_info.threads_per_core = cs->nr_threads;
> >
> > - cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> > + cores_per_pkg = topo_info.cores_per_module * topo_info.modules_per_die *
> > + topo_info.dies_per_pkg;
> > cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
> >
> > /* Calculate & apply limits for different index ranges */
> > @@ -6286,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> > break;
> > case 1:
> > *eax = apicid_die_offset(&topo_info);
> > - *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> > + *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
> > *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> > break;
> > case 2:
> > diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
> > index 2b104f86d7c2..f21b8a5d95c2 100644
> > --- a/tests/unit/test-x86-topo.c
> > +++ b/tests/unit/test-x86-topo.c
> > @@ -30,13 +30,16 @@ static void test_topo_bits(void)
> > {
> > X86CPUTopoInfo topo_info = {0};
> >
> > - /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> > - topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > + /*
> > + * simple tests for 1 thread per core, 1 core per module,
> > + * 1 module per die, 1 die per package
> > + */
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
> > g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
> > g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> >
> > - topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > @@ -45,39 +48,39 @@ static void test_topo_bits(void)
> >
> > /* Test field width calculation for multiple values
> > */
> > - topo_info = (X86CPUTopoInfo) {1, 1, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> > - topo_info = (X86CPUTopoInfo) {1, 1, 3};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > - topo_info = (X86CPUTopoInfo) {1, 1, 4};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> >
> > - topo_info = (X86CPUTopoInfo) {1, 1, 14};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > - topo_info = (X86CPUTopoInfo) {1, 1, 15};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > - topo_info = (X86CPUTopoInfo) {1, 1, 16};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > - topo_info = (X86CPUTopoInfo) {1, 1, 17};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
> >
> >
> > - topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> > g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > - topo_info = (X86CPUTopoInfo) {1, 31, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
> > g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > - topo_info = (X86CPUTopoInfo) {1, 32, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
> > g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > - topo_info = (X86CPUTopoInfo) {1, 33, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
> > g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
> >
> > - topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> > g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > - topo_info = (X86CPUTopoInfo) {2, 30, 2};
> > + topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
> > g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> > - topo_info = (X86CPUTopoInfo) {3, 30, 2};
> > + topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
> > g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> > - topo_info = (X86CPUTopoInfo) {4, 30, 2};
> > + topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
> > g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> >
> > /* build a weird topology and see if IDs are calculated correctly
> > @@ -85,18 +88,18 @@ static void test_topo_bits(void)
> >
> > /* This will use 2 bits for thread ID and 3 bits for core ID
> > */
> > - topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> > g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
> > g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
> > g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
> >
> > - topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> >
> > - topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > + topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> > (1 << 2) | 0);
> > g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
>
> --
> Thanks
> Babu Moger
next prev parent reply other threads:[~2023-08-04 8:55 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
2023-08-01 23:13 ` Moger, Babu
2023-08-04 8:12 ` Zhao Liu
2023-08-07 2:16 ` Xiaoyao Li
2023-08-07 7:05 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
2023-08-01 23:20 ` Moger, Babu
2023-08-04 8:14 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
2023-08-02 15:25 ` Moger, Babu
2023-08-04 8:16 ` Zhao Liu
2023-08-07 7:03 ` Xiaoyao Li
2023-08-07 7:53 ` Zhao Liu
2023-08-07 8:43 ` Xiaoyao Li
2023-08-07 10:00 ` Zhao Liu
2023-08-07 14:20 ` Xiaoyao Li
2023-08-07 14:42 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
2023-08-04 9:56 ` Xiaoyao Li
2023-08-04 12:43 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
2023-08-02 15:41 ` Moger, Babu
2023-08-04 8:21 ` Zhao Liu
2023-08-07 8:13 ` Xiaoyao Li
2023-08-07 9:30 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2023-08-02 16:31 ` Moger, Babu
2023-08-04 8:23 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2023-08-02 17:25 ` Moger, Babu
2023-08-04 9:05 ` Zhao Liu [this message]
2023-08-01 10:35 ` [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
2023-08-02 22:44 ` Moger, Babu
2023-08-04 9:06 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing Zhao Liu
2023-08-01 10:35 ` [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
2023-08-01 10:35 ` [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
2023-08-02 23:49 ` Moger, Babu
2023-08-03 16:41 ` Moger, Babu
2023-08-04 9:48 ` Zhao Liu
2023-08-04 15:48 ` Moger, Babu
2023-08-14 8:22 ` Zhao Liu
2023-08-14 16:03 ` Moger, Babu
2023-08-18 7:37 ` Zhao Liu
2023-08-23 17:18 ` Moger, Babu
2023-09-01 8:43 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2023-08-03 20:40 ` Moger, Babu
2023-08-04 9:50 ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
2023-08-03 20:44 ` Moger, Babu
2023-08-04 9:56 ` Zhao Liu
2023-08-04 18:50 ` Moger, Babu
2023-08-01 10:35 ` [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
2023-08-04 13:17 ` Zhao Liu
2023-08-08 11:52 ` Jonathan Cameron via
2023-08-01 23:11 ` Moger, Babu
2023-08-04 7:44 ` Zhao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMy/XAhuXJSZrwLk@liuzhao-OptiPlex-7080 \
--to=zhao1.liu@linux.intel.com \
--cc=babu.moger@amd.com \
--cc=eduardo@habkost.net \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wangyanan55@huawei.com \
--cc=xiaoyao.li@intel.com \
--cc=zhao1.liu@intel.com \
--cc=zhenyu.z.wang@intel.com \
--cc=zhuocheng.ding@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).