From: Zhao Liu <zhao1.liu@linux.intel.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Eduardo Habkost" <eduardo@habkost.net>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Yanan Wang" <wangyanan55@huawei.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Marcelo Tosatti" <mtosatti@redhat.com>,
qemu-devel@nongnu.org, kvm@vger.kernel.org,
"Babu Moger" <babu.moger@amd.com>,
"Xiaoyao Li" <xiaoyao.li@intel.com>,
"Zhenyu Wang" <zhenyu.z.wang@intel.com>,
"Zhuocheng Ding" <zhuocheng.ding@intel.com>,
"Yongwei Ma" <yongwei.ma@intel.com>,
"Zhao Liu" <zhao1.liu@intel.com>
Subject: Re: [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU
Date: Fri, 16 Feb 2024 00:56:34 +0800 [thread overview]
Message-ID: <Zc5CQlA20gTePwu6@intel.com> (raw)
In-Reply-To: <ZcUG0Uc8KylEQhUW@redhat.com>
Hi Daniel,
On Thu, Feb 08, 2024 at 04:52:33PM +0000, Daniel P. Berrangé wrote:
> Date: Thu, 8 Feb 2024 16:52:33 +0000
> From: "Daniel P. Berrangé" <berrange@redhat.com>
> Subject: Re: [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU
>
> On Fri, Feb 02, 2024 at 12:10:58AM +0800, Zhao Liu wrote:
> > Hi Daniel,
> >
> > On Thu, Feb 01, 2024 at 09:21:48AM +0000, Daniel P. Berrangé wrote:
> > > Date: Thu, 1 Feb 2024 09:21:48 +0000
> > > From: "Daniel P. Berrangé" <berrange@redhat.com>
> > > Subject: Re: [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU
> > >
> > > On Thu, Feb 01, 2024 at 10:57:32AM +0800, Zhao Liu wrote:
> > > > Hi Daniel,
> > > >
> > > > On Wed, Jan 31, 2024 at 10:28:42AM +0000, Daniel P. Berrangé wrote:
> > > > > Date: Wed, 31 Jan 2024 10:28:42 +0000
> > > > > From: "Daniel P. Berrangé" <berrange@redhat.com>
> > > > > Subject: Re: [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU
> > > > >
> > > > > On Wed, Jan 31, 2024 at 06:13:29PM +0800, Zhao Liu wrote:
> > > > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > >
> > > > [snip]
> > > >
> > > > > > However, after digging deeper into the description and use cases of
> > > > > > cluster in the device tree [3], I realized that the essential
> > > > > > difference between clusters and modules is that cluster is an extremely
> > > > > > abstract concept:
> > > > > > * Cluster supports nesting though currently QEMU doesn't support
> > > > > > nested cluster topology. However, modules will not support nesting.
> > > > > > * Also due to nesting, there is great flexibility in sharing resources
> > > > > > on clusters, rather than narrowing cluster down to sharing L2 (and
> > > > > > L3 tags) as the lowest topology level that contains cores.
> > > > > > * Flexible nesting of cluster allows it to correspond to any level
> > > > > > between the x86 package and core.
> > > > > >
> > > > > > Based on the above considerations, and in order to eliminate the naming
> > > > > > confusion caused by the mapping between general cluster and x86 module
> > > > > > in v7, we now formally introduce smp.modules as the new topology level.
> > > > >
> > > > > What is the Linux kernel calling this topology level on x86 ?
> > > > > It will be pretty unfortunate if Linux and QEMU end up with
> > > > > different names for the same topology level.
> > > > >
> > > >
> > > > Now Intel's engineers in the Linux kernel are starting to use "module"
> > > > to refer to this layer of topology [4] to avoid confusion, where
> > > > previously the scheduler developers referred to the share L2 hierarchy
> > > > collectively as "cluster".
> > > >
> > > > Looking at it this way, it makes more sense for QEMU to use the
> > > > "module" for x86.
> > >
> > > I was thinking specificially about what Linux calls this topology when
> > > exposing it in sysfs and /proc/cpuinfo. AFAICT, it looks like it is
> > > called 'clusters' in this context, and so this is the terminology that
> > > applications and users are going to expect.
> >
> > The cluster related topology information under "/sys/devices/system/cpu/
> > cpu*/topology" indicates the L2 cache topology (CPUID[0x4]), not module
> > level CPU topology (CPUID[0x1f]).
> >
> > So far, kernel hasn't exposed module topology related sysfs. But we will
> > add new "module" related information in sysfs. The relevant patches are
> > ready internally, but not posted yet.
> >
> > In the future, we will use "module" in sysfs to indicate module level CPU
> > topology, and "cluster" will be only used to refer to the l2 cache domain
> > as it is now.
>
> So, if they're distinct concepts both relevant to x86 CPUs, then from
> the QEMU POV, should this patch series be changing the -smp arg to
> allowing configuration of both 'clusters' and 'modules' for x86 ?
Though the previous versions use "clusters" parameter, they, like the
current "modules" version, are just to add a CPU topology level to the
x86 CPU.
>
> An earlier version of this series just supported 'clusters', and this
> changed to 'modules', but your description of Linux reporting both
> suggests QEMU would need both.
>
Given the cluster support for x86, i.e. the L2 cache topology support,
we want to introduce a different cache topology configuration way than
CPU topology and avoid using the "cluster" as cache topology name (this
avoids the confusion of -smp "clusters" which is a CPU topology since
ARM also just treats "cluster" as a CPU topology level in QEMU other
than cache topology level).
BTW, for cache topology, may I ask for your advice? Currently, I can
think of 2 options:
1. Hacked the -smp as:
-smp cpus=4,sockets=2,cores=2,threads=1, \
l3-cache=socket,l2-cache=core,l1-i-cache=core,l1-d-cache=core
For this way, I just parsed the extended -smp and store the cache
topology in such structue:
typedef struct CacheTopology {
CPUTopoLevel l1i;
CPUTopoLevel l1d;
CPUTopoLevel l2;
CPUTopoLevel l3;
} CacheTopology;
This way is just used for smp cache topology. For the heterogeneous/hybrid
cache topology, I think it can be expanded based on the QOM CPU topology
[4] as:
-accel kvm -cpu host \
-device cpu-socket,id=sock0 \
-device cpu-die,id=die0,parent=sock0 \
-device cpu-module,id=module0,parent=die0 \
-device cpu-module,id=module1,parent=die0 \
-device cpu-core,id=core0,parent=module0,nr-threads=2 \
-device cpu-core,id=core1,parent=module1,nr-threads=1 \
-device cpu-core,id=core2,parent=module1,nr-threads=1 \
-device cache,id=cache0,parent=die0,level=3,type=unified \
-device cache,id=cache1,parent=core0,level=2,type=unified \
-device cache,id=cache2,parent=core0,level=1,type=data \
-device cache,id=cache3,parent=core0,level=1,type=inst \
-device cache,id=cache4,parent=module1,level=2,type=unified \
-device cache,id=cache5,parent=core1,level=1,type=data \
-device cache,id=cache6,parent=core1,level=1,type=inst \
-device cache,id=cache5,parent=core2,level=1,type=data \
-device cache,id=cache6,parent=core2,level=1,type=inst \
In the module0, the l2 (x86's cluster) is shared at core0 (core level).
And in the module1, the l2 is shared for core1 and core 2 (at module
level).
[4]: https://lore.kernel.org/qemu-devel/20231130144203.2307629-1-zhao1.liu@linux.intel.com/
2. But recently I realized maybe there's another option, which is just
to introduce a new option "-cache" like "-numa" to configure cache
topology.
In "-cache", we could accept the CPU list as the parameter:
-cache cache,cacheid=0,level=2,type=unified,cpus=0-1 \
-cache cache,cacheid=1,level=2,type=unified,cpus=2-3 \
or CPU topology ids as the parameters:
-cache cache,cache-id=0,level=2,type=unified \
-cache cache,cache-id=1,level=2,type=unified \
-cache cpu,cache-id=0,socket-id=0,die-id=0,module-id=0,core-id=0 \
-cache cpu,cache-id=1,socket-id=0,die-id=0,module-id=1 \
Hmmm, Daniel, which of the above two options do you prefer?
Thanks,
Zhao
next prev parent reply other threads:[~2024-02-15 16:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-31 10:13 [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU Zhao Liu
2024-01-31 10:13 ` [PATCH v8 01/21] hw/core/machine: Introduce the module as a CPU topology level Zhao Liu
2024-01-31 10:13 ` [PATCH v8 02/21] hw/core/machine: Support modules in -smp Zhao Liu
2024-01-31 10:13 ` [PATCH v8 03/21] hw/core: Introduce module-id as the topology subindex Zhao Liu
2024-01-31 10:13 ` [PATCH v8 04/21] hw/core: Support module-id in numa configuration Zhao Liu
2024-01-31 10:13 ` [PATCH v8 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
2024-01-31 10:13 ` [PATCH v8 06/21] i386/cpu: Use APIC ID info to encode cache topo in CPUID[4] Zhao Liu
2024-01-31 10:13 ` [PATCH v8 07/21] i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2024-01-31 10:13 ` [PATCH v8 08/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2024-02-07 5:59 ` Philippe Mathieu-Daudé
2024-01-31 10:13 ` [PATCH v8 09/21] i386/cpu: Introduce bitmap to cache available CPU topology levels Zhao Liu
2024-01-31 10:13 ` [PATCH v8 10/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] Zhao Liu
2024-02-07 6:00 ` Philippe Mathieu-Daudé
2024-01-31 10:13 ` [PATCH v8 11/21] i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level Zhao Liu
2024-01-31 10:13 ` [PATCH v8 12/21] i386: Introduce module level cpu topology to CPUX86State Zhao Liu
2024-01-31 10:13 ` [PATCH v8 13/21] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2024-01-31 10:13 ` [PATCH v8 14/21] i386: Expose module level in CPUID[0x1F] Zhao Liu
2024-01-31 10:13 ` [PATCH v8 15/21] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2024-01-31 10:13 ` [PATCH v8 16/21] i386/cpu: Introduce module-id to X86CPU Zhao Liu
2024-01-31 10:13 ` [PATCH v8 17/21] tests: Add test case of APIC ID for module level parsing Zhao Liu
2024-01-31 10:13 ` [PATCH v8 18/21] hw/i386/pc: Support smp.modules for x86 PC machine Zhao Liu
2024-01-31 10:13 ` [PATCH v8 19/21] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2024-01-31 10:13 ` [PATCH v8 20/21] i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
2024-01-31 10:13 ` [PATCH v8 21/21] i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2024-01-31 10:28 ` [PATCH v8 00/21] Introduce smp.modules for x86 in QEMU Daniel P. Berrangé
2024-02-01 2:57 ` Zhao Liu
2024-02-01 9:21 ` Daniel P. Berrangé
2024-02-01 16:10 ` Zhao Liu
2024-02-08 16:52 ` Daniel P. Berrangé
2024-02-15 16:56 ` Zhao Liu [this message]
2024-02-21 12:41 ` Markus Armbruster
2024-02-21 15:15 ` Zhao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zc5CQlA20gTePwu6@intel.com \
--to=zhao1.liu@linux.intel.com \
--cc=armbru@redhat.com \
--cc=babu.moger@amd.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=eduardo@habkost.net \
--cc=kvm@vger.kernel.org \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=wangyanan55@huawei.com \
--cc=xiaoyao.li@intel.com \
--cc=yongwei.ma@intel.com \
--cc=zhao1.liu@intel.com \
--cc=zhenyu.z.wang@intel.com \
--cc=zhuocheng.ding@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.