* [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree
@ 2024-09-19 6:11 Zhao Liu
2024-09-19 6:11 ` [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device Zhao Liu
` (12 more replies)
0 siblings, 13 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Hi all,
This our v2 RFC trying to introduce hyrbid (aka, heterogeneous) CPU
topology into QEMU. This series focuses on the heterogeneous CPUs with
same ISA, like Intel client hybrid architecture.
Comparing with v1 [1], v2 totally re-designs the topology architecture
and based on QOM (CPU) topology [2], unleashes the ability to customize
CPU topology tree by -device from CLI.
For example, a PC machine with 1 Intel Core (P-core) with 2 threads and
2 Intel Atoms (E core) with single thread can be defined like:
-smp maxsockets=1,maxdies=1,maxmodules=2,maxcores=2,maxthreads=2
-machine pc,custom-topo=on \
-device cpu-socket,id=sock0 \
-device cpu-die,id=die0,bus=sock0 \
-device cpu-module,id=mod0,bus=die0 \
-device cpu-module,id=mod1,bus=die0 \
-device x86-intel-core,id=core0,bus=mod0 \
-device x86-intel-atom,id=core1,bus=mod1 \
-device x86-intel-atom,id=core2,bus=mod1 \
-device host-x86_64-cpu,id=cpu0,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=0 \
-device host-x86_64-cpu,id=cpu1,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=1 \
-device host-x86_64-cpu,id=cpu2,socket-id=0,die-id=0,module-id=1,core-id=0,thread-id=0 \
-device host-x86_64-cpu,id=cpu3,socket-id=0,die-id=0,module-id=1,core-id=1,thread-id=0
The example above has some difference from the v1 qom-topo example [3]:
* new max* parameter in -smp and,
* new custom-topo option in -machine,
* no "parent" parameter to create child<>, instead there's a bus to
specify parent bus of parent topology device.
The design of such command line is related to the machine/CPU
initialization process, and I'll explain in more detail later the
reasons for this (pls refer section 2. "Design Overview").
This series is based on previous v2 QOM topology series [2].
Welcome your feedback and comments!
1. Background
=============
About why we need hybrid CPU topology, pls refer the cover letter of
QOM-topo v2 RFC [2], "What's the Problem?" :-).
With CPU topology related devices introduced by QOM-topo v2 RFC [2],
then we have the chance to allow user to customize CPU topology from
CLI.
There is no need to deliberately emphasize the hybrid topology here, as
the custom topology can be either SMP or hybrid, and custom-topo is
generic and flexible enough.
2. Design Overview
==================
2.1. How to Initialize possible_cpus[] for Custom Topology from CLI
===================================================================
At present (QEMU master and QOM topo v2 [2]), possible_cpus[] is
initialized with -smp parameters.
For user custom topology, a previous attempt (in QOM topo v1 [3]) tried
to create topology devices (CPU/core/module/die...) from CLI in advance,
and built a complete topology tree, then used the globle topology
informantion (something similar to smp.max_cpus/threads/cores/sockets...)
to create possible_cpus[] and initialize archid (for x86, it's APIC ID).
Figure 1: Previous attempt to create topology devices before
possible_cpus[] initialization (in QOM-topo v1 [3])
qmp_x_exit_preconfig()
│
├───(?)qemu_create_cli_base_devices()
│ │
│ └───(?)Create CPU topology devices
│ including CPUs
│
├─── qemu_init_board()
│ │
│ └── machine_run_board_init()
│ │
│ └─── machine_class->init(machine)
│ │
│ └─── x86_cpus_init()
│ │
│ └─── mc->possible_cpu_arch_ids(ms)
│
└─── qemu_create_cli_devices()
The "(?)" marked qemu_create_cli_base_devices() (added in previous
approach) would create topology devices.
But this approach has the drawback: when topology tree is completed,
especially for the levels higher than possible_cpus[], it's impossible
to hotplug other topology devices (higher than possble_cpus[]). This is
because the length of possible_cpus[] is computed by those higher
topology levels and this length cannot change at runtime.
This would prevent future support and exploration of larger granularity
hotplugs.
Thus, in this RFC, we create topology devices after possible_cpus[]
creation.
But the question that arises is how to get the topology information
needed for the initialization of possible_cpus[] and its archid.
The current -smp parameters (cores/modules/clusters/dies/sockets/books/
drawers) require the machine to create a corresponding number of
topology instances for SMP systems.
This does not accommodate hybrid topologies. Therefore, we introduce
max* parameters: maxthreads/maxcores/maxmodules/maxdies/maxsockets
(for x86), to predefine the topology framework for the machine. These
parameters also constrain subsequent custom topologies, ensuring the
number of child devices under each parent device does not exceed the
specified max limits.
The actual number of child instances is determined by the user. Maybe
user defines a SMP topology, or maybe a hybrid topology.
Not only can the length of possible_cpus[] continue to be defined via
-smp, but its internal archid can also be set using max parameters. In
the case of x86, the bit width of the sub-topology ID in the APIC ID
will be determined by these max parameters. In fact, actual x86 hardware
uses the similar approach, including hybrid platforms.
Setting SMP max limits for custom topologies is semantically meaningful.
Regardless of how heterogeneous the CPU topology is, there will always
be a corresponding superset in the SMP structure.
2.2. How to Address CPU Dependencies in Machine Initialization
==============================================================
A coming question is whether the machine continues to initialize the
default CPUs from "-smp cpus=*", when the user needs custom topology
from the CLI.
In qom-topo v2, machine creates a symmetric topology tree from -smp by
default, and it's clear that customizing again based on an existing
topology tree won't work.
Therefore, once user wants to customize topology by "-machine
custom-topo=on", the machine, that supports custom topology, will skip
the default topology creation as well as the default CPU creation.
In the following figure, just as the "(X)" marked
machine_create_topo_tree() and x86_cpu_new() should be skipped.
Figure 2: Original machine initialization process (in QOM-topo v2 [2])
qmp_x_exit_preconfig()
│
├─── qemu_init_board()
│ │
│ └── machine_run_board_init()
│ │
│ ├───(*)machine_create_topo_tree()
│ │
│ └─── machine_class->init(machine)
│ │
│ ├─── x86_cpus_init()
│ │ │
│ │ ├─── mc->possible_cpu_arch_ids(ms)
│ │ │
│ │ └───(*)x86_cpu_new()
│ │
│ └───(*)Other initialization steps
│ with CPU dependencies
│
└─── qemu_create_cli_devices()
However, machine initialization may have some followup steps with CPU
dependencies after the default CPU initialization. If the default CPU
creation is skipped, such CPU-dependent steps will fail.
Therefore, to address these annoying CPU dependencies, and to replace
the default topology tree creation (machine_create_topo_tree() and
x86_cpu_new()) with CPU topology creation from CLI, this series reorders
the machine initialization steps and topology device creation from CLI
for the custom topology case:
Figure 3: New machine initialization process (in this series)
qmp_x_exit_preconfig()
│
├─── qemu_init_board()
│ │
│ ┼──── machine_run_board_init()
│ │ │
│ │ ├───(X)machine_create_topo_tree()
│ │ │
│ │ └─── machine_class->init(machine)
│ │ │
│ │ ├─── x86_cpus_init()
│ │ │ │
│ │ │ ┼─── mc->possible_cpu_arch_ids(ms)
│ │ │ │
│ │ │ └───(X)x86_cpu_new()
│ │ │
│ │ └───(X)Other initialization steps
│ │ with CPU dependencies
│ │
│ ├────(*)qemu_add_cli_devices_early()
│ │ │
│ │ └───(*)Create CPU topology devices
│ │ including CPUs
│ │
│ └────(*)machine_run_board_post_init()
│ │
│ └───(*)machine_class->post_init(machine)
│ │
│ └───(*)Other initialization steps
│ with CPU dependencies
│
└─── qemu_create_cli_devices()
As the above figure, "(*)" indicates the new interface/hook added in
this series:
* (For the machine supports custom topology) split CPU dependent
initialization setps into machine_class->post_init().
- For example, in q35 machine, all the logic after x86_cpu_new() is
placed in machine_class->post_init().
* Between machine_class->init() and machine_class->post_init(),
create CPU topology devices (including CPUs) from CLI early.
This effectively replaces the default CPU creation (as well as topology
tree creation) in the original initialization process with
qemu_add_cli_devices_early().
3. Patch Summary
================
Patch 01-03: Create topology device from CLI early.
Ptach 04,11: Separate the part following CPU creation from the machine
initialization process into MachineClass.post_init().
Patch 05-08: Implement max parameters in -smp and use max limitations
to initialize possible_cpus[].
Patch 09-10: Add Intel hybrid CPU support.
Patch 12: Allow user to customize topology tree for x86 machines.
4. Reference
============
[1]: [RFC 00/52] Introduce hybrid CPU topology
https://lore.kernel.org/qemu-devel/20230213095035.158240-1-zhao1.liu@linux.intel.com/
[2]: [RFC v2 00/15] qom-topo: Abstract CPU Topology Level to Topology Device
https://lore.kernel.org/qemu-devel/20240919015533.766754-1-zhao1.liu@intel.com/
[3]: [RFC 00/41] qom-topo: Abstract Everything about CPU Topology
https://lore.kernel.org/qemu-devel/20231130144203.2307629-1-zhao1.liu@linux.intel.com/
Thanks and Best Regards,
Zhao
---
Zhao Liu (12):
qdev: Allow qdev_device_add() to add specific category device
qdev: Introduce new device category to cover basic topology device
system/vl: Create CPU topology devices from CLI early
hw/core/machine: Split machine initialization around
qemu_add_cli_devices_early()
hw/core/machine: Introduce custom CPU topology with max limitations
hw/cpu: Constrain CPU topology tree with max_limit
hw/core: Re-implement topology helpers to honor max limitations
hw/i386: Use get_max_topo_by_level() to get topology information
i386: Introduce x86 CPU core abstractions
i386/cpu: Support Intel hybrid CPUID
i386/machine: Split machine initialization after CPU creation into
post_init()
i386: Support custom topology for microvm, pc-i440fx and pc-q35
MAINTAINERS | 1 +
hw/core/machine-smp.c | 10 ++-
hw/core/machine.c | 47 ++++++++++
hw/core/meson.build | 2 +-
hw/cpu/cpu-slot.c | 168 ++++++++++++++++++++++++++++++++++++
hw/cpu/cpu-topology.c | 2 +-
hw/i386/microvm.c | 8 ++
hw/i386/pc_piix.c | 41 +++++----
hw/i386/pc_q35.c | 37 +++++---
hw/i386/x86-common.c | 25 ++++--
hw/i386/x86.c | 20 +++--
hw/net/virtio-net.c | 2 +-
hw/usb/xen-usb.c | 3 +-
include/hw/boards.h | 13 ++-
include/hw/cpu/cpu-slot.h | 12 +++
include/hw/i386/pc.h | 3 +
include/hw/qdev-core.h | 6 ++
include/monitor/qdev.h | 4 +-
qapi/machine.json | 22 ++++-
stubs/machine-stubs.c | 21 +++++
stubs/meson.build | 1 +
system/cpus.c | 2 +-
system/qdev-monitor.c | 13 ++-
system/vl.c | 59 ++++++++-----
target/i386/core.c | 56 ++++++++++++
target/i386/core.h | 53 ++++++++++++
target/i386/cpu.c | 58 +++++++++++++
target/i386/cpu.h | 5 ++
target/i386/meson.build | 1 +
tests/unit/test-smp-parse.c | 4 +-
30 files changed, 618 insertions(+), 81 deletions(-)
create mode 100644 stubs/machine-stubs.c
create mode 100644 target/i386/core.c
create mode 100644 target/i386/core.h
--
2.34.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-10-08 9:14 ` Jonathan Cameron via
2024-09-19 6:11 ` [RFC v2 02/12] qdev: Introduce new device category to cover basic topology device Zhao Liu
` (11 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Topology devices need to be created and realized before board
initialization.
Allow qdev_device_add() to specify category to help create topology
devices early.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/net/virtio-net.c | 2 +-
hw/usb/xen-usb.c | 3 ++-
include/monitor/qdev.h | 4 ++--
system/qdev-monitor.c | 12 ++++++++----
system/vl.c | 4 ++--
5 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index fb84d142ee29..0d92e09e9076 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -935,7 +935,7 @@ static void failover_add_primary(VirtIONet *n, Error **errp)
return;
}
- dev = qdev_device_add_from_qdict(n->primary_opts,
+ dev = qdev_device_add_from_qdict(n->primary_opts, NULL,
n->primary_opts_from_json,
&err);
if (err) {
diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
index 13901625c0c8..e4168b1fec7e 100644
--- a/hw/usb/xen-usb.c
+++ b/hw/usb/xen-usb.c
@@ -766,7 +766,8 @@ static void usbback_portid_add(struct usbback_info *usbif, unsigned port,
qdict_put_str(qdict, "hostport", portname);
opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict,
&error_abort);
- usbif->ports[port - 1].dev = USB_DEVICE(qdev_device_add(opts, &local_err));
+ usbif->ports[port - 1].dev = USB_DEVICE(
+ qdev_device_add(opts, NULL, &local_err));
if (!usbif->ports[port - 1].dev) {
qobject_unref(qdict);
xen_pv_printf(&usbif->xendev, 0,
diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
index 1d57bf657794..f5fd6e6c1ffc 100644
--- a/include/monitor/qdev.h
+++ b/include/monitor/qdev.h
@@ -8,8 +8,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict);
void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp);
int qdev_device_help(QemuOpts *opts);
-DeviceState *qdev_device_add(QemuOpts *opts, Error **errp);
-DeviceState *qdev_device_add_from_qdict(const QDict *opts,
+DeviceState *qdev_device_add(QemuOpts *opts, long *category, Error **errp);
+DeviceState *qdev_device_add_from_qdict(const QDict *opts, long *category,
bool from_json, Error **errp);
/**
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 457dfd05115e..fe120353fedc 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -632,7 +632,7 @@ const char *qdev_set_id(DeviceState *dev, char *id, Error **errp)
return prop->name;
}
-DeviceState *qdev_device_add_from_qdict(const QDict *opts,
+DeviceState *qdev_device_add_from_qdict(const QDict *opts, long *category,
bool from_json, Error **errp)
{
ERRP_GUARD();
@@ -655,6 +655,10 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
return NULL;
}
+ if (category && !test_bit(*category, dc->categories)) {
+ return NULL;
+ }
+
/* find bus */
path = qdict_get_try_str(opts, "bus");
if (path != NULL) {
@@ -767,12 +771,12 @@ err_del_dev:
}
/* Takes ownership of @opts on success */
-DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
+DeviceState *qdev_device_add(QemuOpts *opts, long *category, Error **errp)
{
QDict *qdict = qemu_opts_to_qdict(opts, NULL);
DeviceState *ret;
- ret = qdev_device_add_from_qdict(qdict, false, errp);
+ ret = qdev_device_add_from_qdict(qdict, category, false, errp);
if (ret) {
qemu_opts_del(opts);
}
@@ -897,7 +901,7 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
qemu_opts_del(opts);
return;
}
- dev = qdev_device_add(opts, errp);
+ dev = qdev_device_add(opts, NULL, errp);
if (!dev) {
/*
* Drain all pending RCU callbacks. This is done because
diff --git a/system/vl.c b/system/vl.c
index 193e7049ccbe..c40364e2f091 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1212,7 +1212,7 @@ static int device_init_func(void *opaque, QemuOpts *opts, Error **errp)
{
DeviceState *dev;
- dev = qdev_device_add(opts, errp);
+ dev = qdev_device_add(opts, NULL, errp);
if (!dev && *errp) {
error_report_err(*errp);
return -1;
@@ -2665,7 +2665,7 @@ static void qemu_create_cli_devices(void)
* from the start, so call qdev_device_add_from_qdict() directly for
* now.
*/
- dev = qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
+ dev = qdev_device_add_from_qdict(opt->opts, NULL, true, &error_fatal);
object_unref(OBJECT(dev));
loc_pop(&opt->loc);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 02/12] qdev: Introduce new device category to cover basic topology device
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
2024-09-19 6:11 ` [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early Zhao Liu
` (10 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Topology devices are used to define CPUs and need to be created and
realized earlier than current qemu_create_cli_devices().
Use this new catogory to identify such special devices, which allows
to create them earlier in subsequent change.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/cpu/cpu-topology.c | 2 +-
include/hw/qdev-core.h | 1 +
system/qdev-monitor.c | 1 +
3 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/cpu/cpu-topology.c b/hw/cpu/cpu-topology.c
index 3e8982ff7e6c..ce3da844a7d8 100644
--- a/hw/cpu/cpu-topology.c
+++ b/hw/cpu/cpu-topology.c
@@ -164,7 +164,7 @@ static void cpu_topo_class_init(ObjectClass *oc, void *data)
DeviceClass *dc = DEVICE_CLASS(oc);
CPUTopoClass *tc = CPU_TOPO_CLASS(oc);
- set_bit(DEVICE_CATEGORY_CPU, dc->categories);
+ set_bit(DEVICE_CATEGORY_CPU_DEF, dc->categories);
dc->realize = cpu_topo_realize;
/*
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 77223b28c788..ddcaa329e3ec 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -86,6 +86,7 @@ typedef enum DeviceCategory {
DEVICE_CATEGORY_SOUND,
DEVICE_CATEGORY_MISC,
DEVICE_CATEGORY_CPU,
+ DEVICE_CATEGORY_CPU_DEF,
DEVICE_CATEGORY_WATCHDOG,
DEVICE_CATEGORY_MAX
} DeviceCategory;
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index fe120353fedc..07863d4e650a 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -179,6 +179,7 @@ static void qdev_print_devinfos(bool show_no_user)
[DEVICE_CATEGORY_SOUND] = "Sound",
[DEVICE_CATEGORY_MISC] = "Misc",
[DEVICE_CATEGORY_CPU] = "CPU",
+ [DEVICE_CATEGORY_CPU_DEF] = "CPU Definition",
[DEVICE_CATEGORY_WATCHDOG]= "Watchdog",
[DEVICE_CATEGORY_MAX] = "Uncategorized",
};
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
2024-09-19 6:11 ` [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device Zhao Liu
2024-09-19 6:11 ` [RFC v2 02/12] qdev: Introduce new device category to cover basic topology device Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-10-08 9:50 ` Jonathan Cameron via
2024-10-08 9:55 ` Jonathan Cameron via
2024-09-19 6:11 ` [RFC v2 04/12] hw/core/machine: Split machine initialization around qemu_add_cli_devices_early() Zhao Liu
` (9 subsequent siblings)
12 siblings, 2 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Custom topology will allow user to build CPU topology from CLI totally,
and this replaces machine's default CPU creation process (*_init_cpus()
in MachineClass.init()).
For the machine's initialization, there may be CPU dependencies in the
remaining initialization after the CPU creation.
To address such dependencies, create the CPU topology device (including
CPU devices) from the CLI earlier, so that the latter part of machine
initialization can be separated after qemu_add_cli_devices_early().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
system/vl.c | 55 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 36 insertions(+), 19 deletions(-)
diff --git a/system/vl.c b/system/vl.c
index c40364e2f091..8540454aa1c2 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1211,8 +1211,9 @@ static int device_help_func(void *opaque, QemuOpts *opts, Error **errp)
static int device_init_func(void *opaque, QemuOpts *opts, Error **errp)
{
DeviceState *dev;
+ long *category = opaque;
- dev = qdev_device_add(opts, NULL, errp);
+ dev = qdev_device_add(opts, category, errp);
if (!dev && *errp) {
error_report_err(*errp);
return -1;
@@ -2623,6 +2624,36 @@ static void qemu_init_displays(void)
}
}
+static void qemu_add_devices(long *category)
+{
+ DeviceOption *opt;
+
+ qemu_opts_foreach(qemu_find_opts("device"),
+ device_init_func, category, &error_fatal);
+ QTAILQ_FOREACH(opt, &device_opts, next) {
+ DeviceState *dev;
+ loc_push_restore(&opt->loc);
+ /*
+ * TODO Eventually we should call qmp_device_add() here to make sure it
+ * behaves the same, but QMP still has to accept incorrectly typed
+ * options until libvirt is fixed and we want to be strict on the CLI
+ * from the start, so call qdev_device_add_from_qdict() directly for
+ * now.
+ */
+ dev = qdev_device_add_from_qdict(opt->opts, category,
+ true, &error_fatal);
+ object_unref(OBJECT(dev));
+ loc_pop(&opt->loc);
+ }
+}
+
+static void qemu_add_cli_devices_early(void)
+{
+ long category = DEVICE_CATEGORY_CPU_DEF;
+
+ qemu_add_devices(&category);
+}
+
static void qemu_init_board(void)
{
/* process plugin before CPUs are created, but once -smp has been parsed */
@@ -2631,6 +2662,9 @@ static void qemu_init_board(void)
/* From here on we enter MACHINE_PHASE_INITIALIZED. */
machine_run_board_init(current_machine, mem_path, &error_fatal);
+ /* Create CPU topology device if any. */
+ qemu_add_cli_devices_early();
+
drive_check_orphaned();
realtime_init();
@@ -2638,8 +2672,6 @@ static void qemu_init_board(void)
static void qemu_create_cli_devices(void)
{
- DeviceOption *opt;
-
soundhw_init();
qemu_opts_foreach(qemu_find_opts("fw_cfg"),
@@ -2653,22 +2685,7 @@ static void qemu_create_cli_devices(void)
/* init generic devices */
rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
- qemu_opts_foreach(qemu_find_opts("device"),
- device_init_func, NULL, &error_fatal);
- QTAILQ_FOREACH(opt, &device_opts, next) {
- DeviceState *dev;
- loc_push_restore(&opt->loc);
- /*
- * TODO Eventually we should call qmp_device_add() here to make sure it
- * behaves the same, but QMP still has to accept incorrectly typed
- * options until libvirt is fixed and we want to be strict on the CLI
- * from the start, so call qdev_device_add_from_qdict() directly for
- * now.
- */
- dev = qdev_device_add_from_qdict(opt->opts, NULL, true, &error_fatal);
- object_unref(OBJECT(dev));
- loc_pop(&opt->loc);
- }
+ qemu_add_devices(NULL);
rom_reset_order_override();
}
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 04/12] hw/core/machine: Split machine initialization around qemu_add_cli_devices_early()
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (2 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations Zhao Liu
` (8 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Split machine initialization and machine_run_board_init() into two parts
around qemu_add_cli_devices_early(), allowing initialization to continue
after the CPU creation from the CLI.
This enables machine to place the initialization steps with CPU
dependencies in post_init().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/core/machine.c | 10 ++++++++++
include/hw/boards.h | 2 ++
system/vl.c | 4 +++-
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 076bd365197b..7b4ac5ac52b2 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1645,6 +1645,16 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
accel_init_interfaces(ACCEL_GET_CLASS(machine->accelerator));
machine_class->init(machine);
+}
+
+void machine_run_board_post_init(MachineState *machine, Error **errp)
+{
+ MachineClass *machine_class = MACHINE_GET_CLASS(machine);
+
+ if (machine_class->post_init) {
+ machine_class->post_init(machine);
+ }
+
phase_advance(PHASE_MACHINE_INITIALIZED);
}
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a49677466ef6..9f706223e848 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -33,6 +33,7 @@ const char *machine_class_default_cpu_type(MachineClass *mc);
void machine_add_audiodev_property(MachineClass *mc);
void machine_run_board_init(MachineState *machine, const char *mem_path, Error **errp);
+void machine_run_board_post_init(MachineState *machine, Error **errp);
bool machine_usb(MachineState *machine);
int machine_phandle_start(MachineState *machine);
bool machine_dump_guest_core(MachineState *machine);
@@ -271,6 +272,7 @@ struct MachineClass {
const char *deprecation_reason;
void (*init)(MachineState *state);
+ void (*post_init)(MachineState *state);
void (*reset)(MachineState *state, ShutdownCause reason);
void (*wakeup)(MachineState *state);
int (*kvm_type)(MachineState *machine, const char *arg);
diff --git a/system/vl.c b/system/vl.c
index 8540454aa1c2..00370f7a52aa 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2659,12 +2659,14 @@ static void qemu_init_board(void)
/* process plugin before CPUs are created, but once -smp has been parsed */
qemu_plugin_load_list(&plugin_list, &error_fatal);
- /* From here on we enter MACHINE_PHASE_INITIALIZED. */
machine_run_board_init(current_machine, mem_path, &error_fatal);
/* Create CPU topology device if any. */
qemu_add_cli_devices_early();
+ /* From here on we enter MACHINE_PHASE_INITIALIZED. */
+ machine_run_board_post_init(current_machine, &error_fatal);
+
drive_check_orphaned();
realtime_init();
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (3 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 04/12] hw/core/machine: Split machine initialization around qemu_add_cli_devices_early() Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-10-08 10:16 ` Jonathan Cameron via
2024-09-19 6:11 ` [RFC v2 06/12] hw/cpu: Constrain CPU topology tree with max_limit Zhao Liu
` (7 subsequent siblings)
12 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Custom topology allows user to create CPU topology totally via -device
from CLI.
Once custom topology is enabled, machine will stop the default CPU
creation and expect user's CPU topology tree built from CLI.
With custom topology, any CPU topology, whether symmetric or hybrid
(aka, heterogeneous), can be created naturally.
However, custom topology also needs to be restricted because
possible_cpus[] requires some preliminary topology information for
initialization, which is the max limitation (the new max parameters in
-smp). Custom topology will be subject to this max limitation.
Max limitations are necessary because creating custom topology before
initializing possible_cpus[] would compromise future hotplug scalability.
Max limitations are placed in -smp, even though custom topology can be
defined as hybrid. From an implementation perspective, any hybrid
topology can be considered a subset of a complete SMP structure.
Therefore, semantically, using max limitations to constrain hybrid
topology is consistent.
Introduce custom CPU topology related properties in MachineClass. At the
same time, add and parse max parameters from -smp, and store the max
limitations in CPUSlot.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
MAINTAINERS | 1 +
hw/core/machine-smp.c | 2 +
hw/core/machine.c | 33 +++++++++++
hw/core/meson.build | 2 +-
hw/cpu/cpu-slot.c | 118 ++++++++++++++++++++++++++++++++++++++
include/hw/boards.h | 2 +
include/hw/cpu/cpu-slot.h | 9 +++
qapi/machine.json | 22 ++++++-
stubs/machine-stubs.c | 21 +++++++
stubs/meson.build | 1 +
10 files changed, 209 insertions(+), 2 deletions(-)
create mode 100644 stubs/machine-stubs.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 4608c3c6db8c..5ea739f12857 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1901,6 +1901,7 @@ F: include/hw/cpu/die.h
F: include/hw/cpu/module.h
F: include/hw/cpu/socket.h
F: include/sysemu/numa.h
+F: stubs/machine-stubs.c
F: tests/functional/test_cpu_queries.py
F: tests/functional/test_empty_cpu_model.py
F: tests/unit/test-smp-parse.c
diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 9a281946762f..d3be4352267d 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -259,6 +259,8 @@ void machine_parse_smp_config(MachineState *ms,
mc->name, mc->max_cpus);
return;
}
+
+ machine_parse_custom_topo_config(ms, config, errp);
}
static bool machine_check_topo_support(MachineState *ms,
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 7b4ac5ac52b2..dedabd75c825 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -966,6 +966,30 @@ static void machine_set_smp_cache(Object *obj, Visitor *v, const char *name,
qapi_free_SmpCachePropertiesList(caches);
}
+static bool machine_get_custom_topo(Object *obj, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ if (!ms->topo) {
+ error_setg(errp, "machine doesn't support custom topology");
+ return false;
+ }
+
+ return ms->topo->custom_topo_enabled;
+}
+
+static void machine_set_custom_topo(Object *obj, bool value, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ if (!ms->topo) {
+ error_setg(errp, "machine doesn't support custom topology");
+ return;
+ }
+
+ ms->topo->custom_topo_enabled = value;
+}
+
static void machine_get_boot(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
@@ -1240,6 +1264,15 @@ static void machine_initfn(Object *obj)
}
ms->topo = NULL;
+ if (mc->smp_props.topo_tree_supported &&
+ mc->smp_props.custom_topo_supported) {
+ object_property_add_bool(obj, "custom-topo",
+ machine_get_custom_topo,
+ machine_set_custom_topo);
+ object_property_set_description(obj, "custom-topo",
+ "Set on/off to enable/disable "
+ "user custom CPU topology tree");
+ }
machine_copy_boot_config(ms, &(BootConfiguration){ 0 });
}
diff --git a/hw/core/meson.build b/hw/core/meson.build
index a3d9bab9f42a..f70d6104a00d 100644
--- a/hw/core/meson.build
+++ b/hw/core/meson.build
@@ -13,7 +13,6 @@ hwcore_ss.add(files(
))
common_ss.add(files('cpu-common.c'))
-common_ss.add(files('machine-smp.c'))
system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
system_ss.add(when: 'CONFIG_GENERIC_LOADER', if_true: files('generic-loader.c'))
system_ss.add(when: 'CONFIG_GUEST_LOADER', if_true: files('guest-loader.c'))
@@ -33,6 +32,7 @@ system_ss.add(files(
'loader.c',
'machine-hmp-cmds.c',
'machine-qmp-cmds.c',
+ 'machine-smp.c',
'machine.c',
'nmi.c',
'null-machine.c',
diff --git a/hw/cpu/cpu-slot.c b/hw/cpu/cpu-slot.c
index 1cc3b32ed675..2d16a2729501 100644
--- a/hw/cpu/cpu-slot.c
+++ b/hw/cpu/cpu-slot.c
@@ -165,6 +165,11 @@ void machine_plug_cpu_slot(MachineState *ms)
set_bit(CPU_TOPOLOGY_LEVEL_DIE, slot->supported_levels);
}
+ /* Initizlize max_limit to 1, as members of CpuTopology. */
+ for (int i = 0; i < CPU_TOPOLOGY_LEVEL__MAX; i++) {
+ slot->stat.entries[i].max_limit = 1;
+ }
+
ms->topo = slot;
object_property_add_child(container_get(OBJECT(ms), "/peripheral"),
"cpu-slot", OBJECT(ms->topo));
@@ -295,6 +300,11 @@ bool machine_create_topo_tree(MachineState *ms, Error **errp)
return false;
}
+ /* User will customize topology tree. */
+ if (slot->custom_topo_enabled) {
+ return true;
+ }
+
/*
* Don't support full topology tree.
* Just use slot to collect topology device.
@@ -325,3 +335,111 @@ bool machine_create_topo_tree(MachineState *ms, Error **errp)
return true;
}
+
+int get_max_topo_by_level(const MachineState *ms, CpuTopologyLevel level)
+{
+ if (!ms->topo || !ms->topo->custom_topo_enabled) {
+ return get_smp_info_by_level(&ms->smp, level);
+ }
+ return ms->topo->stat.entries[level].max_limit;
+}
+
+bool machine_parse_custom_topo_config(MachineState *ms,
+ const SMPConfiguration *config,
+ Error **errp)
+{
+ MachineClass *mc = MACHINE_GET_CLASS(ms);
+ CPUSlot *slot = ms->topo;
+ bool is_valid;
+ int maxcpus;
+
+ if (!slot) {
+ return true;
+ }
+
+ is_valid = config->has_maxsockets && config->maxsockets;
+ if (mc->smp_props.custom_topo_supported) {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
+ is_valid ? config->maxsockets : ms->smp.sockets;
+ } else if (is_valid) {
+ error_setg(errp, "maxsockets > 0 not supported "
+ "by this machine's CPU topology");
+ return false;
+ } else {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
+ ms->smp.sockets;
+ }
+
+ is_valid = config->has_maxdies && config->maxdies;
+ if (mc->smp_props.custom_topo_supported &&
+ mc->smp_props.dies_supported) {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_DIE].max_limit =
+ is_valid ? config->maxdies : ms->smp.dies;
+ } else if (is_valid) {
+ error_setg(errp, "maxdies > 0 not supported "
+ "by this machine's CPU topology");
+ return false;
+ } else {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_DIE].max_limit =
+ ms->smp.dies;
+ }
+
+ is_valid = config->has_maxmodules && config->maxmodules;
+ if (mc->smp_props.custom_topo_supported &&
+ mc->smp_props.modules_supported) {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_MODULE].max_limit =
+ is_valid ? config->maxmodules : ms->smp.modules;
+ } else if (is_valid) {
+ error_setg(errp, "maxmodules > 0 not supported "
+ "by this machine's CPU topology");
+ return false;
+ } else {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_MODULE].max_limit =
+ ms->smp.modules;
+ }
+
+ is_valid = config->has_maxcores && config->maxcores;
+ if (mc->smp_props.custom_topo_supported) {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_CORE].max_limit =
+ is_valid ? config->maxcores : ms->smp.cores;
+ } else if (is_valid) {
+ error_setg(errp, "maxcores > 0 not supported "
+ "by this machine's CPU topology");
+ return false;
+ } else {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_CORE].max_limit =
+ ms->smp.cores;
+ }
+
+ is_valid = config->has_maxthreads && config->maxthreads;
+ if (mc->smp_props.custom_topo_supported) {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_THREAD].max_limit =
+ is_valid ? config->maxthreads : ms->smp.threads;
+ } else if (is_valid) {
+ error_setg(errp, "maxthreads > 0 not supported "
+ "by this machine's CPU topology");
+ return false;
+ } else {
+ slot->stat.entries[CPU_TOPOLOGY_LEVEL_THREAD].max_limit =
+ ms->smp.threads;
+ }
+
+ maxcpus = 1;
+ /* Initizlize max_limit to 1, as members of CpuTopology. */
+ for (int i = 0; i < CPU_TOPOLOGY_LEVEL__MAX; i++) {
+ maxcpus *= slot->stat.entries[i].max_limit;
+ }
+
+ if (!config->has_maxcpus) {
+ ms->smp.max_cpus = maxcpus;
+ } else {
+ if (maxcpus != ms->smp.max_cpus) {
+ error_setg(errp, "maxcpus (%d) should be equal to "
+ "the product of the remaining max parameters (%d)",
+ ms->smp.max_cpus, maxcpus);
+ return false;
+ }
+ }
+
+ return true;
+}
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 9f706223e848..6ef4ea322590 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -157,6 +157,7 @@ typedef struct {
* @topo_tree_supported - whether QOM topology tree is supported by the
* machine
* @arch_id_topo_level - topology granularity for possible_cpus[]
+ * @custom_topo_supported - whether custom topology tree is supported
*/
typedef struct {
bool prefer_sockets;
@@ -169,6 +170,7 @@ typedef struct {
bool cache_supported[CACHE_LEVEL_AND_TYPE__MAX];
bool topo_tree_supported;
CpuTopologyLevel arch_id_topo_level;
+ bool custom_topo_supported;
} SMPCompatProps;
/**
diff --git a/include/hw/cpu/cpu-slot.h b/include/hw/cpu/cpu-slot.h
index 1838e8c0c3f9..8d7e35aa1851 100644
--- a/include/hw/cpu/cpu-slot.h
+++ b/include/hw/cpu/cpu-slot.h
@@ -24,10 +24,13 @@
* that are currently inserted in CPU slot
* @max_instances: Maximum number of topological instances at the same level
* under the parent topological container
+ * @max_limit: Maximum limitation of topological instances at the same level
+ * under the parent topological container
*/
typedef struct CPUTopoStatEntry {
int total_instances;
int max_instances;
+ int max_limit;
} CPUTopoStatEntry;
/**
@@ -54,6 +57,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(CPUSlot, CPU_SLOT)
* @stat: Topological statistics for topology tree.
* @bus: CPU bus to add the children topology device.
* @supported_levels: Supported topology levels for topology tree.
+ * @custom_topo_enabled: Whether user to create custom topology tree.
* @listener: Hooks to listen realize() and unrealize() of topology
* device.
*/
@@ -65,6 +69,7 @@ struct CPUSlot {
CPUBusState bus;
CPUTopoStat stat;
DECLARE_BITMAP(supported_levels, CPU_TOPOLOGY_LEVEL__MAX);
+ bool custom_topo_enabled;
DeviceListener listener;
};
@@ -75,5 +80,9 @@ struct CPUSlot {
void machine_plug_cpu_slot(MachineState *ms);
bool machine_create_topo_tree(MachineState *ms, Error **errp);
+int get_max_topo_by_level(const MachineState *ms, CpuTopologyLevel level);
+bool machine_parse_custom_topo_config(MachineState *ms,
+ const SMPConfiguration *config,
+ Error **errp);
#endif /* CPU_SLOT_H */
diff --git a/qapi/machine.json b/qapi/machine.json
index a6b8795b09ed..2d5c6e4becd1 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1695,6 +1695,21 @@
#
# @threads: number of threads per core
#
+# @maxsockets: maximum number of sockets allowed to be created per
+# parent container in custom CPU topology tree (since 10.0)
+#
+# @maxdies: maximum number of dies allowed to be created per parent
+# container in custom CPU topology tree (since 10.0)
+#
+# @maxmodules: maximum number of modules allowed to be created per
+# parent container in custom CPU topology tree (since 10.0)
+#
+# @maxcores: maximum number of cores allowed to be created per parent
+# container in custom CPU topology tree (since 10.0)
+#
+# @maxthreads: maximum number of threads allowed to be created per
+# parent container in custom CPU topology tree (since 10.0)
+#
# Since: 6.1
##
{ 'struct': 'SMPConfiguration', 'data': {
@@ -1707,7 +1722,12 @@
'*modules': 'int',
'*cores': 'int',
'*threads': 'int',
- '*maxcpus': 'int' } }
+ '*maxcpus': 'int',
+ '*maxsockets': 'int',
+ '*maxdies': 'int',
+ '*maxmodules': 'int',
+ '*maxcores': 'int',
+ '*maxthreads': 'int' } }
##
# @x-query-irq:
diff --git a/stubs/machine-stubs.c b/stubs/machine-stubs.c
new file mode 100644
index 000000000000..e592504fef6b
--- /dev/null
+++ b/stubs/machine-stubs.c
@@ -0,0 +1,21 @@
+/*
+ * Machine stubs
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Author: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/boards.h"
+
+bool machine_parse_custom_topo_config(MachineState *ms,
+ const SMPConfiguration *config,
+ Error **errp)
+{
+ return true;
+}
diff --git a/stubs/meson.build b/stubs/meson.build
index 772a3e817df2..406a7efc5bcb 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -66,6 +66,7 @@ if have_system
stub_ss.add(files('dump.c'))
stub_ss.add(files('cmos.c'))
stub_ss.add(files('fw_cfg.c'))
+ stub_ss.add(files('machine-stubs.c'))
stub_ss.add(files('target-get-monitor-def.c'))
stub_ss.add(files('target-monitor-defs.c'))
stub_ss.add(files('win32-kbd-hook.c'))
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 06/12] hw/cpu: Constrain CPU topology tree with max_limit
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (4 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 07/12] hw/core: Re-implement topology helpers to honor max limitations Zhao Liu
` (6 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Apply max_limit to CPU topology and prevent the number of topology
devices from exceeding the max limitation configured by user.
Additionally, ensure that CPUs created from the CLI via custom topology
meet at least the requirements of smp.cpus. This guarantees that custom
topology will always have CPUs.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/core/machine.c | 4 ++++
hw/cpu/cpu-slot.c | 32 ++++++++++++++++++++++++++++++++
include/hw/cpu/cpu-slot.h | 1 +
include/hw/qdev-core.h | 5 +++++
4 files changed, 42 insertions(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index dedabd75c825..54fca9eb7265 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1684,6 +1684,10 @@ void machine_run_board_post_init(MachineState *machine, Error **errp)
{
MachineClass *machine_class = MACHINE_GET_CLASS(machine);
+ if (!machine_validate_topo_tree(machine, errp)) {
+ return;
+ }
+
if (machine_class->post_init) {
machine_class->post_init(machine);
}
diff --git a/hw/cpu/cpu-slot.c b/hw/cpu/cpu-slot.c
index 2d16a2729501..f2b9c412926f 100644
--- a/hw/cpu/cpu-slot.c
+++ b/hw/cpu/cpu-slot.c
@@ -47,6 +47,7 @@ static void cpu_slot_device_realize(DeviceListener *listener,
{
CPUSlot *slot = container_of(listener, CPUSlot, listener);
CPUTopoState *topo;
+ int max_children;
if (!object_dynamic_cast(OBJECT(dev), TYPE_CPU_TOPO)) {
return;
@@ -54,6 +55,13 @@ static void cpu_slot_device_realize(DeviceListener *listener,
topo = CPU_TOPO(dev);
cpu_slot_add_topo_info(slot, topo);
+
+ if (dev->parent_bus) {
+ max_children = slot->stat.entries[GET_CPU_TOPO_LEVEL(topo)].max_limit;
+ if (dev->parent_bus->num_children == max_children) {
+ qbus_mark_full(dev->parent_bus);
+ }
+ }
}
static void cpu_slot_del_topo_info(CPUSlot *slot, CPUTopoState *topo)
@@ -79,6 +87,10 @@ static void cpu_slot_device_unrealize(DeviceListener *listener,
topo = CPU_TOPO(dev);
cpu_slot_del_topo_info(slot, topo);
+
+ if (dev->parent_bus) {
+ qbus_mask_full(dev->parent_bus);
+ }
}
DeviceListener cpu_slot_device_listener = {
@@ -443,3 +455,23 @@ bool machine_parse_custom_topo_config(MachineState *ms,
return true;
}
+
+bool machine_validate_topo_tree(MachineState *ms, Error **errp)
+{
+ int cpus;
+
+ if (!ms->topo || !ms->topo->custom_topo_enabled) {
+ return true;
+ }
+
+ cpus = ms->topo->stat.entries[CPU_TOPOLOGY_LEVEL_THREAD].total_instances;
+ if (cpus < ms->smp.cpus) {
+ error_setg(errp, "machine requires at least %d online CPUs, "
+ "but currently only %d CPUs",
+ ms->smp.cpus, cpus);
+ return false;
+ }
+
+ /* TODO: Add checks for other levels to honor more -smp parameters. */
+ return true;
+}
diff --git a/include/hw/cpu/cpu-slot.h b/include/hw/cpu/cpu-slot.h
index 8d7e35aa1851..f56a0b08dca4 100644
--- a/include/hw/cpu/cpu-slot.h
+++ b/include/hw/cpu/cpu-slot.h
@@ -84,5 +84,6 @@ int get_max_topo_by_level(const MachineState *ms, CpuTopologyLevel level);
bool machine_parse_custom_topo_config(MachineState *ms,
const SMPConfiguration *config,
Error **errp);
+bool machine_validate_topo_tree(MachineState *ms, Error **errp);
#endif /* CPU_SLOT_H */
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index ddcaa329e3ec..3f2117e08774 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -1063,6 +1063,11 @@ static inline void qbus_mark_full(BusState *bus)
bus->full = true;
}
+static inline void qbus_mask_full(BusState *bus)
+{
+ bus->full = false;
+}
+
void device_listener_register(DeviceListener *listener);
void device_listener_unregister(DeviceListener *listener);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 07/12] hw/core: Re-implement topology helpers to honor max limitations
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (5 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 06/12] hw/cpu: Constrain CPU topology tree with max_limit Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 08/12] hw/i386: Use get_max_topo_by_level() to get topology information Zhao Liu
` (5 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
For custom topology case, the valid and reliable topology information
be obtained from topology max limitations.
Therefore, re-implement machine_topo_get_cores_per_socket() and
machine_topo_get_threads_per_socket() to consider the custom topology
case. And further, use the wrapped helper to set CPUState.nr_threads/
nr_cores, avoiding topology mismatches in custom topology scenarios.
Additionally, since test-smp-parse needs more stubs to compile with
cpu-slot.c, keep the old helpers for test-smp-parse' use for now. The
legacy old helpers will be cleaned up when full compilation support is
added later on.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/core/machine-smp.c | 8 +++++---
hw/cpu/cpu-slot.c | 18 ++++++++++++++++++
include/hw/boards.h | 9 +++++++--
include/hw/cpu/cpu-slot.h | 2 ++
system/cpus.c | 2 +-
tests/unit/test-smp-parse.c | 4 ++--
6 files changed, 35 insertions(+), 8 deletions(-)
diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index d3be4352267d..2965b042fd92 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -376,14 +376,16 @@ bool machine_parse_smp_cache(MachineState *ms,
return true;
}
-unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
+unsigned int machine_topo_get_cores_per_socket_old(const MachineState *ms)
{
+ assert(!ms->topo);
return ms->smp.cores * ms->smp.modules * ms->smp.clusters * ms->smp.dies;
}
-unsigned int machine_topo_get_threads_per_socket(const MachineState *ms)
+unsigned int machine_topo_get_threads_per_socket_old(const MachineState *ms)
{
- return ms->smp.threads * machine_topo_get_cores_per_socket(ms);
+ assert(!ms->topo);
+ return ms->smp.threads * machine_topo_get_cores_per_socket_old(ms);
}
CpuTopologyLevel machine_get_cache_topo_level(const MachineState *ms,
diff --git a/hw/cpu/cpu-slot.c b/hw/cpu/cpu-slot.c
index f2b9c412926f..8c0d55e835e2 100644
--- a/hw/cpu/cpu-slot.c
+++ b/hw/cpu/cpu-slot.c
@@ -204,6 +204,8 @@ static int get_smp_info_by_level(const CpuTopology *smp_info,
return smp_info->cores;
case CPU_TOPOLOGY_LEVEL_MODULE:
return smp_info->modules;
+ case CPU_TOPOLOGY_LEVEL_CLUSTER:
+ return smp_info->clusters;
case CPU_TOPOLOGY_LEVEL_DIE:
return smp_info->dies;
case CPU_TOPOLOGY_LEVEL_SOCKET:
@@ -356,6 +358,22 @@ int get_max_topo_by_level(const MachineState *ms, CpuTopologyLevel level)
return ms->topo->stat.entries[level].max_limit;
}
+unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
+{
+ int cores = 1, i;
+
+ for (i = CPU_TOPOLOGY_LEVEL_CORE; i < CPU_TOPOLOGY_LEVEL_SOCKET; i++) {
+ cores *= get_max_topo_by_level(ms, i);
+ }
+ return cores;
+}
+
+unsigned int machine_topo_get_threads_per_socket(const MachineState *ms)
+{
+ return get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_THREAD) *
+ machine_topo_get_cores_per_socket(ms);
+}
+
bool machine_parse_custom_topo_config(MachineState *ms,
const SMPConfiguration *config,
Error **errp)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6ef4ea322590..faf7859debdd 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -48,8 +48,13 @@ void machine_parse_smp_config(MachineState *ms,
bool machine_parse_smp_cache(MachineState *ms,
const SmpCachePropertiesList *caches,
Error **errp);
-unsigned int machine_topo_get_cores_per_socket(const MachineState *ms);
-unsigned int machine_topo_get_threads_per_socket(const MachineState *ms);
+/*
+ * TODO: Drop these old helpers when cpu-slot.c could be compiled for
+ * test-smp-parse. Pls use machine_topo_get_cores_per_socket() and
+ * machine_topo_get_threads_per_socket() instead.
+ */
+unsigned int machine_topo_get_cores_per_socket_old(const MachineState *ms);
+unsigned int machine_topo_get_threads_per_socket_old(const MachineState *ms);
CpuTopologyLevel machine_get_cache_topo_level(const MachineState *ms,
CacheLevelAndType cache);
void machine_memory_devices_init(MachineState *ms, hwaddr base, uint64_t size);
diff --git a/include/hw/cpu/cpu-slot.h b/include/hw/cpu/cpu-slot.h
index f56a0b08dca4..230309b67fe1 100644
--- a/include/hw/cpu/cpu-slot.h
+++ b/include/hw/cpu/cpu-slot.h
@@ -81,6 +81,8 @@ struct CPUSlot {
void machine_plug_cpu_slot(MachineState *ms);
bool machine_create_topo_tree(MachineState *ms, Error **errp);
int get_max_topo_by_level(const MachineState *ms, CpuTopologyLevel level);
+unsigned int machine_topo_get_cores_per_socket(const MachineState *ms);
+unsigned int machine_topo_get_threads_per_socket(const MachineState *ms);
bool machine_parse_custom_topo_config(MachineState *ms,
const SMPConfiguration *config,
Error **errp);
diff --git a/system/cpus.c b/system/cpus.c
index 1c818ff6828c..53e7cfb8a55f 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -667,7 +667,7 @@ void qemu_init_vcpu(CPUState *cpu)
MachineState *ms = MACHINE(qdev_get_machine());
cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
- cpu->nr_threads = ms->smp.threads;
+ cpu->nr_threads = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_THREAD);
cpu->stopped = true;
cpu->random_seed = qemu_guest_random_seed_thread_part1();
diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c
index f9bccb56abc7..44d2213a7163 100644
--- a/tests/unit/test-smp-parse.c
+++ b/tests/unit/test-smp-parse.c
@@ -801,8 +801,8 @@ static void check_parse(MachineState *ms, const SMPConfiguration *config,
/* call the generic parser */
machine_parse_smp_config(ms, config, &err);
- ms_threads_per_socket = machine_topo_get_threads_per_socket(ms);
- ms_cores_per_socket = machine_topo_get_cores_per_socket(ms);
+ ms_threads_per_socket = machine_topo_get_threads_per_socket_old(ms);
+ ms_cores_per_socket = machine_topo_get_cores_per_socket_old(ms);
output_topo_str = cpu_topology_to_string(&ms->smp,
ms_threads_per_socket,
ms_cores_per_socket,
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 08/12] hw/i386: Use get_max_topo_by_level() to get topology information
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (6 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 07/12] hw/core: Re-implement topology helpers to honor max limitations Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 09/12] i386: Introduce x86 CPU core abstractions Zhao Liu
` (4 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
To honor the custom topology case and generate correct APIC ID for
hybrid CPU topology, Use get_max_topo_by_level() to get topology
information instead of accessing MachineState.smp directly.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/i386/x86-common.c | 19 +++++++++++++------
hw/i386/x86.c | 20 +++++++++++++-------
2 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 75d4b2f3d43a..58591e015569 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -202,11 +202,15 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
static void x86_fixup_topo_ids(MachineState *ms, X86CPU *cpu)
{
+ int max_modules, max_dies;
+
+ max_modules = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_MODULE);
+ max_dies = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_DIE);
/*
* die-id was optional in QEMU 4.0 and older, so keep it optional
* if there's only one die per socket.
*/
- if (cpu->module_id < 0 && ms->smp.modules == 1) {
+ if (cpu->module_id < 0 && max_modules == 1) {
cpu->module_id = 0;
}
@@ -214,7 +218,7 @@ static void x86_fixup_topo_ids(MachineState *ms, X86CPU *cpu)
* module-id was optional in QEMU 9.0 and older, so keep it optional
* if there's only one module per die.
*/
- if (cpu->die_id < 0 && ms->smp.dies == 1) {
+ if (cpu->die_id < 0 && max_dies == 1) {
cpu->die_id = 0;
}
}
@@ -393,6 +397,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
MachineState *ms = MACHINE(hotplug_dev);
X86MachineState *x86ms = X86_MACHINE(hotplug_dev);
X86CPUTopoInfo topo_info;
+ int max_modules, max_dies;
if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -413,13 +418,15 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
init_topo_info(&topo_info, x86ms);
- if (ms->smp.modules > 1) {
- env->nr_modules = ms->smp.modules;
+ max_modules = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_MODULE);
+ if (max_modules > 1) {
+ env->nr_modules = max_modules;
set_bit(CPU_TOPOLOGY_LEVEL_MODULE, env->avail_cpu_topo);
}
- if (ms->smp.dies > 1) {
- env->nr_dies = ms->smp.dies;
+ max_dies = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_DIE);
+ if (max_dies > 1) {
+ env->nr_dies = max_dies;
set_bit(CPU_TOPOLOGY_LEVEL_DIE, env->avail_cpu_topo);
}
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index cdf7b81ad0e3..55904b545d84 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -44,16 +44,20 @@ void init_topo_info(X86CPUTopoInfo *topo_info,
{
MachineState *ms = MACHINE(x86ms);
- topo_info->dies_per_pkg = ms->smp.dies;
+ topo_info->dies_per_pkg =
+ get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_DIE);
/*
* Though smp.modules means the number of modules in one cluster,
* i386 doesn't support cluster level so that the smp.clusters
* always defaults to 1, therefore using smp.modules directly is
* fine here.
*/
- topo_info->modules_per_die = ms->smp.modules;
- topo_info->cores_per_module = ms->smp.cores;
- topo_info->threads_per_core = ms->smp.threads;
+ topo_info->modules_per_die =
+ get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_MODULE);
+ topo_info->cores_per_module =
+ get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_CORE);
+ topo_info->threads_per_core =
+ get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_THREAD);
}
/*
@@ -103,7 +107,7 @@ static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
X86MachineState *x86ms = X86_MACHINE(ms);
unsigned int max_cpus = ms->smp.max_cpus;
X86CPUTopoInfo topo_info;
- int i;
+ int i, max_dies, max_modules;
if (ms->possible_cpus) {
/*
@@ -120,6 +124,8 @@ static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
init_topo_info(&topo_info, x86ms);
+ max_dies = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_DIE);
+ max_modules = get_max_topo_by_level(ms, CPU_TOPOLOGY_LEVEL_MODULE);
for (i = 0; i < ms->possible_cpus->len; i++) {
X86CPUTopoIDs topo_ids;
@@ -131,11 +137,11 @@ static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
&topo_info, &topo_ids);
ms->possible_cpus->cpus[i].props.has_socket_id = true;
ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
- if (ms->smp.dies > 1) {
+ if (max_dies > 1) {
ms->possible_cpus->cpus[i].props.has_die_id = true;
ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
}
- if (ms->smp.modules > 1) {
+ if (max_modules > 1) {
ms->possible_cpus->cpus[i].props.has_module_id = true;
ms->possible_cpus->cpus[i].props.module_id = topo_ids.module_id;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 09/12] i386: Introduce x86 CPU core abstractions
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (7 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 08/12] hw/i386: Use get_max_topo_by_level() to get topology information Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 10/12] i386/cpu: Support Intel hybrid CPUID Zhao Liu
` (3 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Abstract 3 core types for i386: common core, Intel Core (P-core) and
Intel atom (E-core). This is in preparation for creating the hybrid
topology from the CLI.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
target/i386/core.c | 56 +++++++++++++++++++++++++++++++++++++++++
target/i386/core.h | 53 ++++++++++++++++++++++++++++++++++++++
target/i386/meson.build | 1 +
3 files changed, 110 insertions(+)
create mode 100644 target/i386/core.c
create mode 100644 target/i386/core.h
diff --git a/target/i386/core.c b/target/i386/core.c
new file mode 100644
index 000000000000..d76186a6a070
--- /dev/null
+++ b/target/i386/core.c
@@ -0,0 +1,56 @@
+/*
+ * x86 CPU core
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Author: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "core.h"
+
+static void x86_common_core_class_init(ObjectClass *oc, void *data)
+{
+ X86CPUCoreClass *cc = X86_CPU_CORE_CLASS(oc);
+
+ cc->core_type = COMMON_CORE;
+}
+
+static void x86_intel_atom_class_init(ObjectClass *oc, void *data)
+{
+ X86CPUCoreClass *cc = X86_CPU_CORE_CLASS(oc);
+
+ cc->core_type = INTEL_ATOM;
+}
+
+static void x86_intel_core_class_init(ObjectClass *oc, void *data)
+{
+ X86CPUCoreClass *cc = X86_CPU_CORE_CLASS(oc);
+
+ cc->core_type = INTEL_CORE;
+}
+
+static const TypeInfo x86_cpu_core_infos[] = {
+ {
+ .name = TYPE_X86_CPU_CORE,
+ .parent = TYPE_CPU_CORE,
+ .class_size = sizeof(X86CPUCoreClass),
+ .class_init = x86_common_core_class_init,
+ .instance_size = sizeof(X86CPUCore),
+ },
+ {
+ .parent = TYPE_X86_CPU_CORE,
+ .name = X86_CPU_CORE_TYPE_NAME("intel-atom"),
+ .class_init = x86_intel_atom_class_init,
+ },
+ {
+ .parent = TYPE_X86_CPU_CORE,
+ .name = X86_CPU_CORE_TYPE_NAME("intel-core"),
+ .class_init = x86_intel_core_class_init,
+ },
+};
+
+DEFINE_TYPES(x86_cpu_core_infos)
diff --git a/target/i386/core.h b/target/i386/core.h
new file mode 100644
index 000000000000..b942153b2c0d
--- /dev/null
+++ b/target/i386/core.h
@@ -0,0 +1,53 @@
+/*
+ * x86 CPU core header
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Author: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ */
+
+#include "hw/cpu/core.h"
+#include "hw/cpu/cpu-topology.h"
+#include "qom/object.h"
+
+#ifndef I386_CORE_H
+#define I386_CORE_H
+
+#ifdef TARGET_X86_64
+#define TYPE_X86_PREFIX "x86-"
+#else
+#define TYPE_X86_PREFIX "i386-"
+#endif
+
+#define TYPE_X86_CPU_CORE TYPE_X86_PREFIX "core"
+
+OBJECT_DECLARE_TYPE(X86CPUCore, X86CPUCoreClass, X86_CPU_CORE)
+
+typedef enum {
+ COMMON_CORE = 0,
+ INTEL_ATOM,
+ INTEL_CORE,
+} X86CoreType;
+
+struct X86CPUCoreClass {
+ /*< private >*/
+ CPUTopoClass parent_class;
+
+ /*< public >*/
+ DeviceRealize parent_realize;
+ X86CoreType core_type;
+};
+
+struct X86CPUCore {
+ /*< private >*/
+ CPUCore parent_obj;
+
+ /*< public >*/
+};
+
+#define X86_CPU_CORE_TYPE_NAME(core_type_str) (TYPE_X86_PREFIX core_type_str)
+
+#endif /* I386_CORE_H */
diff --git a/target/i386/meson.build b/target/i386/meson.build
index 075117989b9d..80a32526d98b 100644
--- a/target/i386/meson.build
+++ b/target/i386/meson.build
@@ -18,6 +18,7 @@ i386_system_ss.add(files(
'arch_memory_mapping.c',
'machine.c',
'monitor.c',
+ 'core.c',
'cpu-apic.c',
'cpu-sysemu.c',
))
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 10/12] i386/cpu: Support Intel hybrid CPUID
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (8 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 09/12] i386: Introduce x86 CPU core abstractions Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 11/12] i386/machine: Split machine initialization after CPU creation into post_init() Zhao Liu
` (2 subsequent siblings)
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu, Zhuocheng Ding
For hybrid cpu topology, Intel exposes these CPUIDs [1]:
1. Set CPUID.07H.0H:EDX.Hybrid[bit 15]. With setting as 1, the processor
is identified as a hybrid part.
2. Have CPUID.1AH leaf. Set core type and native model ID in
CPUID.1AH:EAX. Because the native model ID is currently useless for
the software, no need to emulate.
For hybrid related CPUIDs, especially CPUID.07H.0H:EDX.Hybrid[bit 15],
there's no need to expose this feature in feature_word_info[] to allow
user to set directly, because hybrid features depend on the specific
core type information, and this information needs to be gathered
together with hybrid cpu topology.
[1]: SDM, vol.2, Ch.3, 3.2 Instructions (A-L), CPUID-CPU Identification
Co-Developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
target/i386/cpu.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
target/i386/cpu.h | 5 ++++
2 files changed, 63 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fb54c2c100a0..2f0e7f3d5ad7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -22,6 +22,7 @@
#include "qemu/cutils.h"
#include "qemu/qemu-print.h"
#include "qemu/hw-version.h"
+#include "core.h"
#include "cpu.h"
#include "tcg/helper-tcg.h"
#include "sysemu/hvf.h"
@@ -743,6 +744,10 @@ static CPUCacheInfo legacy_l3_cache = {
#define INTEL_AMX_TMUL_MAX_K 0x10
#define INTEL_AMX_TMUL_MAX_N 0x40
+/* CPUID Leaf 0x1A constants: */
+#define INTEL_HYBRID_TYPE_ATOM 0x20
+#define INTEL_HYBRID_TYPE_CORE 0x40
+
void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
uint32_t vendor2, uint32_t vendor3)
{
@@ -6580,6 +6585,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
*ecx |= CPUID_7_0_ECX_OSPKE;
}
*edx = env->features[FEAT_7_0_EDX]; /* Feature flags */
+
+ if (env->parent_core_type != COMMON_CORE &&
+ (IS_INTEL_CPU(env) || !cpu->vendor_cpuid_only)) {
+ *edx |= CPUID_7_0_EDX_HYBRID;
+ }
} else if (count == 1) {
*eax = env->features[FEAT_7_1_EAX];
*edx = env->features[FEAT_7_1_EDX];
@@ -6800,6 +6810,31 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
}
break;
}
+ case 0x1A:
+ /* Hybrid Information Enumeration */
+ *eax = 0;
+ *ebx = 0;
+ *ecx = 0;
+ *edx = 0;
+ if (env->parent_core_type != COMMON_CORE &&
+ (IS_INTEL_CPU(env) || !cpu->vendor_cpuid_only)) {
+ /*
+ * CPUID.1AH:EAX.[bits 23-0] indicates "native model ID of the
+ * core". Since this field currently is useless for software,
+ * no need to emulate.
+ */
+ switch (env->parent_core_type) {
+ case INTEL_ATOM:
+ *eax = INTEL_HYBRID_TYPE_ATOM << 24;
+ break;
+ case INTEL_CORE:
+ *eax = INTEL_HYBRID_TYPE_CORE << 24;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+ }
+ break;
case 0x1D: {
/* AMX TILE, for now hardcoded for Sapphire Rapids*/
*eax = 0;
@@ -7459,6 +7494,14 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
}
}
+ /*
+ * Intel CPU topology with hybrid cores support requires CPUID.1AH.
+ */
+ if (env->parent_core_type != COMMON_CORE &&
+ (IS_INTEL_CPU(env) || !cpu->vendor_cpuid_only)) {
+ x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x1A);
+ }
+
/*
* Intel CPU topology with multi-dies support requires CPUID[0x1F].
* For AMD Rome/Milan, cpuid level is 0x10, and guest OS should detect
@@ -7650,6 +7693,20 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
return;
}
+ /*
+ * TODO: Introduce parent_pre_realize to make sure topology device
+ * can realize first.
+ */
+ if (dev->parent_bus && dev->parent_bus->parent) {
+ DeviceState *parent = dev->parent_bus->parent;
+ X86CPUCore *core =
+ (X86CPUCore *)object_dynamic_cast(OBJECT(parent),
+ TYPE_X86_CPU_CORE);
+ if (core) {
+ env->parent_core_type = X86_CPU_CORE_GET_CLASS(core)->core_type;
+ }
+ }
+
/*
* Process Hyper-V enlightenments.
* Note: this currently has to happen before the expansion of CPU features.
@@ -8048,6 +8105,7 @@ static void x86_cpu_initfn(Object *obj)
CPUX86State *env = &cpu->env;
x86_cpu_init_default_topo(cpu);
+ env->parent_core_type = COMMON_CORE;
object_property_add(obj, "feature-words", "X86CPUFeatureWordInfo",
x86_cpu_get_feature_words,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index afe2b5fd3382..38236df547e6 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -21,6 +21,7 @@
#define I386_CPU_H
#include "sysemu/tcg.h"
+#include "core.h"
#include "cpu-qom.h"
#include "kvm/hyperv-proto.h"
#include "exec/cpu-defs.h"
@@ -920,6 +921,8 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
#define CPUID_7_0_EDX_AVX512_VP2INTERSECT (1U << 8)
/* SERIALIZE instruction */
#define CPUID_7_0_EDX_SERIALIZE (1U << 14)
+/* Hybrid */
+#define CPUID_7_0_EDX_HYBRID (1U << 15)
/* TSX Suspend Load Address Tracking instruction */
#define CPUID_7_0_EDX_TSX_LDTRK (1U << 16)
/* Architectural LBRs */
@@ -1996,6 +1999,8 @@ typedef struct CPUArchState {
/* Bitmap of available CPU topology levels for this CPU. */
DECLARE_BITMAP(avail_cpu_topo, CPU_TOPOLOGY_LEVEL__MAX);
+
+ X86CoreType parent_core_type;
} CPUX86State;
struct kvm_msrs;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 11/12] i386/machine: Split machine initialization after CPU creation into post_init()
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (9 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 10/12] i386/cpu: Support Intel hybrid CPUID Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 12/12] i386: Support custom topology for microvm, pc-i440fx and pc-q35 Zhao Liu
2024-10-08 10:30 ` [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Jonathan Cameron via
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
Custom topology will allow machine to skip the default CPU creation and
accept user's CPU creation from CLI.
Therefore, for microvm, pc-i440fx and pc-q35, split machine
initialization from x86_cpus_init(), and place the remaining part into
post_init(), which can continue to run after CPU creation from CLI.
This addresses the CPU dependency for the remaining initialization steps
after x86_cpus_init().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/i386/microvm.c | 7 +++++++
hw/i386/pc_piix.c | 40 +++++++++++++++++++++++++---------------
hw/i386/pc_q35.c | 36 ++++++++++++++++++++++--------------
include/hw/i386/pc.h | 3 +++
4 files changed, 57 insertions(+), 29 deletions(-)
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 49a897db50fc..dc9b21a34230 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -463,6 +463,11 @@ static void microvm_machine_state_init(MachineState *machine)
microvm_memory_init(mms);
x86_cpus_init(x86ms, CPU_VERSION_LATEST);
+}
+
+static void microvm_machine_state_post_init(MachineState *machine)
+{
+ MicrovmMachineState *mms = MICROVM_MACHINE(machine);
microvm_devices_init(mms);
}
@@ -665,6 +670,8 @@ static void microvm_class_init(ObjectClass *oc, void *data)
/* Machine class handlers */
mc->reset = microvm_machine_reset;
+ mc->post_init = microvm_machine_state_post_init;
+
/* hotplug (for cpu coldplug) */
mc->get_hotplug_handler = microvm_get_hotplug_handler;
hc->pre_plug = microvm_device_pre_plug_cb;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 2bf6865d405e..c1db2f3129cf 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -105,19 +105,9 @@ static void pc_init1(MachineState *machine, const char *pci_type)
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
- MemoryRegion *system_memory = get_system_memory();
- MemoryRegion *system_io = get_system_io();
- Object *phb = NULL;
- ISABus *isa_bus;
- Object *piix4_pm = NULL;
- qemu_irq smi_irq;
- GSIState *gsi_state;
- MemoryRegion *ram_memory;
- MemoryRegion *pci_memory = NULL;
- MemoryRegion *rom_memory = system_memory;
ram_addr_t lowmem;
- uint64_t hole64_size = 0;
+ pcms->pci_type = pci_type;
/*
* Calculate ram split, for memory below and above 4G. It's a bit
* complicated for backward compatibility reasons ...
@@ -150,9 +140,9 @@ static void pc_init1(MachineState *machine, const char *pci_type)
* qemu -M pc,max-ram-below-4g=4G -m 3968M -> 3968M low (=4G-128M)
*/
if (xen_enabled()) {
- xen_hvm_init_pc(pcms, &ram_memory);
+ xen_hvm_init_pc(pcms, &pcms->pre_config_ram);
} else {
- ram_memory = machine->ram;
+ pcms->pre_config_ram = machine->ram;
if (!pcms->max_ram_below_4g) {
pcms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
}
@@ -182,6 +172,23 @@ static void pc_init1(MachineState *machine, const char *pci_type)
pc_machine_init_sgx_epc(pcms);
x86_cpus_init(x86ms, pcmc->default_cpu_version);
+}
+
+static void pc_post_init1(MachineState *machine)
+{
+ PCMachineState *pcms = PC_MACHINE(machine);
+ PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+ X86MachineState *x86ms = X86_MACHINE(machine);
+ MemoryRegion *system_memory = get_system_memory();
+ MemoryRegion *system_io = get_system_io();
+ Object *phb = NULL;
+ ISABus *isa_bus;
+ Object *piix4_pm = NULL;
+ qemu_irq smi_irq;
+ GSIState *gsi_state;
+ MemoryRegion *pci_memory = NULL;
+ MemoryRegion *rom_memory = system_memory;
+ uint64_t hole64_size = 0;
if (kvm_enabled()) {
kvmclock_create(pcmc->kvmclock_create_always);
@@ -195,7 +202,7 @@ static void pc_init1(MachineState *machine, const char *pci_type)
phb = OBJECT(qdev_new(TYPE_I440FX_PCI_HOST_BRIDGE));
object_property_add_child(OBJECT(machine), "i440fx", phb);
object_property_set_link(phb, PCI_HOST_PROP_RAM_MEM,
- OBJECT(ram_memory), &error_fatal);
+ OBJECT(pcms->pre_config_ram), &error_fatal);
object_property_set_link(phb, PCI_HOST_PROP_PCI_MEM,
OBJECT(pci_memory), &error_fatal);
object_property_set_link(phb, PCI_HOST_PROP_SYSTEM_MEM,
@@ -206,7 +213,7 @@ static void pc_init1(MachineState *machine, const char *pci_type)
x86ms->below_4g_mem_size, &error_fatal);
object_property_set_uint(phb, PCI_HOST_ABOVE_4G_MEM_SIZE,
x86ms->above_4g_mem_size, &error_fatal);
- object_property_set_str(phb, I440FX_HOST_PROP_PCI_TYPE, pci_type,
+ object_property_set_str(phb, I440FX_HOST_PROP_PCI_TYPE, pcms->pci_type,
&error_fatal);
sysbus_realize_and_unref(SYS_BUS_DEVICE(phb), &error_fatal);
@@ -413,6 +420,7 @@ static void pc_set_south_bridge(Object *obj, int value, Error **errp)
static void pc_init_isa(MachineState *machine)
{
pc_init1(machine, NULL);
+ pc_post_init1(machine);
}
#endif
@@ -423,6 +431,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine)
TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : TYPE_I440FX_PCI_DEVICE;
pc_init1(machine, pci_type);
+ pc_post_init1(machine);
}
static void pc_xen_hvm_init(MachineState *machine)
@@ -463,6 +472,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
m->default_nic = "e1000";
m->no_floppy = !module_object_class_by_name(TYPE_ISA_FDC);
m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
+ m->post_init = pc_post_init1;
machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8319b6d45ee3..9ce3e65d7182 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -129,21 +129,7 @@ static void pc_q35_init(MachineState *machine)
PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
- Object *phb;
- PCIDevice *lpc;
- DeviceState *lpc_dev;
- MemoryRegion *system_memory = get_system_memory();
- MemoryRegion *system_io = get_system_io();
- MemoryRegion *pci_memory = g_new(MemoryRegion, 1);
- GSIState *gsi_state;
- ISABus *isa_bus;
- int i;
ram_addr_t lowmem;
- DriveInfo *hd[MAX_SATA_PORTS];
- MachineClass *mc = MACHINE_GET_CLASS(machine);
- bool acpi_pcihp;
- bool keep_pci_slot_hpc;
- uint64_t pci_hole64_size = 0;
assert(pcmc->pci_enabled);
@@ -188,6 +174,27 @@ static void pc_q35_init(MachineState *machine)
pc_machine_init_sgx_epc(pcms);
x86_cpus_init(x86ms, pcmc->default_cpu_version);
+}
+
+static void pc_q35_post_init(MachineState *machine)
+{
+ PCMachineState *pcms = PC_MACHINE(machine);
+ PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+ X86MachineState *x86ms = X86_MACHINE(machine);
+ Object *phb;
+ PCIDevice *lpc;
+ DeviceState *lpc_dev;
+ MemoryRegion *system_memory = get_system_memory();
+ MemoryRegion *system_io = get_system_io();
+ MemoryRegion *pci_memory = g_new(MemoryRegion, 1);
+ GSIState *gsi_state;
+ ISABus *isa_bus;
+ int i;
+ DriveInfo *hd[MAX_SATA_PORTS];
+ MachineClass *mc = MACHINE_GET_CLASS(machine);
+ bool acpi_pcihp;
+ bool keep_pci_slot_hpc;
+ uint64_t pci_hole64_size = 0;
if (kvm_enabled()) {
kvmclock_create(pcmc->kvmclock_create_always);
@@ -348,6 +355,7 @@ static void pc_q35_machine_options(MachineClass *m)
m->no_floppy = 1;
m->max_cpus = 4096;
m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
+ m->post_init = pc_q35_post_init;
machine_class_allow_dynamic_sysbus_dev(m, TYPE_AMD_IOMMU_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 14ee06287da3..14534781e8fb 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -58,6 +58,9 @@ typedef struct PCMachineState {
SGXEPCState sgx_epc;
CXLState cxl_devices_state;
+
+ MemoryRegion *pre_config_ram;
+ const char *pci_type;
} PCMachineState;
#define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [RFC v2 12/12] i386: Support custom topology for microvm, pc-i440fx and pc-q35
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (10 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 11/12] i386/machine: Split machine initialization after CPU creation into post_init() Zhao Liu
@ 2024-09-19 6:11 ` Zhao Liu
2024-10-08 10:30 ` [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Jonathan Cameron via
12 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-09-19 6:11 UTC (permalink / raw)
To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell
Cc: qemu-devel, kvm, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma,
Zhao Liu
With custom topology enabling, user could configure hyrid CPU topology
from CLI.
For example, create a Intel Core (P core) with 2 threads and 2 Intel
Atom (E core) with single thread for PC machine:
-smp maxsockets=1,maxdies=1,maxmodules=2,maxcores=2,maxthreads=2
-machine pc,custom-topo=on \
-device cpu-socket,id=sock0 \
-device cpu-die,id=die0,bus=sock0 \
-device cpu-module,id=mod0,bus=die0 \
-device cpu-module,id=mod1,bus=die0 \
-device x86-intel-core,id=core0,bus=mod0 \
-device x86-intel-atom,id=core1,bus=mod1 \
-device x86-intel-atom,id=core2,bus=mod1 \
-device host-x86_64-cpu,id=cpu0,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=0 \
-device host-x86_64-cpu,id=cpu1,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=1 \
-device host-x86_64-cpu,id=cpu2,socket-id=0,die-id=0,module-id=1,core-id=0,thread-id=0 \
-device host-x86_64-cpu,id=cpu3,socket-id=0,die-id=0,module-id=1,core-id=1,thread-id=0
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
hw/i386/microvm.c | 1 +
hw/i386/pc_piix.c | 1 +
hw/i386/pc_q35.c | 1 +
hw/i386/x86-common.c | 6 ++++++
4 files changed, 9 insertions(+)
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index dc9b21a34230..bd03b6946e6c 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -671,6 +671,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
mc->reset = microvm_machine_reset;
mc->post_init = microvm_machine_state_post_init;
+ mc->smp_props.custom_topo_supported = true;
/* hotplug (for cpu coldplug) */
mc->get_hotplug_handler = microvm_get_hotplug_handler;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c1db2f3129cf..9c696a226858 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -473,6 +473,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
m->no_floppy = !module_object_class_by_name(TYPE_ISA_FDC);
m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
m->post_init = pc_post_init1;
+ m->smp_props.custom_topo_supported = true;
machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 9ce3e65d7182..9241366ff351 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -356,6 +356,7 @@ static void pc_q35_machine_options(MachineClass *m)
m->max_cpus = 4096;
m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
m->post_init = pc_q35_post_init;
+ m->smp_props.custom_topo_supported = true;
machine_class_allow_dynamic_sysbus_dev(m, TYPE_AMD_IOMMU_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 58591e015569..2995eed5d670 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -195,6 +195,12 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
}
possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+ /* Leave user to add CPUs. */
+ if (ms->topo->custom_topo_enabled) {
+ return;
+ }
+
for (i = 0; i < ms->smp.cpus; i++) {
x86_cpu_new(x86ms, i, possible_cpus->cpus[i].arch_id, &error_fatal);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device
2024-09-19 6:11 ` [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device Zhao Liu
@ 2024-10-08 9:14 ` Jonathan Cameron via
2024-10-09 6:09 ` Zhao Liu
0 siblings, 1 reply; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-08 9:14 UTC (permalink / raw)
To: Zhao Liu
Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma
On Thu, 19 Sep 2024 14:11:17 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:
> Topology devices need to be created and realized before board
> initialization.
>
> Allow qdev_device_add() to specify category to help create topology
> devices early.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
It's not immediately obvious what the category parameter is.
Can you use DeviceCategory rather than long?
> ---
> hw/net/virtio-net.c | 2 +-
> hw/usb/xen-usb.c | 3 ++-
> include/monitor/qdev.h | 4 ++--
> system/qdev-monitor.c | 12 ++++++++----
> system/vl.c | 4 ++--
> 5 files changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index fb84d142ee29..0d92e09e9076 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -935,7 +935,7 @@ static void failover_add_primary(VirtIONet *n, Error **errp)
> return;
> }
>
> - dev = qdev_device_add_from_qdict(n->primary_opts,
> + dev = qdev_device_add_from_qdict(n->primary_opts, NULL,
> n->primary_opts_from_json,
> &err);
> if (err) {
> diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
> index 13901625c0c8..e4168b1fec7e 100644
> --- a/hw/usb/xen-usb.c
> +++ b/hw/usb/xen-usb.c
> @@ -766,7 +766,8 @@ static void usbback_portid_add(struct usbback_info *usbif, unsigned port,
> qdict_put_str(qdict, "hostport", portname);
> opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict,
> &error_abort);
> - usbif->ports[port - 1].dev = USB_DEVICE(qdev_device_add(opts, &local_err));
> + usbif->ports[port - 1].dev = USB_DEVICE(
> + qdev_device_add(opts, NULL, &local_err));
> if (!usbif->ports[port - 1].dev) {
> qobject_unref(qdict);
> xen_pv_printf(&usbif->xendev, 0,
> diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
> index 1d57bf657794..f5fd6e6c1ffc 100644
> --- a/include/monitor/qdev.h
> +++ b/include/monitor/qdev.h
> @@ -8,8 +8,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict);
> void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp);
>
> int qdev_device_help(QemuOpts *opts);
> -DeviceState *qdev_device_add(QemuOpts *opts, Error **errp);
> -DeviceState *qdev_device_add_from_qdict(const QDict *opts,
> +DeviceState *qdev_device_add(QemuOpts *opts, long *category, Error **errp);
> +DeviceState *qdev_device_add_from_qdict(const QDict *opts, long *category,
> bool from_json, Error **errp);
>
> /**
> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
> index 457dfd05115e..fe120353fedc 100644
> --- a/system/qdev-monitor.c
> +++ b/system/qdev-monitor.c
> @@ -632,7 +632,7 @@ const char *qdev_set_id(DeviceState *dev, char *id, Error **errp)
> return prop->name;
> }
>
> -DeviceState *qdev_device_add_from_qdict(const QDict *opts,
> +DeviceState *qdev_device_add_from_qdict(const QDict *opts, long *category,
> bool from_json, Error **errp)
> {
> ERRP_GUARD();
> @@ -655,6 +655,10 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
> return NULL;
> }
>
> + if (category && !test_bit(*category, dc->categories)) {
> + return NULL;
> + }
> +
> /* find bus */
> path = qdict_get_try_str(opts, "bus");
> if (path != NULL) {
> @@ -767,12 +771,12 @@ err_del_dev:
> }
>
> /* Takes ownership of @opts on success */
> -DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
> +DeviceState *qdev_device_add(QemuOpts *opts, long *category, Error **errp)
> {
> QDict *qdict = qemu_opts_to_qdict(opts, NULL);
> DeviceState *ret;
>
> - ret = qdev_device_add_from_qdict(qdict, false, errp);
> + ret = qdev_device_add_from_qdict(qdict, category, false, errp);
> if (ret) {
> qemu_opts_del(opts);
> }
> @@ -897,7 +901,7 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
> qemu_opts_del(opts);
> return;
> }
> - dev = qdev_device_add(opts, errp);
> + dev = qdev_device_add(opts, NULL, errp);
> if (!dev) {
> /*
> * Drain all pending RCU callbacks. This is done because
> diff --git a/system/vl.c b/system/vl.c
> index 193e7049ccbe..c40364e2f091 100644
> --- a/system/vl.c
> +++ b/system/vl.c
> @@ -1212,7 +1212,7 @@ static int device_init_func(void *opaque, QemuOpts *opts, Error **errp)
> {
> DeviceState *dev;
>
> - dev = qdev_device_add(opts, errp);
> + dev = qdev_device_add(opts, NULL, errp);
> if (!dev && *errp) {
> error_report_err(*errp);
> return -1;
> @@ -2665,7 +2665,7 @@ static void qemu_create_cli_devices(void)
> * from the start, so call qdev_device_add_from_qdict() directly for
> * now.
> */
> - dev = qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
> + dev = qdev_device_add_from_qdict(opt->opts, NULL, true, &error_fatal);
> object_unref(OBJECT(dev));
> loc_pop(&opt->loc);
> }
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early
2024-09-19 6:11 ` [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early Zhao Liu
@ 2024-10-08 9:50 ` Jonathan Cameron via
2024-10-09 6:31 ` Zhao Liu
2024-10-08 9:55 ` Jonathan Cameron via
1 sibling, 1 reply; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-08 9:50 UTC (permalink / raw)
To: Zhao Liu
Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma
On Thu, 19 Sep 2024 14:11:19 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:
> Custom topology will allow user to build CPU topology from CLI totally,
> and this replaces machine's default CPU creation process (*_init_cpus()
> in MachineClass.init()).
>
> For the machine's initialization, there may be CPU dependencies in the
> remaining initialization after the CPU creation.
>
> To address such dependencies, create the CPU topology device (including
> CPU devices) from the CLI earlier, so that the latter part of machine
> initialization can be separated after qemu_add_cli_devices_early().
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Other than question of type of category from previous patch this looks
fine to me.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
However, needs review from others more familiar with this code!
> ---
> system/vl.c | 55 +++++++++++++++++++++++++++++++++++------------------
> 1 file changed, 36 insertions(+), 19 deletions(-)
>
> diff --git a/system/vl.c b/system/vl.c
> index c40364e2f091..8540454aa1c2 100644
> --- a/system/vl.c
> +++ b/system/vl.c
> @@ -1211,8 +1211,9 @@ static int device_help_func(void *opaque, QemuOpts *opts, Error **errp)
> static int device_init_func(void *opaque, QemuOpts *opts, Error **errp)
> {
> DeviceState *dev;
> + long *category = opaque;
>
> - dev = qdev_device_add(opts, NULL, errp);
> + dev = qdev_device_add(opts, category, errp);
> if (!dev && *errp) {
> error_report_err(*errp);
> return -1;
> @@ -2623,6 +2624,36 @@ static void qemu_init_displays(void)
> }
> }
>
> +static void qemu_add_devices(long *category)
> +{
> + DeviceOption *opt;
> +
> + qemu_opts_foreach(qemu_find_opts("device"),
> + device_init_func, category, &error_fatal);
> + QTAILQ_FOREACH(opt, &device_opts, next) {
> + DeviceState *dev;
> + loc_push_restore(&opt->loc);
> + /*
> + * TODO Eventually we should call qmp_device_add() here to make sure it
> + * behaves the same, but QMP still has to accept incorrectly typed
> + * options until libvirt is fixed and we want to be strict on the CLI
> + * from the start, so call qdev_device_add_from_qdict() directly for
> + * now.
> + */
> + dev = qdev_device_add_from_qdict(opt->opts, category,
> + true, &error_fatal);
> + object_unref(OBJECT(dev));
> + loc_pop(&opt->loc);
> + }
> +}
> +
> +static void qemu_add_cli_devices_early(void)
> +{
> + long category = DEVICE_CATEGORY_CPU_DEF;
> +
> + qemu_add_devices(&category);
> +}
> +
> static void qemu_init_board(void)
> {
> /* process plugin before CPUs are created, but once -smp has been parsed */
> @@ -2631,6 +2662,9 @@ static void qemu_init_board(void)
> /* From here on we enter MACHINE_PHASE_INITIALIZED. */
> machine_run_board_init(current_machine, mem_path, &error_fatal);
>
> + /* Create CPU topology device if any. */
> + qemu_add_cli_devices_early();
> +
> drive_check_orphaned();
>
> realtime_init();
> @@ -2638,8 +2672,6 @@ static void qemu_init_board(void)
>
> static void qemu_create_cli_devices(void)
> {
> - DeviceOption *opt;
> -
> soundhw_init();
>
> qemu_opts_foreach(qemu_find_opts("fw_cfg"),
> @@ -2653,22 +2685,7 @@ static void qemu_create_cli_devices(void)
>
> /* init generic devices */
> rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
> - qemu_opts_foreach(qemu_find_opts("device"),
> - device_init_func, NULL, &error_fatal);
> - QTAILQ_FOREACH(opt, &device_opts, next) {
> - DeviceState *dev;
> - loc_push_restore(&opt->loc);
> - /*
> - * TODO Eventually we should call qmp_device_add() here to make sure it
> - * behaves the same, but QMP still has to accept incorrectly typed
> - * options until libvirt is fixed and we want to be strict on the CLI
> - * from the start, so call qdev_device_add_from_qdict() directly for
> - * now.
> - */
> - dev = qdev_device_add_from_qdict(opt->opts, NULL, true, &error_fatal);
> - object_unref(OBJECT(dev));
> - loc_pop(&opt->loc);
> - }
> + qemu_add_devices(NULL);
> rom_reset_order_override();
> }
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early
2024-09-19 6:11 ` [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early Zhao Liu
2024-10-08 9:50 ` Jonathan Cameron via
@ 2024-10-08 9:55 ` Jonathan Cameron via
2024-10-09 6:11 ` Zhao Liu
1 sibling, 1 reply; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-08 9:55 UTC (permalink / raw)
To: Zhao Liu
Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma
> +
> +static void qemu_add_cli_devices_early(void)
> +{
> + long category = DEVICE_CATEGORY_CPU_DEF;
> +
> + qemu_add_devices(&category);
> +}
> +
> static void qemu_init_board(void)
> {
> /* process plugin before CPUs are created, but once -smp has been parsed */
> @@ -2631,6 +2662,9 @@ static void qemu_init_board(void)
> /* From here on we enter MACHINE_PHASE_INITIALIZED. */
> machine_run_board_init(current_machine, mem_path, &error_fatal);
>
> + /* Create CPU topology device if any. */
> + qemu_add_cli_devices_early();
I wonder if this is too generic a name?
There are various other things we might want to do early.
Maybe qemu_add_cli_cpu_def()
> +
> drive_check_orphaned();
>
> realtime_init();
> @@ -2638,8 +2672,6 @@ static void qemu_init_board(void)
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations
2024-09-19 6:11 ` [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations Zhao Liu
@ 2024-10-08 10:16 ` Jonathan Cameron via
0 siblings, 0 replies; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-08 10:16 UTC (permalink / raw)
To: Zhao Liu
Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma
On Thu, 19 Sep 2024 14:11:21 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:
> Custom topology allows user to create CPU topology totally via -device
> from CLI.
>
> Once custom topology is enabled, machine will stop the default CPU
> creation and expect user's CPU topology tree built from CLI.
>
> With custom topology, any CPU topology, whether symmetric or hybrid
> (aka, heterogeneous), can be created naturally.
>
> However, custom topology also needs to be restricted because
> possible_cpus[] requires some preliminary topology information for
> initialization, which is the max limitation (the new max parameters in
> -smp). Custom topology will be subject to this max limitation.
>
> Max limitations are necessary because creating custom topology before
> initializing possible_cpus[] would compromise future hotplug scalability.
>
> Max limitations are placed in -smp, even though custom topology can be
> defined as hybrid. From an implementation perspective, any hybrid
> topology can be considered a subset of a complete SMP structure.
> Therefore, semantically, using max limitations to constrain hybrid
> topology is consistent.
>
> Introduce custom CPU topology related properties in MachineClass. At the
> same time, add and parse max parameters from -smp, and store the max
> limitations in CPUSlot.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
A few code style comments inline.
J
> diff --git a/hw/cpu/cpu-slot.c b/hw/cpu/cpu-slot.c
> index 1cc3b32ed675..2d16a2729501 100644
> --- a/hw/cpu/cpu-slot.c
> +++ b/hw/cpu/cpu-slot.c
> +
> +bool machine_parse_custom_topo_config(MachineState *ms,
> + const SMPConfiguration *config,
> + Error **errp)
> +{
> + MachineClass *mc = MACHINE_GET_CLASS(ms);
> + CPUSlot *slot = ms->topo;
> + bool is_valid;
> + int maxcpus;
> +
> + if (!slot) {
> + return true;
> + }
> +
> + is_valid = config->has_maxsockets && config->maxsockets;
> + if (mc->smp_props.custom_topo_supported) {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
> + is_valid ? config->maxsockets : ms->smp.sockets;
> + } else if (is_valid) {
> + error_setg(errp, "maxsockets > 0 not supported "
> + "by this machine's CPU topology");
> + return false;
> + } else {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
> + ms->smp.sockets;
> + }
Having the error condition in the middle is rather confusing to
read to my eyes. Playing with equivalents I wonder what works best..
if (!is_valid) {
slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
ms->smp.sockets;
} else if (mc->smp_props.custom_topo_supported) {
slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
config->max_sockets;
} else {
error_setg...
return false;
}
or take the bad case out first. Maybe this is a little obscure
though (assuming I even got it right) as it relies on the fact
that is_valid must be false for the legacy path.
if (!mc->smp_props.custom_topo_supported && is_valid) {
error_setg();
return false;
}
slot->stat.entries[CPU_TOPOLOGY_LEVEL_SOCKET].max_limit =
is_valid ? config->maxsockets : ms->smp.sockets;
Similar for other cases.
> +
> + is_valid = config->has_maxdies && config->maxdies;
> + if (mc->smp_props.custom_topo_supported &&
> + mc->smp_props.dies_supported) {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_DIE].max_limit =
> + is_valid ? config->maxdies : ms->smp.dies;
> + } else if (is_valid) {
> + error_setg(errp, "maxdies > 0 not supported "
> + "by this machine's CPU topology");
> + return false;
> + } else {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_DIE].max_limit =
> + ms->smp.dies;
> + }
> +
> + is_valid = config->has_maxmodules && config->maxmodules;
> + if (mc->smp_props.custom_topo_supported &&
> + mc->smp_props.modules_supported) {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_MODULE].max_limit =
> + is_valid ? config->maxmodules : ms->smp.modules;
> + } else if (is_valid) {
> + error_setg(errp, "maxmodules > 0 not supported "
> + "by this machine's CPU topology");
> + return false;
> + } else {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_MODULE].max_limit =
> + ms->smp.modules;
> + }
> +
> + is_valid = config->has_maxcores && config->maxcores;
> + if (mc->smp_props.custom_topo_supported) {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_CORE].max_limit =
> + is_valid ? config->maxcores : ms->smp.cores;
> + } else if (is_valid) {
> + error_setg(errp, "maxcores > 0 not supported "
> + "by this machine's CPU topology");
> + return false;
> + } else {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_CORE].max_limit =
> + ms->smp.cores;
> + }
> +
> + is_valid = config->has_maxthreads && config->maxthreads;
> + if (mc->smp_props.custom_topo_supported) {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_THREAD].max_limit =
> + is_valid ? config->maxthreads : ms->smp.threads;
> + } else if (is_valid) {
> + error_setg(errp, "maxthreads > 0 not supported "
> + "by this machine's CPU topology");
> + return false;
> + } else {
> + slot->stat.entries[CPU_TOPOLOGY_LEVEL_THREAD].max_limit =
> + ms->smp.threads;
> + }
> +
> + maxcpus = 1;
> + /* Initizlize max_limit to 1, as members of CpuTopology. */
> + for (int i = 0; i < CPU_TOPOLOGY_LEVEL__MAX; i++) {
> + maxcpus *= slot->stat.entries[i].max_limit;
> + }
> +
> + if (!config->has_maxcpus) {
> + ms->smp.max_cpus = maxcpus;
Maybe early return here to get rid of need for the else?
> + } else {
> + if (maxcpus != ms->smp.max_cpus) {
Unless this is going to get more complex later, else if probably appropriate here
(if you don't drop the else above.
> + error_setg(errp, "maxcpus (%d) should be equal to "
> + "the product of the remaining max parameters (%d)",
> + ms->smp.max_cpus, maxcpus);
> + return false;
> + }
> + }
> +
> + return true;
> +}
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
` (11 preceding siblings ...)
2024-09-19 6:11 ` [RFC v2 12/12] i386: Support custom topology for microvm, pc-i440fx and pc-q35 Zhao Liu
@ 2024-10-08 10:30 ` Jonathan Cameron via
2024-10-09 6:01 ` Zhao Liu
2024-10-09 6:51 ` Zhao Liu
12 siblings, 2 replies; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-08 10:30 UTC (permalink / raw)
To: Zhao Liu
Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Bennée, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma
On Thu, 19 Sep 2024 14:11:16 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:
> -smp maxsockets=1,maxdies=1,maxmodules=2,maxcores=2,maxthreads=2
> -machine pc,custom-topo=on \
> -device cpu-socket,id=sock0 \
> -device cpu-die,id=die0,bus=sock0 \
> -device cpu-module,id=mod0,bus=die0 \
> -device cpu-module,id=mod1,bus=die0 \
> -device x86-intel-core,id=core0,bus=mod0 \
> -device x86-intel-atom,id=core1,bus=mod1 \
> -device x86-intel-atom,id=core2,bus=mod1 \
> -device host-x86_64-cpu,id=cpu0,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=0 \
> -device host-x86_64-cpu,id=cpu1,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=1 \
> -device host-x86_64-cpu,id=cpu2,socket-id=0,die-id=0,module-id=1,core-id=0,thread-id=0 \
> -device host-x86_64-cpu,id=cpu3,socket-id=0,die-id=0,module-id=1,core-id=1,thread-id=0
I quite like this as a way of doing the configuration but that needs
some review from others.
Peter, Alex, do you think this scheme is flexible enough to ultimately
allow us to support this for arm?
>
> This does not accommodate hybrid topologies. Therefore, we introduce
> max* parameters: maxthreads/maxcores/maxmodules/maxdies/maxsockets
> (for x86), to predefine the topology framework for the machine. These
> parameters also constrain subsequent custom topologies, ensuring the
> number of child devices under each parent device does not exceed the
> specified max limits.
To my thinking this seems like a good solution even though it's a
bunch more smp parameters.
What does this actually mean for hotplug of CPUs? What cases work
with this setup?
> Therefore, once user wants to customize topology by "-machine
> custom-topo=on", the machine, that supports custom topology, will skip
> the default topology creation as well as the default CPU creation.
Seems sensible to me.
Jonathan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree
2024-10-08 10:30 ` [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Jonathan Cameron via
@ 2024-10-09 6:01 ` Zhao Liu
2024-10-09 6:51 ` Zhao Liu
1 sibling, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-09 6:01 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Benn�e, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu
Hi Jonathan,
Thank you for looking at here!
On Tue, Oct 08, 2024 at 11:30:38AM +0100, Jonathan Cameron wrote:
> Date: Tue, 8 Oct 2024 11:30:38 +0100
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom
> Topology Tree
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
>
> On Thu, 19 Sep 2024 14:11:16 +0800
> Zhao Liu <zhao1.liu@intel.com> wrote:
>
>
> > -smp maxsockets=1,maxdies=1,maxmodules=2,maxcores=2,maxthreads=2
> > -machine pc,custom-topo=on \
> > -device cpu-socket,id=sock0 \
> > -device cpu-die,id=die0,bus=sock0 \
> > -device cpu-module,id=mod0,bus=die0 \
> > -device cpu-module,id=mod1,bus=die0 \
> > -device x86-intel-core,id=core0,bus=mod0 \
> > -device x86-intel-atom,id=core1,bus=mod1 \
> > -device x86-intel-atom,id=core2,bus=mod1 \
> > -device host-x86_64-cpu,id=cpu0,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=0 \
> > -device host-x86_64-cpu,id=cpu1,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=1 \
> > -device host-x86_64-cpu,id=cpu2,socket-id=0,die-id=0,module-id=1,core-id=0,thread-id=0 \
> > -device host-x86_64-cpu,id=cpu3,socket-id=0,die-id=0,module-id=1,core-id=1,thread-id=0
>
> I quite like this as a way of doing the configuration but that needs
> some review from others.
Thanks!
> Peter, Alex, do you think this scheme is flexible enough to ultimately
> allow us to support this for arm?
I was also hoping that being generic enough would benefit ARM.
> >
> > This does not accommodate hybrid topologies. Therefore, we introduce
> > max* parameters: maxthreads/maxcores/maxmodules/maxdies/maxsockets
> > (for x86), to predefine the topology framework for the machine. These
> > parameters also constrain subsequent custom topologies, ensuring the
> > number of child devices under each parent device does not exceed the
> > specified max limits.
>
> To my thinking this seems like a good solution even though it's a
> bunch more smp parameters.
>
> What does this actually mean for hotplug of CPUs? What cases work
> with this setup?
My solution for this does not change the current CPU hotplug, because the
current cpu hotplug only needs to consider smp.cpus and smp.maxcpus.
But when a cpu is plugged in, machine needs to make sure that plugging
into the core doesn't break the maxthreads limit. Similarly, if one wants
to support hotplugging at the socket/die/core granularity, he will need
to make sure that the new topology meets the limits set by the max
parameters, which are the equivalent of preemptively leaving some empty
holes that can be utilized by hotplug.
> > Therefore, once user wants to customize topology by "-machine
> > custom-topo=on", the machine, that supports custom topology, will skip
> > the default topology creation as well as the default CPU creation.
>
> Seems sensible to me.
Thank you! Glad to have your support.
Regards,
Zhao
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device
2024-10-08 9:14 ` Jonathan Cameron via
@ 2024-10-09 6:09 ` Zhao Liu
0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-09 6:09 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Benn�e, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu
On Tue, Oct 08, 2024 at 10:14:25AM +0100, Jonathan Cameron wrote:
> Date: Tue, 8 Oct 2024 10:14:25 +0100
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific
> category device
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
>
> On Thu, 19 Sep 2024 14:11:17 +0800
> Zhao Liu <zhao1.liu@intel.com> wrote:
>
> > Topology devices need to be created and realized before board
> > initialization.
> >
> > Allow qdev_device_add() to specify category to help create topology
> > devices early.
> >
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> It's not immediately obvious what the category parameter is.
> Can you use DeviceCategory rather than long?
...
> > -DeviceState *qdev_device_add_from_qdict(const QDict *opts,
> > +DeviceState *qdev_device_add_from_qdict(const QDict *opts, long *category,
> > bool from_json, Error **errp)
> > {
> > ERRP_GUARD();
> > @@ -655,6 +655,10 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
> > return NULL;
> > }
> >
> > + if (category && !test_bit(*category, dc->categories)) {
> > + return NULL;
> > + }
> > +
The category parameter is a bit not a bitmap, so, YES.
Thanks,
Zhao
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early
2024-10-08 9:55 ` Jonathan Cameron via
@ 2024-10-09 6:11 ` Zhao Liu
0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-09 6:11 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Benn�e, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu
On Tue, Oct 08, 2024 at 10:55:45AM +0100, Jonathan Cameron wrote:
> Date: Tue, 8 Oct 2024 10:55:45 +0100
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI
> early
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
>
>
> > +
> > +static void qemu_add_cli_devices_early(void)
> > +{
> > + long category = DEVICE_CATEGORY_CPU_DEF;
> > +
> > + qemu_add_devices(&category);
> > +}
> > +
> > static void qemu_init_board(void)
> > {
> > /* process plugin before CPUs are created, but once -smp has been parsed */
> > @@ -2631,6 +2662,9 @@ static void qemu_init_board(void)
> > /* From here on we enter MACHINE_PHASE_INITIALIZED. */
> > machine_run_board_init(current_machine, mem_path, &error_fatal);
> >
> > + /* Create CPU topology device if any. */
> > + qemu_add_cli_devices_early();
> I wonder if this is too generic a name?
>
> There are various other things we might want to do early.
> Maybe qemu_add_cli_cpu_def()
Sure, it makes sense.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early
2024-10-08 9:50 ` Jonathan Cameron via
@ 2024-10-09 6:31 ` Zhao Liu
0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-09 6:31 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Benn�e, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu
On Tue, Oct 08, 2024 at 10:50:53AM +0100, Jonathan Cameron wrote:
> Date: Tue, 8 Oct 2024 10:50:53 +0100
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC v2 03/12] system/vl: Create CPU topology devices from CLI
> early
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
>
> On Thu, 19 Sep 2024 14:11:19 +0800
> Zhao Liu <zhao1.liu@intel.com> wrote:
>
> > Custom topology will allow user to build CPU topology from CLI totally,
> > and this replaces machine's default CPU creation process (*_init_cpus()
> > in MachineClass.init()).
> >
> > For the machine's initialization, there may be CPU dependencies in the
> > remaining initialization after the CPU creation.
> >
> > To address such dependencies, create the CPU topology device (including
> > CPU devices) from the CLI earlier, so that the latter part of machine
> > initialization can be separated after qemu_add_cli_devices_early().
> >
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Other than question of type of category from previous patch this looks
> fine to me.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> However, needs review from others more familiar with this code!
Thanks!
-Zhao
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree
2024-10-08 10:30 ` [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Jonathan Cameron via
2024-10-09 6:01 ` Zhao Liu
@ 2024-10-09 6:51 ` Zhao Liu
1 sibling, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-09 6:51 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
Sergio Lopez, Jason Wang, Stefano Stabellini, Anthony PERARD,
Paul Durrant, Edgar E . Iglesias, Eric Blake, Markus Armbruster,
Alex Benn�e, Peter Maydell, qemu-devel, kvm, qemu-arm,
Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu
Hi Jonathan,
On Tue, Oct 08, 2024 at 11:30:38AM +0100, Jonathan Cameron wrote:
> Date: Tue, 8 Oct 2024 11:30:38 +0100
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom
> Topology Tree
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
>
> On Thu, 19 Sep 2024 14:11:16 +0800
> Zhao Liu <zhao1.liu@intel.com> wrote:
>
>
> > -smp maxsockets=1,maxdies=1,maxmodules=2,maxcores=2,maxthreads=2
> > -machine pc,custom-topo=on \
> > -device cpu-socket,id=sock0 \
> > -device cpu-die,id=die0,bus=sock0 \
> > -device cpu-module,id=mod0,bus=die0 \
> > -device cpu-module,id=mod1,bus=die0 \
> > -device x86-intel-core,id=core0,bus=mod0 \
> > -device x86-intel-atom,id=core1,bus=mod1 \
> > -device x86-intel-atom,id=core2,bus=mod1 \
> > -device host-x86_64-cpu,id=cpu0,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=0 \
> > -device host-x86_64-cpu,id=cpu1,socket-id=0,die-id=0,module-id=0,core-id=0,thread-id=1 \
> > -device host-x86_64-cpu,id=cpu2,socket-id=0,die-id=0,module-id=1,core-id=0,thread-id=0 \
> > -device host-x86_64-cpu,id=cpu3,socket-id=0,die-id=0,module-id=1,core-id=1,thread-id=0
>
> I quite like this as a way of doing the configuration but that needs
> some review from others.
>
> Peter, Alex, do you think this scheme is flexible enough to ultimately
> allow us to support this for arm?
BTW, this series requires a preliminary RFC [*] to first convert all the
topology layers into devices.
If you’re interested as well, welcome your comments. :)
[*]: [RFC v2 00/15] qom-topo: Abstract CPU Topology Level to Topology Device
https://lore.kernel.org/qemu-devel/20240919015533.766754-1-zhao1.liu@intel.com/
Regards,
Zhao
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-10-09 6:36 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-19 6:11 [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Zhao Liu
2024-09-19 6:11 ` [RFC v2 01/12] qdev: Allow qdev_device_add() to add specific category device Zhao Liu
2024-10-08 9:14 ` Jonathan Cameron via
2024-10-09 6:09 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 02/12] qdev: Introduce new device category to cover basic topology device Zhao Liu
2024-09-19 6:11 ` [RFC v2 03/12] system/vl: Create CPU topology devices from CLI early Zhao Liu
2024-10-08 9:50 ` Jonathan Cameron via
2024-10-09 6:31 ` Zhao Liu
2024-10-08 9:55 ` Jonathan Cameron via
2024-10-09 6:11 ` Zhao Liu
2024-09-19 6:11 ` [RFC v2 04/12] hw/core/machine: Split machine initialization around qemu_add_cli_devices_early() Zhao Liu
2024-09-19 6:11 ` [RFC v2 05/12] hw/core/machine: Introduce custom CPU topology with max limitations Zhao Liu
2024-10-08 10:16 ` Jonathan Cameron via
2024-09-19 6:11 ` [RFC v2 06/12] hw/cpu: Constrain CPU topology tree with max_limit Zhao Liu
2024-09-19 6:11 ` [RFC v2 07/12] hw/core: Re-implement topology helpers to honor max limitations Zhao Liu
2024-09-19 6:11 ` [RFC v2 08/12] hw/i386: Use get_max_topo_by_level() to get topology information Zhao Liu
2024-09-19 6:11 ` [RFC v2 09/12] i386: Introduce x86 CPU core abstractions Zhao Liu
2024-09-19 6:11 ` [RFC v2 10/12] i386/cpu: Support Intel hybrid CPUID Zhao Liu
2024-09-19 6:11 ` [RFC v2 11/12] i386/machine: Split machine initialization after CPU creation into post_init() Zhao Liu
2024-09-19 6:11 ` [RFC v2 12/12] i386: Support custom topology for microvm, pc-i440fx and pc-q35 Zhao Liu
2024-10-08 10:30 ` [RFC v2 00/12] Introduce Hybrid CPU Topology via Custom Topology Tree Jonathan Cameron via
2024-10-09 6:01 ` Zhao Liu
2024-10-09 6:51 ` Zhao Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).