From: Pierre Morel <pmorel@linux.ibm.com>
To: qemu-s390x@nongnu.org
Cc: thuth@redhat.com, ehabkost@redhat.com, david@redhat.com,
cohuck@redhat.com, richard.henderson@linaro.org,
qemu-devel@nongnu.org, armbru@redhat.com, pasic@linux.ibm.com,
borntraeger@de.ibm.com, pbonzini@redhat.com, eblake@redhat.com
Subject: [PATCH v1 0/9] s390x: CPU Topology
Date: Wed, 14 Jul 2021 18:53:07 +0200 [thread overview]
Message-ID: <1626281596-31061-1-git-send-email-pmorel@linux.ibm.com> (raw)
Hi,
This series is a first part of the implementation of CPU topology
for S390.
====================
A short introduction
====================
CPU Topology is described in the S390 POP with essentially the description
of two instructions:
PTF Perform Topology function used to poll for topology change
and used to set the polarization but this part is not part of this item.
STSI Store System Information and the SYSIB 15.1.x providing the Topology
configuration.
S390 Topology is a 6 levels hierarchical topology with up to 5 level
of containers. The last topology level, specifying the CPU cores.
To get the information on the topology, S390 provides the STSI
instruction, which stores a structures providing the list of the
containers used in the Machine topology: the SYSIB.
A selector within the STSI instruction allow to chose how many topology
levels will be provide in the SYSIB.
Using the Topology List Entries (TLE) provided inside the SYSIB we
the Linux kernel is able to compute the information about the cache
distance between two cores and can use this information to take
scheduling decisions.
We first provide a simple topology for the case QEMU does not
use NUMA by extending the -smp argument and then we provide more
fine grain topology in the case QEMU uses the -numa arguments.
Note:
-----
Z15 reports 3 levels of containers, drawers, book, sockets as
Container-TLEs above the core description inside CPU-TLEs.
The Topology can be seen at several places inside zLinux:
- sysfs: /sys/devices/system/cpu/cpuX/topology
- procfs: /proc/sysinfo and /proc/cpuinfo
- lscpu -e : gives toplogy information
The different Topology levels have names:
- Node - Drawer - Book - sockets or physical package - core
Threads:
Multithreading, is not part of the topology as described by the
SYSIB 15.1.x
The interest of the guest to know the CPU topology is obviously to be
able to optimise the load balancing and the migration of threads.
KVM will have the same interest concerning vCPUs scheduling and cache
optimisation.
====================
The design
====================
1) To be ready for hotplug, I chose an Object oriented design
of the topology containers:
- A node is a bridge on the SYSBUS and defines a "node bus"
- A drawer is hotplug on the "node bus"
- A book on the "drawer bus"
- A socket on the "book bus"
- And the CPU Topology List Entry (CPU-TLE)sits on the socket bus.
These objects will be enhanced with the cache information when
NUMA is implemented.
This also allows for easy retrieval when building the different SYSIB
for Store Topology System Information (STSI)
2) Perform Topology Function (PTF) instruction is made available to the
guest with a new KVM capability and intercepted in QEMU, allowing the
guest to pool for topology changes.
======================
Current implementation
======================
* qemu command line is extended with the new topology levels.
Here a comparison with X86
for X86:
-smp [cpus=]n[,cores=cores][,threads=threads][,dies=dies][,sockets=sockets][,maxcpus=maxcpus]
old S390:
-smp [cpus=]n[,sockets=sockets][,cores=cores][,maxcpus=maxcpus]
new S390:
-smp [cpus=]n[,drawers=drawers][,books=books][,sockets=sockets][,cores=cores][,maxcpus=maxcpus]
Example:
--------
Here we want to use all cores on book 3, on socket-id 2 and the cores 0 and 1:
/usr/local/bin/qemu-system-s390x \
-machine s390-ccw-virtio,accel=kvm \
-enable-kvm \
-m 10G \
\
-smp 15,drawers=5,books=2,sockets=2,cores=12,maxcpus=240 \
\
-object memory-backend-ram,id=mem0,size=2G \
-object memory-backend-ram,id=mem1,size=2G \
-object memory-backend-ram,id=mem2,size=2G \
-object memory-backend-ram,id=mem3,size=2G \
-object memory-backend-ram,id=mem4,size=2G \
\
-numa node,nodeid=0,memdev=mem0\
-numa node,nodeid=1,memdev=mem1\
-numa node,nodeid=2,memdev=mem2\
-numa node,nodeid=3,memdev=mem3\
-numa node,nodeid=4,memdev=mem4\
\
-numa cpu,node-id=1,core-id=0 \
-numa cpu,node-id=4,core-id=1 \
-numa cpu,node-id=2,socket-id=2 \
-numa cpu,node-id=3,book-id=3 \
\
-netdev tap,id=hn0,queues=1 \
\
-device virtio-net-ccw,netdev=hn0,mq=on \
-kernel /boot/vmlinuz-${YOUR_KERNEL} \
-initrd ${YOUR_ROOTFS} \
-append "loglevel=8 selinux=0 root=/dev/ram earlyprintk=1" \
-nographic
=====================
Features and TBD list
=====================
- using a default memory device
- There is a warning about all CPU should be described in NUMA config and that
this would be obsolete in the future, I will need help to know how to
handle this.
- There is no direct match between IDs shown by:
- lscpu (unrelated numbered list),
- used by -numa cpu (numbered list related to topology)
- SYSIB 15.1.x (topology ID)
in the example above socket-id=6 is shown in SYSIB 15.1.x as
socket ID 0 of Book ID 3 since a Book has 2 sockets and appear
on lscpu as socket 2.
- The CPU number, left column of lscpu, is used to reference a CPU
by Linux tools
While the CPU address is used by QEMU for hotplug.
- To plug a new CPU inside the topology one can simply use the CPU
address like in:
(qemu) device_add host-s390x-cpu,core-id=8
CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
0 0 0 0 0 0 0:0:0:0 yes yes horizontal 0
1 0 0 1 1 1 1:1:1:1 yes yes horizontal 24
2 0 0 1 1 2 2:2:2:2 yes yes horizontal 25
....
36 0 1 3 4 36 36:36:36:36 yes yes horizontal 94
37 0 1 3 4 37 37:37:37:37 yes yes horizontal 95
38 - - - - - ::: no yes horizontal 8
# chcpu -e 38
- Documentation will come with the next iteration
Regards,
Pierre
Pierre Morel (9):
s390x: smp: s390x dedicated smp parsing
s390x: toplogy: adding drawers and books to smp parsing
s390x: cpu topology: CPU topology objects and structures
s390x: Topology list entries and SYSIB 15.x.x
s390x: topology: implementating Store Topology System Information
s390x: kvm: topology: interception of PTF instruction
s390x: SCLP: reporting the maximum nested topology entries
s390x: numa: define drawers and books for NUMA
s390x: numa: implement NUMA for S390x
hw/core/machine.c | 18 +
hw/s390x/cpu-topology.c | 595 +++++++++++++++++++++++++++++
hw/s390x/meson.build | 1 +
hw/s390x/s390-virtio-ccw.c | 107 +++++-
hw/s390x/sclp.c | 1 +
include/hw/s390x/cpu-topology.h | 99 +++++
include/hw/s390x/s390-virtio-ccw.h | 3 +
include/hw/s390x/sclp.h | 4 +-
qapi/machine.json | 2 +
softmmu/vl.c | 6 +
target/s390x/cpu.c | 4 +
target/s390x/cpu.h | 45 +++
target/s390x/kvm/kvm.c | 273 +++++++++++++
13 files changed, 1150 insertions(+), 8 deletions(-)
create mode 100644 hw/s390x/cpu-topology.c
create mode 100644 include/hw/s390x/cpu-topology.h
--
2.25.1
next reply other threads:[~2021-07-14 16:56 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-14 16:53 Pierre Morel [this message]
2021-07-14 16:53 ` [PATCH v1 1/9] s390x: smp: s390x dedicated smp parsing Pierre Morel
2021-07-16 8:54 ` Cornelia Huck
2021-07-16 9:14 ` Daniel P. Berrangé
2021-07-16 10:59 ` Pierre Morel
[not found] ` <e4865ad6-f8ec-e7ba-66ef-9c95334ba9b3@linux.ibm.com>
2021-07-19 15:43 ` Cornelia Huck
2021-07-19 15:52 ` Daniel P. Berrangé
2021-07-20 7:37 ` Pierre Morel
2021-07-20 8:33 ` Pierre Morel
2021-07-16 10:47 ` Pierre Morel
2021-07-14 16:53 ` [PATCH v1 2/9] s390x: toplogy: adding drawers and books to " Pierre Morel
2021-07-15 6:16 ` Markus Armbruster
2021-07-15 8:19 ` Pierre Morel
2021-07-15 10:48 ` Markus Armbruster
2021-07-16 9:10 ` Cornelia Huck
2021-07-16 9:18 ` Daniel P. Berrangé
2021-07-16 10:44 ` Cornelia Huck
2021-07-16 10:49 ` Daniel P. Berrangé
2021-07-19 15:50 ` Cornelia Huck
2021-07-20 7:52 ` Pierre Morel
2021-07-20 8:20 ` Cornelia Huck
2021-07-20 8:46 ` Pierre Morel
2021-07-20 9:00 ` Cornelia Huck
2021-07-20 9:19 ` Daniel P. Berrangé
2021-07-20 12:29 ` Pierre Morel
2021-07-16 9:23 ` Daniel P. Berrangé
2021-07-16 11:08 ` Pierre Morel
2021-07-14 16:53 ` [PATCH v1 3/9] s390x: cpu topology: CPU topology objects and structures Pierre Morel
2021-07-14 16:53 ` [PATCH v1 4/9] s390x: Topology list entries and SYSIB 15.x.x Pierre Morel
2021-07-14 16:53 ` [PATCH v1 5/9] s390x: topology: implementating Store Topology System Information Pierre Morel
2021-07-14 16:53 ` [PATCH v1 6/9] s390x: kvm: topology: interception of PTF instruction Pierre Morel
2021-07-16 9:22 ` Cornelia Huck
2021-07-16 11:23 ` Pierre Morel
2021-07-16 11:56 ` Cornelia Huck
2021-07-14 16:53 ` [PATCH v1 7/9] s390x: SCLP: reporting the maximum nested topology entries Pierre Morel
2021-07-16 9:24 ` Cornelia Huck
2021-07-16 11:12 ` Pierre Morel
2021-07-14 16:53 ` [PATCH v1 8/9] s390x: numa: define drawers and books for NUMA Pierre Morel
2021-07-14 16:53 ` [PATCH v1 9/9] s390x: numa: implement NUMA for S390x Pierre Morel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1626281596-31061-1-git-send-email-pmorel@linux.ibm.com \
--to=pmorel@linux.ibm.com \
--cc=armbru@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=eblake@redhat.com \
--cc=ehabkost@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).