qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Pierre Morel <pmorel@linux.ibm.com>
To: qemu-s390x@nongnu.org
Cc: thuth@redhat.com, david@redhat.com, cohuck@redhat.com,
	richard.henderson@linaro.org, qemu-devel@nongnu.org,
	pasic@linux.ibm.com, borntraeger@de.ibm.com
Subject: Re: [PATCH v4 0/5] s390x: CPU Topology
Date: Thu, 9 Dec 2021 16:09:22 +0100	[thread overview]
Message-ID: <a104b68e-f1a5-a176-a0e2-e816d2ed427f@linux.ibm.com> (raw)
In-Reply-To: <20211117164848.310952-1-pmorel@linux.ibm.com>

Hi,

This series is updated by a v5 series with documentation and numa 
extensions.
Some changes have been made in some of the patches contained in this 
series too.

Regards,
Pierre

On 11/17/21 17:48, Pierre Morel wrote:
> Hi,
> 
> This series is a first part of the implementation of CPU topology
> for S390 greatly reduced from the first spin.
> 
> In particular, we reduced the scope to the S390x specificities, removing
> all code touching to SMP or NUMA, with the goal to:
> - facilitate review and acceptance
> - let for later the SMP part currently actively discussed in mainline
> - be able despite the reduction of code to handle CPU topology for S390
>    using the current S390 topology provided by QEMU with cores and sockets
>    only.
> 
> To use these patches, you will need the Linux series version 4.
> You find it there:
> https://lkml.org/lkml/2021/9/16/576
> 
> Currently this code is for KVM only, I have no idea if it is interesting
> to provide a TCG patch. If ever it will be done in another series.
> 
> A short introduction
> ====================
> 
> CPU Topology is described in the S390 POP with essentially the description
> of two instructions:
> 
> PTF Perform Topology function used to poll for topology change
>      and used to set the polarization but this part is not part of this item.
> 
> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>      configuration.
> 
> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>      of containers. The last topology level, specifying the CPU cores.
> 
>      This patch series only uses the two lower levels sockets and cores.
>      
>      To get the information on the topology, S390 provides the STSI
>      instruction, which stores a structures providing the list of the
>      containers used in the Machine topology: the SYSIB.
>      A selector within the STSI instruction allow to chose how many topology
>      levels will be provide in the SYSIB.
> 
>      Using the Topology List Entries (TLE) provided inside the SYSIB we
>      the Linux kernel is able to compute the information about the cache
>      distance between two cores and can use this information to take
>      scheduling decisions.
> 
> Note:
> -----
>       Z15 reports 3 levels of containers, drawers, book, sockets as
>       Container-TLEs above the core description inside CPU-TLEs.
> 
> The Topology can be seen at several places inside zLinux:
>      - sysfs: /sys/devices/system/cpu/cpuX/topology
>      - procfs: /proc/sysinfo and /proc/cpuinfo
>      - lscpu -e : gives toplogy information
> 
> The different Topology levels have names:
>      - Node - Drawer - Book - sockets or physical package - core
> 
> Threads:
>      Multithreading, is not part of the topology as described by the
>      SYSIB 15.1.x
> 
> The interest of the guest to know the CPU topology is obviously to be
> able to optimise the load balancing and the migration of threads.
> KVM will have the same interest concerning vCPUs scheduling and cache
> optimisation.
> 
> 
> The design
> ==========
> 
> 1) To be ready for hotplug, I chose an Object oriented design
> of the topology containers:
> - A node is a bridge on the SYSBUS and defines a "node bus"
> - A drawer is hotplug on the "node bus"
> - A book on the "drawer bus"
> - A socket on the "book bus"
> - And the CPU Topology List Entry (CPU-TLE)sits on the socket bus.
> These objects will be enhanced with the cache information when
> NUMA is implemented.
> 
> This also allows for easy retrieval when building the different SYSIB
> for Store Topology System Information (STSI)
> 
> 2) Perform Topology Function (PTF) instruction is made available to the
> guest with a new KVM capability and intercepted in QEMU, allowing the
> guest to pool for topology changes.
> 
> 
> Features and TBD list
> =====================
> 
> - There is no direct match between IDs shown by:
>      - lscpu (unrelated numbered list),
>      - SYSIB 15.1.x (topology ID)
> 
> - The CPU number, left column of lscpu, is used to reference a CPU
>      by Linux tools
>      While the CPU address is used by QEMU for hotplug.
> 
> - Effect of -smp parsing on the topology with an example:
>      -smp 9,sockets=4,cores=4,maxcpus=16
> 
>      We have 4 socket each holding 4 cores so that we have a maximum
>      of 16 CPU, 9 of them are active on boot. (Should be obvious)
> 
> # lscpu -e
> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
>    0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
>    1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
>    2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
>    3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
>    4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
>    5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
>    6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
>    7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
>    8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
> #
> 
> 
> - To plug a new CPU inside the topology one can simply use the CPU
>      address like in:
>    
> (qemu) device_add host-s390x-cpu,core-id=12
> # lscpu -e
> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
>    0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
>    1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
>    2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
>    3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
>    4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
>    5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
>    6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
>    7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
>    8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
>    9    -      -    -      -    - :::                 no yes        horizontal   12
> # chcpu -e 9
> CPU 9 enabled
> # lscpu -e
> CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED POLARIZATION ADDRESS
>    0    0      0    0      0    0 0:0:0:0            yes yes        horizontal   0
>    1    0      0    0      0    1 1:1:1:1            yes yes        horizontal   1
>    2    0      0    0      0    2 2:2:2:2            yes yes        horizontal   2
>    3    0      0    0      0    3 3:3:3:3            yes yes        horizontal   3
>    4    0      0    0      1    4 4:4:4:4            yes yes        horizontal   4
>    5    0      0    0      1    5 5:5:5:5            yes yes        horizontal   5
>    6    0      0    0      1    6 6:6:6:6            yes yes        horizontal   6
>    7    0      0    0      1    7 7:7:7:7            yes yes        horizontal   7
>    8    0      0    0      2    8 8:8:8:8            yes yes        horizontal   8
>    9    0      0    0      3    9 9:9:9:9            yes yes        horizontal   12
> #
> 
> It is up to the admin level, Libvirt for example, to pin the righ CPU to the right
> vCPU, but as we can see without NUMA, chosing separate sockets for CPUs is not easy
> without hotplug because without information the code will assign the vCPU and fill
> the sockets one after the other.
> Note that this is also the default behavior on the LPAR.
> 
> Conclusion
> ==========
> 
> This patch, together with the associated KVM patch allows to provide CPU topology
> information to the guest.
> Currently, only dedicated vCPU and CPU are supported and a NUMA topology can only
> be handled using CPU hotplug inside the guest.
> 
> Next extensions are to provide:
> - adding books and drawers levels
> - NUMA using the -numa QEMU parameter.
> - Topology information change for shared CPU
> 
> Regards,
> Pierre
> 
> Pierre Morel (5):
>    linux-headers update
>    s390x: topology: CPU topology objects and structures
>    s390x: topology: implementating Store Topology System Information
>    s390x: CPU topology: CPU topology migration
>    s390x: kvm: topology: interception of PTF instruction
> 
>   hw/s390x/cpu-topology.c             | 361 ++++++++++++++++++++++++++++
>   hw/s390x/meson.build                |   1 +
>   hw/s390x/s390-virtio-ccw.c          |  54 +++++
>   include/hw/s390x/cpu-topology.h     |  74 ++++++
>   include/hw/s390x/s390-virtio-ccw.h  |   6 +
>   linux-headers/linux/kvm.h           |   1 +
>   target/s390x/cpu.h                  |  50 ++++
>   target/s390x/cpu_features_def.h.inc |   1 +
>   target/s390x/cpu_models.c           |   2 +
>   target/s390x/cpu_topology.c         | 113 +++++++++
>   target/s390x/gen-features.c         |   3 +
>   target/s390x/kvm/kvm.c              |  26 ++
>   target/s390x/machine.c              |  48 ++++
>   target/s390x/meson.build            |   1 +
>   14 files changed, 741 insertions(+)
>   create mode 100644 hw/s390x/cpu-topology.c
>   create mode 100644 include/hw/s390x/cpu-topology.h
>   create mode 100644 target/s390x/cpu_topology.c
> 

-- 
Pierre Morel
IBM Lab Boeblingen


      parent reply	other threads:[~2021-12-09 15:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17 16:48 [PATCH v4 0/5] s390x: CPU Topology Pierre Morel
2021-11-17 16:48 ` [PATCH v4 1/5] linux-headers update Pierre Morel
2021-11-17 16:48 ` [PATCH v4 2/5] s390x: topology: CPU topology objects and structures Pierre Morel
2021-11-17 16:48 ` [PATCH v4 3/5] s390x: topology: implementating Store Topology System Information Pierre Morel
2021-11-17 16:48 ` [PATCH v4 4/5] s390x: CPU topology: CPU topology migration Pierre Morel
2021-11-17 16:48 ` [PATCH v4 5/5] s390x: kvm: topology: interception of PTF instruction Pierre Morel
2021-12-09 15:09 ` Pierre Morel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a104b68e-f1a5-a176-a0e2-e816d2ed427f@linux.ibm.com \
    --to=pmorel@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).