From: Pierre Morel <pmorel@linux.ibm.com>
To: Janis Schoetterl-Glausch <scgl@linux.ibm.com>, qemu-s390x@nongnu.org
Cc: qemu-devel@nongnu.org, borntraeger@de.ibm.com,
pasic@linux.ibm.com, richard.henderson@linaro.org,
david@redhat.com, thuth@redhat.com, cohuck@redhat.com,
mst@redhat.com, pbonzini@redhat.com, kvm@vger.kernel.org,
ehabkost@redhat.com, marcel.apfelbaum@gmail.com,
eblake@redhat.com, armbru@redhat.com, seiden@linux.ibm.com,
nrb@linux.ibm.com, frankja@linux.ibm.com
Subject: Re: [PATCH v8 00/12] s390x: CPU Topology
Date: Mon, 18 Jul 2022 14:32:37 +0200 [thread overview]
Message-ID: <080e0f9b-7d75-24f4-6cb2-aa46df86dd58@linux.ibm.com> (raw)
In-Reply-To: <a5f10560-7243-358d-dabe-3ad8ac8a9e80@linux.ibm.com>
On 7/15/22 20:28, Janis Schoetterl-Glausch wrote:
> On 7/15/22 15:47, Pierre Morel wrote:
>>
>>
>> On 7/15/22 11:31, Janis Schoetterl-Glausch wrote:
>>> On 7/14/22 22:05, Pierre Morel wrote:
>>>>
>>>>
>>>> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>>> Hi,
>>>>>>
>>>>>> This new spin is essentially for coherence with the last Linux CPU
>>>>>> Topology patch, function testing and coding style modifications.
>>>>>>
>>>>>> Forword
>>>>>> =======
>>>>>>
>>>>>> The goal of this series is to implement CPU topology for S390, it
>>>>>> improves the preceeding series with the implementation of books and
>>>>>> drawers, of non uniform CPU topology and with documentation.
>>>>>>
>>>>>> To use these patches, you will need the Linux series version 10.
>>>>>> You find it there:
>>>>>> https://lkml.org/lkml/2022/6/20/590
>>>>>>
>>>>>> Currently this code is for KVM only, I have no idea if it is interesting
>>>>>> to provide a TCG patch. If ever it will be done in another series.
>>>>>>
>>>>>> To have a better understanding of the S390x CPU Topology and its
>>>>>> implementation in QEMU you can have a look at the documentation in the
>>>>>> last patch or follow the introduction here under.
>>>>>>
>>>>>> A short introduction
>>>>>> ====================
>>>>>>
>>>>>> CPU Topology is described in the S390 POP with essentially the description
>>>>>> of two instructions:
>>>>>>
>>>>>> PTF Perform Topology function used to poll for topology change
>>>>>> and used to set the polarization but this part is not part of this item.
>>>>>>
>>>>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>>>> configuration.
>>>>>>
>>>>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>>>> of containers. The last topology level, specifying the CPU cores.
>>>>>>
>>>>>> This patch series only uses the two lower levels sockets and cores.
>>>>>> To get the information on the topology, S390 provides the STSI
>>>>>> instruction, which stores a structures providing the list of the
>>>>>> containers used in the Machine topology: the SYSIB.
>>>>>> A selector within the STSI instruction allow to chose how many topology
>>>>>> levels will be provide in the SYSIB.
>>>>>>
>>>>>> Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>>>> the Linux kernel is able to compute the information about the cache
>>>>>> distance between two cores and can use this information to take
>>>>>> scheduling decisions.
>>>>>
>>>>> Do the socket, book, ... metaphors and looking at STSI from the existing
>>>>> smp infrastructure even make sense?
>>>>
>>>> Sorry, I do not understand.
>>>> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
>>>>
>>>> What we do is:
>>>> Compute the STSI from the SMP + numa + device QEMU parameters .
>>>>
>>>>>
>>>>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>>>>> this topology can be very dynamic. So a CPU can move from from one topology
>>>>> container to another, but the socket of a cpu changing while it's running seems
>>>>> a bit strange. And this isn't supported by this patch series as far as I understand,
>>>>> the only topology changes are on hotplug.
>>>>
>>>> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.
>>>
>>> Can a CPU actually change between sockets right now?
>>
>> To be exact, what I understand is that a shared CPU can be scheduled to another real CPU exactly as a guest vCPU can be scheduled by the host to another host CPU.
>
> Ah, ok, this is what I'm forgetting, and what made communication harder,
> there are two ways by which the topology can change:
> 1. the host topology changes
> 2. the vCPU threads are scheduled on another host CPU
>
> I've been only thinking about the 2.
> I assumed some outside entity (libvirt?) pins vCPU threads, and so it would
> be the responsibility of that entity to set the topology which then is
> reported to the guest. So if you pin vCPUs for the whole lifetime of the vm
> then you could do that by specifying the topology up front with -devices.
> If you want to support migration, then the outside entity would need a way
> to tell qemu the updated topology.
Yes
>
>>
>>> The socket-id is computed from the core-id, so it's fixed, is it not?
>>
>> the virtual socket-id is computed from the virtual core-id
>
> Meaning cpu.env.core_id, correct? (which is the same as cpu.cpu_index which is the same as
> ms->possible_cpus->cpus[core_id].props.core_id)
> And a cpu's core id doesn't change during the lifetime of the vm, right?
right
> And so it's socket id doesn't either.
Yes
>
>>
>>>
>>>> It is not expected to appear often but it does appear.
>>>> The code has been removed from the kernel in spin 10 for 2 reasons:
>>>> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem
>>>
>>> This seems sensible, but now it seems too static.
>>> For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
>>> unless I'm misunderstanding something.
>>
>> No, to do this we would need to ask the kernel about it.
>
> You mean polling /sys/devices/system/cpu/cpu*/topology/*_id ?
> That should work if it isn't done to frequently, right?
> And if it's done by the entity doing the pinning it could judge if the host topology change
> is relevant to the guest and if so tell qemu how to update it.
yes, I guess we will need to change the core-id which may be complicated
or find another way to link the vCPU topology with the host CPU topology.
First I wanted to have something directly from the kernel, as we have
there all the info on vCPU and host topology.
That is why I had a struct in the UAPI.
Viktor is as you say here for something in userland only.
For the moment I would like to stay for this patch series on a fixed
topology, set by the admin, then we go on updating the topology.
>>
>>> And migration is rare, but something you'd want to be able to react to.
>>> And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.
>>
>> I think on migration we should just make a kvm_set_mtcr on post_load like Nico suggested everything else seems complicated for a questionable benefit.
>
> But what is the point? The result of STSI reported to the guest doesn't actually change, does it?
> Since the same CPUs with the same calculated socket-ids, ..., exist.
> You cannot migrate to a vm with a different virtual topology, since the CPUs get matched via the cpu_index
> as far as I can tell, which is the same as the core_id, or am I misunderstanding something?
> Migrating the MTCR bit is correct, if it is 1 than there was a cpu hotplug that the guest did not yet observe,
> but setting it to 1 after migration would we wrong if the STSI result would be the same.
That is a good point, IIUC it follows that:
- a CPU hotplug can not be done during the migration.
- migration can not be started while a CPU is being hot plugged.
--
Pierre Morel
IBM Lab Boeblingen
prev parent reply other threads:[~2022-07-18 12:29 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
2022-06-27 13:31 ` Janosch Frank
2022-06-28 11:08 ` Pierre Morel
2022-06-29 15:25 ` Pierre Morel
2022-07-04 11:47 ` Janosch Frank
2022-07-04 14:51 ` Pierre Morel
2022-07-12 15:40 ` Janis Schoetterl-Glausch
2022-07-13 14:59 ` Pierre Morel
2022-07-14 10:38 ` Janis Schoetterl-Glausch
2022-07-14 11:25 ` Pierre Morel
2022-07-14 12:50 ` Janis Schoetterl-Glausch
2022-07-14 19:26 ` Pierre Morel
2022-08-23 13:30 ` Thomas Huth
2022-08-23 16:30 ` Pierre Morel
2022-08-23 17:41 ` Pierre Morel
2022-08-24 7:30 ` Thomas Huth
2022-08-24 8:41 ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
2022-06-27 14:26 ` Janosch Frank
2022-06-28 11:03 ` Pierre Morel
2022-07-20 19:34 ` Janis Schoetterl-Glausch
2022-07-21 11:23 ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
2022-07-14 14:57 ` Janis Schoetterl-Glausch
2022-07-14 20:17 ` Pierre Morel
2022-07-15 9:11 ` Janis Schoetterl-Glausch
2022-07-15 13:07 ` Pierre Morel
2022-07-20 17:24 ` Janis Schoetterl-Glausch
2022-07-21 7:58 ` Pierre Morel
2022-07-21 8:16 ` Janis Schoetterl-Glausch
2022-07-21 11:41 ` Pierre Morel
2022-07-22 12:08 ` Janis Schoetterl-Glausch
2022-08-23 16:25 ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 09/12] target/s390x: interception of PTF instruction Pierre Morel
2022-06-20 14:03 ` [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report Pierre Morel
2022-06-20 14:03 ` [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration Pierre Morel
2022-06-20 14:03 ` [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology Pierre Morel
2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
2022-07-14 20:05 ` Pierre Morel
2022-07-15 9:31 ` Janis Schoetterl-Glausch
2022-07-15 13:47 ` Pierre Morel
2022-07-15 18:28 ` Janis Schoetterl-Glausch
2022-07-18 12:32 ` Pierre Morel [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=080e0f9b-7d75-24f4-6cb2-aa46df86dd58@linux.ibm.com \
--to=pmorel@linux.ibm.com \
--cc=armbru@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=eblake@redhat.com \
--cc=ehabkost@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=nrb@linux.ibm.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=scgl@linux.ibm.com \
--cc=seiden@linux.ibm.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).