qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Pierre Morel <pmorel@linux.ibm.com>
To: Janis Schoetterl-Glausch <scgl@linux.ibm.com>, qemu-s390x@nongnu.org
Cc: qemu-devel@nongnu.org, borntraeger@de.ibm.com,
	pasic@linux.ibm.com, richard.henderson@linaro.org,
	david@redhat.com, thuth@redhat.com, cohuck@redhat.com,
	mst@redhat.com, pbonzini@redhat.com, kvm@vger.kernel.org,
	ehabkost@redhat.com, marcel.apfelbaum@gmail.com,
	eblake@redhat.com, armbru@redhat.com, seiden@linux.ibm.com,
	nrb@linux.ibm.com, frankja@linux.ibm.com
Subject: Re: [PATCH v8 00/12] s390x: CPU Topology
Date: Mon, 18 Jul 2022 14:32:37 +0200	[thread overview]
Message-ID: <080e0f9b-7d75-24f4-6cb2-aa46df86dd58@linux.ibm.com> (raw)
In-Reply-To: <a5f10560-7243-358d-dabe-3ad8ac8a9e80@linux.ibm.com>



On 7/15/22 20:28, Janis Schoetterl-Glausch wrote:
> On 7/15/22 15:47, Pierre Morel wrote:
>>
>>
>> On 7/15/22 11:31, Janis Schoetterl-Glausch wrote:
>>> On 7/14/22 22:05, Pierre Morel wrote:
>>>>
>>>>
>>>> On 7/14/22 20:43, Janis Schoetterl-Glausch wrote:
>>>>> On 6/20/22 16:03, Pierre Morel wrote:
>>>>>> Hi,
>>>>>>
>>>>>> This new spin is essentially for coherence with the last Linux CPU
>>>>>> Topology patch, function testing and coding style modifications.
>>>>>>
>>>>>> Forword
>>>>>> =======
>>>>>>
>>>>>> The goal of this series is to implement CPU topology for S390, it
>>>>>> improves the preceeding series with the implementation of books and
>>>>>> drawers, of non uniform CPU topology and with documentation.
>>>>>>
>>>>>> To use these patches, you will need the Linux series version 10.
>>>>>> You find it there:
>>>>>> https://lkml.org/lkml/2022/6/20/590
>>>>>>
>>>>>> Currently this code is for KVM only, I have no idea if it is interesting
>>>>>> to provide a TCG patch. If ever it will be done in another series.
>>>>>>
>>>>>> To have a better understanding of the S390x CPU Topology and its
>>>>>> implementation in QEMU you can have a look at the documentation in the
>>>>>> last patch or follow the introduction here under.
>>>>>>
>>>>>> A short introduction
>>>>>> ====================
>>>>>>
>>>>>> CPU Topology is described in the S390 POP with essentially the description
>>>>>> of two instructions:
>>>>>>
>>>>>> PTF Perform Topology function used to poll for topology change
>>>>>>        and used to set the polarization but this part is not part of this item.
>>>>>>
>>>>>> STSI Store System Information and the SYSIB 15.1.x providing the Topology
>>>>>>        configuration.
>>>>>>
>>>>>> S390 Topology is a 6 levels hierarchical topology with up to 5 level
>>>>>>        of containers. The last topology level, specifying the CPU cores.
>>>>>>
>>>>>>        This patch series only uses the two lower levels sockets and cores.
>>>>>>             To get the information on the topology, S390 provides the STSI
>>>>>>        instruction, which stores a structures providing the list of the
>>>>>>        containers used in the Machine topology: the SYSIB.
>>>>>>        A selector within the STSI instruction allow to chose how many topology
>>>>>>        levels will be provide in the SYSIB.
>>>>>>
>>>>>>        Using the Topology List Entries (TLE) provided inside the SYSIB we
>>>>>>        the Linux kernel is able to compute the information about the cache
>>>>>>        distance between two cores and can use this information to take
>>>>>>        scheduling decisions.
>>>>>
>>>>> Do the socket, book, ... metaphors and looking at STSI from the existing
>>>>> smp infrastructure even make sense?
>>>>
>>>> Sorry, I do not understand.
>>>> I admit the cover-letter is old and I did not rewrite it really good since the first patch series.
>>>>
>>>> What we do is:
>>>> Compute the STSI from the SMP + numa + device QEMU parameters .
>>>>
>>>>>
>>>>> STSI 15.1.x reports the topology to the guest and for a virtual machine,
>>>>> this topology can be very dynamic. So a CPU can move from from one topology
>>>>> container to another, but the socket of a cpu changing while it's running seems
>>>>> a bit strange. And this isn't supported by this patch series as far as I understand,
>>>>> the only topology changes are on hotplug.
>>>>
>>>> A CPU changing from a socket to another socket is the only case the PTF instruction reports a change in the topology with the case a new CPU is plug in.
>>>
>>> Can a CPU actually change between sockets right now?
>>
>> To be exact, what I understand is that a shared CPU can be scheduled to another real CPU exactly as a guest vCPU can be scheduled by the host to another host CPU.
> 
> Ah, ok, this is what I'm forgetting, and what made communication harder,
> there are two ways by which the topology can change:
> 1. the host topology changes
> 2. the vCPU threads are scheduled on another host CPU
> 
> I've been only thinking about the 2.
> I assumed some outside entity (libvirt?) pins vCPU threads, and so it would
> be the responsibility of that entity to set the topology which then is
> reported to the guest. So if you pin vCPUs for the whole lifetime of the vm
> then you could do that by specifying the topology up front with -devices.
> If you want to support migration, then the outside entity would need a way
> to tell qemu the updated topology.

Yes

>   
>>
>>> The socket-id is computed from the core-id, so it's fixed, is it not?
>>
>> the virtual socket-id is computed from the virtual core-id
> 
> Meaning cpu.env.core_id, correct? (which is the same as cpu.cpu_index which is the same as
> ms->possible_cpus->cpus[core_id].props.core_id)
> And a cpu's core id doesn't change during the lifetime of the vm, right?

right

> And so it's socket id doesn't either.

Yes

> 
>>
>>>
>>>> It is not expected to appear often but it does appear.
>>>> The code has been removed from the kernel in spin 10 for 2 reasons:
>>>> 1) we decided to first support only dedicated and pinned CPU> 2) Christian fears it may happen too often due to Linux host scheduling and could be a performance problem
>>>
>>> This seems sensible, but now it seems too static.
>>> For example after migration, you cannot tell the guest which CPUs are in the same socket, book, ...,
>>> unless I'm misunderstanding something.
>>
>> No, to do this we would need to ask the kernel about it.
> 
> You mean polling /sys/devices/system/cpu/cpu*/topology/*_id ?
> That should work if it isn't done to frequently, right?
> And if it's done by the entity doing the pinning it could judge if the host topology change
> is relevant to the guest and if so tell qemu how to update it.

yes, I guess we will need to change the core-id which may be complicated 
or find another way to link the vCPU topology with the host CPU topology.
First I wanted to have something directly from the kernel, as we have 
there all the info on vCPU and host topology.
That is why I had a struct in the UAPI.

Viktor is as you say here for something in userland only.

For the moment I would like to stay for this patch series on a fixed 
topology, set by the admin, then we go on updating the topology.


>>
>>> And migration is rare, but something you'd want to be able to react to.
>>> And I could imaging that the vCPUs are pinned most of the time, but the pinning changes occasionally.
>>
>> I think on migration we should just make a kvm_set_mtcr on post_load like Nico suggested everything else seems complicated for a questionable benefit.
> 
> But what is the point? The result of STSI reported to the guest doesn't actually change, does it?
> Since the same CPUs with the same calculated socket-ids, ..., exist.
> You cannot migrate to a vm with a different virtual topology, since the CPUs get matched via the cpu_index
> as far as I can tell, which is the same as the core_id, or am I misunderstanding something?
> Migrating the MTCR bit is correct, if it is 1 than there was a cpu hotplug that the guest did not yet observe,
> but setting it to 1 after migration would we wrong if the STSI result would be the same.

That is a good point, IIUC it follows that:
- a CPU hotplug can not be done during the migration.
- migration can not be started while a CPU is being hot plugged.


-- 
Pierre Morel
IBM Lab Boeblingen


      reply	other threads:[~2022-07-18 12:29 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-20 14:03 [PATCH v8 00/12] s390x: CPU Topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 01/12] Update Linux Headers Pierre Morel
2022-06-20 14:03 ` [PATCH v8 02/12] s390x/cpu_topology: CPU topology objects and structures Pierre Morel
2022-06-27 13:31   ` Janosch Frank
2022-06-28 11:08     ` Pierre Morel
2022-06-29 15:25     ` Pierre Morel
2022-07-04 11:47       ` Janosch Frank
2022-07-04 14:51         ` Pierre Morel
2022-07-12 15:40   ` Janis Schoetterl-Glausch
2022-07-13 14:59     ` Pierre Morel
2022-07-14 10:38       ` Janis Schoetterl-Glausch
2022-07-14 11:25         ` Pierre Morel
2022-07-14 12:50           ` Janis Schoetterl-Glausch
2022-07-14 19:26             ` Pierre Morel
2022-08-23 13:30   ` Thomas Huth
2022-08-23 16:30     ` Pierre Morel
2022-08-23 17:41     ` Pierre Morel
2022-08-24  7:30       ` Thomas Huth
2022-08-24  8:41         ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 03/12] s390x/cpu_topology: implementating Store Topology System Information Pierre Morel
2022-06-27 14:26   ` Janosch Frank
2022-06-28 11:03     ` Pierre Morel
2022-07-20 19:34   ` Janis Schoetterl-Glausch
2022-07-21 11:23     ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 04/12] s390x/cpu_topology: Adding books to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 05/12] s390x/cpu_topology: Adding books to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 06/12] s390x/cpu_topology: Adding drawers to CPU topology Pierre Morel
2022-06-20 14:03 ` [PATCH v8 07/12] s390x/cpu_topology: Adding drawers to STSI Pierre Morel
2022-06-20 14:03 ` [PATCH v8 08/12] s390x/cpu_topology: implementing numa for the s390x topology Pierre Morel
2022-07-14 14:57   ` Janis Schoetterl-Glausch
2022-07-14 20:17     ` Pierre Morel
2022-07-15  9:11       ` Janis Schoetterl-Glausch
2022-07-15 13:07         ` Pierre Morel
2022-07-20 17:24           ` Janis Schoetterl-Glausch
2022-07-21  7:58             ` Pierre Morel
2022-07-21  8:16               ` Janis Schoetterl-Glausch
2022-07-21 11:41                 ` Pierre Morel
2022-07-22 12:08                   ` Janis Schoetterl-Glausch
2022-08-23 16:25                     ` Pierre Morel
2022-06-20 14:03 ` [PATCH v8 09/12] target/s390x: interception of PTF instruction Pierre Morel
2022-06-20 14:03 ` [PATCH v8 10/12] s390x/cpu_topology: resetting the Topology-Change-Report Pierre Morel
2022-06-20 14:03 ` [PATCH v8 11/12] s390x/cpu_topology: CPU topology migration Pierre Morel
2022-06-20 14:03 ` [PATCH v8 12/12] s390x/cpu_topology: activating CPU topology Pierre Morel
2022-07-14 18:43 ` [PATCH v8 00/12] s390x: CPU Topology Janis Schoetterl-Glausch
2022-07-14 20:05   ` Pierre Morel
2022-07-15  9:31     ` Janis Schoetterl-Glausch
2022-07-15 13:47       ` Pierre Morel
2022-07-15 18:28         ` Janis Schoetterl-Glausch
2022-07-18 12:32           ` Pierre Morel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=080e0f9b-7d75-24f4-6cb2-aa46df86dd58@linux.ibm.com \
    --to=pmorel@linux.ibm.com \
    --cc=armbru@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=eblake@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=nrb@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=scgl@linux.ibm.com \
    --cc=seiden@linux.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).