qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Christian Borntraeger <borntraeger@de.ibm.com>, g@voom.redhat.com
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Bharata B Rao" <bharata@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org, "Alexander Graf" <agraf@suse.de>,
	"Jason J. Herne" <jjherne@linux.vnet.ibm.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Cornelia Huck" <cornelia.huck@de.ibm.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1)
Date: Thu, 23 Apr 2015 17:32:33 +1000	[thread overview]
Message-ID: <20150423073233.GB26536@voom.redhat.com> (raw)
In-Reply-To: <5523D0FF.7090609@de.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 6797 bytes --]

On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> We had a call and I was asked to write a summary about our conclusion.
> 
> The more I wrote, there more I became uncertain if we really came to a 
> conclusion and became more certain that we want to define the QMP/HMP/CLI
> interfaces first (or quite early in the process)
> 
> As discussed I will provide an initial document as a discussion starter
> 
> So here is my current understanding with each piece of information on one line, so 
> that everybody can correct me or make additions:
> 
> current wrap-up of architecture support
> -------------------
> x86
> - Topology possible
>    - can be hierarchical
>    - interfaces to query topology
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> - supports cpu hotplug via cpu_add
> 
> power
> - Topology possible
>    - interfaces to query topology?

For power, topology information is communicated via the
"ibm,associativity" (and related) properties in the device tree.  This
is can encode heirarchical topologies, but it is *not* bound to the
socket/core/thread heirarchy.  On the guest side in Power there's no
real notion of "socket", just cores with specified proximities to
various memory nodes.

> - SMT: Power8: no threads in host and full core passed in due to HW design
>        may change in the future
> 
> s/390
> - Topology possible
>     - can be hierarchical
>     - interfaces to query topology
> - always virtualized via PR/SM LPAR
>     - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st socket, 4 in 2nd)
> - SMT: fanout in host, guest uses host threads to back guest vCPUS
> 
> 
> Current downsides of CPU definitions/hotplug
> -----------------------------------------------
> - smp, sockets=,cores=,threads= builds only homogeneous topology
> - cpu_add does not tell were to add
> - artificial icc bus construct on x86 for several reasons (link, sysbus not hotpluggable..)

Artificial though it may be, I think having a "cpus" pseudo-bus is not
such a bad idea

> discussions
> -------------------
> - we want to be able to (most important question, IHMO)
>  - hotplug CPUs on power/x86/s390 and maybe others
>  - define topology information
>  - bind the guest topology to the host topology in some way
>     - to host nodes
>     - maybe also for gang scheduling of threads (might face reluctance from
>       the linux scheduler folks)
>     - not really deeply outlined in this call
> - QOM links must be allocated at boot time, but can be set later on
>     - nothing that we want to expose to users
>     - Machine provides QOM links that the device_add hotplug mechanism can use to add
>       new CPUs into preallocated slots. "CPUs" can be groups of cores and/or threads. 
> - hotplug and initial config should use same semantics
> - cpu and memory topology might be somewhat independent
> --> - define nodes
>     - map CPUs to nodes
>     - map memory to nodes
> 
> - hotplug per
>     - socket
>     - core
>     - thread
>     ?
> Now comes the part where I am not sure if we came to a conclusion or not:
> - hotplug/definition per core (but not per thread) seems to handle all cases
>     - core might have multiple threads ( and thus multiple cpustates)
>     - as device statement (or object?)
> - mapping of cpus to nodes or defining the topology not really
>   outlined in this call
> 
> To be defined:
> - QEMU command line for initial setup
> - QEMU hmp/qmp interfaces for dynamic setup

So, I can't say I've entirely got my head around this, but here's my
thoughts so far.

I think the basic problem here is that the fixed socket -> core ->
thread heirarchy is something from x86 land that's become integrated
into qemu's generic code where it doesn't entirely make sense.

Ignoring NUMA topology (I'll come back to that in a moment) qemu
should really only care about two things:

  a) the unit of execution scheduling (a vCPU or "thread")
  b) the unit of plug/unplug

Now, returning to NUMA topology.  What the guest, and therefore qemu,
really needs to know is the relative proximity of each thread to each
block of memory.  That usually forms some sort of node heirarchy,
but it doesn't necessarily correspond to a socket->core->thread
heirarchy you can see in physical units.

On Power, an arbitrary NUMA node heirarchy can be described in the
device tree without reference to "cores" or "sockets", so really qemu
has no business even talking about such units.

IIUC, on x86 the NUMA topology is bound up to the socket->core->thread
heirarchy so it needs to have a notion of those layers, but ideally
that would be specific to the pc machine type.

So, here's what I'd propose:

1) I think we really need some better terminology to refer to the unit
of plug/unplug.  Until someone comes up with something better, I'm
going to use "CPU Module" (CM), to distinguish from the NUMA baggage
of "socket" and also to refer more clearly to the thing that goes into
the socket, rather than the socket itself.

2) A Virtual CPU Module (vCM) need not correspond to a real physical
object.  For machine types which we want to faithfully represent a
specific physical machine, it would.  For generic or pure virtual
machines, the vCMs would be as small as possible.  So for current
Power, they'd be one virtual core, for future power (maybe) or s390 a
single virtual thread.  For x86 I'm not sure what they'd be.

3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
which would contain the vCMs (also QOM objects).  Their existence
would be generic, though we'd almost certainly use arch and/or machine
specific subtypes.

4) There would be a (generic) way of finding the vCPUS (threads) in a
vCM and the vCM for a specific vCPU.

5) A vCM *might* have internal subdivisions into "cores" or "nodes" or
"chips" or "MCMs" or whatever, but that would be up to the machine
type specific code, and not represented in the QOM heirarchy.

6) Obviously we'd need some backwards compat goo to sort out existing
command line options referring to cores and sockets into the new
representation.  This will need machine type specific hooks - so for
x86 it would need to set up the right vCM subdivisions and make sure
the right NUMA topology info goes into ACPI.  For -machine pseries I'm
thinking that "-smp sockets=2,cores=1,threads=4" and "-smp
sockets=1,cores=2,threads=4" should result in exactly the same thing
internally.


Thoughts?


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2015-04-23  7:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 17:31 [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 1/4] cpu: Prepare Socket container type Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 2/4] target-i386: Prepare CPU socket/core abstraction Andreas Färber
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 3/4] pc: Create sockets and cores for CPUs Andreas Färber
2015-03-25 16:55   ` Bharata B Rao
2015-03-25 17:13     ` Andreas Färber
2015-03-26  2:24       ` Bharata B Rao
2015-03-23 17:32 ` [Qemu-devel] [PATCH RFC 4/4] pc: Create initial CPUs in-place Andreas Färber
2015-03-24 14:33 ` [Qemu-devel] [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1 Christian Borntraeger
2015-03-26 17:39 ` Igor Mammedov
2015-04-07 12:43 ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) Christian Borntraeger
2015-04-07 15:07   ` Igor Mammedov
2015-04-08  7:07     ` [Qemu-devel] cpu modelling and hotplug Christian Borntraeger
2015-04-23  7:32   ` David Gibson [this message]
2015-04-23  7:37     ` [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) David Gibson
2015-04-23 13:17     ` Eduardo Habkost
2015-04-27 10:46       ` David Gibson
2015-10-22  1:27   ` [Qemu-devel] cpu modelling and hotplug Zhu Guihua
2015-10-22 16:52     ` Andreas Färber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150423073233.GB26536@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=afaerber@suse.de \
    --cc=agraf@suse.de \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=ehabkost@redhat.com \
    --cc=g@voom.redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jjherne@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).