qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	peter.maydell@linaro.org, pkrempa@redhat.com, cohuck@redhat.com,
	qemu-devel@nongnu.org, armbru@redhat.com, pbonzini@redhat.com,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP
Date: Wed, 18 Oct 2017 15:49:36 +0100	[thread overview]
Message-ID: <20171018144936.GJ9719@redhat.com> (raw)
In-Reply-To: <20171018164435.5290db6a@nial.brq.redhat.com>

On Wed, Oct 18, 2017 at 04:44:35PM +0200, Igor Mammedov wrote:
> On Wed, 18 Oct 2017 10:59:11 -0200
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Tue, Oct 17, 2017 at 06:18:59PM +0200, Igor Mammedov wrote:
> > > On Tue, 17 Oct 2017 17:09:26 +0100
> > > "Daniel P. Berrange" <berrange@redhat.com> wrote:
> > >   
> > > > On Tue, Oct 17, 2017 at 06:06:35PM +0200, Igor Mammedov wrote:  
> > > > > On Tue, 17 Oct 2017 16:07:59 +0100
> > > > > "Daniel P. Berrange" <berrange@redhat.com> wrote:
> > > > >     
> > > > > > On Tue, Oct 17, 2017 at 09:27:02AM +0200, Igor Mammedov wrote:    
> > > > > > > On Mon, 16 Oct 2017 17:36:36 +0100
> > > > > > > "Daniel P. Berrange" <berrange@redhat.com> wrote:
> > > > > > >       
> > > > > > > > On Mon, Oct 16, 2017 at 06:22:50PM +0200, Igor Mammedov wrote:      
> > > > > > > > > Series allows to configure NUMA mapping at runtime using QMP/HMP
> > > > > > > > > interface. For that to happen it introduces a new '-paused' CLI option
> > > > > > > > > which allows to pause QEMU before machine_init() is run and
> > > > > > > > > adds new set-numa-node HMP/QMP commands which in conjuction with
> > > > > > > > > info hotpluggable-cpus/query-hotpluggable-cpus allow to configure
> > > > > > > > > NUMA mapping for cpus.        
> > > > > > > > 
> > > > > > > > What's the problem we're seeking solve here compared to what we currently
> > > > > > > > do for NUMA configuration ?      
> > > > > > > From RHBZ1382425
> > > > > > > "
> > > > > > > Current -numa CLI interface is quite limited in terms that allow map
> > > > > > > CPUs to NUMA nodes as it requires to provide cpu_index values which 
> > > > > > > are non obvious and depend on machine/arch. As result libvirt has to
> > > > > > > assume/re-implement cpu_index allocation logic to provide valid 
> > > > > > > values for -numa cpus=... QEMU CLI option.      
> > > > > > 
> > > > > > In broad terms, this problem applies to every device / object libvirt
> > > > > > asks QEMU to create. For everything else libvirt is able to assign a
> > > > > > "id" string, which is can then use to identify the thing later. The
> > > > > > CPU stuff is different because libvirt isn't able to provide 'id'
> > > > > > strings for each CPU - QEMU generates a psuedo-id internally which
> > > > > > libvirt has to infer. The latter is the same problem we had with
> > > > > > devices before '-device' was introduced allowing 'id' naming.
> > > > > > 
> > > > > > IMHO we should take the same approach with CPUs and start modelling 
> > > > > > the individual CPUs as something we can explicitly create with -object
> > > > > > or -device. That way libvirt can assign names and does not have to 
> > > > > > care about CPU index values, and it all works just the same way as
> > > > > > any other devices / object we create
> > > > > > 
> > > > > > ie instead of:
> > > > > > 
> > > > > >   -smp 8,sockets=4,cores=2,threads=1
> > > > > >   -numa node,nodeid=0,cpus=0-3
> > > > > >   -numa node,nodeid=1,cpus=4-7
> > > > > > 
> > > > > > we could do:
> > > > > > 
> > > > > >   -object numa-node,id=numa0
> > > > > >   -object numa-node,id=numa1
> > > > > >   -object cpu,id=cpu0,node=numa0,socket=0,core=0,thread=0
> > > > > >   -object cpu,id=cpu1,node=numa0,socket=0,core=1,thread=0
> > > > > >   -object cpu,id=cpu2,node=numa0,socket=1,core=0,thread=0
> > > > > >   -object cpu,id=cpu3,node=numa0,socket=1,core=1,thread=0
> > > > > >   -object cpu,id=cpu4,node=numa1,socket=2,core=0,thread=0
> > > > > >   -object cpu,id=cpu5,node=numa1,socket=2,core=1,thread=0
> > > > > >   -object cpu,id=cpu6,node=numa1,socket=3,core=0,thread=0
> > > > > >   -object cpu,id=cpu7,node=numa1,socket=3,core=1,thread=0    
> > > > > the follow up question would be where do "socket=3,core=1,thread=0"
> > > > > come from, currently these options are the function of
> > > > > (-M foo -smp ...) and can be queried vi query-hotpluggble-cpus at
> > > > > runtime after qemu parses -M and -smp options.    
> > > >   
> > 
> > Also, note that in the case of NUMA, having identifiers for CPU
> > objects themselves won't be enough. NUMA settings need
> > identifiers for CPU slots (even if they are still empty), and
> > those slots are provided by the machine, not created by the user.
> > 
> > 
> > > > The sockets/cores/threads topology of CPUs is something that comes from
> > > > the libvirt guest XML config  
> > > in this case things for libvirt to implement would be to know following details:
> > >    1: which machine/machine version support which set of attributes
> > >    2: valid values for these properties depending on machine/machine version/cpu type  
> > 
> > The big assumption in this series is that libvirt doesn't know in
> > advance how the possible slots for CPUs will look like on each
> > machine-type, and need to query them using
> > query-hotpluggable-cpus.
> yep, that's true and it started with introduction of 'device_add cpu'
> where libvirt didn't new what to specify as options for new cpu,
> hence query-hotpluggable-cpus were added to provide that information.
> 
> 
> > But if this assumption was really true, it would be impossible
> > for the user to even decide how the NUMA topology will look like,
> > wouldn't it?
> > 
> > Igor, are you able to give one example of how the user input
> > (libvirt XML) for configuring NUMA CPU binding could look like if
> > the user didn't know yet what the available sockets/cores/threads
> > are?
> not sure I parse question but looking at libvirt's domain docs
> it mentions
>   <numa>
>     <cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
>     <cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/>
>   </numa>
> 
> here libvirt assumes that there are cpus with cpu-index in range 0-7
> /and probably duplicates logic that calculates cpu-index/
> If libvirt would continue to duplicate logic we could skip on
> implementing early runtime QMP in QEMU and also drop support for
> query-hotpluggable-cpus as libvirt would be able to compute
> properties/values on it's own.

>From the POV of the XML, these CPU numbers are *not* required to be
the same as any QEMU CPU index. This is just saying that we've got
a <vcpus>8</vcpu> element, and we want the first 4 CPUs in one node
and the second 4 in the second node. 

If QEMU assigns CPU indexes 70-77 internally, that's not relevant to
the XML POV, which uses 0-7 regardless. If there ever was such a
disjoint representation of CPU indexes libvirt would have to remap
whats in the XML to match whats in QEMU

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

  reply	other threads:[~2017-10-18 14:49 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-16 16:22 [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP Igor Mammedov
2017-10-16 16:22 ` [Qemu-devel] [RFC 1/6] numa: postpone options post-processing till machine_run_board_init() Igor Mammedov
2017-10-17  5:49   ` David Gibson
2017-10-16 16:22 ` [Qemu-devel] [RFC 2/6] numa: split out NumaOptions parsing into parse_NumaOptions() Igor Mammedov
2017-10-18  3:27   ` David Gibson
2017-10-18 14:53     ` Eric Blake
2017-10-16 16:22 ` [Qemu-devel] [RFC 3/6] possible_cpus: add CPUArchId::type field Igor Mammedov
2017-10-18 11:12   ` [Qemu-devel] [RFC v2 " Igor Mammedov
2017-10-19  6:31     ` David Gibson
2017-10-31 14:01       ` Igor Mammedov
2017-11-06 18:02         ` Eduardo Habkost
2017-11-07 15:04           ` Cornelia Huck
2017-11-09  6:58             ` David Gibson
2017-11-09 20:02               ` Eduardo Habkost
2017-11-10 10:14                 ` Cornelia Huck
2017-11-10 12:34                   ` David Hildenbrand
2017-11-10 12:58                     ` Eduardo Habkost
2017-11-10 13:07                       ` David Hildenbrand
2017-11-21 14:02                 ` Igor Mammedov
2017-11-09  6:53           ` David Gibson
2017-10-16 16:22 ` [Qemu-devel] [RFC 4/6] CLI: add -paused option Igor Mammedov
2017-10-16 16:35   ` Daniel P. Berrange
2017-10-17  8:17     ` Igor Mammedov
2017-10-17 10:56       ` Laszlo Ersek
2017-10-17 11:11         ` Peter Krempa
2017-10-20 15:38     ` Eduardo Habkost
2017-10-16 16:59   ` Eduardo Habkost
2017-10-16 17:01     ` Paolo Bonzini
2017-10-16 17:17       ` Eduardo Habkost
2017-10-17  8:47         ` Paolo Bonzini
2017-10-17  9:25           ` Igor Mammedov
2017-10-17 14:48       ` Daniel P. Berrange
2017-10-17 15:21         ` Laszlo Ersek
2017-10-17 15:35           ` Daniel P. Berrange
2017-10-17 15:42             ` Laszlo Ersek
2017-10-17 15:47               ` Daniel P. Berrange
2017-10-17 15:47             ` Igor Mammedov
2017-10-17 15:52               ` Daniel P. Berrange
2017-10-17  9:10     ` Igor Mammedov
2017-10-19 10:42     ` David Gibson
2017-10-20  0:15       ` Eduardo Habkost
2017-10-20  1:19         ` David Gibson
2017-10-20 14:21           ` Eduardo Habkost
2017-10-23  9:49             ` Igor Mammedov
2017-10-23  9:53               ` Daniel P. Berrange
2017-10-23 10:36                 ` Igor Mammedov
2017-10-23 10:49                   ` Daniel P. Berrange
2017-10-23 11:18                     ` Igor Mammedov
2017-10-25 10:52                       ` Eduardo Habkost
2017-10-25 10:35               ` Eduardo Habkost
2017-10-23  9:30         ` Alex Bennée
2017-10-16 16:22 ` [Qemu-devel] [RFC 5/6] HMP: add set-numa-node command Igor Mammedov
2017-10-16 16:22 ` [Qemu-devel] [RFC 6/6] QMP: " Igor Mammedov
2017-10-16 16:36 ` [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP Daniel P. Berrange
2017-10-16 17:05   ` Eduardo Habkost
2017-10-17  7:27   ` Igor Mammedov
2017-10-17 15:07     ` Daniel P. Berrange
2017-10-17 15:24       ` Laszlo Ersek
2017-10-17 16:06       ` Igor Mammedov
2017-10-17 16:09         ` Daniel P. Berrange
2017-10-17 16:18           ` Igor Mammedov
2017-10-18 12:59             ` Eduardo Habkost
2017-10-18 14:44               ` Igor Mammedov
2017-10-18 14:49                 ` Daniel P. Berrange [this message]
2017-10-18 15:24                   ` Igor Mammedov
2017-10-18 15:27                     ` Daniel P. Berrange
2017-10-18 20:11                       ` Eduardo Habkost
2017-10-18 15:30         ` Daniel P. Berrange
2017-10-18 20:22           ` Eduardo Habkost
2017-10-19 11:49             ` David Gibson
2017-10-19 12:23               ` Paolo Bonzini
2017-10-20  1:21                 ` David Gibson
2017-10-20 19:53                   ` Eduardo Habkost
2017-10-23  8:17                     ` Igor Mammedov
2017-10-23  8:45                     ` Igor Mammedov
2017-10-25  6:57                       ` Eduardo Habkost
2017-10-25  7:02                         ` Daniel P. Berrange
2017-10-25 13:37                           ` Eduardo Habkost
2017-10-19 15:21           ` Igor Mammedov
2017-10-19 15:28             ` Daniel P. Berrange
2017-10-19 19:56               ` Eduardo Habkost
2017-10-20  9:07                 ` Daniel P. Berrange
2017-10-20 20:07                   ` Eduardo Habkost
2017-10-23  8:53                     ` Igor Mammedov
2017-10-23 10:04                   ` Igor Mammedov
2017-10-23 10:19                     ` Daniel P. Berrange
2017-10-18 12:19       ` Paolo Bonzini
2017-10-18 12:27         ` Daniel P. Berrange
2017-10-18 12:33           ` Paolo Bonzini
2017-10-18 14:26             ` Igor Mammedov
2017-10-18 14:29               ` Paolo Bonzini
2017-10-18 14:54                 ` Igor Mammedov
2017-10-18 14:21           ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171018144936.GJ9719@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).