From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46021) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFLg7-0002rK-Ak for qemu-devel@nongnu.org; Mon, 17 Feb 2014 05:44:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFLg3-0004Ch-25 for qemu-devel@nongnu.org; Mon, 17 Feb 2014 05:43:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:31928) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFLg2-0004CU-PI for qemu-devel@nongnu.org; Mon, 17 Feb 2014 05:43:54 -0500 Date: Mon, 17 Feb 2014 11:43:41 +0100 From: Igor Mammedov Message-ID: <20140217114341.64520c83@thinkpad> In-Reply-To: <1392632649.11208.24.camel@G08FNSTD131468> References: <602453e20a4b721f596cd727b981e9e698c83159.1389685621.git.chen.fan.fnst@cn.fujitsu.com> <20140114114003.6001b070@nial.usersys.redhat.com> <1389788641.2726.13.camel@G08FNSTD131468> <20140115153704.08a47320@thinkpad> <20140117191355.GB2221@otherpad.lan.raisama.net> <20140120132901.57999b8c@nial.usersys.redhat.com> <1390288365.15515.23.camel@G08FNSTD131468> <20140121103144.28cd6f47@nial.usersys.redhat.com> <1390297864.15515.39.camel@G08FNSTD131468> <52DE4796.4060802@suse.de> <1392272048.31130.21.camel@G08FNSTD131468> <20140213104403.7bf30f31@nial.usersys.redhat.com> <1392632649.11208.24.camel@G08FNSTD131468> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn()) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chen Fan Cc: libvir-list@redhat.com, qemu-devel@nongnu.org, Eduardo Habkost , Andreas =?ISO-8859-1?B?RuRyYmVy?= On Mon, 17 Feb 2014 18:24:09 +0800 Chen Fan wrote: > On Thu, 2014-02-13 at 10:44 +0100, Igor Mammedov wrote: > > On Thu, 13 Feb 2014 14:14:08 +0800 > > Chen Fan wrote: > >=20 > > > On Tue, 2014-01-21 at 11:10 +0100, Andreas F=E4rber wrote: > > > > Am 21.01.2014 10:51, schrieb Chen Fan: > > > > > On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: > > > > >> On Tue, 21 Jan 2014 15:12:45 +0800 > > > > >> Chen Fan wrote: > > > > >>> On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: > > > > >>>> On Fri, 17 Jan 2014 17:13:55 -0200 > > > > >>>> Eduardo Habkost wrote: > > > > >>>>> On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: > > > > >>>>>> I recall there were objections to it since APIC ID contains = topology > > > > >>>>>> information and it's not trivial for user to get it right. > > > > >>>>>> The last idea that was discussed to fix it was not expose AP= IC ID to > > > > >>>>>> user but rather introduce QOM hierarchy like: > > > > >>>>>> /machine/node/N/socket/X/core/Y/thread/Z > > > > >>>>>> and use it in user interface as a means to specify an arbitr= ary CPU > > > > >>>>>> and let QEMU calculate APIC ID based on this path. > > > > >>>>>> > > > > >>>>>> But nobody took on implementing it yet. > > > > >>>>> > > > > >>>>> We're taking so long to get a decent interface implemented, t= hat part of > > > > >>>>> me is considering exposing the APIC ID directly like suggeste= d before, > > > > >>>>> and requiring libvirt to calculate topology-aware APIC IDs[1]= to > > > > >>>>> properly implement CPU hotplug (and possibly for other tasks). > > > > >>>> If you are speaking about=20 > > > > >>>> 'qemu will core dump with "-smp 254, sockets=3D2, cores=3D3, t= hreads=3D2"' > > > > >>>> http://patchwork.ozlabs.org/patch/301272/ > > > > >>>> bug then it's limitation of ACPI implementation, > > > > >>>> I'm going to refactor it to use full APIC ids instead of using= bitmap, > > > > >>>> so that we won't ever run into issue regardless of cpu support= ed CPU count. > > > > >>>> > > > > >>>>> > > > > >>>>> Another part of me is hoping that the libvirt developers ask = us to > > > > >>>>> please not do that, so I can use it as argument against expos= ing the > > > > >>>>> APIC IDs directly the next time we discuss this. :) > > > > >>>> > > > > >>>> why not try your /machine/node/N/socket/X/core/Y/thread/Z ide= a first. > > > > >>>> It will benefit not only cpu hotplug but also '-numa' and topo= logy > > > > >>>> description in general. > > > > >>>> > > > > >>> have there been any plan/model of the idea? Need to add a new o= ption to > > > > >>> qemu command? > > > > >> I suppose we can start with internal default implementation firs= t. > > > > >> > > > > >> one way could be > > > > >> 1. let machine prebuild empty QOM tree /machine/node/N/socket/X= /core/Y/thread/Z > > > > >> 2. add node, socket, core, thread properties to CPU and link CP= U into respective > > > > >> link created by #1 > > > > >> =20 > > > > > Thanks, I hope I can take some time to make some patches to imple= ment > > > > > it. > > > >=20 > > > > Please give us a few hours to reply. :) > > > >=20 > > > > /machine/node seems too broad a term to me. > > > > You can't prebuild the full tree, you can only prepare the nodes. > > > > core[Y]/thread[Z] was previously discussed as syntax. > > > >=20 > > > > The important part to decide on will be what is going to be child<>= and > > > > what link<>. Has anyone played with the Intel Quark platform for > > > > instance? (Galileo board or upcoming Edison card) On a regular > > > > mainboard, we would have socket[X] as a link, which mig= ht > > > > point to a child /machine/memory-node[W]/cpu[X]. But if we do = so we > > > > can't reassign it to another memory node - acceptable? With Quark (= or > > > > Qseven modules etc.) there would be a container object rather than = the > > > > /machine itself that has a child instead of a link. > > > > I guess the memory nodes could still be on the /machine though. > > > > The other point of discussion between Anthony and me was whether co= re[Y] > > > > should be a link<> or child<>, same for thread. I believe a child<>= is > > > > better as it enforces that unrealizing the CPU will unrealize all i= ts > > > > cores and all its threads in the future. > > > >=20 > > > > More issues may pop up when thinking about it longer than a few min= utes. > > > > But yes, we need to start investigating this, and so far I had other > > > > priorities like getting the CPUState mess I created cleaned up. > > > Hi, Igor, Andreas, > > >=20 > > > In addition, I want to know what way user could use to specify an > > > arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea?= =20 > > > -device qemu64,socket=3DX,core=3DY,thread=3DZ? or add a new optional = command > > > line? > > Definitely not another CLU option. > > I see a couple of options, > > 1. as you suggest with additional 'numa=3DN' so that QEMU could know > > where to attach a new CPU. > > 2. add 'parent' like option tied to link property and specify ful= l QOM path > > on CLI: -device cpufoo,parent=3D/machine/node[N]/socket[X]/... > >=20 > Hi, Igor, > Currently, we know, after adding an arbitrary CPU then do migration, > on target, there will be not aware that which CPU have been added. > in order to notify target of the cpu topo, can we specify full QOM > path that you mentioned 2th point on target? if we can simply make smp > n + 1 work as well at target to be better, but target how to know the > cpu topo on source side? Regarding topology, I suppose -smp & -numa options should match on both, that should give the same topology on target as on source. As for specifying present CPUs on target, there is work in progress to make CPUs work with -device/device_add (x86 cpu properties, x86 CPU subclasses) >=20 >=20 > >=20 > > > Thanks, > > > Chen > > >=20 > > > >=20 > > > > Regards, > > > > Andreas > > > >=20 > > >=20 > > >=20 > >=20 > >=20 >=20 >=20 --=20 Regards, Igor