From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56567) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYX1S-0002a2-Gl for qemu-devel@nongnu.org; Wed, 24 Feb 2016 05:50:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYX1Q-0003oJ-Dq for qemu-devel@nongnu.org; Wed, 24 Feb 2016 05:50:22 -0500 Received: from ozlabs.org ([2401:3900:2:1::2]:35243) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYX1P-0003nm-IL for qemu-devel@nongnu.org; Wed, 24 Feb 2016 05:50:20 -0500 Date: Wed, 24 Feb 2016 21:51:19 +1100 From: David Gibson Message-ID: <20160224105119.GN2808@voom.fritz.box> References: <878u2lhi8i.fsf@blackfin.pond.sub.org> <20160216113655.2bbb9988@nial.brq.redhat.com> <20160218033952.GG15224@voom.fritz.box> <20160218113739.64b02461@nial.brq.redhat.com> <20160219043848.GZ15224@voom.fritz.box> <87h9h5uiy8.fsf@blackfin.pond.sub.org> <20160222023228.GC2808@voom.fritz.box> <87h9h1i07h.fsf@blackfin.pond.sub.org> <20160224015711.GG2808@voom.fritz.box> <87povm1ov1.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="rTTzD97p2CCc/LGU" Content-Disposition: inline In-Reply-To: <87povm1ov1.fsf@blackfin.pond.sub.org> Subject: Re: [Qemu-devel] [RFC] QMP: add query-hotpluggable-cpus List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: lvivier@redhat.com, thuth@redhat.com, ehabkost@redhat.com, aik@ozlabs.ru, qemu-devel@nongnu.org, agraf@suse.de, pbonzini@redhat.com, abologna@redhat.com, bharata@linux.vnet.ibm.com, Igor Mammedov , afaerber@suse.de --rTTzD97p2CCc/LGU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 24, 2016 at 09:42:10AM +0100, Markus Armbruster wrote: > David Gibson writes: >=20 > > On Mon, Feb 22, 2016 at 10:05:54AM +0100, Markus Armbruster wrote: > >> David Gibson writes: > >>=20 > >> > On Fri, Feb 19, 2016 at 10:51:11AM +0100, Markus Armbruster wrote: > >> >> David Gibson writes: > >> >>=20 > >> >> > On Thu, Feb 18, 2016 at 11:37:39AM +0100, Igor Mammedov wrote: > >> >> >> On Thu, 18 Feb 2016 14:39:52 +1100 > >> >> >> David Gibson wrote: > >> >> >>=20 > >> >> >> > On Tue, Feb 16, 2016 at 11:36:55AM +0100, Igor Mammedov wrote: > >> >> >> > > On Mon, 15 Feb 2016 20:43:41 +0100 > >> >> >> > > Markus Armbruster wrote: > >> >> >> > > =20 > >> >> >> > > > Igor Mammedov writes: > >> >> >> > > > =20 > >> >> >> > > > > it will allow mgmt to query present and possible to hotp= lug CPUs > >> >> >> > > > > it is required from a target platform that wish to suppo= rt > >> >> >> > > > > command to set board specific MachineClass.possible_cpus= () hook, > >> >> >> > > > > which will return a list of possible CPUs with options > >> >> >> > > > > that would be needed for hotplugging possible CPUs. > >> >> >> > > > > > >> >> >> > > > > For RFC there are: > >> >> >> > > > > 'arch_id': 'int' - mandatory unique CPU number, > >> >> >> > > > > for x86 it's APIC ID for ARM it's = MPIDR > >> >> >> > > > > 'type': 'str' - CPU object type for usage with device= _add > >> >> >> > > > > > >> >> >> > > > > and a set of optional fields that would allows mgmt tools > >> >> >> > > > > to know at what granularity and where a new CPU could be > >> >> >> > > > > hotplugged; > >> >> >> > > > > [node],[socket],[core],[thread] > >> >> >> > > > > Hopefully that should cover needs for CPU hotplug porpos= es for > >> >> >> > > > > magor targets and we can extend structure in future addi= ng > >> >> >> > > > > more fields if it will be needed. > >> >> >> > > > > > >> >> >> > > > > also for present CPUs there is a 'cpu_link' field which > >> >> >> > > > > would allow mgmt inspect whatever object/abstraction > >> >> >> > > > > the target platform considers as CPU object. > >> >> >> > > > > > >> >> >> > > > > For RFC purposes implements only for x86 target so far. = =20 > >> >> >> > > >=20 > >> >> >> > > > Adding ad hoc queries as we go won't scale. Could this be= solved by a > >> >> >> > > > generic introspection interface? =20 > >> >> >> > > Do you mean generic QOM introspection? > >> >> >> > >=20 > >> >> >> > > Using QOM we could have '/cpus' container and create QOM lin= ks > >> >> >> > > for exiting (populated links) and possible (empty links) CPU= s. > >> >> >> > > However in that case link's name will need have a special fo= rmat > >> >> >> > > that will convey an information necessary for mgmt to hotplug > >> >> >> > > a CPU object, at least: > >> >> >> > > - where: [node],[socket],[core],[thread] options > >> >> >> > > - optionally what CPU object to use with device_add comman= d =20 > >> >> >> >=20 > >> >> >> > Hmm.. is it not enough to follow the link and get the topology > >> >> >> > information by examining the target? > >> >> >> One can't follow a link if it's an empty one, hence > >> >> >> CPU placement information should be provided somehow, > >> >> >> either: > >> >> > > >> >> > Ah, right, so the issue is determining the socket/core/thread > >> >> > addresses that cpus which aren't yet present will have. > >> >> > > >> >> >> * by precreating cpu-package objects with properties that > >> >> >> would describe it /could be inspected via OQM/ > >> >> > > >> >> > So, we could do this, but I think the natural way would be to hav= e the > >> >> > information for each potential thread in the package. Just putti= ng > >> >> > say "core number" in the package itself assumes more than I'd like > >> >> > about how packages sit in the heirarchy. Plus, it means that > >> >> > management has a bunch of cases to deal with: package has all the > >> >> > information, package has just a core id, package has just a socke= t id, > >> >> > and so forth. > >> >> > > >> >> > It is a but clunky that when the package is plugged, this informa= tion > >> >> > will have to sit parallel to the array of actual thread links. > >> >> > > >> >> > Markus or Andreas is there a natural way to present a list of (no= de, > >> >> > socket, core, thread) tuples in the package object? Preferably > >> >> > without having to create a whole bunch of "potential thread" obje= cts > >> >> > just for the purpose. > >> >>=20 > >> >> I'm just a dabbler when it comes to QOM, but I can try. > >> >>=20 > >> >> I view a concrete cpu-package device (subtype of the abstract > >> >> cpu-package device) as a composite device containing stuff like act= ual > >> >> cores. > >> > > >> > So.. the idea is it's a bit more abstract than that. My intention is > >> > that the package lists - in some manner - each of the threads > >> > (i.e. vcpus) it contains / can contain. Depending on the platform it > >> > *might* also have internal structure such as cores / sockets, but it > >> > doesn't have to. Either way, the contained threads will be listed in > >> > a common way, as a flat array. > >> > > >> >> To create a composite device, you start with the outer shell, then = plug > >> >> in components one by one. Components can be nested arbitrarily dee= p. > >> >>=20 > >> >> Perhaps you can define the concrete cpu-package shell in a way that= lets > >> >> you query what you need to know from a mere shell (no components > >> >> plugged). > >> > > >> > Right.. that's exactly what I'm suggesting, but I don't know enough > >> > about the presentation of basic data in QOM to know quite how to > >> > accomplish it. > >> > > >> >> >> or > >> >> >> * via QMP/HMP command that would provide the same information > >> >> >> only without need to precreate anything. The only difference > >> >> >> is that it allows to use -device/device_add for new CPUs. > >> >> > > >> >> > I'd be ok with that option as well. I'd be thinking it would be > >> >> > implemented via a class method on the package object which return= s the > >> >> > addresses that its contained threads will have, whether or not th= ey're > >> >> > present right now. Does that make sense? > >> >>=20 > >> >> If you model CPU packages as composite cpu-package devices, then you > >> >> should be able to plug and unplug these with device_add, unless plu= gging > >> >> them requires complex wiring that can't be done in qdev / device_ad= d, > >> >> yet. > >> > > >> > There's a whole bunch of issues raised by allowing device_add of > >> > cpus. Although they're certainly interesting and probably useful, I= 'd > >> > really like to punt on them for the time being, so we can get some > >> > sort of cpu hotplug working on Power (and s390 and others). > >>=20 > >> If you make it a device, you can still set > >> cannot_instantiate_with_device_add_yet to disable -device / device_add > >> for now, and unset it later, when you're ready for it. > > > > Yes, that was the plan. > > > >> > The idea of the cpu packages is that - at least for now - the user > >> > can't control their contents apart from the single "present" bit. > >> > They already know what they can contain. > >>=20 > >> Composite devices commonly do. They're not general containers. > >>=20 > >> The "present" bit sounds like you propose to "pre-plug" all the possib= le > >> CPU packages, and thus reduce CPU hot plug/unplug to enabling/disabling > >> pre-plugged CPU packages. > > > > Yes. >=20 > I'm concerned this might suffer combinatorial explosion. >=20 > qemu-system-x86_64 --cpu help shows more than two dozen CPUs. They can > be configured in numerous arrangements of sockets, cores, threads. Many > of these wouldn't be physically possible with older CPUs. Guest > software might work even with physically impossible configurations, but > arranging virtual models of physical hardware in physically impossible > configurations invites trouble, and should best be avoided. >=20 > I'm afraid I'm still in the guess-what-you-mean stage because I lack > concrete examples to go with the abstract description. Can you > enumerate the pre-plugged CPU packages for a board of your choice to > give us a better idea of how your proposal would look like in practice? > Then describe briefly what a management application would need to know > about them, and what it would do with the knowledge? >=20 > Perhaps a PC board would be the most useful, because PCs are probably > second to none in random complexity :) Well, it may be moot at this point, since Andreas has objected strongly to Bharata's draft for reasons I have yet to really figure out. But I think the answer below will clarify this. > >> What if a board can take different kinds of CPU packages? Do we > >> pre-plug all combinations? Then some combinations are non-sensical. > >> How would we reject them? > > > > I'm not trying to solve all cases with the present bit handling - just > > the currently common case of a machine with fixed maximum number of > > slots which are expected to contain identical processor units. > > > >> For instance, PC machines support a wide range of CPUs in various > >> arrangements, but you generally need to use a single kind of CPU, and > >> the kind of CPU restricts the possible arrangements. How would you > >> model that? > > > > The idea is that the available slots are determined by the machine, > > possibly using machine or global options. So for PC, -cpu and -smp > > would determine the number of slots and what can go into them. >=20 > Do these CPU packages come with "soldered-in" CPUs? Or do they provide > slots where CPUs can be plugged in? From what I've read, I guess it's > the latter, together with a "thou shalt not plug in different CPUs" > commandment. Correct? No, they do in fact come with "soldered in" CPUS. Once the package is constructed it is either absent, or supplies exactly one set of cpu threads (and possibly other bits and pieces), there is no further configuration. So: qemu-system-x86_64 -machine pc -cpu Haswell -smp 2,maxcpus=3D8 Would give you 8 cpu packages. 2 would initially be present, the rest would be absent. If you toggle an absent one to present, another single-thread Haswell would appear in the guest. qemu-system-x86_64 -machine pc -cpu Haswell \ -smp 2,threads=3D2,cores=3D2,sockets=3D2,maxcpus=3D8 Would be basically the same (because thread granularity hotplug is allowed on x86). 2 present (pkg0, pkg1) and 6 (pkg2..pkg7) absent cpu packages. If you toggled on pkg2, socket 0, core 1, thread 0 would appear. If you toggled on pkg 7, socket 1, core 1, thread 1 would appear. In contrast, pseries only allows per-core hotplug, so: qemu-system-ppc64 -machine pseries -cpu POWER8 \ -smp 16,threads=3D8,cores=3D2,sockets=3D1,maxcpus=3D16 Would give you 2 cpu packages, 1 present, 1 absent. Toggling on the second package would make a second POWER8 with 8 threads appear. Clearer? > If yes, then the CPU the board comes with would determine what you can > plug into the slots. >=20 > Conversely, the CPU the board comes with helps determine the CPU > packages. Either, potentially. The machine type code would determine what packages are constructed, and may use machine specific or global options to determine this. Or it can (as now) declare that that's not a possible set of CPUs for this board. > >> > There are a bunch of potential use cases this doesn't address, but I > >> > think it *does* address a useful subset of currently interesting > >> > cases, without precluding more flexible extensions in future. > >> > > >> >> If that's the case, a general solution for "device needs complex wi= ring" > >> >> would be more useful than a one-off for CPU packages. > >> >>=20 > >> >> [...] > >> >>=20 > >>=20 >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --rTTzD97p2CCc/LGU Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWzYsnAAoJEGw4ysog2bOSKtIP/0csG4NksjuzSwJyh5VUkXHD W8rIN0W0KtnX0XSx3J2aU5N5Jx56O1DevYaQeByCHsqtHBu9SYfHWjsUoQeGgVbM zduwZpbvvA49Z2iD2VOcxGHq+wyhj87v9oT/lfbae62563ISVPzYamA+hBo/6i99 mc3aEcnQQBnFtJHQjTp3NbjqwZI0p6Hyu6m73xUvlcXBjzM6461y5LU4rFo+Ii5x Afs0R+72h6EQUXp4aSDXPecFcYwDStA8Ou+P6GzQdxf5mMzi3h0VodIUho0GIz+4 7Rxj3F34EBfmgwOHnaeOm1cJZ7efttjaarUyXEBrcRRoijE0IAAku2I9LBVwbIy+ 93ZR2REgN1RlvJJNYYkD78t7j+yGnn6jO669Ki+lLQZ1ecdWJmn4LLJ39kVycEL1 z+VHZ8A/KE0fUM5WrNU4bIL7Fa0F9ctr6QjgXHCkXLSENJ2cJxP4j3gR3xf8TQM6 AcUwt9hTeXZszfq0bhrJYTlZWdyyTsJwI3wL0ojHuD3QovL4AlDJw39UfeuRfkLE G1o6ScTAy1SFo6K5uFWoXy5JA28NCTRmvFMxo6yw0SK2E8LPMie6QKCMR4VzjZUB KcFFlr5QVQjpSAop3vcY5omqqL0qGW3SQKwT5u8AZb0kunS1I/9iPeYFXrDXkN8z jX0ZXPjxMTmUfNRANCso =6QQZ -----END PGP SIGNATURE----- --rTTzD97p2CCc/LGU--