From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexander Graf <agraf@suse.de>
Cc: Erlon Cruz <erlon.cruz@br.flextronics.com>,
qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] Enablig DLPAR capacity on QEMU pSeries
Date: Thu, 13 Sep 2012 07:48:25 +1000 [thread overview]
Message-ID: <1347486505.2276.13.camel@pasglop> (raw)
In-Reply-To: <5050B005.9080500@suse.de>
On Wed, 2012-09-12 at 17:53 +0200, Alexander Graf wrote:
> On 09/12/2012 04:54 PM, Erlon Cruz wrote:
> > Hi all,
> >
> > We are planning to implement DLPAR capacity on QEMU pSeries. As we
>
> What is DLPAR? Hotplug support?
Yes.
> > lack of experience in the internals of the arch we would like you guys
> > to give us some design directions
> > and confirm if we going in the right direction. Our first idea is:
> >
> > 1 - to patch 'spapr.c' so it can dynamically insert/remove basic
> > items into the device tree.
>
> What exactly would you like to patch into it? We already do have support
> for dynamic dt creation with the spapr target.
No we don't. We don't have the necessary bits and pieces to pass the DT
updates down to the guest. PAPR defines a mechanism using RTAS calls
which we need to implement, but there are some issues remaining:
- We don't have a way to "initiate" a DLPAR operation. This is
currently done by proprietary tools that communicate with the HMC. We
want to invent some kind of hotplug "interrupt" (using existing RTAS
event facilities). All it needs to do is indicate the DT path (ie.
connector) where something is to be plugged to or unplugged, which can
then trigger the relevant configure-connector calls to retrieve the DT
bits.
- We have a problem with PCI. Currently, the content of the PCI
bus(ses) is discovered by SLOF running inside the guest. Not by qemu.
It's SLOF that assigns the BARs and create the device-tree nodes for the
various PCI devices. However, with hotplug, the guest expects to get
fully populated DT nodes for hotplugged PCI devices and fully assigned
BARS. Under pHyp that works because under the hood, RTAS contains an OFW
implementation which does all the assignment before passing the stuff to
the OS, but under qemu, RTAS is actually in qemu. This means we'll
probably have to move back the PCI device node creation and resource
assignment to qemu (like it was in the very early versions of the spapr
support).
> > 2 - create a host side device that will be used with a guest side
> > driver to perform guest side operations and communicate changes from
> > host to the guest (like DynamicRM does in PowerVM LPARs). We are not
>
> Why not just use hypercalls?
Actually there are existing RTAS calls to use for the actual passing of
the device-tree bits, the problem is purely how to "initiate" an
operation to trigger the guest code that will then perform the
appropriate calls.
qemu-ga is an option. But I was thinking more along the lines of adding
some new RTAS events, maybe EPOW style, a bit like ACPI does.
> > planning to use powerpc-tools and want to make resource management
> > transparent (i.e. no need to run daemons or userspace programs in the
> > guest, only this kernel driver).
> > 3 - create bindings to support adding/removal ibmvscsi devices
> > 4 - create bindings to support adding/removal ibmveth devices
> > 5 - create bindings to support adding/removal PCI devices
> > 6 - create bindings to support adding/removal of memory
There's already large parts of the necessary bits using RTAS in the
kernel (in recent kernels that is, older stuff really needed it all done
in userspace). The trigger mostly is missing.
> This is going to be the hardest part. I don't think QEMU supports memory
> hotplug yet.
Missing from the above list is also CPU hotplug.
> > - Do we need to do this the way PowerVM does? We have tested
> > virtio ballooning and it can works with a few endiannes corrections.
>
> I don't know how PowerVM works. But if normal ballooning is all you
> need, you should certainly just enable virtio-balloon.
Does virtio-balloon needs endian fixes ? We though it was just working !
Feel free to submit patches :)
> > 7 - create bindings to support adding/removal CPUs
> > - is SMP supported already? I tried to run SMP in a x86 host
> > and the guest stuck when SMP is enabled
>
> SMP should work just fine, yes. Where exactly does it get stuck?
Right,it works fine as far as I can tell.
> > - would be possible to work on this without a P7 baremetal
> > machine?
>
> At least for device hotplug, it should be perfectly possible to use an
> old G5 with PR KVM. I haven't gotten around to patch all the pieces of
> the puzzle to make -M pseries work with PR KVM when it's running on top
> of pHyp yet, so that won't work.
>
> > We have a P7 8205-E6B, is that possible to kick PHYP out?
>
> Ben?
Probably not. You need a 7R2.
> > Any ideia on how much effort (time/people) the hole thing would take?
> > Any consideration about this is much appreciated :)
>
> Phew. It's hard to tell. Depends heavily on how good your people are :).
>
Cheers,
Ben.
next prev parent reply other threads:[~2012-09-12 21:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-12 14:54 [Qemu-devel] Enablig DLPAR capacity on QEMU pSeries Erlon Cruz
2012-09-12 15:53 ` Alexander Graf
2012-09-12 20:56 ` Erlon Cruz
2012-09-12 21:42 ` Alexander Graf
2012-09-12 21:48 ` Benjamin Herrenschmidt [this message]
2012-09-13 15:15 ` Erlon Cruz
2012-09-13 21:45 ` Benjamin Herrenschmidt
2012-10-05 14:08 ` Erlon Cruz
2012-10-05 14:42 ` Anthony Liguori
2012-10-05 15:26 ` Erlon Cruz
2012-10-05 20:56 ` Benjamin Herrenschmidt
2012-10-05 20:49 ` Benjamin Herrenschmidt
2012-10-06 14:54 ` David Gibson
2012-10-06 19:39 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1347486505.2276.13.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=agraf@suse.de \
--cc=david@gibson.dropbear.id.au \
--cc=erlon.cruz@br.flextronics.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).