From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras <paulus@samba.org>,
kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
Subject: Re: [PATCH v3 14/17] KVM: PPC: Book3S HV: XIVE: add passthrough support
Date: Tue, 19 Mar 2019 16:22:27 +1100 [thread overview]
Message-ID: <20190319052227.GD31018@umbus.fritz.box> (raw)
In-Reply-To: <20190315120609.25910-15-clg@kaod.org>
[-- Attachment #1: Type: text/plain, Size: 9677 bytes --]
On Fri, Mar 15, 2019 at 01:06:06PM +0100, Cédric Le Goater wrote:
> The KVM XICS-over-XIVE device and the proposed KVM XIVE native device
> implement an IRQ space for the guest using the generic IPI interrupts
> of the XIVE IC controller. These interrupts are allocated at the OPAL
> level and "mapped" into the guest IRQ number space in the range 0-0x1FFF.
> Interrupt management is performed in the XIVE way: using loads and
> stores on the addresses of the XIVE IPI interrupt ESB pages.
>
> Both KVM devices share the same internal structure caching information
> on the interrupts, among which the xive_irq_data struct containing the
> addresses of the IPI ESB pages and an extra one in case of pass-through.
> The later contains the addresses of the ESB pages of the underlying HW
> controller interrupts, PHB4 in all cases for now.
>
> A guest, when running in the XICS legacy interrupt mode, lets the KVM
> XICS-over-XIVE device "handle" interrupt management, that is to
> perform the loads and stores on the addresses of the ESB pages of the
> guest interrupts. However, when running in XIVE native exploitation
> mode, the KVM XIVE native device exposes the interrupt ESB pages to
> the guest and lets the guest perform directly the loads and stores.
>
> The VMA exposing the ESB pages make use of a custom VM fault handler
> which role is to populate the VMA with appropriate pages. When a fault
> occurs, the guest IRQ number is deduced from the offset, and the ESB
> pages of associated XIVE IPI interrupt are inserted in the VMA (using
> the internal structure caching information on the interrupts).
>
> Supporting device passthrough in the guest running in XIVE native
> exploitation mode adds some extra refinements because the ESB pages
> of a different HW controller (PHB4) need to be exposed to the guest
> along with the initial IPI ESB pages of the XIVE IC controller. But
> the overall mechanic is the same.
>
> When the device HW irqs are mapped into or unmapped from the guest
> IRQ number space, the passthru_irq helpers, kvmppc_xive_set_mapped()
> and kvmppc_xive_clr_mapped(), are called to record or clear the
> passthrough interrupt information and to perform the switch.
>
> The approach taken by this patch is to clear the ESB pages of the
> guest IRQ number being mapped and let the VM fault handler repopulate.
> The handler will insert the ESB page corresponding to the HW interrupt
> of the device being passed-through or the initial IPI ESB page if the
> device is being removed.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>
> Changes since v2 :
>
> - extra comment in documentation
>
> arch/powerpc/kvm/book3s_xive.h | 9 +++++
> arch/powerpc/kvm/book3s_xive.c | 15 ++++++++
> arch/powerpc/kvm/book3s_xive_native.c | 41 ++++++++++++++++++++++
> Documentation/virtual/kvm/devices/xive.txt | 19 ++++++++++
> 4 files changed, 84 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index 622f594d93e1..e011622dc038 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -94,6 +94,11 @@ struct kvmppc_xive_src_block {
> struct kvmppc_xive_irq_state irq_state[KVMPPC_XICS_IRQ_PER_ICS];
> };
>
> +struct kvmppc_xive;
> +
> +struct kvmppc_xive_ops {
> + int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq);
> +};
>
> struct kvmppc_xive {
> struct kvm *kvm;
> @@ -132,6 +137,10 @@ struct kvmppc_xive {
>
> /* Flags */
> u8 single_escalation;
> +
> + struct kvmppc_xive_ops *ops;
> + struct address_space *mapping;
> + struct mutex mapping_lock;
> };
>
> #define KVMPPC_XIVE_Q_COUNT 8
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index c1b7aa7dbc28..480a3fc6b9fd 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -937,6 +937,13 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
> /* Turn the IPI hard off */
> xive_vm_esb_load(&state->ipi_data, XIVE_ESB_SET_PQ_01);
>
> + /*
> + * Reset ESB guest mapping. Needed when ESB pages are exposed
> + * to the guest in XIVE native mode
> + */
> + if (xive->ops && xive->ops->reset_mapped)
> + xive->ops->reset_mapped(kvm, guest_irq);
> +
> /* Grab info about irq */
> state->pt_number = hw_irq;
> state->pt_data = irq_data_get_irq_handler_data(host_data);
> @@ -1022,6 +1029,14 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
> state->pt_number = 0;
> state->pt_data = NULL;
>
> + /*
> + * Reset ESB guest mapping. Needed when ESB pages are exposed
> + * to the guest in XIVE native mode
> + */
> + if (xive->ops && xive->ops->reset_mapped) {
> + xive->ops->reset_mapped(kvm, guest_irq);
> + }
> +
> /* Reconfigure the IPI */
> xive_native_configure_irq(state->ipi_number,
> kvmppc_xive_vp(xive, state->act_server),
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index e465d4c53f5c..67a1bb26a4cc 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -14,6 +14,7 @@
> #include <linux/delay.h>
> #include <linux/percpu.h>
> #include <linux/cpumask.h>
> +#include <linux/file.h>
> #include <asm/uaccess.h>
> #include <asm/kvm_book3s.h>
> #include <asm/kvm_ppc.h>
> @@ -170,6 +171,35 @@ int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> return rc;
> }
>
> +/*
> + * Device passthrough support
> + */
> +static int kvmppc_xive_native_reset_mapped(struct kvm *kvm, unsigned long irq)
> +{
> + struct kvmppc_xive *xive = kvm->arch.xive;
> +
> + if (irq >= KVMPPC_XIVE_NR_IRQS)
> + return -EINVAL;
> +
> + /*
> + * Clear the ESB pages of the IRQ number being mapped (or
> + * unmapped) into the guest and let the the VM fault handler
> + * repopulate with the appropriate ESB pages (device or IC)
> + */
> + pr_debug("clearing esb pages for girq 0x%lx\n", irq);
> + mutex_lock(&xive->mapping_lock);
> + if (xive->mapping)
> + unmap_mapping_range(xive->mapping,
> + irq * (2ull << PAGE_SHIFT),
> + 2ull << PAGE_SHIFT, 1);
> + mutex_unlock(&xive->mapping_lock);
> + return 0;
> +}
> +
> +static struct kvmppc_xive_ops kvmppc_xive_native_ops = {
> + .reset_mapped = kvmppc_xive_native_reset_mapped,
> +};
> +
> static int xive_native_esb_fault(struct vm_fault *vmf)
> {
> struct vm_area_struct *vma = vmf->vma;
> @@ -247,6 +277,8 @@ static const struct vm_operations_struct xive_native_tima_vmops = {
> static int kvmppc_xive_native_mmap(struct kvm_device *dev,
> struct vm_area_struct *vma)
> {
> + struct kvmppc_xive *xive = dev->private;
> +
> /* We only allow mappings at fixed offset for now */
> if (vma->vm_pgoff == KVM_XIVE_TIMA_PAGE_OFFSET) {
> if (vma_pages(vma) > 4)
> @@ -262,6 +294,13 @@ static int kvmppc_xive_native_mmap(struct kvm_device *dev,
>
> vma->vm_flags |= VM_IO | VM_PFNMAP;
> vma->vm_page_prot = pgprot_noncached_wc(vma->vm_page_prot);
> +
> + /*
> + * Grab the KVM device file address_space to be able to clear
> + * the ESB pages mapping when a device is passed-through into
> + * the guest.
> + */
> + xive->mapping = vma->vm_file->f_mapping;
> return 0;
> }
>
> @@ -959,6 +998,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
> xive->dev = dev;
> xive->kvm = kvm;
> kvm->arch.xive = xive;
> + mutex_init(&xive->mapping_lock);
>
> /*
> * Allocate a bunch of VPs. KVM_MAX_VCPUS is a large value for
> @@ -972,6 +1012,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
> ret = -ENXIO;
>
> xive->single_escalation = xive_native_has_single_escalation();
> + xive->ops = &kvmppc_xive_native_ops;
>
> if (ret)
> kfree(xive);
> diff --git a/Documentation/virtual/kvm/devices/xive.txt b/Documentation/virtual/kvm/devices/xive.txt
> index 686cca450f9f..9aa48efca1cb 100644
> --- a/Documentation/virtual/kvm/devices/xive.txt
> +++ b/Documentation/virtual/kvm/devices/xive.txt
> @@ -43,6 +43,25 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
> manage the source: to trigger, to EOI, to turn off the source for
> instance.
>
> + 3. Device pass-through
> +
> + When a device is passed-through into the guest, the source
> + interrupts are from a different HW controller (PHB4) and the ESB
> + pages exposed to the guest should accommadate this change.
> +
> + The passthru_irq helpers, kvmppc_xive_set_mapped() and
> + kvmppc_xive_clr_mapped() are called when the device HW irqs are
> + mapped into or unmapped from the guest IRQ number space. The KVM
> + device extends these helpers to clear the ESB pages of the guest IRQ
> + number being mapped and then lets the VM fault handler repopulate.
> + The handler will insert the ESB page corresponding to the HW
> + interrupt of the device being passed-through or the initial IPI ESB
> + page if the device has being removed.
> +
> + The ESB remapping is fully transparent to the guest and the OS
> + device driver. All handling is done within VFIO and the above
> + helpers in KVM-PPC.
> +
> * Groups:
>
> 1. KVM_DEV_XIVE_GRP_CTRL
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-03-19 5:44 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-15 12:05 [PATCH v3 00/17] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 01/17] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 02/17] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-03-17 23:48 ` David Gibson
2019-03-15 12:05 ` [PATCH v3 03/17] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE Cédric Le Goater
2019-03-18 0:19 ` David Gibson
2019-03-18 10:00 ` Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 04/17] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source Cédric Le Goater
2019-03-18 1:38 ` David Gibson
2019-03-15 12:05 ` [PATCH v3 05/17] KVM: PPC: Book3S HV: XIVE: add a control to configure " Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 06/17] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration Cédric Le Goater
2019-03-18 3:23 ` David Gibson
2019-03-18 14:12 ` Cédric Le Goater
2019-03-18 14:38 ` Cédric Le Goater
2019-03-19 4:54 ` David Gibson
2019-03-19 15:47 ` Cédric Le Goater
2019-03-20 3:44 ` David Gibson
2019-03-20 6:44 ` Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 07/17] KVM: PPC: Book3S HV: XIVE: add a global reset control Cédric Le Goater
2019-03-18 3:25 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 08/17] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources Cédric Le Goater
2019-03-18 3:28 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 09/17] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages Cédric Le Goater
2019-03-18 3:31 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 10/17] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-03-19 5:08 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 11/17] KVM: introduce a 'mmap' method for KVM devices Cédric Le Goater
2019-03-18 3:32 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 12/17] KVM: PPC: Book3S HV: XIVE: add a TIMA mapping Cédric Le Goater
2019-03-15 12:06 ` [PATCH v3 13/17] KVM: PPC: Book3S HV: XIVE: add a mapping for the source ESB pages Cédric Le Goater
2019-03-15 12:06 ` [PATCH v3 14/17] KVM: PPC: Book3S HV: XIVE: add passthrough support Cédric Le Goater
2019-03-19 5:22 ` David Gibson [this message]
2019-03-15 12:06 ` [PATCH v3 15/17] KVM: PPC: Book3S HV: XIVE: activate XIVE exploitation mode Cédric Le Goater
2019-03-18 6:42 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 16/17] KVM: introduce a KVM_DESTROY_DEVICE ioctl Cédric Le Goater
2019-03-18 6:42 ` David Gibson
2019-03-15 12:06 ` [PATCH v3 17/17] KVM: PPC: Book3S HV: XIVE: clear the vCPU interrupt presenters Cédric Le Goater
2019-03-19 5:37 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190319052227.GD31018@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=clg@kaod.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).