From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 14/16] KVM: PPC: Book3S HV: XIVE: add passthrough support
Date: Mon, 25 Feb 2019 15:13:15 +1100 [thread overview]
Message-ID: <20190225041315.GR7668@umbus.fritz.box> (raw)
In-Reply-To: <20190222112840.25000-15-clg@kaod.org>
[-- Attachment #1: Type: text/plain, Size: 9646 bytes --]
On Fri, Feb 22, 2019 at 12:28:38PM +0100, Cédric Le Goater wrote:
> The KVM XICS-over-XIVE device and the proposed KVM XIVE native device
> implement an IRQ space for the guest using the generic IPI interrupts
> of the XIVE IC controller. These interrupts are allocated at the OPAL
> level and "mapped" into the guest IRQ number space in the range 0-0x1FFF.
> Interrupt management is performed in the XIVE way: using loads and
> stores on the addresses of the XIVE IPI interrupt ESB pages.
>
> Both KVM devices share the same internal structure caching information
> on the interrupts, among which the xive_irq_data struct containing the
> addresses of the IPI ESB pages and an extra one in case of passthrough.
> The later contains the addresses of the ESB pages of the underlying HW
> controller interrupts, PHB4 in all cases for now.
>
> A guest, when running in the XICS legacy interrupt mode, lets the KVM
> XICS-over-XIVE device "handle" interrupt management, that is to
> perform the loads and stores on the addresses of the ESB pages of the
> guest interrupts. However, when running in XIVE native exploitation
> mode, the KVM XIVE native device exposes the interrupt ESB pages to
> the guest and lets the guest perform directly the loads and stores.
>
> The VMA exposing the ESB pages make use of a custom VM fault handler
> which role is to populate the VMA with appropriate pages. When a fault
> occurs, the guest IRQ number is deduced from the offset, and the ESB
> pages of associated XIVE IPI interrupt are inserted in the VMA (using
> the internal structure caching information on the interrupts).
>
> Supporting device passthrough in the guest running in XIVE native
> exploitation mode adds some extra refinements because the ESB pages
> of a different HW controller (PHB4) need to be exposed to the guest
> along with the initial IPI ESB pages of the XIVE IC controller. But
> the overall mechanic is the same.
>
> When the device HW irqs are mapped into or unmapped from the guest
> IRQ number space, the passthru_irq helpers, kvmppc_xive_set_mapped()
> and kvmppc_xive_clr_mapped(), are called to record or clear the
> passthrough interrupt information and to perform the switch.
>
> The approach taken by this patch is to clear the ESB pages of the
> guest IRQ number being mapped and let the VM fault handler repopulate.
> The handler will insert the ESB page corresponding to the HW interrupt
> of the device being passed-through or the initial IPI ESB page if the
> device is being removed.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> arch/powerpc/kvm/book3s_xive.h | 9 +++++
> arch/powerpc/kvm/book3s_xive.c | 15 ++++++++
> arch/powerpc/kvm/book3s_xive_native.c | 41 ++++++++++++++++++++++
> Documentation/virtual/kvm/devices/xive.txt | 15 ++++++++
> 4 files changed, 80 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index 6660d138c6b7..d1f832a53811 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -94,6 +94,11 @@ struct kvmppc_xive_src_block {
> struct kvmppc_xive_irq_state irq_state[KVMPPC_XICS_IRQ_PER_ICS];
> };
>
> +struct kvmppc_xive;
> +
> +struct kvmppc_xive_ops {
> + int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq);
> +};
>
> struct kvmppc_xive {
> struct kvm *kvm;
> @@ -132,6 +137,10 @@ struct kvmppc_xive {
>
> /* Flags */
> u8 single_escalation;
> +
> + struct kvmppc_xive_ops *ops;
> + struct address_space *mapping;
> + struct mutex mapping_lock;
> };
>
> #define KVMPPC_XIVE_Q_COUNT 8
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 7431e31bc541..7a14512b8944 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -942,6 +942,13 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
> /* Turn the IPI hard off */
> xive_vm_esb_load(&state->ipi_data, XIVE_ESB_SET_PQ_01);
>
> + /*
> + * Reset ESB guest mapping. Needed when ESB pages are exposed
> + * to the guest in XIVE native mode
> + */
> + if (xive->ops && xive->ops->reset_mapped)
> + xive->ops->reset_mapped(kvm, guest_irq);
> +
> /* Grab info about irq */
> state->pt_number = hw_irq;
> state->pt_data = irq_data_get_irq_handler_data(host_data);
> @@ -1027,6 +1034,14 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
> state->pt_number = 0;
> state->pt_data = NULL;
>
> + /*
> + * Reset ESB guest mapping. Needed when ESB pages are exposed
> + * to the guest in XIVE native mode
> + */
> + if (xive->ops && xive->ops->reset_mapped) {
> + xive->ops->reset_mapped(kvm, guest_irq);
> + }
> +
> /* Reconfigure the IPI */
> xive_native_configure_irq(state->ipi_number,
> xive_vp(xive, state->act_server),
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 92cab6409e8e..bf60870144f1 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -14,6 +14,7 @@
> #include <linux/delay.h>
> #include <linux/percpu.h>
> #include <linux/cpumask.h>
> +#include <linux/file.h>
> #include <asm/uaccess.h>
> #include <asm/kvm_book3s.h>
> #include <asm/kvm_ppc.h>
> @@ -176,6 +177,35 @@ int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> return rc;
> }
>
> +/*
> + * Device passthrough support
> + */
> +static int kvmppc_xive_native_reset_mapped(struct kvm *kvm, unsigned long irq)
> +{
> + struct kvmppc_xive *xive = kvm->arch.xive;
> +
> + if (irq >= KVMPPC_XIVE_NR_IRQS)
> + return -EINVAL;
> +
> + /*
> + * Clear the ESB pages of the IRQ number being mapped (or
> + * unmapped) into the guest and let the the VM fault handler
> + * repopulate with the appropriate ESB pages (device or IC)
> + */
> + pr_debug("clearing esb pages for girq 0x%lx\n", irq);
> + mutex_lock(&xive->mapping_lock);
> + if (xive->mapping)
> + unmap_mapping_range(xive->mapping,
> + irq * (2ull << PAGE_SHIFT),
> + 2ull << PAGE_SHIFT, 1);
> + mutex_unlock(&xive->mapping_lock);
> + return 0;
> +}
> +
> +static struct kvmppc_xive_ops kvmppc_xive_native_ops = {
> + .reset_mapped = kvmppc_xive_native_reset_mapped,
> +};
> +
> static int xive_native_esb_fault(struct vm_fault *vmf)
> {
> struct vm_area_struct *vma = vmf->vma;
> @@ -253,6 +283,8 @@ static const struct vm_operations_struct xive_native_tima_vmops = {
> static int kvmppc_xive_native_mmap(struct kvm_device *dev,
> struct vm_area_struct *vma)
> {
> + struct kvmppc_xive *xive = dev->private;
> +
> /* We only allow mappings at fixed offset for now */
> if (vma->vm_pgoff == KVM_XIVE_TIMA_PAGE_OFFSET) {
> if (vma_pages(vma) > 4)
> @@ -268,6 +300,13 @@ static int kvmppc_xive_native_mmap(struct kvm_device *dev,
>
> vma->vm_flags |= VM_IO | VM_PFNMAP;
> vma->vm_page_prot = pgprot_noncached_wc(vma->vm_page_prot);
> +
> + /*
> + * Grab the KVM device file address_space to be able to clear
> + * the ESB pages mapping when a device is passed-through into
> + * the guest.
> + */
> + xive->mapping = vma->vm_file->f_mapping;
> return 0;
> }
>
> @@ -913,6 +952,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
> xive->dev = dev;
> xive->kvm = kvm;
> kvm->arch.xive = xive;
> + mutex_init(&xive->mapping_lock);
>
> /* We use the default queue size set by the host */
> xive->q_order = xive_native_default_eq_shift();
> @@ -933,6 +973,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
> ret = -ENOMEM;
>
> xive->single_escalation = xive_native_has_single_escalation();
> + xive->ops = &kvmppc_xive_native_ops;
>
> if (ret)
> kfree(xive);
> diff --git a/Documentation/virtual/kvm/devices/xive.txt b/Documentation/virtual/kvm/devices/xive.txt
> index be5000b2eb5a..7a242cb07e7c 100644
> --- a/Documentation/virtual/kvm/devices/xive.txt
> +++ b/Documentation/virtual/kvm/devices/xive.txt
> @@ -43,6 +43,21 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
> manage the source: to trigger, to EOI, to turn off the source for
> instance.
>
> + 3. Device passthrough
> +
> + When a device is passed-through into the guest, the source
> + interrupts are from a different HW controller (PHB4) and the ESB
> + pages exposed to the guest should accommadate this change.
> +
> + The passthru_irq helpers, kvmppc_xive_set_mapped() and
> + kvmppc_xive_clr_mapped() are called when the device HW irqs are
> + mapped into or unmapped from the guest IRQ number space. The KVM
> + device extends these helpers to clear the ESB pages of the guest IRQ
> + number being mapped and then lets the VM fault handler repopulate.
> + The handler will insert the ESB page corresponding to the HW
> + interrupt of the device being passed-through or the initial IPI ESB
> + page if the device has being removed.
I think it might be worth emphasizing that this all happens with KVM
and userspace / the guest doesn't need to do anything about this
remapping. Really this is an informational aside, not something a
user of the device actually needs to know.
> * Groups:
>
> 1. KVM_DEV_XIVE_GRP_CTRL
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-02-25 5:16 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-22 11:28 [PATCH v2 00/16] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 01/16] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-02-24 23:42 ` David Gibson
2019-02-25 3:50 ` Michael Ellerman
2019-02-25 10:11 ` Cédric Le Goater
2019-02-26 4:21 ` David Gibson
2019-03-12 18:25 ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 02/16] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-02-25 0:08 ` David Gibson
2019-03-12 11:14 ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 03/16] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE Cédric Le Goater
2019-02-25 0:35 ` David Gibson
2019-02-25 4:59 ` Paul Mackerras
2019-03-12 14:10 ` Cédric Le Goater
2019-03-12 14:03 ` Cédric Le Goater
2019-03-13 4:05 ` David Gibson
2019-02-25 4:35 ` Paul Mackerras
2019-03-13 8:34 ` Cédric Le Goater
2019-03-14 2:29 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 04/16] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source Cédric Le Goater
2019-02-25 2:10 ` David Gibson
2019-02-26 4:25 ` Paul Mackerras
2019-02-26 23:20 ` David Gibson
2019-03-12 15:19 ` Cédric Le Goater
2019-03-14 2:15 ` David Gibson
2019-02-25 5:30 ` Paul Mackerras
2019-02-22 11:28 ` [PATCH v2 05/16] KVM: PPC: Book3S HV: XIVE: add a control to configure " Cédric Le Goater
2019-02-25 2:21 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 06/16] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration Cédric Le Goater
2019-02-25 2:39 ` David Gibson
2019-03-12 17:00 ` Cédric Le Goater
2019-03-13 4:03 ` David Gibson
2019-03-13 8:46 ` Cédric Le Goater
2019-03-14 3:29 ` David Gibson
2019-02-26 5:24 ` Paul Mackerras
2019-03-13 9:40 ` Cédric Le Goater
2019-03-14 2:32 ` David Gibson
2019-03-14 7:11 ` Cédric Le Goater
2019-03-15 0:29 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 07/16] KVM: PPC: Book3S HV: XIVE: add a global reset control Cédric Le Goater
2019-02-25 2:43 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 08/16] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources Cédric Le Goater
2019-02-25 2:45 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 09/16] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages Cédric Le Goater
2019-02-25 2:53 ` David Gibson
2019-03-13 11:48 ` Cédric Le Goater
2019-03-14 2:33 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 10/16] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-02-25 3:31 ` David Gibson
2019-03-13 13:19 ` Cédric Le Goater
2019-03-14 3:09 ` David Gibson
2019-03-14 7:08 ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 11/16] KVM: introduce a 'mmap' method for KVM devices Cédric Le Goater
2019-02-25 3:33 ` David Gibson
2019-02-25 10:57 ` Cédric Le Goater
2019-02-26 12:52 ` Paolo Bonzini
2019-02-26 23:22 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 12/16] KVM: PPC: Book3S HV: XIVE: add a TIMA mapping Cédric Le Goater
2019-02-25 3:42 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 13/16] KVM: PPC: Book3S HV: XIVE: add a mapping for the source ESB pages Cédric Le Goater
2019-02-25 3:47 ` David Gibson
2019-02-22 11:28 ` [PATCH v2 14/16] KVM: PPC: Book3S HV: XIVE: add passthrough support Cédric Le Goater
2019-02-25 4:13 ` David Gibson [this message]
2019-02-22 11:28 ` [PATCH v2 15/16] KVM: introduce a KVM_DESTROY_DEVICE ioctl Cédric Le Goater
2019-02-25 4:15 ` David Gibson
2019-03-13 8:02 ` Cédric Le Goater
2019-03-15 17:57 ` Paolo Bonzini
2019-02-22 11:28 ` [PATCH v2 16/16] KVM: PPC: Book3S HV: XIVE: clear the vCPU interrupt presenters Cédric Le Goater
2019-02-25 4:18 ` David Gibson
2019-03-13 8:17 ` Cédric Le Goater
2019-03-14 2:26 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190225041315.GR7668@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=clg@kaod.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).