linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 14/16] KVM: PPC: Book3S HV: XIVE: add passthrough support
Date: Mon, 25 Feb 2019 15:13:15 +1100	[thread overview]
Message-ID: <20190225041315.GR7668@umbus.fritz.box> (raw)
In-Reply-To: <20190222112840.25000-15-clg@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 9646 bytes --]

On Fri, Feb 22, 2019 at 12:28:38PM +0100, Cédric Le Goater wrote:
> The KVM XICS-over-XIVE device and the proposed KVM XIVE native device
> implement an IRQ space for the guest using the generic IPI interrupts
> of the XIVE IC controller. These interrupts are allocated at the OPAL
> level and "mapped" into the guest IRQ number space in the range 0-0x1FFF.
> Interrupt management is performed in the XIVE way: using loads and
> stores on the addresses of the XIVE IPI interrupt ESB pages.
> 
> Both KVM devices share the same internal structure caching information
> on the interrupts, among which the xive_irq_data struct containing the
> addresses of the IPI ESB pages and an extra one in case of passthrough.
> The later contains the addresses of the ESB pages of the underlying HW
> controller interrupts, PHB4 in all cases for now.
> 
> A guest, when running in the XICS legacy interrupt mode, lets the KVM
> XICS-over-XIVE device "handle" interrupt management, that is to
> perform the loads and stores on the addresses of the ESB pages of the
> guest interrupts. However, when running in XIVE native exploitation
> mode, the KVM XIVE native device exposes the interrupt ESB pages to
> the guest and lets the guest perform directly the loads and stores.
> 
> The VMA exposing the ESB pages make use of a custom VM fault handler
> which role is to populate the VMA with appropriate pages. When a fault
> occurs, the guest IRQ number is deduced from the offset, and the ESB
> pages of associated XIVE IPI interrupt are inserted in the VMA (using
> the internal structure caching information on the interrupts).
> 
> Supporting device passthrough in the guest running in XIVE native
> exploitation mode adds some extra refinements because the ESB pages
> of a different HW controller (PHB4) need to be exposed to the guest
> along with the initial IPI ESB pages of the XIVE IC controller. But
> the overall mechanic is the same.
> 
> When the device HW irqs are mapped into or unmapped from the guest
> IRQ number space, the passthru_irq helpers, kvmppc_xive_set_mapped()
> and kvmppc_xive_clr_mapped(), are called to record or clear the
> passthrough interrupt information and to perform the switch.
> 
> The approach taken by this patch is to clear the ESB pages of the
> guest IRQ number being mapped and let the VM fault handler repopulate.
> The handler will insert the ESB page corresponding to the HW interrupt
> of the device being passed-through or the initial IPI ESB page if the
> device is being removed.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  arch/powerpc/kvm/book3s_xive.h             |  9 +++++
>  arch/powerpc/kvm/book3s_xive.c             | 15 ++++++++
>  arch/powerpc/kvm/book3s_xive_native.c      | 41 ++++++++++++++++++++++
>  Documentation/virtual/kvm/devices/xive.txt | 15 ++++++++
>  4 files changed, 80 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index 6660d138c6b7..d1f832a53811 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -94,6 +94,11 @@ struct kvmppc_xive_src_block {
>  	struct kvmppc_xive_irq_state irq_state[KVMPPC_XICS_IRQ_PER_ICS];
>  };
>  
> +struct kvmppc_xive;
> +
> +struct kvmppc_xive_ops {
> +	int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq);
> +};
>  
>  struct kvmppc_xive {
>  	struct kvm *kvm;
> @@ -132,6 +137,10 @@ struct kvmppc_xive {
>  
>  	/* Flags */
>  	u8	single_escalation;
> +
> +	struct kvmppc_xive_ops *ops;
> +	struct address_space   *mapping;
> +	struct mutex mapping_lock;
>  };
>  
>  #define KVMPPC_XIVE_Q_COUNT	8
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 7431e31bc541..7a14512b8944 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -942,6 +942,13 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
>  	/* Turn the IPI hard off */
>  	xive_vm_esb_load(&state->ipi_data, XIVE_ESB_SET_PQ_01);
>  
> +	/*
> +	 * Reset ESB guest mapping. Needed when ESB pages are exposed
> +	 * to the guest in XIVE native mode
> +	 */
> +	if (xive->ops && xive->ops->reset_mapped)
> +		xive->ops->reset_mapped(kvm, guest_irq);
> +
>  	/* Grab info about irq */
>  	state->pt_number = hw_irq;
>  	state->pt_data = irq_data_get_irq_handler_data(host_data);
> @@ -1027,6 +1034,14 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
>  	state->pt_number = 0;
>  	state->pt_data = NULL;
>  
> +	/*
> +	 * Reset ESB guest mapping. Needed when ESB pages are exposed
> +	 * to the guest in XIVE native mode
> +	 */
> +	if (xive->ops && xive->ops->reset_mapped) {
> +		xive->ops->reset_mapped(kvm, guest_irq);
> +	}
> +
>  	/* Reconfigure the IPI */
>  	xive_native_configure_irq(state->ipi_number,
>  				  xive_vp(xive, state->act_server),
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 92cab6409e8e..bf60870144f1 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -14,6 +14,7 @@
>  #include <linux/delay.h>
>  #include <linux/percpu.h>
>  #include <linux/cpumask.h>
> +#include <linux/file.h>
>  #include <asm/uaccess.h>
>  #include <asm/kvm_book3s.h>
>  #include <asm/kvm_ppc.h>
> @@ -176,6 +177,35 @@ int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
>  	return rc;
>  }
>  
> +/*
> + * Device passthrough support
> + */
> +static int kvmppc_xive_native_reset_mapped(struct kvm *kvm, unsigned long irq)
> +{
> +	struct kvmppc_xive *xive = kvm->arch.xive;
> +
> +	if (irq >= KVMPPC_XIVE_NR_IRQS)
> +		return -EINVAL;
> +
> +	/*
> +	 * Clear the ESB pages of the IRQ number being mapped (or
> +	 * unmapped) into the guest and let the the VM fault handler
> +	 * repopulate with the appropriate ESB pages (device or IC)
> +	 */
> +	pr_debug("clearing esb pages for girq 0x%lx\n", irq);
> +	mutex_lock(&xive->mapping_lock);
> +	if (xive->mapping)
> +		unmap_mapping_range(xive->mapping,
> +				    irq * (2ull << PAGE_SHIFT),
> +				    2ull << PAGE_SHIFT, 1);
> +	mutex_unlock(&xive->mapping_lock);
> +	return 0;
> +}
> +
> +static struct kvmppc_xive_ops kvmppc_xive_native_ops =  {
> +	.reset_mapped = kvmppc_xive_native_reset_mapped,
> +};
> +
>  static int xive_native_esb_fault(struct vm_fault *vmf)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
> @@ -253,6 +283,8 @@ static const struct vm_operations_struct xive_native_tima_vmops = {
>  static int kvmppc_xive_native_mmap(struct kvm_device *dev,
>  				   struct vm_area_struct *vma)
>  {
> +	struct kvmppc_xive *xive = dev->private;
> +
>  	/* We only allow mappings at fixed offset for now */
>  	if (vma->vm_pgoff == KVM_XIVE_TIMA_PAGE_OFFSET) {
>  		if (vma_pages(vma) > 4)
> @@ -268,6 +300,13 @@ static int kvmppc_xive_native_mmap(struct kvm_device *dev,
>  
>  	vma->vm_flags |= VM_IO | VM_PFNMAP;
>  	vma->vm_page_prot = pgprot_noncached_wc(vma->vm_page_prot);
> +
> +	/*
> +	 * Grab the KVM device file address_space to be able to clear
> +	 * the ESB pages mapping when a device is passed-through into
> +	 * the guest.
> +	 */
> +	xive->mapping = vma->vm_file->f_mapping;
>  	return 0;
>  }
>  
> @@ -913,6 +952,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
>  	xive->dev = dev;
>  	xive->kvm = kvm;
>  	kvm->arch.xive = xive;
> +	mutex_init(&xive->mapping_lock);
>  
>  	/* We use the default queue size set by the host */
>  	xive->q_order = xive_native_default_eq_shift();
> @@ -933,6 +973,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
>  		ret = -ENOMEM;
>  
>  	xive->single_escalation = xive_native_has_single_escalation();
> +	xive->ops = &kvmppc_xive_native_ops;
>  
>  	if (ret)
>  		kfree(xive);
> diff --git a/Documentation/virtual/kvm/devices/xive.txt b/Documentation/virtual/kvm/devices/xive.txt
> index be5000b2eb5a..7a242cb07e7c 100644
> --- a/Documentation/virtual/kvm/devices/xive.txt
> +++ b/Documentation/virtual/kvm/devices/xive.txt
> @@ -43,6 +43,21 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
>    manage the source: to trigger, to EOI, to turn off the source for
>    instance.
>  
> +  3. Device passthrough
> +
> +  When a device is passed-through into the guest, the source
> +  interrupts are from a different HW controller (PHB4) and the ESB
> +  pages exposed to the guest should accommadate this change.
> +
> +  The passthru_irq helpers, kvmppc_xive_set_mapped() and
> +  kvmppc_xive_clr_mapped() are called when the device HW irqs are
> +  mapped into or unmapped from the guest IRQ number space. The KVM
> +  device extends these helpers to clear the ESB pages of the guest IRQ
> +  number being mapped and then lets the VM fault handler repopulate.
> +  The handler will insert the ESB page corresponding to the HW
> +  interrupt of the device being passed-through or the initial IPI ESB
> +  page if the device has being removed.

I think it might be worth emphasizing that this all happens with KVM
and userspace / the guest doesn't need to do anything about this
remapping.  Really this is an informational aside, not something a
user of the device actually needs to know.

>  * Groups:
>  
>    1. KVM_DEV_XIVE_GRP_CTRL

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-02-25  5:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-22 11:28 [PATCH v2 00/16] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 01/16] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-02-24 23:42   ` David Gibson
2019-02-25  3:50   ` Michael Ellerman
2019-02-25 10:11     ` Cédric Le Goater
2019-02-26  4:21       ` David Gibson
2019-03-12 18:25         ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 02/16] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-02-25  0:08   ` David Gibson
2019-03-12 11:14     ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 03/16] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE Cédric Le Goater
2019-02-25  0:35   ` David Gibson
2019-02-25  4:59     ` Paul Mackerras
2019-03-12 14:10       ` Cédric Le Goater
2019-03-12 14:03     ` Cédric Le Goater
2019-03-13  4:05       ` David Gibson
2019-02-25  4:35   ` Paul Mackerras
2019-03-13  8:34     ` Cédric Le Goater
2019-03-14  2:29       ` David Gibson
2019-02-22 11:28 ` [PATCH v2 04/16] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source Cédric Le Goater
2019-02-25  2:10   ` David Gibson
2019-02-26  4:25     ` Paul Mackerras
2019-02-26 23:20       ` David Gibson
2019-03-12 15:19     ` Cédric Le Goater
2019-03-14  2:15       ` David Gibson
2019-02-25  5:30   ` Paul Mackerras
2019-02-22 11:28 ` [PATCH v2 05/16] KVM: PPC: Book3S HV: XIVE: add a control to configure " Cédric Le Goater
2019-02-25  2:21   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 06/16] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration Cédric Le Goater
2019-02-25  2:39   ` David Gibson
2019-03-12 17:00     ` Cédric Le Goater
2019-03-13  4:03       ` David Gibson
2019-03-13  8:46         ` Cédric Le Goater
2019-03-14  3:29           ` David Gibson
2019-02-26  5:24   ` Paul Mackerras
2019-03-13  9:40     ` Cédric Le Goater
2019-03-14  2:32       ` David Gibson
2019-03-14  7:11         ` Cédric Le Goater
2019-03-15  0:29           ` David Gibson
2019-02-22 11:28 ` [PATCH v2 07/16] KVM: PPC: Book3S HV: XIVE: add a global reset control Cédric Le Goater
2019-02-25  2:43   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 08/16] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources Cédric Le Goater
2019-02-25  2:45   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 09/16] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages Cédric Le Goater
2019-02-25  2:53   ` David Gibson
2019-03-13 11:48     ` Cédric Le Goater
2019-03-14  2:33       ` David Gibson
2019-02-22 11:28 ` [PATCH v2 10/16] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-02-25  3:31   ` David Gibson
2019-03-13 13:19     ` Cédric Le Goater
2019-03-14  3:09       ` David Gibson
2019-03-14  7:08         ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 11/16] KVM: introduce a 'mmap' method for KVM devices Cédric Le Goater
2019-02-25  3:33   ` David Gibson
2019-02-25 10:57     ` Cédric Le Goater
2019-02-26 12:52       ` Paolo Bonzini
2019-02-26 23:22         ` David Gibson
2019-02-22 11:28 ` [PATCH v2 12/16] KVM: PPC: Book3S HV: XIVE: add a TIMA mapping Cédric Le Goater
2019-02-25  3:42   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 13/16] KVM: PPC: Book3S HV: XIVE: add a mapping for the source ESB pages Cédric Le Goater
2019-02-25  3:47   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 14/16] KVM: PPC: Book3S HV: XIVE: add passthrough support Cédric Le Goater
2019-02-25  4:13   ` David Gibson [this message]
2019-02-22 11:28 ` [PATCH v2 15/16] KVM: introduce a KVM_DESTROY_DEVICE ioctl Cédric Le Goater
2019-02-25  4:15   ` David Gibson
2019-03-13  8:02     ` Cédric Le Goater
2019-03-15 17:57       ` Paolo Bonzini
2019-02-22 11:28 ` [PATCH v2 16/16] KVM: PPC: Book3S HV: XIVE: clear the vCPU interrupt presenters Cédric Le Goater
2019-02-25  4:18   ` David Gibson
2019-03-13  8:17     ` Cédric Le Goater
2019-03-14  2:26       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190225041315.GR7668@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).