From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras <paulus@samba.org>,
Alex Williamson <alex.williamson@redhat.com>,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH kernel 8/9] KVM: PPC: Add in-kernel handling for VFIO
Date: Wed, 9 Mar 2016 19:46:47 +1100 [thread overview]
Message-ID: <56DFE2F7.80300@ozlabs.ru> (raw)
In-Reply-To: <20160308110812.GC22546@voom.fritz.box>
On 03/08/2016 10:08 PM, David Gibson wrote:
> On Mon, Mar 07, 2016 at 02:41:16PM +1100, Alexey Kardashevskiy wrote:
>> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
>> and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
>> without passing them to user space which saves time on switching
>> to user space and back.
>>
>> Both real and virtual modes are supported. The kernel tries to
>> handle a TCE request in the real mode, if fails it passes the request
>> to the virtual mode to complete the operation. If it a virtual mode
>> handler fails, the request is passed to user space; this is not expected
>> to happen ever though.
>
> Well... not expect to happen with a qemu which uses this. Presumably
> it will fall back to userspace routinely if you have an old qemu that
> doesn't add the liobn mappings.
Ah. Ok, thanks, I'll add this to the commit log.
>> The first user of this is VFIO on POWER. Trampolines to the VFIO external
>> user API functions are required for this patch.
>
> I'm not sure what you mean by "trampoline" here.
For example, look at kvm_vfio_group_get_external_user. It calls
symbol_get(vfio_group_get_external_user) and then calls a function via the
returned pointer.
Is there a better word for this?
>> This uses a VFIO KVM device to associate a logical bus number (LIOBN)
>> with an VFIO IOMMU group fd and enable in-kernel handling of map/unmap
>> requests.
>
> Group fd? Or container fd? The group fd wouldn't make a lot of
> sense.
Group. KVM has no idea about containers.
>> To make use of the feature, the user space has to create a guest view
>> of the TCE table via KVM_CAP_SPAPR_TCE/KVM_CAP_SPAPR_TCE_64 and
>> then associate a LIOBN with this table via VFIO KVM device,
>> a KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE_LIOBN property (which is added in
>> the next patch).
>>
>> Tests show that this patch increases transmission speed from 220MB/s
>> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
>
> Is that with or without DDW (i.e. with or without a 64-bit DMA window)?
Without DDW, I should have mentioned this. The patch is from the times when
there was no DDW :(
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> arch/powerpc/kvm/book3s_64_vio.c | 184 +++++++++++++++++++++++++++++++++++
>> arch/powerpc/kvm/book3s_64_vio_hv.c | 186 ++++++++++++++++++++++++++++++++++++
>> 2 files changed, 370 insertions(+)
>>
>> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
>> index 7965fc7..9417d12 100644
>> --- a/arch/powerpc/kvm/book3s_64_vio.c
>> +++ b/arch/powerpc/kvm/book3s_64_vio.c
>> @@ -33,6 +33,7 @@
>> #include <asm/kvm_ppc.h>
>> #include <asm/kvm_book3s.h>
>> #include <asm/mmu-hash64.h>
>> +#include <asm/mmu_context.h>
>> #include <asm/hvcall.h>
>> #include <asm/synch.h>
>> #include <asm/ppc-opcode.h>
>> @@ -317,11 +318,161 @@ fail:
>> return ret;
>> }
>>
>> +static long kvmppc_tce_iommu_mapped_dec(struct iommu_table *tbl,
>> + unsigned long entry)
>> +{
>> + struct mm_iommu_table_group_mem_t *mem = NULL;
>> + const unsigned long pgsize = 1ULL << tbl->it_page_shift;
>> + unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry);
>> +
>> + if (!pua)
>> + return H_HARDWARE;
>> +
>> + mem = mm_iommu_lookup(*pua, pgsize);
>> + if (!mem)
>> + return H_HARDWARE;
>> +
>> + mm_iommu_mapped_dec(mem);
>> +
>> + *pua = 0;
>> +
>> + return H_SUCCESS;
>> +}
>> +
>> +static long kvmppc_tce_iommu_unmap(struct iommu_table *tbl,
>> + unsigned long entry)
>> +{
>> + enum dma_data_direction dir = DMA_NONE;
>> + unsigned long hpa = 0;
>> +
>> + if (iommu_tce_xchg(tbl, entry, &hpa, &dir))
>> + return H_HARDWARE;
>> +
>> + if (dir == DMA_NONE)
>> + return H_SUCCESS;
>> +
>> + return kvmppc_tce_iommu_mapped_dec(tbl, entry);
>> +}
>> +
>> +long kvmppc_tce_iommu_map(struct kvm *kvm, struct iommu_table *tbl,
>> + unsigned long entry, unsigned long gpa,
>> + enum dma_data_direction dir)
>> +{
>> + long ret;
>> + unsigned long hpa, ua, *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry);
>> + struct mm_iommu_table_group_mem_t *mem;
>> +
>> + if (!pua)
>> + return H_HARDWARE;
>
> H_HARDWARE? Or H_PARAMETER? This essentially means the guest has
> supplied a bad physical address, doesn't it?
Well, may be. I'll change. If it not H_TOO_HARD, it does not make any
difference after all :)
>> + if (kvmppc_gpa_to_ua(kvm, gpa, &ua, NULL))
>> + return H_HARDWARE;
>> +
>> + mem = mm_iommu_lookup(ua, 1ULL << tbl->it_page_shift);
>> + if (!mem)
>> + return H_HARDWARE;
>> +
>> + if (mm_iommu_ua_to_hpa(mem, ua, &hpa))
>> + return H_HARDWARE;
>> +
>> + if (mm_iommu_mapped_inc(mem))
>> + return H_HARDWARE;
>> +
>> + ret = iommu_tce_xchg(tbl, entry, &hpa, &dir);
>> + if (ret) {
>> + mm_iommu_mapped_dec(mem);
>> + return H_TOO_HARD;
>> + }
>> +
>> + if (dir != DMA_NONE)
>> + kvmppc_tce_iommu_mapped_dec(tbl, entry);
>> +
>> + *pua = ua;
>
> IIUC this means you have a copy of the UA for every group attached to
> the TCE table, but they'll all be the same. Any way to avoid that
> duplication?
It is for every container, not a group. On P8, I allow multiple groups to
go to the same container, that means that a container has one or two
iommu_table, and each iommu_table has this "ua" list but since tables are
different (window size, page size, content), these "ua" arrays are also
different.
--
Alexey
next prev parent reply other threads:[~2016-03-09 8:46 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-07 3:41 [PATCH kernel 0/9] KVM, PPC, VFIO: Enable in-kernel acceleration Alexey Kardashevskiy
2016-03-07 3:41 ` [PATCH kernel 1/9] KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number Alexey Kardashevskiy
2016-03-07 4:58 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 2/9] powerpc/mmu: Add real mode support for IOMMU preregistered memory Alexey Kardashevskiy
2016-03-07 5:30 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 3/9] KVM: PPC: Use preregistered memory API to access TCE list Alexey Kardashevskiy
2016-03-07 6:00 ` David Gibson
2016-03-08 5:47 ` Alexey Kardashevskiy
2016-03-08 6:30 ` David Gibson
2016-03-09 8:55 ` Alexey Kardashevskiy
2016-03-09 23:46 ` David Gibson
2016-03-10 8:33 ` Paul Mackerras
2016-03-10 23:42 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 4/9] powerpc/powernv/iommu: Add real mode version of xchg() Alexey Kardashevskiy
2016-03-07 6:05 ` David Gibson
2016-03-07 7:32 ` Alexey Kardashevskiy
2016-03-08 4:50 ` David Gibson
2016-03-10 8:43 ` Paul Mackerras
2016-03-10 8:46 ` Paul Mackerras
2016-03-07 3:41 ` [PATCH kernel 5/9] KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64 permanently Alexey Kardashevskiy
2016-03-07 3:41 ` [PATCH kernel 6/9] KVM: PPC: Associate IOMMU group with guest view of TCE table Alexey Kardashevskiy
2016-03-07 6:25 ` David Gibson
2016-03-07 9:38 ` Alexey Kardashevskiy
2016-03-08 4:55 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 7/9] KVM: PPC: Create a virtual-mode only TCE table handlers Alexey Kardashevskiy
2016-03-08 6:32 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 8/9] KVM: PPC: Add in-kernel handling for VFIO Alexey Kardashevskiy
2016-03-08 11:08 ` David Gibson
2016-03-09 8:46 ` Alexey Kardashevskiy [this message]
2016-03-10 5:18 ` David Gibson
2016-03-11 2:15 ` Alexey Kardashevskiy
2016-03-15 6:00 ` David Gibson
2016-03-07 3:41 ` [PATCH kernel 9/9] KVM: PPC: VFIO device: support SPAPR TCE Alexey Kardashevskiy
2016-03-09 5:45 ` David Gibson
2016-03-09 9:20 ` Alexey Kardashevskiy
2016-03-10 5:21 ` David Gibson
2016-03-10 23:09 ` Alexey Kardashevskiy
2016-03-15 6:04 ` David Gibson
[not found] ` <15389a41428.27cb.1ca38dd7e845b990cd13d431eb58563d@ozlabs.ru>
[not found] ` <20160321051932.GJ23586@voom.redhat.com>
2016-03-22 0:34 ` Alexey Kardashevskiy
2016-03-23 3:03 ` David Gibson
2016-06-09 6:47 ` Alexey Kardashevskiy
2016-06-10 6:50 ` David Gibson
2016-06-14 3:30 ` Alexey Kardashevskiy
2016-06-15 4:43 ` David Gibson
2016-04-08 9:13 ` Alexey Kardashevskiy
2016-04-11 3:36 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56DFE2F7.80300@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).