From: Gleb Natapov <gleb@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: kvm@vger.kernel.org, Alexander Graf <agraf@suse.de>,
kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, Paul Mackerras <paulus@samba.org>,
Paolo Bonzini <pbonzini@redhat.com>,
linuxppc-dev@lists.ozlabs.org,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling
Date: Fri, 6 Sep 2013 09:01:47 +0300 [thread overview]
Message-ID: <20130906060147.GF13021@redhat.com> (raw)
In-Reply-To: <522915ED.7050106@ozlabs.ru>
On Fri, Sep 06, 2013 at 09:38:21AM +1000, Alexey Kardashevskiy wrote:
> On 09/06/2013 04:10 AM, Gleb Natapov wrote:
> > On Wed, Sep 04, 2013 at 02:01:28AM +1000, Alexey Kardashevskiy wrote:
> >> On 09/03/2013 08:53 PM, Gleb Natapov wrote:
> >>> On Mon, Sep 02, 2013 at 01:14:29PM +1000, Alexey Kardashevskiy wrote:
> >>>> On 09/01/2013 10:06 PM, Gleb Natapov wrote:
> >>>>> On Wed, Aug 28, 2013 at 06:50:41PM +1000, Alexey Kardashevskiy wrote:
> >>>>>> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
> >>>>>> and H_STUFF_TCE requests targeted an IOMMU TCE table without passing
> >>>>>> them to user space which saves time on switching to user space and back.
> >>>>>>
> >>>>>> Both real and virtual modes are supported. The kernel tries to
> >>>>>> handle a TCE request in the real mode, if fails it passes the request
> >>>>>> to the virtual mode to complete the operation. If it a virtual mode
> >>>>>> handler fails, the request is passed to user space.
> >>>>>>
> >>>>>> The first user of this is VFIO on POWER. Trampolines to the VFIO external
> >>>>>> user API functions are required for this patch.
> >>>>>>
> >>>>>> This adds a "SPAPR TCE IOMMU" KVM device to associate a logical bus
> >>>>>> number (LIOBN) with an VFIO IOMMU group fd and enable in-kernel handling
> >>>>>> of map/unmap requests. The device supports a single attribute which is
> >>>>>> a struct with LIOBN and IOMMU fd. When the attribute is set, the device
> >>>>>> establishes the connection between KVM and VFIO.
> >>>>>>
> >>>>>> Tests show that this patch increases transmission speed from 220MB/s
> >>>>>> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
> >>>>>>
> >>>>>> Signed-off-by: Paul Mackerras <paulus@samba.org>
> >>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> Changes:
> >>>>>> v9:
> >>>>>> * KVM_CAP_SPAPR_TCE_IOMMU ioctl to KVM replaced with "SPAPR TCE IOMMU"
> >>>>>> KVM device
> >>>>>> * release_spapr_tce_table() is not shared between different TCE types
> >>>>>> * reduced the patch size by moving VFIO external API
> >>>>>> trampolines to separate patche
> >>>>>> * moved documentation from Documentation/virtual/kvm/api.txt to
> >>>>>> Documentation/virtual/kvm/devices/spapr_tce_iommu.txt
> >>>>>>
> >>>>>> v8:
> >>>>>> * fixed warnings from check_patch.pl
> >>>>>>
> >>>>>> 2013/07/11:
> >>>>>> * removed multiple #ifdef IOMMU_API as IOMMU_API is always enabled
> >>>>>> for KVM_BOOK3S_64
> >>>>>> * kvmppc_gpa_to_hva_and_get also returns host phys address. Not much sense
> >>>>>> for this here but the next patch for hugepages support will use it more.
> >>>>>>
> >>>>>> 2013/07/06:
> >>>>>> * added realmode arch_spin_lock to protect TCE table from races
> >>>>>> in real and virtual modes
> >>>>>> * POWERPC IOMMU API is changed to support real mode
> >>>>>> * iommu_take_ownership and iommu_release_ownership are protected by
> >>>>>> iommu_table's locks
> >>>>>> * VFIO external user API use rewritten
> >>>>>> * multiple small fixes
> >>>>>>
> >>>>>> 2013/06/27:
> >>>>>> * tce_list page is referenced now in order to protect it from accident
> >>>>>> invalidation during H_PUT_TCE_INDIRECT execution
> >>>>>> * added use of the external user VFIO API
> >>>>>>
> >>>>>> 2013/06/05:
> >>>>>> * changed capability number
> >>>>>> * changed ioctl number
> >>>>>> * update the doc article number
> >>>>>>
> >>>>>> 2013/05/20:
> >>>>>> * removed get_user() from real mode handlers
> >>>>>> * kvm_vcpu_arch::tce_tmp usage extended. Now real mode handler puts there
> >>>>>> translated TCEs, tries realmode_get_page() on those and if it fails, it
> >>>>>> passes control over the virtual mode handler which tries to finish
> >>>>>> the request handling
> >>>>>> * kvmppc_lookup_pte() now does realmode_get_page() protected by BUSY bit
> >>>>>> on a page
> >>>>>> * The only reason to pass the request to user mode now is when the user mode
> >>>>>> did not register TCE table in the kernel, in all other cases the virtual mode
> >>>>>> handler is expected to do the job
> >>>>>> ---
> >>>>>> .../virtual/kvm/devices/spapr_tce_iommu.txt | 37 +++
> >>>>>> arch/powerpc/include/asm/kvm_host.h | 4 +
> >>>>>> arch/powerpc/kvm/book3s_64_vio.c | 310 ++++++++++++++++++++-
> >>>>>> arch/powerpc/kvm/book3s_64_vio_hv.c | 122 ++++++++
> >>>>>> arch/powerpc/kvm/powerpc.c | 1 +
> >>>>>> include/linux/kvm_host.h | 1 +
> >>>>>> virt/kvm/kvm_main.c | 5 +
> >>>>>> 7 files changed, 477 insertions(+), 3 deletions(-)
> >>>>>> create mode 100644 Documentation/virtual/kvm/devices/spapr_tce_iommu.txt
> >>>>>>
> >>>>>> diff --git a/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt b/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt
> >>>>>> new file mode 100644
> >>>>>> index 0000000..4bc8fc3
> >>>>>> --- /dev/null
> >>>>>> +++ b/Documentation/virtual/kvm/devices/spapr_tce_iommu.txt
> >>>>>> @@ -0,0 +1,37 @@
> >>>>>> +SPAPR TCE IOMMU device
> >>>>>> +
> >>>>>> +Capability: KVM_CAP_SPAPR_TCE_IOMMU
> >>>>>> +Architectures: powerpc
> >>>>>> +
> >>>>>> +Device type supported: KVM_DEV_TYPE_SPAPR_TCE_IOMMU
> >>>>>> +
> >>>>>> +Groups:
> >>>>>> + KVM_DEV_SPAPR_TCE_IOMMU_ATTR_LINKAGE
> >>>>>> + Attributes: single attribute with pair { LIOBN, IOMMU fd}
> >>>>>> +
> >>>>>> +This is completely made up device which provides API to link
> >>>>>> +logical bus number (LIOBN) and IOMMU group. The user space has
> >>>>>> +to create a new SPAPR TCE IOMMU device per a logical bus.
> >>>>>> +
> >>>>> Why not have one device that can handle multimple links?
> >>>>
> >>>>
> >>>> I can do that. If I make it so, it won't even look as a device at all, just
> >>>> some weird interface to KVM but ok. What bothers me is it is just a
> >>> May be I do not understand usage pattern here. Why do you feel that device
> >>> that can handle multiple links is worse than device per link? How many logical
> >>> buses is there usually? How often they created/destroyed? I am not insisting
> >>> on the change, just trying to understand why you do not like it.
> >>
> >>
> >> Is it usually one PCI host bus adapter per IOMMU group which is usually
> >> one PCI card or 2-3 cards if it is a legacy PCI-X, and they are created
> >> when QEMU-KVM starts. Not many. And they live till KVM ends.
> >>
> >> My point is why would I want to put all links to one device? It all is just
> >> a matter of taste and nothing more. Or I am missing something but I do not
> >> see what. If it is all about making thing to be kosher/halal/orthodox, then
> >> I have more stuff to do, like reworking the emulated TCEs. But if is it for
> >> (I do not know, just guessing) performance or something like that - then
> >> I'll fix it, I just need to know what I am fixing.
> >>
> > Each device creates an fd, if you can have a lot of them eventually this
> > will be a bottleneck. You are saying this is not the case, so lets go
> > with proposed interface.
>
>
> Did you decide not to answer the email which Ben sent yesterday or you just
> did not see it? Just checking :)
>
Haven't seen it. Which one?
--
Gleb.
next prev parent reply other threads:[~2013-09-06 6:01 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-28 8:37 [PATCH v9 00/13] KVM: PPC: IOMMU in-kernel handling of VFIO Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 01/13] KVM: PPC: POWERNV: move iommu_add_device earlier Alexey Kardashevskiy
2013-10-30 5:33 ` Bhushan Bharat-R65777
2013-08-28 8:37 ` [PATCH v9 02/13] hashtable: add hash_for_each_possible_rcu_notrace() Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 03/13] KVM: PPC: reserve a capability number for multitce support Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 04/13] KVM: PPC: reserve a capability and KVM device type for realmode VFIO Alexey Kardashevskiy
2013-09-01 11:27 ` Gleb Natapov
2013-09-01 11:39 ` Alexey Kardashevskiy
2013-09-01 12:04 ` Gleb Natapov
2013-08-28 8:37 ` [PATCH v9 05/13] powerpc: Prepare to support kernel handling of IOMMU map/unmap Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 06/13] powerpc: add real mode support for dma operations on powernv Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 07/13] KVM: PPC: enable IOMMU_API for KVM_BOOK3S_64 permanently Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 08/13] KVM: PPC: Add support for multiple-TCE hcalls Alexey Kardashevskiy
2013-08-28 8:37 ` [PATCH v9 09/13] powerpc/iommu: rework to support realmode Alexey Kardashevskiy
2013-08-28 8:49 ` [PATCH v9 10/13] KVM: PPC: remove warning from kvmppc_core_destroy_vm Alexey Kardashevskiy
2013-08-28 8:50 ` [PATCH v9 11/13] KVM: PPC: add trampolines for VFIO external API Alexey Kardashevskiy
2013-08-28 8:50 ` [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling Alexey Kardashevskiy
2013-09-01 12:06 ` Gleb Natapov
2013-09-02 3:14 ` Alexey Kardashevskiy
2013-09-03 10:53 ` Gleb Natapov
2013-09-03 16:01 ` Alexey Kardashevskiy
2013-09-05 18:10 ` Gleb Natapov
2013-09-05 23:38 ` Alexey Kardashevskiy
2013-09-06 6:01 ` Gleb Natapov [this message]
2013-09-06 6:06 ` Alexey Kardashevskiy
2013-09-05 4:05 ` Benjamin Herrenschmidt
2013-09-06 6:57 ` Gleb Natapov
2013-09-06 7:04 ` Alexey Kardashevskiy
2013-09-06 10:45 ` Alexey Kardashevskiy
2013-09-06 10:40 ` [PATCH v10 " Alexey Kardashevskiy
2013-09-08 14:12 ` Gleb Natapov
2013-08-28 8:51 ` [PATCH v9 13/13] KVM: PPC: Add hugepage " Alexey Kardashevskiy
2013-08-30 10:26 ` [PATCH v9 00/13] KVM: PPC: IOMMU in-kernel handling of VFIO Alexey Kardashevskiy
2013-08-30 17:58 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130906060147.GF13021@redhat.com \
--to=gleb@redhat.com \
--cc=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).