From: Jike Song <jike.song@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>,
pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com,
qemu-devel@nongnu.org, kvm@vger.kernel.org, kevin.tian@intel.com,
shuai.ruan@intel.com, zhiyuan.lv@intel.com
Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu
Date: Thu, 05 May 2016 14:55:46 +0800 [thread overview]
Message-ID: <572AEE72.90008@intel.com> (raw)
In-Reply-To: <20160503164306.6a699fe3@t450s.home>
On 05/04/2016 06:43 AM, Alex Williamson wrote:
> On Tue, 3 May 2016 00:10:41 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>> +
>> +/*
>> + * Pin a set of guest PFNs and return their associated host PFNs for vGPU.
>> + * @vaddr [in]: array of guest PFNs
>> + * @npage [in]: count of array elements
>> + * @prot [in] : protection flags
>> + * @pfn_base[out] : array of host PFNs
>> + */
>> +int vfio_pin_pages(void *iommu_data, dma_addr_t *vaddr, long npage,
>> + int prot, dma_addr_t *pfn_base)
>> +{
>> + struct vfio_iommu *iommu = iommu_data;
>> + struct vfio_domain *domain = NULL, *domain_vgpu = NULL;
>> + int i = 0, ret = 0;
>> + long retpage;
>> + dma_addr_t remote_vaddr = 0;
>> + dma_addr_t *pfn = pfn_base;
>> + struct vfio_dma *dma;
>> +
>> + if (!iommu || !pfn_base)
>> + return -EINVAL;
>> +
>> + if (list_empty(&iommu->domain_list)) {
>> + ret = -EINVAL;
>> + goto pin_done;
>> + }
>> +
>> + get_first_domains(iommu, &domain, &domain_vgpu);
>> +
>> + // Return error if vGPU domain doesn't exist
>
> No c++ style comments please.
>
>> + if (!domain_vgpu) {
>> + ret = -EINVAL;
>> + goto pin_done;
>> + }
>> +
>> + for (i = 0; i < npage; i++) {
>> + struct vfio_vgpu_pfn *p;
>> + struct vfio_vgpu_pfn *lpfn;
>> + unsigned long tpfn;
>> + dma_addr_t iova;
>> +
>> + mutex_lock(&iommu->lock);
>> +
>> + iova = vaddr[i] << PAGE_SHIFT;
>> +
>> + dma = vfio_find_dma(iommu, iova, 0 /* size */);
>> + if (!dma) {
>> + mutex_unlock(&iommu->lock);
>> + ret = -EINVAL;
>> + goto pin_done;
>> + }
>> +
>> + remote_vaddr = dma->vaddr + iova - dma->iova;
>> +
>> + retpage = vfio_pin_pages_internal(domain_vgpu, remote_vaddr,
>> + (long)1, prot, &tpfn);
>> + mutex_unlock(&iommu->lock);
>> + if (retpage <= 0) {
>> + WARN_ON(!retpage);
>> + ret = (int)retpage;
>> + goto pin_done;
>> + }
>> +
>> + pfn[i] = tpfn;
>> +
>> + mutex_lock(&domain_vgpu->lock);
>> +
>> + // search if pfn exist
>> + if ((p = vfio_find_vgpu_pfn(domain_vgpu, tpfn))) {
>> + atomic_inc(&p->ref_count);
>> + mutex_unlock(&domain_vgpu->lock);
>> + continue;
>> + }
>
> The only reason I can come up with for why we'd want to integrate an
> api-only domain into the existing type1 code would be to avoid page
> accounting issues where we count locked pages once for a normal
> assigned device and again for a vgpu, but that's not what we're doing
> here. We're not only locking the pages again regardless of them
> already being locked, we're counting every time we lock them through
> this new interface. So there's really no point at all to making type1
> become this unsupportable. In that case we should be pulling out the
> common code that we want to share from type1 and making a new type1
> compatible vfio iommu backend rather than conditionalizing everything
> here.
>
>> +
>> + // add to pfn_list
>> + lpfn = kzalloc(sizeof(*lpfn), GFP_KERNEL);
>> + if (!lpfn) {
>> + ret = -ENOMEM;
>> + mutex_unlock(&domain_vgpu->lock);
>> + goto pin_done;
>> + }
>> + lpfn->vmm_va = remote_vaddr;
>> + lpfn->iova = iova;
>> + lpfn->pfn = pfn[i];
>> + lpfn->npage = 1;
>> + lpfn->prot = prot;
>> + atomic_inc(&lpfn->ref_count);
>> + vfio_link_vgpu_pfn(domain_vgpu, lpfn);
>> + mutex_unlock(&domain_vgpu->lock);
>> + }
>> +
>> + ret = i;
>> +
>> +pin_done:
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(vfio_pin_pages);
>> +
IIUC, an api-only domain is a VFIO domain *without* underlying IOMMU
hardware. It just, as you said in another mail, "rather than
programming them into an IOMMU for a device, it simply stores the
translations for use by later requests".
That imposes a constraint on gfx driver: hardware IOMMU must be disabled.
Otherwise, if IOMMU is present, the gfx driver eventually programs
the hardware IOMMU with IOVA returned by pci_map_page or dma_map_page;
Meanwhile, the IOMMU backend for vgpu only maintains GPA <-> HPA
translations without any knowledge about hardware IOMMU, how is the
device model supposed to do to get an IOVA for a given GPA (thereby HPA
by the IOMMU backend here)?
If things go as guessed above, as vfio_pin_pages() indicates, it
pin & translate vaddr to PFN, then it will be very difficult for the
device model to figure out:
1, for a given GPA, how to avoid calling dma_map_page multiple times?
2, for which page to call dma_unmap_page?
--
Thanks,
Jike
next prev parent reply other threads:[~2016-05-05 6:57 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-02 18:40 [Qemu-devel] [RFC PATCH v3 0/3] Add vGPU support Kirti Wankhede
2016-05-02 18:40 ` [Qemu-devel] [RFC PATCH v3 1/3] vGPU Core driver Kirti Wankhede
2016-05-03 22:43 ` Alex Williamson
2016-05-04 2:45 ` Tian, Kevin
2016-05-04 16:57 ` Alex Williamson
2016-05-05 8:58 ` Tian, Kevin
2016-05-04 2:58 ` Tian, Kevin
2016-05-12 8:22 ` Tian, Kevin
2016-05-04 13:31 ` Kirti Wankhede
2016-05-05 9:06 ` Tian, Kevin
2016-05-05 10:44 ` Kirti Wankhede
2016-05-05 12:07 ` Tian, Kevin
2016-05-05 12:57 ` Kirti Wankhede
2016-05-11 6:37 ` Tian, Kevin
2016-05-06 12:14 ` Jike Song
2016-05-06 16:16 ` Kirti Wankhede
2016-05-09 12:12 ` Jike Song
2016-05-02 18:40 ` [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device Kirti Wankhede
2016-05-03 22:43 ` Alex Williamson
2016-05-04 3:23 ` Tian, Kevin
2016-05-04 17:06 ` Alex Williamson
2016-05-04 21:14 ` Neo Jia
2016-05-05 4:42 ` Kirti Wankhede
2016-05-05 9:24 ` Tian, Kevin
2016-05-05 20:27 ` Neo Jia
2016-05-11 6:45 ` Tian, Kevin
2016-05-11 20:10 ` Alex Williamson
2016-05-12 0:59 ` Tian, Kevin
2016-05-04 16:25 ` Kirti Wankhede
2016-05-02 18:40 ` [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu Kirti Wankhede
2016-05-03 10:40 ` Jike Song
2016-05-03 22:43 ` Alex Williamson
2016-05-04 3:39 ` Tian, Kevin
2016-05-05 6:55 ` Jike Song [this message]
2016-05-05 9:27 ` Tian, Kevin
2016-05-10 7:52 ` Jike Song
2016-05-10 16:02 ` Neo Jia
2016-05-11 9:15 ` Jike Song
2016-05-11 22:06 ` Alex Williamson
2016-05-12 4:11 ` Jike Song
2016-05-12 19:49 ` Neo Jia
2016-05-13 2:41 ` Tian, Kevin
2016-05-13 6:22 ` Jike Song
2016-05-13 6:43 ` Neo Jia
2016-05-13 7:30 ` Jike Song
2016-05-13 7:42 ` Neo Jia
2016-05-13 7:45 ` Tian, Kevin
2016-05-13 8:31 ` Neo Jia
2016-05-13 9:23 ` Jike Song
2016-05-13 15:50 ` Neo Jia
2016-05-16 6:57 ` Jike Song
2016-05-13 6:08 ` Jike Song
2016-05-13 6:41 ` Neo Jia
2016-05-13 7:13 ` Tian, Kevin
2016-05-13 7:38 ` Neo Jia
2016-05-13 8:02 ` Tian, Kevin
2016-05-13 8:41 ` Neo Jia
2016-05-12 8:00 ` Tian, Kevin
2016-05-12 19:05 ` Alex Williamson
2016-05-12 20:12 ` Neo Jia
2016-05-13 9:46 ` Jike Song
2016-05-13 15:48 ` Neo Jia
2016-05-16 2:27 ` Jike Song
2016-05-13 3:55 ` Tian, Kevin
2016-05-13 16:16 ` Alex Williamson
2016-05-13 7:10 ` Dong Jia
2016-05-13 7:24 ` Neo Jia
2016-05-13 8:39 ` Dong Jia
2016-05-13 9:05 ` Neo Jia
2016-05-19 7:28 ` Dong Jia
2016-05-20 3:21 ` Tian, Kevin
2016-06-06 6:59 ` Dong Jia
2016-06-07 2:47 ` Tian, Kevin
2016-06-07 7:04 ` Dong Jia
2016-05-05 7:51 ` Kirti Wankhede
2016-05-04 1:05 ` [Qemu-devel] [RFC PATCH v3 0/3] Add vGPU support Tian, Kevin
2016-05-04 6:17 ` Neo Jia
2016-05-04 17:07 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=572AEE72.90008@intel.com \
--to=jike.song@intel.com \
--cc=alex.williamson@redhat.com \
--cc=cjia@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=shuai.ruan@intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).