From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40453) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpoGR-0001MP-I7 for qemu-devel@nongnu.org; Thu, 29 Sep 2016 23:13:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bpoGL-0003k0-H2 for qemu-devel@nongnu.org; Thu, 29 Sep 2016 23:13:30 -0400 Received: from mga05.intel.com ([192.55.52.43]:41079) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bpoGL-0003hc-6b for qemu-devel@nongnu.org; Thu, 29 Sep 2016 23:13:25 -0400 Message-ID: <57EDD7C1.3070109@intel.com> Date: Fri, 30 Sep 2016 11:10:57 +0800 From: Jike Song MIME-Version: 1.0 References: <1472097235-6332-1-git-send-email-kwankhede@nvidia.com> <1472097235-6332-4-git-send-email-kwankhede@nvidia.com> <57EC79B3.2060305@intel.com> <57EDD4CF.2080806@intel.com> In-Reply-To: <57EDD4CF.2080806@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 3/4] vfio iommu: Add support for mediated devices List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kirti Wankhede Cc: alex.williamson@redhat.com, pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com, qemu-devel@nongnu.org, kvm@vger.kernel.org, kevin.tian@intel.com, bjsdjshi@linux.vnet.ibm.com, "Xiao, Guangrong" On 09/30/2016 10:58 AM, Jike Song wrote: > On 09/29/2016 11:06 PM, Kirti Wankhede wrote: >> >> >> On 9/29/2016 7:47 AM, Jike Song wrote: >>> +Guangrong >>> >>> On 08/25/2016 11:53 AM, Kirti Wankhede wrote: >> >> ... >> >>>> +static long vfio_iommu_type1_pin_pages(void *iommu_data, >>>> + unsigned long *user_pfn, >>>> + long npage, int prot, >>>> + unsigned long *phys_pfn) >>>> +{ >>>> + struct vfio_iommu *iommu = iommu_data; >>>> + struct vfio_domain *domain; >>>> + int i, j, ret; >>>> + long retpage; >>>> + unsigned long remote_vaddr; >>>> + unsigned long *pfn = phys_pfn; >>>> + struct vfio_dma *dma; >>>> + bool do_accounting = false; >>>> + >>>> + if (!iommu || !user_pfn || !phys_pfn) >>>> + return -EINVAL; >>>> + >>>> + mutex_lock(&iommu->lock); >>>> + >>>> + if (!iommu->local_domain) { >>>> + ret = -EINVAL; >>>> + goto pin_done; >>>> + } >>>> + >>>> + domain = iommu->local_domain; >>>> + >>>> + /* >>>> + * If iommu capable domain exist in the container then all pages are >>>> + * already pinned and accounted. Accouting should be done if there is no >>>> + * iommu capable domain in the container. >>>> + */ >>>> + do_accounting = !IS_IOMMU_CAPABLE_DOMAIN_IN_CONTAINER(iommu); >>>> + >>>> + for (i = 0; i < npage; i++) { >>>> + struct vfio_pfn *p; >>>> + dma_addr_t iova; >>>> + >>>> + iova = user_pfn[i] << PAGE_SHIFT; >>>> + >>>> + dma = vfio_find_dma(iommu, iova, 0); >>>> + if (!dma) { >>>> + ret = -EINVAL; >>>> + goto pin_unwind; >>>> + } >>>> + >>>> + remote_vaddr = dma->vaddr + iova - dma->iova; >>>> + >>>> + retpage = __vfio_pin_pages_local(domain, remote_vaddr, prot, >>>> + &pfn[i], do_accounting); >>> >>> Hi Kirti, >>> >>> Here you call __vfio_pin_pages_local() > vaddr_get_pfn() > GUP regardless >>> whether the vaddr already pinned or not. That probably means, if the caller >>> calls vfio_pin_pages() with a GPA for multiple times, you get memory leaks. >>> >>> GUP always increases the page refcnt. >>> >>> FWIW, I would like to have the pfn_list_lock implemented with key == iova, >>> so you can always try to find the PFN for a given iova, and pin it only if >>> not found. >>> >> >> I didn't get how there would be a memory leak. >> >> Right, GUP increases refcnt, so if vfio_pin_pages() is called for >> multiple types for same GPA, refcnt would be incremented. In >> vfio_iommu_type1_pin_pages() pinned pages list is maintained with >> ref_count. If pfn is already in list, ref_count is incremented and same >> is used while unpining pages. >> > > Let's have a close look at vfio_unpin_pfn: > > static int vfio_unpin_pfn(struct vfio_domain *domain, > struct vfio_pfn *vpfn, bool do_accounting) > { > __vfio_unpin_pages_for_mdev(domain, vpfn->pfn, vpfn->prot, > do_accounting); > > if (atomic_dec_and_test(&vpfn->ref_count)) > vfio_remove_from_pfn_list(domain, vpfn); > > return 1; > } > > Here you didn't call __vfio_unpin_pages_for_mdev -- thereby put_page -- for > vpfn->ref_count times. If page->_refcount increased by GUP for (N) times, here > you only set it back to (N-1). > What's more, since all pinned {iova, pfni} already saved, it's better to consult it before calling GUP, which will get_page() unconditionally. -- Thanks, Jike