From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53387) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b1FKp-0006Mw-Bo for qemu-devel@nongnu.org; Fri, 13 May 2016 11:49:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b1FKj-0004Yq-9T for qemu-devel@nongnu.org; Fri, 13 May 2016 11:49:02 -0400 Received: from hqemgate14.nvidia.com ([216.228.121.143]:16010) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b1FKi-0004Ym-W1 for qemu-devel@nongnu.org; Fri, 13 May 2016 11:48:57 -0400 Date: Fri, 13 May 2016 08:48:53 -0700 From: Neo Jia Message-ID: <20160513154853.GA11236@nvidia.com> References: <572AEE72.90008@intel.com> <5731933B.90508@intel.com> <20160510160257.GA4125@nvidia.com> <5732F823.3090409@intel.com> <20160511160628.690876f9@t450s.home> <20160512130552.08974076@t450s.home> <20160512201258.GB24334@nvidia.com> <5735A269.5080909@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <5735A269.5080909@intel.com> Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jike Song Cc: Alex Williamson , "Tian, Kevin" , Kirti Wankhede , "pbonzini@redhat.com" , "kraxel@redhat.com" , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "Ruan, Shuai" , "Lv, Zhiyuan" On Fri, May 13, 2016 at 05:46:17PM +0800, Jike Song wrote: > On 05/13/2016 04:12 AM, Neo Jia wrote: > > On Thu, May 12, 2016 at 01:05:52PM -0600, Alex Williamson wrote: > >> > >> If you're trying to equate the scale of what we need to track vs what > >> type1 currently tracks, they're significantly different. Possible > >> things we need to track include the pfn, the iova, and possibly a > >> reference count or some sort of pinned page map. In the pin-all model > >> we can assume that every page is pinned on map and unpinned on unmap, > >> so a reference count or map is unnecessary. We can also assume that we > >> can always regenerate the pfn with get_user_pages() from the vaddr, so > >> we don't need to track that. > > > > Hi Alex, > > > > Thanks for pointing this out, we will not track those in our next rev and > > get_user_pages will be used from the vaddr as you suggested to handle the > > single VM with both passthru + mediated device case. > > > > Just a gut feeling: > > Calling GUP every time for a particular vaddr, means locking mm->mmap_sem > every time for a particular process. If the VM has dozens of VCPU, which > is not rare, the semaphore is likely to be the bottleneck. Hi Jike, We do need to hold the lock of mm->mmap_sem for the VMM/QEMU process, but I don't quite follow the reasoning with "dozens of vcpus", one situation that I can think of is that we have other thread competing with the mmap_sem for the VMM/QEMU process within KVM kernel such as hva_to_pfn, after a quick search it seems only mostly gets used by iotcl "KVM_ASSIGN_PCI_DEVICE". We will definitely conduct performance analysis with large configuration on servers with E5-2697 v4. :-) Thanks, Neo > > > -- > Thanks, > Jike >