From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44029) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b28GI-0000yN-J2 for qemu-devel@nongnu.org; Sun, 15 May 2016 22:28:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b28GE-0000Xj-Fz for qemu-devel@nongnu.org; Sun, 15 May 2016 22:28:02 -0400 Received: from mga03.intel.com ([134.134.136.65]:51311) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b28GE-0000Xc-7k for qemu-devel@nongnu.org; Sun, 15 May 2016 22:27:58 -0400 Message-ID: <57392FF4.3060501@intel.com> Date: Mon, 16 May 2016 10:27:00 +0800 From: Jike Song MIME-Version: 1.0 References: <572AEE72.90008@intel.com> <5731933B.90508@intel.com> <20160510160257.GA4125@nvidia.com> <5732F823.3090409@intel.com> <20160511160628.690876f9@t450s.home> <20160512130552.08974076@t450s.home> <20160512201258.GB24334@nvidia.com> <5735A269.5080909@intel.com> <20160513154853.GA11236@nvidia.com> In-Reply-To: <20160513154853.GA11236@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Neo Jia Cc: Alex Williamson , "Tian, Kevin" , Kirti Wankhede , "pbonzini@redhat.com" , "kraxel@redhat.com" , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "Ruan, Shuai" , "Lv, Zhiyuan" On 05/13/2016 11:48 PM, Neo Jia wrote: > On Fri, May 13, 2016 at 05:46:17PM +0800, Jike Song wrote: >> On 05/13/2016 04:12 AM, Neo Jia wrote: >>> On Thu, May 12, 2016 at 01:05:52PM -0600, Alex Williamson wrote: >>>> >>>> If you're trying to equate the scale of what we need to track vs what >>>> type1 currently tracks, they're significantly different. Possible >>>> things we need to track include the pfn, the iova, and possibly a >>>> reference count or some sort of pinned page map. In the pin-all model >>>> we can assume that every page is pinned on map and unpinned on unmap, >>>> so a reference count or map is unnecessary. We can also assume that we >>>> can always regenerate the pfn with get_user_pages() from the vaddr, so >>>> we don't need to track that. >>> >>> Hi Alex, >>> >>> Thanks for pointing this out, we will not track those in our next rev and >>> get_user_pages will be used from the vaddr as you suggested to handle the >>> single VM with both passthru + mediated device case. >>> >> >> Just a gut feeling: >> >> Calling GUP every time for a particular vaddr, means locking mm->mmap_sem >> every time for a particular process. If the VM has dozens of VCPU, which >> is not rare, the semaphore is likely to be the bottleneck. > > Hi Jike, > > We do need to hold the lock of mm->mmap_sem for the VMM/QEMU process, but I > don't quite follow the reasoning with "dozens of vcpus", one situation that I > can think of is that we have other thread competing with the mmap_sem for the > VMM/QEMU process within KVM kernel such as hva_to_pfn, after a quick search it > seems only mostly gets used by iotcl "KVM_ASSIGN_PCI_DEVICE". > I meant, on guest's writing a gfn to GPU MMU, which could happen on any vcpu, so vmexit happens and mmap_sem required. But I'm now realized that it's also the situation even we store the pfn in rbtree .. > We will definitely conduct performance analysis with large configuration on > servers with E5-2697 v4. :-) My homage :) -- Thanks, Jike