From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH 3/3] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Date: Wed, 12 Mar 2014 12:48:11 +0000 Message-ID: <5320578B.1070005@linux.intel.com> References: <1393008347-441-1-git-send-email-chris@chris-wilson.co.uk> <1393008347-441-3-git-send-email-chris@chris-wilson.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id 5869AFAC59 for ; Wed, 12 Mar 2014 05:48:29 -0700 (PDT) In-Reply-To: <1393008347-441-3-git-send-email-chris@chris-wilson.co.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org To: Chris Wilson , intel-gfx@lists.freedesktop.org Cc: Akash Goel List-Id: intel-gfx@lists.freedesktop.org On 02/21/2014 06:45 PM, Chris Wilson wrote: > By exporting the ability to map user address and inserting PTEs > representing their backing pages into the GTT, we can exploit UMA in order > to utilize normal application data as a texture source or even as a > render target (depending upon the capabilities of the chipset). This has > a number of uses, with zero-copy downloads to the GPU and efficient > readback making the intermixed streaming of CPU and GPU operations > fairly efficient. This ability has many widespread implications from > faster rendering of client-side software rasterisers (chromium), > mitigation of stalls due to read back (firefox) and to faster pipelining > of texture data (such as pixel buffer objects in GL or data blobs in CL). > > v2: Compile with CONFIG_MMU_NOTIFIER > v3: We can sleep while performing invalidate-range, which we can utilise > to drop our page references prior to the kernel manipulating the vma > (for either discard or cloning) and so protect normal users. > v4: Only run the invalidate notifier if the range intercepts the bo. > v5: Prevent userspace from attempting to GTT mmap non-page aligned buffers > v6: Recheck after reacquire mutex for lost mmu. > v7: Fix implicit padding of ioctl struct by rounding to next 64bit boundary. > v8: Fix rebasing error after forwarding porting the back port. > v9: Limit the userptr to page aligned entries. We now expect userspace > to handle all the offset-in-page adjustments itself. > v10: Prevent vma from being copied across fork to avoid issues with cow. > v11: Drop vma behaviour changes -- locking is nigh on impossible. > Use a worker to load user pages to avoid lock inversions. > v12: Use get_task_mm()/mmput() for correct refcounting of mm. > v13: Use a worker to release the mmu_notifier to avoid lock inversion > v14: Decouple mmu_notifier from struct_mutex using a custom mmu_notifer > with its own locking and tree of objects for each mm/mmu_notifier. > v15: Prevent overlapping userptr objects, and invalidate all objects > within the mmu_notifier range > v16: Fix a typo for iterating over multiple objects in the range and > rearrange error path to destroy the mmu_notifier locklessly. > Also close a race between invalidate_range and the get_pages_worker. > v17: Close a race between get_pages_worker/invalidate_range and fresh > allocations of the same userptr range - and notice that > struct_mutex was presumed to be held when during creation it wasn't. > v18: Sigh. Fix the refactor of st_set_pages() to allocate enough memory > for the struct sg_table and to clear it before reporting an error. > v19: Always error out on read-only userptr requests as we don't have the > hardware infrastructure to support them at the moment. > v20: Refuse to implement read-only support until we have the required > infrastructure - but reserve the bit in flags for future use. > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin > Cc: "Gong, Zhipeng" > Cc: Akash Goel > Cc: "Volkin, Bradley D" > --- > drivers/gpu/drm/i915/Kconfig | 1 + > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_dma.c | 1 + > drivers/gpu/drm/i915/i915_drv.h | 24 +- > drivers/gpu/drm/i915/i915_gem.c | 4 + > drivers/gpu/drm/i915/i915_gem_dmabuf.c | 5 + > drivers/gpu/drm/i915/i915_gem_userptr.c | 681 ++++++++++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_gpu_error.c | 2 + > include/uapi/drm/i915_drm.h | 16 + > 9 files changed, 734 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/i915_gem_userptr.c I went through all of it once again, I mean apart from looking at it extensively while we were debugging it, so can now say: Reviewed-by: Tvrtko Ursulin Regards, Tvrtko