public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915: Prefault the entire object on first page fault
@ 2014-06-10 11:14 Chris Wilson
  2014-06-10 11:14 ` [PATCH 2/2] drm/i915: Use remap_pfn_range() to prefault all PTE in a single pass Chris Wilson
  2014-06-11 20:41 ` [PATCH 1/2] drm/i915: Prefault the entire object on first page fault Volkin, Bradley D
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Wilson @ 2014-06-10 11:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: Goel, Akash

Inserting additional PTEs has no side-effect for us as the pfn are fixed
for the entire time the object is resident in the global GTT. The
downside is that we pay the entire cost of faulting the object upon the
first hit, for which we in return receive the benefit of removing the
per-page faulting overhead.

On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
Upload rate for 2 linear surfaces:	8127MiB/s -> 8134MiB/s
Upload rate for 2 tiled surfaces:	8607MiB/s -> 8625MiB/s
Upload rate for 4 linear surfaces:	8127MiB/s -> 8127MiB/s
Upload rate for 4 tiled surfaces:	8611MiB/s -> 8602MiB/s
Upload rate for 8 linear surfaces:	8114MiB/s -> 8124MiB/s
Upload rate for 8 tiled surfaces:	8601MiB/s -> 8603MiB/s
Upload rate for 16 linear surfaces:	8110MiB/s -> 8123MiB/s
Upload rate for 16 tiled surfaces:	8595MiB/s -> 8606MiB/s
Upload rate for 32 linear surfaces:	8104MiB/s -> 8121MiB/s
Upload rate for 32 tiled surfaces:	8589MiB/s -> 8605MiB/s
Upload rate for 64 linear surfaces:	8107MiB/s -> 8121MiB/s
Upload rate for 64 tiled surfaces:	2013MiB/s -> 3017MiB/s

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Goel, Akash" <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3aaf7e01235e..e1f68f06c2ef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1704,14 +1704,26 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	if (ret)
 		goto unpin;
 
-	obj->fault_mappable = true;
-
+	/* Finally, remap it using the new GTT offset */
 	pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
 	pfn >>= PAGE_SHIFT;
-	pfn += page_offset;
 
-	/* Finally, remap it using the new GTT offset */
-	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
+	if (!obj->fault_mappable) {
+		int i;
+
+		for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) {
+			ret = vm_insert_pfn(vma,
+					    (unsigned long)vma->vm_start + i * PAGE_SIZE,
+					    pfn + i);
+			if (ret)
+				break;
+		}
+
+		obj->fault_mappable = true;
+	} else
+		ret = vm_insert_pfn(vma,
+				    (unsigned long)vmf->virtual_address,
+				    pfn + page_offset);
 unpin:
 	i915_gem_object_ggtt_unpin(obj);
 unlock:
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread
* [PATCH 1/2] mm: Report attempts to overwrite PTE from remap_pfn_range()
@ 2014-06-13 16:26 Chris Wilson
  2014-06-13 16:26 ` [PATCH 2/2] drm/i915: Use remap_pfn_range() to prefault all PTE in a single pass Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2014-06-13 16:26 UTC (permalink / raw)
  To: intel-gfx
  Cc: Rik van Riel, Peter Zijlstra, Cyrill Gorcunov, linux-mm,
	Mel Gorman, Johannes Weiner, Andrew Morton, Kirill A. Shutemov

When using remap_pfn_range() from a fault handler, we are exposed to
races between concurrent faults. Rather than hitting a BUG, report the
error back to the caller, like vm_insert_pfn().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
---
 mm/memory.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 037b812a9531..6603a9e6a731 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2306,19 +2306,23 @@ static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
 {
 	pte_t *pte;
 	spinlock_t *ptl;
+	int ret = 0;
 
 	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
 	if (!pte)
 		return -ENOMEM;
 	arch_enter_lazy_mmu_mode();
 	do {
-		BUG_ON(!pte_none(*pte));
+		if (!pte_none(*pte)) {
+			ret = -EBUSY;
+			break;
+		}
 		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
 		pfn++;
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 	arch_leave_lazy_mmu_mode();
 	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
+	return ret;
 }
 
 static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-06-13 16:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-10 11:14 [PATCH 1/2] drm/i915: Prefault the entire object on first page fault Chris Wilson
2014-06-10 11:14 ` [PATCH 2/2] drm/i915: Use remap_pfn_range() to prefault all PTE in a single pass Chris Wilson
2014-06-11 21:17   ` Volkin, Bradley D
2014-06-12  7:15     ` Daniel Vetter
2014-06-12  7:20     ` Chris Wilson
2014-06-12 16:15     ` Daniel Vetter
2014-06-11 20:41 ` [PATCH 1/2] drm/i915: Prefault the entire object on first page fault Volkin, Bradley D
2014-06-12  7:21   ` Chris Wilson
  -- strict thread matches above, loose matches on Subject: below --
2014-06-13 16:26 [PATCH 1/2] mm: Report attempts to overwrite PTE from remap_pfn_range() Chris Wilson
2014-06-13 16:26 ` [PATCH 2/2] drm/i915: Use remap_pfn_range() to prefault all PTE in a single pass Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox