All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 00/23] drm/i915: Use ww locking in execbuf submission.
@ 2020-07-03 12:21 Maarten Lankhorst
  2020-07-03 12:21 ` [Intel-gfx] [PATCH 01/23] Revert "drm/i915/gem: Async GPU relocations only" Maarten Lankhorst
                   ` (25 more replies)
  0 siblings, 26 replies; 44+ messages in thread
From: Maarten Lankhorst @ 2020-07-03 12:21 UTC (permalink / raw)
  To: intel-gfx

Change the locking hierarchy to put request inside ww, and fix selftests to hopefully pass now.

Maarten Lankhorst (23):
  Revert "drm/i915/gem: Async GPU relocations only"
  drm/i915: Revert relocation chaining commits.
  Revert "drm/i915/gem: Drop relocation slowpath".
  drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.
  drm/i915: Remove locking from i915_gem_object_prepare_read/write
  drm/i915: Parse command buffer earlier in eb_relocate(slow)
  Revert "drm/i915/gem: Split eb_vma into its own allocation"
  drm/i915: Use per object locking in execbuf, v12.
  drm/i915: Use ww locking in intel_renderstate.
  drm/i915: Add ww context handling to context_barrier_task
  drm/i915: Nuke arguments to eb_pin_engine
  drm/i915: Pin engine before pinning all objects, v4.
  drm/i915: Rework intel_context pinning to do everything outside of
    pin_mutex
  drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.
  drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as
    well, v2.
  drm/i915: Kill last user of intel_context_create_request outside of
    selftests
  drm/i915: Convert i915_perf to ww locking as well
  drm/i915: Dirty hack to fix selftests locking inversion
  drm/i915/selftests: Fix locking inversion in lrc selftest.
  drm/i915: Use ww pinning for intel_context_create_request()
  drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion,
    v2.
  drm/i915: Add ww locking to vm_fault_gtt
  drm/i915: Add ww locking to pin_to_display_plane

 drivers/gpu/drm/i915/display/intel_display.c  |    6 +-
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    |   78 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   55 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |    4 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    |   89 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 1569 +++++++++++------
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |   51 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   40 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    |  152 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |    3 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |    9 +
 drivers/gpu/drm/i915/gem/i915_gem_pm.c        |    2 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c    |    2 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |    7 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  |    2 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |   50 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |   47 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   57 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |   45 +-
 .../drm/i915/gem/selftests/i915_gem_phys.c    |    2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |    4 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.h          |    4 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  309 ++--
 drivers/gpu/drm/i915/gt/intel_context.h       |   13 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |    5 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |    2 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            |   23 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   37 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |   73 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.h   |    9 +-
 drivers/gpu/drm/i915/gt/intel_ring.c          |   10 +-
 drivers/gpu/drm/i915/gt/intel_ring.h          |    3 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   20 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c      |   12 +-
 drivers/gpu/drm/i915/gt/intel_timeline.h      |    3 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   |   43 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |   14 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c        |   17 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c        |   30 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |    4 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    |    2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |    2 +-
 drivers/gpu/drm/i915/gvt/cmd_parser.c         |    3 +-
 drivers/gpu/drm/i915/i915_drv.h               |   13 +-
 drivers/gpu/drm/i915/i915_gem.c               |   89 +-
 drivers/gpu/drm/i915/i915_gem.h               |   12 +
 drivers/gpu/drm/i915/i915_perf.c              |   57 +-
 drivers/gpu/drm/i915/i915_vma.c               |   13 +-
 drivers/gpu/drm/i915/i915_vma.h               |   13 +-
 drivers/gpu/drm/i915/selftests/i915_gem.c     |   41 +
 drivers/gpu/drm/i915/selftests/i915_request.c |   18 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c     |    2 +-
 .../drm/i915/selftests/intel_memory_region.c  |    2 +-
 53 files changed, 2183 insertions(+), 989 deletions(-)


base-commit: 7faedc4873dd257f4ed064ab4e0a28407690ea73
prerequisite-patch-id: e6315738715ac4ffccaeb4c4bf5a94651fb8da1d
-- 
2.27.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread
* Re: [Intel-gfx] [PATCH 15/23] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.
@ 2020-07-05 12:45 kernel test robot
  0 siblings, 0 replies; 44+ messages in thread
From: kernel test robot @ 2020-07-05 12:45 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 9313 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20200703122221.591656-16-maarten.lankhorst@linux.intel.com>
References: <20200703122221.591656-16-maarten.lankhorst@linux.intel.com>
TO: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Hi Maarten,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 7faedc4873dd257f4ed064ab4e0a28407690ea73]

url:    https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Use-ww-locking-in-execbuf-submission/20200703-202504
base:    7faedc4873dd257f4ed064ab4e0a28407690ea73
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: i386-randconfig-m021-20200701 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-14) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
drivers/gpu/drm/i915/gem/i915_gem_object_blt.c:367 i915_gem_object_copy_blt() warn: passing a valid pointer to 'PTR_ERR'

Old smatch warnings:
drivers/gpu/drm/i915/gem/i915_gem_object.h:130 __i915_gem_object_lock() error: we previously assumed 'ww' could be null (see line 122)
drivers/gpu/drm/i915/gem/i915_gem_object_blt.c:139 move_obj_to_gpu() warn: maybe use && instead of &
drivers/gpu/drm/i915/gem/i915_gem_context.h:205 i915_gem_context_get_engine() warn: inconsistent indenting
drivers/gpu/drm/i915/gem/i915_gem_context.h:207 i915_gem_context_get_engine() warn: inconsistent indenting

# https://github.com/0day-ci/linux/commit/db8aba4953b274f1a757ff6f92b15697fd5af58a
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout db8aba4953b274f1a757ff6f92b15697fd5af58a
vim +/PTR_ERR +367 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c

05f219d709ec57 Matthew Auld      2019-08-10  350  
05f219d709ec57 Matthew Auld      2019-08-10  351  int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
05f219d709ec57 Matthew Auld      2019-08-10  352  			     struct drm_i915_gem_object *dst,
05f219d709ec57 Matthew Auld      2019-08-10  353  			     struct intel_context *ce)
05f219d709ec57 Matthew Auld      2019-08-10  354  {
05f219d709ec57 Matthew Auld      2019-08-10  355  	struct i915_address_space *vm = ce->vm;
05f219d709ec57 Matthew Auld      2019-08-10  356  	struct i915_vma *vma[2], *batch;
db8aba4953b274 Maarten Lankhorst 2020-07-03  357  	struct i915_gem_ww_ctx ww;
05f219d709ec57 Matthew Auld      2019-08-10  358  	struct i915_request *rq;
05f219d709ec57 Matthew Auld      2019-08-10  359  	int err, i;
05f219d709ec57 Matthew Auld      2019-08-10  360  
05f219d709ec57 Matthew Auld      2019-08-10  361  	vma[0] = i915_vma_instance(src, vm, NULL);
05f219d709ec57 Matthew Auld      2019-08-10  362  	if (IS_ERR(vma[0]))
05f219d709ec57 Matthew Auld      2019-08-10  363  		return PTR_ERR(vma[0]);
05f219d709ec57 Matthew Auld      2019-08-10  364  
05f219d709ec57 Matthew Auld      2019-08-10  365  	vma[1] = i915_vma_instance(dst, vm, NULL);
05f219d709ec57 Matthew Auld      2019-08-10  366  	if (IS_ERR(vma[1]))
db8aba4953b274 Maarten Lankhorst 2020-07-03 @367  		return PTR_ERR(vma);
05f219d709ec57 Matthew Auld      2019-08-10  368  
db8aba4953b274 Maarten Lankhorst 2020-07-03  369  	i915_gem_ww_ctx_init(&ww, true);
db8aba4953b274 Maarten Lankhorst 2020-07-03  370  	intel_engine_pm_get(ce->engine);
db8aba4953b274 Maarten Lankhorst 2020-07-03  371  retry:
db8aba4953b274 Maarten Lankhorst 2020-07-03  372  	err = i915_gem_object_lock(src, &ww);
db8aba4953b274 Maarten Lankhorst 2020-07-03  373  	if (!err)
db8aba4953b274 Maarten Lankhorst 2020-07-03  374  		err = i915_gem_object_lock(dst, &ww);
db8aba4953b274 Maarten Lankhorst 2020-07-03  375  	if (!err)
db8aba4953b274 Maarten Lankhorst 2020-07-03  376  		err = intel_context_pin_ww(ce, &ww);
db8aba4953b274 Maarten Lankhorst 2020-07-03  377  	if (err)
db8aba4953b274 Maarten Lankhorst 2020-07-03  378  		goto out;
db8aba4953b274 Maarten Lankhorst 2020-07-03  379  
db8aba4953b274 Maarten Lankhorst 2020-07-03  380  	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
db8aba4953b274 Maarten Lankhorst 2020-07-03  381  	if (err)
db8aba4953b274 Maarten Lankhorst 2020-07-03  382  		goto out_ctx;
db8aba4953b274 Maarten Lankhorst 2020-07-03  383  
db8aba4953b274 Maarten Lankhorst 2020-07-03  384  	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
05f219d709ec57 Matthew Auld      2019-08-10  385  	if (unlikely(err))
05f219d709ec57 Matthew Auld      2019-08-10  386  		goto out_unpin_src;
05f219d709ec57 Matthew Auld      2019-08-10  387  
db8aba4953b274 Maarten Lankhorst 2020-07-03  388  	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
05f219d709ec57 Matthew Auld      2019-08-10  389  	if (IS_ERR(batch)) {
05f219d709ec57 Matthew Auld      2019-08-10  390  		err = PTR_ERR(batch);
05f219d709ec57 Matthew Auld      2019-08-10  391  		goto out_unpin_dst;
05f219d709ec57 Matthew Auld      2019-08-10  392  	}
05f219d709ec57 Matthew Auld      2019-08-10  393  
db8aba4953b274 Maarten Lankhorst 2020-07-03  394  	rq = i915_request_create(ce);
05f219d709ec57 Matthew Auld      2019-08-10  395  	if (IS_ERR(rq)) {
05f219d709ec57 Matthew Auld      2019-08-10  396  		err = PTR_ERR(rq);
05f219d709ec57 Matthew Auld      2019-08-10  397  		goto out_batch;
05f219d709ec57 Matthew Auld      2019-08-10  398  	}
05f219d709ec57 Matthew Auld      2019-08-10  399  
05f219d709ec57 Matthew Auld      2019-08-10  400  	err = intel_emit_vma_mark_active(batch, rq);
05f219d709ec57 Matthew Auld      2019-08-10  401  	if (unlikely(err))
05f219d709ec57 Matthew Auld      2019-08-10  402  		goto out_request;
05f219d709ec57 Matthew Auld      2019-08-10  403  
05f219d709ec57 Matthew Auld      2019-08-10  404  	for (i = 0; i < ARRAY_SIZE(vma); i++) {
223128f7671021 Tvrtko Ursulin    2020-06-15  405  		err = move_obj_to_gpu(vma[i]->obj, rq, i);
05f219d709ec57 Matthew Auld      2019-08-10  406  		if (unlikely(err))
db8aba4953b274 Maarten Lankhorst 2020-07-03  407  			goto out_request;
05f219d709ec57 Matthew Auld      2019-08-10  408  	}
05f219d709ec57 Matthew Auld      2019-08-10  409  
05f219d709ec57 Matthew Auld      2019-08-10  410  	for (i = 0; i < ARRAY_SIZE(vma); i++) {
05f219d709ec57 Matthew Auld      2019-08-10  411  		unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
05f219d709ec57 Matthew Auld      2019-08-10  412  
05f219d709ec57 Matthew Auld      2019-08-10  413  		err = i915_vma_move_to_active(vma[i], rq, flags);
05f219d709ec57 Matthew Auld      2019-08-10  414  		if (unlikely(err))
db8aba4953b274 Maarten Lankhorst 2020-07-03  415  			goto out_request;
05f219d709ec57 Matthew Auld      2019-08-10  416  	}
05f219d709ec57 Matthew Auld      2019-08-10  417  
05f219d709ec57 Matthew Auld      2019-08-10  418  	if (rq->engine->emit_init_breadcrumb) {
05f219d709ec57 Matthew Auld      2019-08-10  419  		err = rq->engine->emit_init_breadcrumb(rq);
05f219d709ec57 Matthew Auld      2019-08-10  420  		if (unlikely(err))
db8aba4953b274 Maarten Lankhorst 2020-07-03  421  			goto out_request;
05f219d709ec57 Matthew Auld      2019-08-10  422  	}
05f219d709ec57 Matthew Auld      2019-08-10  423  
05f219d709ec57 Matthew Auld      2019-08-10  424  	err = rq->engine->emit_bb_start(rq,
05f219d709ec57 Matthew Auld      2019-08-10  425  					batch->node.start, batch->node.size,
05f219d709ec57 Matthew Auld      2019-08-10  426  					0);
db8aba4953b274 Maarten Lankhorst 2020-07-03  427  
05f219d709ec57 Matthew Auld      2019-08-10  428  out_request:
05f219d709ec57 Matthew Auld      2019-08-10  429  	if (unlikely(err))
36e191f0644b20 Chris Wilson      2020-03-04  430  		i915_request_set_error_once(rq, err);
05f219d709ec57 Matthew Auld      2019-08-10  431  
05f219d709ec57 Matthew Auld      2019-08-10  432  	i915_request_add(rq);
05f219d709ec57 Matthew Auld      2019-08-10  433  out_batch:
05f219d709ec57 Matthew Auld      2019-08-10  434  	intel_emit_vma_release(ce, batch);
05f219d709ec57 Matthew Auld      2019-08-10  435  out_unpin_dst:
05f219d709ec57 Matthew Auld      2019-08-10  436  	i915_vma_unpin(vma[1]);
05f219d709ec57 Matthew Auld      2019-08-10  437  out_unpin_src:
05f219d709ec57 Matthew Auld      2019-08-10  438  	i915_vma_unpin(vma[0]);
db8aba4953b274 Maarten Lankhorst 2020-07-03  439  out_ctx:
db8aba4953b274 Maarten Lankhorst 2020-07-03  440  	intel_context_unpin(ce);
db8aba4953b274 Maarten Lankhorst 2020-07-03  441  out:
db8aba4953b274 Maarten Lankhorst 2020-07-03  442  	if (err == -EDEADLK) {
db8aba4953b274 Maarten Lankhorst 2020-07-03  443  		err = i915_gem_ww_ctx_backoff(&ww);
db8aba4953b274 Maarten Lankhorst 2020-07-03  444  		if (!err)
db8aba4953b274 Maarten Lankhorst 2020-07-03  445  			goto retry;
db8aba4953b274 Maarten Lankhorst 2020-07-03  446  	}
db8aba4953b274 Maarten Lankhorst 2020-07-03  447  	i915_gem_ww_ctx_fini(&ww);
db8aba4953b274 Maarten Lankhorst 2020-07-03  448  	intel_engine_pm_put(ce->engine);
05f219d709ec57 Matthew Auld      2019-08-10  449  	return err;
05f219d709ec57 Matthew Auld      2019-08-10  450  }
05f219d709ec57 Matthew Auld      2019-08-10  451  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 41957 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread
* [Intel-gfx] [PATCH 01/23] Revert "drm/i915/gem: Async GPU relocations only"
@ 2020-07-14 11:44 Maarten Lankhorst
  2020-07-14 11:45 ` [Intel-gfx] [PATCH 15/23] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2 Maarten Lankhorst
  0 siblings, 1 reply; 44+ messages in thread
From: Maarten Lankhorst @ 2020-07-14 11:44 UTC (permalink / raw)
  To: intel-gfx

This reverts commit 9e0f9464e2ab36b864359a59b0e9058fdef0ce47,
and related commit 7ac2d2536dfa7 ("drm/i915/gem: Delete unused code").

Breaks the execbuf ww locking series.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 314 ++++++++++++++++--
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  21 +-
 2 files changed, 308 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 6b4ec66cb558..446e76e95c38 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -45,6 +45,13 @@ struct eb_vma_array {
 	struct eb_vma vma[];
 };
 
+enum {
+	FORCE_CPU_RELOC = 1,
+	FORCE_GTT_RELOC,
+	FORCE_GPU_RELOC,
+#define DBG_FORCE_RELOC 0 /* choose one of the above! */
+};
+
 #define __EXEC_OBJECT_HAS_PIN		BIT(31)
 #define __EXEC_OBJECT_HAS_FENCE		BIT(30)
 #define __EXEC_OBJECT_NEEDS_MAP		BIT(29)
@@ -253,6 +260,8 @@ struct i915_execbuffer {
 	 */
 	struct reloc_cache {
 		struct drm_mm_node node; /** temporary GTT binding */
+		unsigned long vaddr; /** Current kmap address */
+		unsigned long page; /** Currently mapped page index */
 		unsigned int gen; /** Cached value of INTEL_GEN */
 		bool use_64bit_reloc : 1;
 		bool has_llc : 1;
@@ -596,6 +605,23 @@ eb_add_vma(struct i915_execbuffer *eb,
 	}
 }
 
+static inline int use_cpu_reloc(const struct reloc_cache *cache,
+				const struct drm_i915_gem_object *obj)
+{
+	if (!i915_gem_object_has_struct_page(obj))
+		return false;
+
+	if (DBG_FORCE_RELOC == FORCE_CPU_RELOC)
+		return true;
+
+	if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
+		return false;
+
+	return (cache->has_llc ||
+		obj->cache_dirty ||
+		obj->cache_level != I915_CACHE_NONE);
+}
+
 static int eb_reserve_vma(const struct i915_execbuffer *eb,
 			  struct eb_vma *ev,
 			  u64 pin_flags)
@@ -926,6 +952,8 @@ relocation_target(const struct drm_i915_gem_relocation_entry *reloc,
 static void reloc_cache_init(struct reloc_cache *cache,
 			     struct drm_i915_private *i915)
 {
+	cache->page = -1;
+	cache->vaddr = 0;
 	/* Must be a variable in the struct to allow GCC to unroll. */
 	cache->gen = INTEL_GEN(i915);
 	cache->has_llc = HAS_LLC(i915);
@@ -937,6 +965,25 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->target = NULL;
 }
 
+static inline void *unmask_page(unsigned long p)
+{
+	return (void *)(uintptr_t)(p & PAGE_MASK);
+}
+
+static inline unsigned int unmask_flags(unsigned long p)
+{
+	return p & ~PAGE_MASK;
+}
+
+#define KMAP 0x4 /* after CLFLUSH_FLAGS */
+
+static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache)
+{
+	struct drm_i915_private *i915 =
+		container_of(cache, struct i915_execbuffer, reloc_cache)->i915;
+	return &i915->ggtt;
+}
+
 #define RELOC_TAIL 4
 
 static int reloc_gpu_chain(struct reloc_cache *cache)
@@ -1049,6 +1096,181 @@ static int reloc_gpu_flush(struct reloc_cache *cache)
 	return err;
 }
 
+static void reloc_cache_reset(struct reloc_cache *cache)
+{
+	void *vaddr;
+
+	if (!cache->vaddr)
+		return;
+
+	vaddr = unmask_page(cache->vaddr);
+	if (cache->vaddr & KMAP) {
+		if (cache->vaddr & CLFLUSH_AFTER)
+			mb();
+
+		kunmap_atomic(vaddr);
+		i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm);
+	} else {
+		struct i915_ggtt *ggtt = cache_to_ggtt(cache);
+
+		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
+		io_mapping_unmap_atomic((void __iomem *)vaddr);
+
+		if (drm_mm_node_allocated(&cache->node)) {
+			ggtt->vm.clear_range(&ggtt->vm,
+					     cache->node.start,
+					     cache->node.size);
+			mutex_lock(&ggtt->vm.mutex);
+			drm_mm_remove_node(&cache->node);
+			mutex_unlock(&ggtt->vm.mutex);
+		} else {
+			i915_vma_unpin((struct i915_vma *)cache->node.mm);
+		}
+	}
+
+	cache->vaddr = 0;
+	cache->page = -1;
+}
+
+static void *reloc_kmap(struct drm_i915_gem_object *obj,
+			struct reloc_cache *cache,
+			unsigned long page)
+{
+	void *vaddr;
+
+	if (cache->vaddr) {
+		kunmap_atomic(unmask_page(cache->vaddr));
+	} else {
+		unsigned int flushes;
+		int err;
+
+		err = i915_gem_object_prepare_write(obj, &flushes);
+		if (err)
+			return ERR_PTR(err);
+
+		BUILD_BUG_ON(KMAP & CLFLUSH_FLAGS);
+		BUILD_BUG_ON((KMAP | CLFLUSH_FLAGS) & PAGE_MASK);
+
+		cache->vaddr = flushes | KMAP;
+		cache->node.mm = (void *)obj;
+		if (flushes)
+			mb();
+	}
+
+	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, page));
+	cache->vaddr = unmask_flags(cache->vaddr) | (unsigned long)vaddr;
+	cache->page = page;
+
+	return vaddr;
+}
+
+static void *reloc_iomap(struct drm_i915_gem_object *obj,
+			 struct reloc_cache *cache,
+			 unsigned long page)
+{
+	struct i915_ggtt *ggtt = cache_to_ggtt(cache);
+	unsigned long offset;
+	void *vaddr;
+
+	if (cache->vaddr) {
+		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
+		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
+	} else {
+		struct i915_vma *vma;
+		int err;
+
+		if (i915_gem_object_is_tiled(obj))
+			return ERR_PTR(-EINVAL);
+
+		if (use_cpu_reloc(cache, obj))
+			return NULL;
+
+		i915_gem_object_lock(obj);
+		err = i915_gem_object_set_to_gtt_domain(obj, true);
+		i915_gem_object_unlock(obj);
+		if (err)
+			return ERR_PTR(err);
+
+		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
+					       PIN_MAPPABLE |
+					       PIN_NONBLOCK /* NOWARN */ |
+					       PIN_NOEVICT);
+		if (IS_ERR(vma)) {
+			memset(&cache->node, 0, sizeof(cache->node));
+			mutex_lock(&ggtt->vm.mutex);
+			err = drm_mm_insert_node_in_range
+				(&ggtt->vm.mm, &cache->node,
+				 PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
+				 0, ggtt->mappable_end,
+				 DRM_MM_INSERT_LOW);
+			mutex_unlock(&ggtt->vm.mutex);
+			if (err) /* no inactive aperture space, use cpu reloc */
+				return NULL;
+		} else {
+			cache->node.start = vma->node.start;
+			cache->node.mm = (void *)vma;
+		}
+	}
+
+	offset = cache->node.start;
+	if (drm_mm_node_allocated(&cache->node)) {
+		ggtt->vm.insert_page(&ggtt->vm,
+				     i915_gem_object_get_dma_address(obj, page),
+				     offset, I915_CACHE_NONE, 0);
+	} else {
+		offset += page << PAGE_SHIFT;
+	}
+
+	vaddr = (void __force *)io_mapping_map_atomic_wc(&ggtt->iomap,
+							 offset);
+	cache->page = page;
+	cache->vaddr = (unsigned long)vaddr;
+
+	return vaddr;
+}
+
+static void *reloc_vaddr(struct drm_i915_gem_object *obj,
+			 struct reloc_cache *cache,
+			 unsigned long page)
+{
+	void *vaddr;
+
+	if (cache->page == page) {
+		vaddr = unmask_page(cache->vaddr);
+	} else {
+		vaddr = NULL;
+		if ((cache->vaddr & KMAP) == 0)
+			vaddr = reloc_iomap(obj, cache, page);
+		if (!vaddr)
+			vaddr = reloc_kmap(obj, cache, page);
+	}
+
+	return vaddr;
+}
+
+static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
+{
+	if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
+		if (flushes & CLFLUSH_BEFORE) {
+			clflushopt(addr);
+			mb();
+		}
+
+		*addr = value;
+
+		/*
+		 * Writes to the same cacheline are serialised by the CPU
+		 * (including clflush). On the write path, we only require
+		 * that it hits memory in an orderly fashion and place
+		 * mb barriers at the start and end of the relocation phase
+		 * to ensure ordering of clflush wrt to the system.
+		 */
+		if (flushes & CLFLUSH_AFTER)
+			clflushopt(addr);
+	} else
+		*addr = value;
+}
+
 static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -1214,6 +1436,17 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 	return cmd;
 }
 
+static inline bool use_reloc_gpu(struct i915_vma *vma)
+{
+	if (DBG_FORCE_RELOC == FORCE_GPU_RELOC)
+		return true;
+
+	if (DBG_FORCE_RELOC)
+		return false;
+
+	return !dma_resv_test_signaled_rcu(vma->resv, true);
+}
+
 static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
 {
 	struct page *page;
@@ -1228,10 +1461,10 @@ static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
 	return addr + offset_in_page(offset);
 }
 
-static int __reloc_entry_gpu(struct i915_execbuffer *eb,
-			     struct i915_vma *vma,
-			     u64 offset,
-			     u64 target_addr)
+static bool __reloc_entry_gpu(struct i915_execbuffer *eb,
+			      struct i915_vma *vma,
+			      u64 offset,
+			      u64 target_addr)
 {
 	const unsigned int gen = eb->reloc_cache.gen;
 	unsigned int len;
@@ -1247,7 +1480,7 @@ static int __reloc_entry_gpu(struct i915_execbuffer *eb,
 
 	batch = reloc_gpu(eb, vma, len);
 	if (IS_ERR(batch))
-		return PTR_ERR(batch);
+		return false;
 
 	addr = gen8_canonical_addr(vma->node.start + offset);
 	if (gen >= 8) {
@@ -1296,21 +1529,55 @@ static int __reloc_entry_gpu(struct i915_execbuffer *eb,
 		*batch++ = target_addr;
 	}
 
-	return 0;
+	return true;
+}
+
+static bool reloc_entry_gpu(struct i915_execbuffer *eb,
+			    struct i915_vma *vma,
+			    u64 offset,
+			    u64 target_addr)
+{
+	if (eb->reloc_cache.vaddr)
+		return false;
+
+	if (!use_reloc_gpu(vma))
+		return false;
+
+	return __reloc_entry_gpu(eb, vma, offset, target_addr);
 }
 
 static u64
-relocate_entry(struct i915_execbuffer *eb,
-	       struct i915_vma *vma,
+relocate_entry(struct i915_vma *vma,
 	       const struct drm_i915_gem_relocation_entry *reloc,
+	       struct i915_execbuffer *eb,
 	       const struct i915_vma *target)
 {
 	u64 target_addr = relocation_target(reloc, target);
-	int err;
-
-	err = __reloc_entry_gpu(eb, vma, reloc->offset, target_addr);
-	if (err)
-		return err;
+	u64 offset = reloc->offset;
+
+	if (!reloc_entry_gpu(eb, vma, offset, target_addr)) {
+		bool wide = eb->reloc_cache.use_64bit_reloc;
+		void *vaddr;
+
+repeat:
+		vaddr = reloc_vaddr(vma->obj,
+				    &eb->reloc_cache,
+				    offset >> PAGE_SHIFT);
+		if (IS_ERR(vaddr))
+			return PTR_ERR(vaddr);
+
+		GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
+		clflush_write32(vaddr + offset_in_page(offset),
+				lower_32_bits(target_addr),
+				eb->reloc_cache.vaddr);
+
+		if (wide) {
+			offset += sizeof(u32);
+			target_addr >>= 32;
+			wide = false;
+			goto repeat;
+		}
+	}
 
 	return target->node.start | UPDATE;
 }
@@ -1375,7 +1642,8 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	 * If the relocation already has the right value in it, no
 	 * more work needs to be done.
 	 */
-	if (gen8_canonical_addr(target->vma->node.start) == reloc->presumed_offset)
+	if (!DBG_FORCE_RELOC &&
+	    gen8_canonical_addr(target->vma->node.start) == reloc->presumed_offset)
 		return 0;
 
 	/* Check that the relocation address is valid... */
@@ -1407,7 +1675,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	ev->flags &= ~EXEC_OBJECT_ASYNC;
 
 	/* and update the user's relocation entry */
-	return relocate_entry(eb, ev->vma, reloc, target->vma);
+	return relocate_entry(ev->vma, reloc, eb, target->vma);
 }
 
 static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
@@ -1445,8 +1713,10 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 		 * this is bad and so lockdep complains vehemently.
 		 */
 		copied = __copy_from_user(r, urelocs, count * sizeof(r[0]));
-		if (unlikely(copied))
-			return -EFAULT;
+		if (unlikely(copied)) {
+			remain = -EFAULT;
+			goto out;
+		}
 
 		remain -= count;
 		do {
@@ -1454,7 +1724,8 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 
 			if (likely(offset == 0)) {
 			} else if ((s64)offset < 0) {
-				return (int)offset;
+				remain = (int)offset;
+				goto out;
 			} else {
 				/*
 				 * Note that reporting an error now
@@ -1484,8 +1755,9 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 		} while (r++, --count);
 		urelocs += ARRAY_SIZE(stack);
 	} while (remain);
-
-	return 0;
+out:
+	reloc_cache_reset(&eb->reloc_cache);
+	return remain;
 }
 
 static int eb_relocate(struct i915_execbuffer *eb)
@@ -2392,7 +2664,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.i915 = i915;
 	eb.file = file;
 	eb.args = args;
-	if (!(args->flags & I915_EXEC_NO_RELOC))
+	if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))
 		args->flags |= __EXEC_HAS_RELOC;
 
 	eb.exec = exec;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index 57c14d3340cd..a49016f8ee0d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -37,14 +37,20 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 		return err;
 
 	/* 8-Byte aligned */
-	err = __reloc_entry_gpu(eb, vma, offsets[0] * sizeof(u32), 0);
-	if (err)
+	if (!__reloc_entry_gpu(eb, vma,
+			       offsets[0] * sizeof(u32),
+			       0)) {
+		err = -EIO;
 		goto unpin_vma;
+	}
 
 	/* !8-Byte aligned */
-	err = __reloc_entry_gpu(eb, vma, offsets[1] * sizeof(u32), 1);
-	if (err)
+	if (!__reloc_entry_gpu(eb, vma,
+			       offsets[1] * sizeof(u32),
+			       1)) {
+		err = -EIO;
 		goto unpin_vma;
+	}
 
 	/* Skip to the end of the cmd page */
 	i = PAGE_SIZE / sizeof(u32) - RELOC_TAIL - 1;
@@ -54,9 +60,12 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	eb->reloc_cache.rq_size += i;
 
 	/* Force batch chaining */
-	err = __reloc_entry_gpu(eb, vma, offsets[2] * sizeof(u32), 2);
-	if (err)
+	if (!__reloc_entry_gpu(eb, vma,
+			       offsets[2] * sizeof(u32),
+			       2)) {
+		err = -EIO;
 		goto unpin_vma;
+	}
 
 	GEM_BUG_ON(!eb->reloc_cache.rq);
 	rq = i915_request_get(eb->reloc_cache.rq);

base-commit: beb1c0b42c5368a48e782e5556be95c8332d28c6
-- 
2.27.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2020-07-14 11:46 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-03 12:21 [Intel-gfx] [PATCH 00/23] drm/i915: Use ww locking in execbuf submission Maarten Lankhorst
2020-07-03 12:21 ` [Intel-gfx] [PATCH 01/23] Revert "drm/i915/gem: Async GPU relocations only" Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 02/23] drm/i915: Revert relocation chaining commits Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 03/23] Revert "drm/i915/gem: Drop relocation slowpath" Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 04/23] drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2 Maarten Lankhorst
2020-07-03 13:45   ` Tvrtko Ursulin
2020-07-03 12:22 ` [Intel-gfx] [PATCH 05/23] drm/i915: Remove locking from i915_gem_object_prepare_read/write Maarten Lankhorst
2020-07-03 13:43   ` Tvrtko Ursulin
2020-07-06 12:53     ` Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 06/23] drm/i915: Parse command buffer earlier in eb_relocate(slow) Maarten Lankhorst
2020-07-03 13:49   ` Tvrtko Ursulin
2020-07-06 12:53     ` Maarten Lankhorst
2020-07-06 15:47       ` Tvrtko Ursulin
2020-07-03 12:22 ` [Intel-gfx] [PATCH 07/23] Revert "drm/i915/gem: Split eb_vma into its own allocation" Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 08/23] drm/i915: Use per object locking in execbuf, v12 Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 09/23] drm/i915: Use ww locking in intel_renderstate Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 10/23] drm/i915: Add ww context handling to context_barrier_task Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 11/23] drm/i915: Nuke arguments to eb_pin_engine Maarten Lankhorst
2020-07-03 13:38   ` Tvrtko Ursulin
2020-07-03 12:22 ` [Intel-gfx] [PATCH 12/23] drm/i915: Pin engine before pinning all objects, v4 Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 13/23] drm/i915: Rework intel_context pinning to do everything outside of pin_mutex Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 14/23] drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 15/23] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2 Maarten Lankhorst
2020-07-07 14:32   ` Dan Carpenter
2020-07-07 14:32     ` Dan Carpenter
2020-07-03 12:22 ` [Intel-gfx] [PATCH 16/23] drm/i915: Kill last user of intel_context_create_request outside of selftests Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 17/23] drm/i915: Convert i915_perf to ww locking as well Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 18/23] drm/i915: Dirty hack to fix selftests locking inversion Maarten Lankhorst
2020-07-03 13:48   ` Tvrtko Ursulin
2020-07-07 10:19     ` Maarten Lankhorst
2020-07-07 10:56       ` Tvrtko Ursulin
2020-07-07 11:00         ` Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 19/23] drm/i915/selftests: Fix locking inversion in lrc selftest Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 20/23] drm/i915: Use ww pinning for intel_context_create_request() Maarten Lankhorst
2020-07-07 14:35   ` Dan Carpenter
2020-07-07 14:35     ` Dan Carpenter
2020-07-03 12:22 ` [Intel-gfx] [PATCH 21/23] drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v2 Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 22/23] drm/i915: Add ww locking to vm_fault_gtt Maarten Lankhorst
2020-07-03 12:22 ` [Intel-gfx] [PATCH 23/23] drm/i915: Add ww locking to pin_to_display_plane Maarten Lankhorst
2020-07-03 14:12 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Use ww locking in execbuf submission Patchwork
2020-07-03 14:13 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-07-03 14:36 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2020-07-05 12:45 [Intel-gfx] [PATCH 15/23] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2 kernel test robot
2020-07-14 11:44 [Intel-gfx] [PATCH 01/23] Revert "drm/i915/gem: Async GPU relocations only" Maarten Lankhorst
2020-07-14 11:45 ` [Intel-gfx] [PATCH 15/23] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2 Maarten Lankhorst

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.