public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: use a separate context for gpu relocs
@ 2019-08-24  0:20 Daniele Ceraolo Spurio
  2019-08-24  0:32 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-08-24  0:20 UTC (permalink / raw)
  To: intel-gfx

The CS pre-parser can pre-fetch commands across memory sync points and
starting from gen12 it is able to pre-fetch across BB_START and BB_END
boundaries as well, so when we emit gpu relocs the pre-parser might
fetch the target location of the reloc before the memory write lands.

The parser can't pre-fetch across the ctx switch, so we use a separate
context to guarantee that the memory is syncronized before the parser
can get to it.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 27 ++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index b5f6937369ea..d27201c654e9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,6 +252,7 @@ struct i915_execbuffer {
 		bool has_fence : 1;
 		bool needs_unfenced : 1;
 
+		struct intel_context *ce;
 		struct i915_request *rq;
 		u32 *rq_cmd;
 		unsigned int rq_size;
@@ -903,6 +904,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
 	cache->node.allocated = false;
+	cache->ce = NULL;
 	cache->rq = NULL;
 	cache->rq_size = 0;
 }
@@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
 static void reloc_cache_reset(struct reloc_cache *cache)
 {
 	void *vaddr;
+	struct intel_context *ce;
 
 	if (cache->rq)
 		reloc_gpu_flush(cache);
@@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
 		}
 	}
 
+	ce = fetch_and_zero(&cache->ce);
+	if (ce)
+		intel_context_put(ce);
+
 	cache->vaddr = 0;
 	cache->page = -1;
 }
@@ -1168,7 +1175,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (err)
 		goto err_unmap;
 
-	rq = i915_request_create(eb->context);
+	rq = intel_context_create_request(cache->ce);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -1239,6 +1246,24 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 		if (!intel_engine_can_store_dword(eb->engine))
 			return ERR_PTR(-ENODEV);
 
+		/*
+		 * The CS pre-parser can pre-fetch commands across memory sync
+		 * points and starting gen12 it is able to pre-fetch across
+		 * BB_START and BB_END boundaries (within the same context). We
+		 * use a separate context to guarantee that the reloc writes
+		 * land before the parser gets to the target memory location.
+		 */
+		if (!cache->ce) {
+			struct intel_context *ce;
+
+			ce = intel_context_create(eb->context->gem_context,
+						  eb->engine);
+			if (IS_ERR(ce))
+				return ERR_CAST(ce);
+
+			cache->ce = ce; /* released in reloc_cache_reset */
+		}
+
 		err = __reloc_gpu_alloc(eb, vma, len);
 		if (unlikely(err))
 			return ERR_PTR(err);
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: use a separate context for gpu relocs
  2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
@ 2019-08-24  0:32 ` Patchwork
  2019-08-24  1:25 ` ✗ Fi.CI.BAT: failure " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2019-08-24  0:32 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: use a separate context for gpu relocs
URL   : https://patchwork.freedesktop.org/series/65729/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0deec428eaaf drm/i915: use a separate context for gpu relocs
-:12: WARNING:TYPO_SPELLING: 'syncronized' may be misspelled - perhaps 'synchronized'?
#12: 
context to guarantee that the memory is syncronized before the parser

total: 0 errors, 1 warnings, 0 checks, 63 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: use a separate context for gpu relocs
  2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
  2019-08-24  0:32 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2019-08-24  1:25 ` Patchwork
  2019-08-24  8:54 ` [PATCH] " Chris Wilson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2019-08-24  1:25 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: use a separate context for gpu relocs
URL   : https://patchwork.freedesktop.org/series/65729/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6780 -> Patchwork_14180
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_14180 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14180, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_14180:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_module_load@reload:
    - fi-byt-j1900:       [PASS][1] -> [DMESG-WARN][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-byt-j1900/igt@i915_module_load@reload.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-byt-j1900/igt@i915_module_load@reload.html
    - fi-kbl-r:           [PASS][3] -> [INCOMPLETE][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-r/igt@i915_module_load@reload.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-r/igt@i915_module_load@reload.html
    - fi-whl-u:           [PASS][5] -> [INCOMPLETE][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-whl-u/igt@i915_module_load@reload.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-whl-u/igt@i915_module_load@reload.html
    - fi-skl-iommu:       [PASS][7] -> [INCOMPLETE][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-iommu/igt@i915_module_load@reload.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-iommu/igt@i915_module_load@reload.html
    - fi-hsw-4770r:       [PASS][9] -> [DMESG-WARN][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-hsw-4770r/igt@i915_module_load@reload.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-hsw-4770r/igt@i915_module_load@reload.html
    - fi-elk-e7500:       [PASS][11] -> [DMESG-WARN][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-elk-e7500/igt@i915_module_load@reload.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-elk-e7500/igt@i915_module_load@reload.html
    - fi-skl-6700k2:      [PASS][13] -> [INCOMPLETE][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-6700k2/igt@i915_module_load@reload.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-6700k2/igt@i915_module_load@reload.html
    - fi-bsw-kefka:       [PASS][15] -> [INCOMPLETE][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bsw-kefka/igt@i915_module_load@reload.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bsw-kefka/igt@i915_module_load@reload.html
    - fi-bdw-5557u:       [PASS][17] -> [INCOMPLETE][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bdw-5557u/igt@i915_module_load@reload.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bdw-5557u/igt@i915_module_load@reload.html
    - fi-skl-guc:         [PASS][19] -> [INCOMPLETE][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-guc/igt@i915_module_load@reload.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-guc/igt@i915_module_load@reload.html
    - fi-kbl-guc:         [PASS][21] -> [INCOMPLETE][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-guc/igt@i915_module_load@reload.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-guc/igt@i915_module_load@reload.html
    - fi-cfl-8109u:       [PASS][23] -> [INCOMPLETE][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-cfl-8109u/igt@i915_module_load@reload.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-cfl-8109u/igt@i915_module_load@reload.html
    - fi-bdw-gvtdvm:      [PASS][25] -> [INCOMPLETE][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bdw-gvtdvm/igt@i915_module_load@reload.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bdw-gvtdvm/igt@i915_module_load@reload.html
    - fi-kbl-7500u:       [PASS][27] -> [INCOMPLETE][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-7500u/igt@i915_module_load@reload.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-7500u/igt@i915_module_load@reload.html
    - fi-cfl-8700k:       [PASS][29] -> [INCOMPLETE][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-cfl-8700k/igt@i915_module_load@reload.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-cfl-8700k/igt@i915_module_load@reload.html
    - fi-snb-2520m:       [PASS][31] -> [DMESG-WARN][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-snb-2520m/igt@i915_module_load@reload.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-snb-2520m/igt@i915_module_load@reload.html
    - fi-hsw-4770:        [PASS][33] -> [DMESG-WARN][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-hsw-4770/igt@i915_module_load@reload.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-hsw-4770/igt@i915_module_load@reload.html
    - fi-cfl-guc:         [PASS][35] -> [INCOMPLETE][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-cfl-guc/igt@i915_module_load@reload.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-cfl-guc/igt@i915_module_load@reload.html
    - fi-skl-6770hq:      [PASS][37] -> [INCOMPLETE][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-6770hq/igt@i915_module_load@reload.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-6770hq/igt@i915_module_load@reload.html
    - fi-bsw-n3050:       [PASS][39] -> [INCOMPLETE][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bsw-n3050/igt@i915_module_load@reload.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bsw-n3050/igt@i915_module_load@reload.html
    - fi-ilk-650:         [PASS][41] -> [DMESG-WARN][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-ilk-650/igt@i915_module_load@reload.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-ilk-650/igt@i915_module_load@reload.html
    - fi-ivb-3770:        [PASS][43] -> [DMESG-WARN][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-ivb-3770/igt@i915_module_load@reload.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-ivb-3770/igt@i915_module_load@reload.html
    - fi-skl-lmem:        [PASS][45] -> [INCOMPLETE][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-lmem/igt@i915_module_load@reload.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-lmem/igt@i915_module_load@reload.html
    - fi-skl-gvtdvm:      [PASS][47] -> [INCOMPLETE][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-gvtdvm/igt@i915_module_load@reload.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-gvtdvm/igt@i915_module_load@reload.html
    - fi-hsw-peppy:       [PASS][49] -> [DMESG-WARN][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-hsw-peppy/igt@i915_module_load@reload.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-hsw-peppy/igt@i915_module_load@reload.html
    - fi-skl-6260u:       [PASS][51] -> [INCOMPLETE][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-6260u/igt@i915_module_load@reload.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-6260u/igt@i915_module_load@reload.html
    - fi-byt-n2820:       [PASS][53] -> [DMESG-WARN][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-byt-n2820/igt@i915_module_load@reload.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-byt-n2820/igt@i915_module_load@reload.html
    - fi-kbl-7567u:       [PASS][55] -> [INCOMPLETE][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-7567u/igt@i915_module_load@reload.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-7567u/igt@i915_module_load@reload.html
    - fi-kbl-x1275:       [PASS][57] -> [INCOMPLETE][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-x1275/igt@i915_module_load@reload.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-x1275/igt@i915_module_load@reload.html
    - fi-bwr-2160:        [PASS][59] -> [DMESG-WARN][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bwr-2160/igt@i915_module_load@reload.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bwr-2160/igt@i915_module_load@reload.html

  
#### Warnings ####

  * igt@i915_module_load@reload:
    - fi-blb-e6850:       [INCOMPLETE][61] ([fdo#107718]) -> [DMESG-WARN][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-blb-e6850/igt@i915_module_load@reload.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-blb-e6850/igt@i915_module_load@reload.html

  
Known issues
------------

  Here are the changes found in Patchwork_14180 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_tiled_fence_blits@basic:
    - fi-icl-u2:          [PASS][63] -> [INCOMPLETE][64] ([fdo#107713])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-icl-u2/igt@gem_tiled_fence_blits@basic.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-icl-u2/igt@gem_tiled_fence_blits@basic.html

  * igt@i915_module_load@reload:
    - fi-kbl-8809g:       [PASS][65] -> [INCOMPLETE][66] ([fdo#103665])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-kbl-8809g/igt@i915_module_load@reload.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-kbl-8809g/igt@i915_module_load@reload.html
    - fi-apl-guc:         [PASS][67] -> [INCOMPLETE][68] ([fdo#103927])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-apl-guc/igt@i915_module_load@reload.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-apl-guc/igt@i915_module_load@reload.html
    - fi-skl-6600u:       [PASS][69] -> [INCOMPLETE][70] ([fdo#104108])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-skl-6600u/igt@i915_module_load@reload.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-skl-6600u/igt@i915_module_load@reload.html
    - fi-glk-dsi:         [PASS][71] -> [INCOMPLETE][72] ([fdo#103359] / [k.org#198133])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-glk-dsi/igt@i915_module_load@reload.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-glk-dsi/igt@i915_module_load@reload.html
    - fi-cml-u2:          [PASS][73] -> [INCOMPLETE][74] ([fdo#110566])
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-cml-u2/igt@i915_module_load@reload.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-cml-u2/igt@i915_module_load@reload.html
    - fi-bxt-dsi:         [PASS][75] -> [INCOMPLETE][76] ([fdo#103927])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-bxt-dsi/igt@i915_module_load@reload.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-bxt-dsi/igt@i915_module_load@reload.html
    - fi-snb-2600:        [PASS][77] -> [DMESG-WARN][78] ([fdo#110789])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-snb-2600/igt@i915_module_load@reload.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-snb-2600/igt@i915_module_load@reload.html

  
#### Possible fixes ####

  * igt@gem_ctx_create@basic-files:
    - {fi-icl-guc}:       [INCOMPLETE][79] ([fdo#107713] / [fdo#109100]) -> [PASS][80]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-icl-guc/igt@gem_ctx_create@basic-files.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-icl-guc/igt@gem_ctx_create@basic-files.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [DMESG-WARN][81] ([fdo#102614]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6780/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#110566]: https://bugs.freedesktop.org/show_bug.cgi?id=110566
  [fdo#110789]: https://bugs.freedesktop.org/show_bug.cgi?id=110789
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (54 -> 47)
------------------------------

  Additional (1): fi-icl-u3 
  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6780 -> Patchwork_14180

  CI-20190529: 20190529
  CI_DRM_6780: 2c229f1097cb55fe89671cc5ef1ffdd07a68679d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5149: 6756ede680ee12745393360d7cc87cc0eb733ff6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14180: 0deec428eaafa63524eae533534cf39668726679 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0deec428eaaf drm/i915: use a separate context for gpu relocs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14180/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
  2019-08-24  0:32 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
  2019-08-24  1:25 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2019-08-24  8:54 ` Chris Wilson
  2019-08-26 17:56   ` Daniele Ceraolo Spurio
  2019-08-24  9:01 ` Chris Wilson
  2019-08-24  9:03 ` Chris Wilson
  4 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2019-08-24  8:54 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-08-24 01:20:22)
> The CS pre-parser can pre-fetch commands across memory sync points and
> starting from gen12 it is able to pre-fetch across BB_START and BB_END
> boundaries as well, so when we emit gpu relocs the pre-parser might
> fetch the target location of the reloc before the memory write lands.
> 
> The parser can't pre-fetch across the ctx switch, so we use a separate
> context to guarantee that the memory is syncronized before the parser
> can get to it.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 27 ++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b5f6937369ea..d27201c654e9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -252,6 +252,7 @@ struct i915_execbuffer {
>                 bool has_fence : 1;
>                 bool needs_unfenced : 1;
>  
> +               struct intel_context *ce;
>                 struct i915_request *rq;
>                 u32 *rq_cmd;
>                 unsigned int rq_size;
> @@ -903,6 +904,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
>         cache->has_fence = cache->gen < 4;
>         cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
>         cache->node.allocated = false;
> +       cache->ce = NULL;
>         cache->rq = NULL;
>         cache->rq_size = 0;
>  }
> @@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
>  static void reloc_cache_reset(struct reloc_cache *cache)
>  {
>         void *vaddr;
> +       struct intel_context *ce;
>  
>         if (cache->rq)
>                 reloc_gpu_flush(cache);
> @@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
>                 }
>         }
>  
> +       ce = fetch_and_zero(&cache->ce);
> +       if (ce)
> +               intel_context_put(ce);

We don't need to create a new context between every buffer, it can be
released in eb_destroy.

> +
>         cache->vaddr = 0;
>         cache->page = -1;
>  }
> @@ -1168,7 +1175,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>         if (err)
>                 goto err_unmap;
>  
> -       rq = i915_request_create(eb->context);
> +       rq = intel_context_create_request(cache->ce);
>         if (IS_ERR(rq)) {
>                 err = PTR_ERR(rq);
>                 goto err_unpin;
> @@ -1239,6 +1246,24 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
>                 if (!intel_engine_can_store_dword(eb->engine))
>                         return ERR_PTR(-ENODEV);
>  
> +               /*
> +                * The CS pre-parser can pre-fetch commands across memory sync
> +                * points and starting gen12 it is able to pre-fetch across
> +                * BB_START and BB_END boundaries (within the same context). We
> +                * use a separate context to guarantee that the reloc writes
> +                * land before the parser gets to the target memory location.
> +                */
> +               if (!cache->ce) {
> +                       struct intel_context *ce;
> +

I was thinking of

	if (cache->gen >= 12)
> +                       ce = intel_context_create(eb->context->gem_context,
> +                                                 eb->engine);
else
	ce = intel_context_get(eb->context);

Note that this requires us to fix the tagging so that we don't perform a
lite-restore from the reloc instance to the user instance.

But yes, that's more or less the same as the sketch I did :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
                   ` (2 preceding siblings ...)
  2019-08-24  8:54 ` [PATCH] " Chris Wilson
@ 2019-08-24  9:01 ` Chris Wilson
  2019-08-24  9:03 ` Chris Wilson
  4 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2019-08-24  9:01 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-08-24 01:20:22)
> The CS pre-parser can pre-fetch commands across memory sync points and
> starting from gen12 it is able to pre-fetch across BB_START and BB_END
> boundaries as well, so when we emit gpu relocs the pre-parser might
> fetch the target location of the reloc before the memory write lands.
> 
> The parser can't pre-fetch across the ctx switch, so we use a separate
> context to guarantee that the memory is syncronized before the parser
> can get to it.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 27 ++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b5f6937369ea..d27201c654e9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -252,6 +252,7 @@ struct i915_execbuffer {
>                 bool has_fence : 1;
>                 bool needs_unfenced : 1;
>  
> +               struct intel_context *ce;
>                 struct i915_request *rq;
>                 u32 *rq_cmd;
>                 unsigned int rq_size;
> @@ -903,6 +904,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
>         cache->has_fence = cache->gen < 4;
>         cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
>         cache->node.allocated = false;
> +       cache->ce = NULL;
>         cache->rq = NULL;
>         cache->rq_size = 0;
>  }
> @@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
>  static void reloc_cache_reset(struct reloc_cache *cache)
>  {
>         void *vaddr;
> +       struct intel_context *ce;
>  
>         if (cache->rq)
>                 reloc_gpu_flush(cache);
> @@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
>                 }
>         }
>  
> +       ce = fetch_and_zero(&cache->ce);
> +       if (ce)
> +               intel_context_put(ce);
> +
>         cache->vaddr = 0;
>         cache->page = -1;
>  }
> @@ -1168,7 +1175,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>         if (err)
>                 goto err_unmap;
>  
> -       rq = i915_request_create(eb->context);
> +       rq = intel_context_create_request(cache->ce);
>         if (IS_ERR(rq)) {
>                 err = PTR_ERR(rq);
>                 goto err_unpin;
> @@ -1239,6 +1246,24 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
>                 if (!intel_engine_can_store_dword(eb->engine))
>                         return ERR_PTR(-ENODEV);
>  
> +               /*
> +                * The CS pre-parser can pre-fetch commands across memory sync
> +                * points and starting gen12 it is able to pre-fetch across
> +                * BB_START and BB_END boundaries (within the same context). We
> +                * use a separate context to guarantee that the reloc writes
> +                * land before the parser gets to the target memory location.
> +                */

I think the comment about the different semantics for execution flow
would also be valuable in emit_fini_breadcrumb -- it's the first place
we will look if we ever have any doubt about the barriers around the
breadrumbs.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
                   ` (3 preceding siblings ...)
  2019-08-24  9:01 ` Chris Wilson
@ 2019-08-24  9:03 ` Chris Wilson
  4 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2019-08-24  9:03 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-08-24 01:20:22)
> @@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
>  static void reloc_cache_reset(struct reloc_cache *cache)
>  {
>         void *vaddr;
> +       struct intel_context *ce;
>  
>         if (cache->rq)
>                 reloc_gpu_flush(cache);
> @@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
>                 }
>         }
>  
> +       ce = fetch_and_zero(&cache->ce);
> +       if (ce)
> +               intel_context_put(ce);

For peace of mind, this is too late. For pure gpu relocs, cache->vaddr is 0
and so we took the short-circuit at the beginning of the function.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-24  8:54 ` [PATCH] " Chris Wilson
@ 2019-08-26 17:56   ` Daniele Ceraolo Spurio
  2019-08-26 18:10     ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-08-26 17:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 8/24/2019 1:54 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-08-24 01:20:22)
>> The CS pre-parser can pre-fetch commands across memory sync points and
>> starting from gen12 it is able to pre-fetch across BB_START and BB_END
>> boundaries as well, so when we emit gpu relocs the pre-parser might
>> fetch the target location of the reloc before the memory write lands.
>>
>> The parser can't pre-fetch across the ctx switch, so we use a separate
>> context to guarantee that the memory is syncronized before the parser
>> can get to it.
>>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 27 ++++++++++++++++++-
>>   1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index b5f6937369ea..d27201c654e9 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -252,6 +252,7 @@ struct i915_execbuffer {
>>                  bool has_fence : 1;
>>                  bool needs_unfenced : 1;
>>   
>> +               struct intel_context *ce;
>>                  struct i915_request *rq;
>>                  u32 *rq_cmd;
>>                  unsigned int rq_size;
>> @@ -903,6 +904,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
>>          cache->has_fence = cache->gen < 4;
>>          cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
>>          cache->node.allocated = false;
>> +       cache->ce = NULL;
>>          cache->rq = NULL;
>>          cache->rq_size = 0;
>>   }
>> @@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
>>   static void reloc_cache_reset(struct reloc_cache *cache)
>>   {
>>          void *vaddr;
>> +       struct intel_context *ce;
>>   
>>          if (cache->rq)
>>                  reloc_gpu_flush(cache);
>> @@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
>>                  }
>>          }
>>   
>> +       ce = fetch_and_zero(&cache->ce);
>> +       if (ce)
>> +               intel_context_put(ce);
> We don't need to create a new context between every buffer, it can be
> released in eb_destroy.
>
>> +
>>          cache->vaddr = 0;
>>          cache->page = -1;
>>   }
>> @@ -1168,7 +1175,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>>          if (err)
>>                  goto err_unmap;
>>   
>> -       rq = i915_request_create(eb->context);
>> +       rq = intel_context_create_request(cache->ce);
>>          if (IS_ERR(rq)) {
>>                  err = PTR_ERR(rq);
>>                  goto err_unpin;
>> @@ -1239,6 +1246,24 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
>>                  if (!intel_engine_can_store_dword(eb->engine))
>>                          return ERR_PTR(-ENODEV);
>>   
>> +               /*
>> +                * The CS pre-parser can pre-fetch commands across memory sync
>> +                * points and starting gen12 it is able to pre-fetch across
>> +                * BB_START and BB_END boundaries (within the same context). We
>> +                * use a separate context to guarantee that the reloc writes
>> +                * land before the parser gets to the target memory location.
>> +                */
>> +               if (!cache->ce) {
>> +                       struct intel_context *ce;
>> +
> I was thinking of
>
> 	if (cache->gen >= 12)
>> +                       ce = intel_context_create(eb->context->gem_context,
>> +                                                 eb->engine);
> else
> 	ce = intel_context_get(eb->context);
>
> Note that this requires us to fix the tagging so that we don't perform a
> lite-restore from the reloc instance to the user instance.

What's wrong with lite-restoring in this case? It's not something we 
stop now AFAICS.

Daniele

>
> But yes, that's more or less the same as the sketch I did :)
> -Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-26 17:56   ` Daniele Ceraolo Spurio
@ 2019-08-26 18:10     ` Chris Wilson
  2019-08-26 18:12       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2019-08-26 18:10 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-08-26 18:56:53)
> 
> 
> On 8/24/2019 1:54 AM, Chris Wilson wrote:
> > Note that this requires us to fix the tagging so that we don't perform a
> > lite-restore from the reloc instance to the user instance.
> 
> What's wrong with lite-restoring in this case? It's not something we 
> stop now AFAICS.

I have patches to fix the regression. But the question is whether or not
the pre-parser is happy to cross the lite-restore boundary. If it is a
light lite-restore and there is no cache invalidation...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/i915: use a separate context for gpu relocs
  2019-08-26 18:10     ` Chris Wilson
@ 2019-08-26 18:12       ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 9+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-08-26 18:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 8/26/19 11:10 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-08-26 18:56:53)
>>
>>
>> On 8/24/2019 1:54 AM, Chris Wilson wrote:
>>> Note that this requires us to fix the tagging so that we don't perform a
>>> lite-restore from the reloc instance to the user instance.
>>
>> What's wrong with lite-restoring in this case? It's not something we
>> stop now AFAICS.
> 
> I have patches to fix the regression. But the question is whether or not
> the pre-parser is happy to cross the lite-restore boundary. If it is a
> light lite-restore and there is no cache invalidation...
> -Chris
> 

On pre-gen12 the parser doesn't cross the BB_START, so we're safe from 
that POV.

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-26 18:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-24  0:20 [PATCH] drm/i915: use a separate context for gpu relocs Daniele Ceraolo Spurio
2019-08-24  0:32 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2019-08-24  1:25 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-08-24  8:54 ` [PATCH] " Chris Wilson
2019-08-26 17:56   ` Daniele Ceraolo Spurio
2019-08-26 18:10     ` Chris Wilson
2019-08-26 18:12       ` Daniele Ceraolo Spurio
2019-08-24  9:01 ` Chris Wilson
2019-08-24  9:03 ` Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox