* [PATCH] drm/i915: Increase context alignment requirement for Sandybridge @ 2016-03-22 14:07 Chris Wilson 2016-03-22 15:05 ` ✗ Fi.CI.BAT: failure for " Patchwork 2016-03-23 22:42 ` [PATCH] " Chris Wilson 0 siblings, 2 replies; 3+ messages in thread From: Chris Wilson @ 2016-03-22 14:07 UTC (permalink / raw) To: intel-gfx; +Cc: Chris Wilson, stable In bugzilla, there are some very weird bugs on SNB GT1 whereby the seqno stop being written, but the GPU is otherwise functional, well the command streamer at least! However, since the seqno were not being updated any waits upon rendering results hung, triggering the GPU hang detector. I found a very similar hang when running igt/gem_exec_whisper on a SNB GT1 and after playing around came to the conclusion that: (a) it depends on timing, enabling debug and other slowdowns masks the bug; (b) it was not context size, as increasing the allocation to 128KiB made no difference; (c) it depended upon placement as restricting the binding to the mappable region works; (d) it depended upon alignment of the context binding, though the bspec still only lists the restriction as 4k Changing the alignment constrainst seems to be least intrusive, and though I have not been able to reproduce this on snb-gt2 and all the recent bugs to the best of my knowledge have been snb-gt1, it is safer to apply the constraint to all snb. Though I am still a little wary that is merely a side-effect that is papering over the issue (for example, it may be placement of the context next to another object that is causng the issue, or it may be finding the new alignment slows down context switches enough etc). Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1 References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 394e525e55f1..f0883a968e11 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -94,7 +94,7 @@ * I've seen in a spec to date, and that was a workaround for a non-shipping * part. It should be safe to decrease this, but it's more future proof as is. */ -#define GEN6_CONTEXT_ALIGN (64<<10) +#define GEN6_CONTEXT_ALIGN (256<<10) #define GEN7_CONTEXT_ALIGN 4096 static size_t get_context_alignment(struct drm_device *dev) -- 2.8.0.rc3 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Increase context alignment requirement for Sandybridge 2016-03-22 14:07 [PATCH] drm/i915: Increase context alignment requirement for Sandybridge Chris Wilson @ 2016-03-22 15:05 ` Patchwork 2016-03-23 22:42 ` [PATCH] " Chris Wilson 1 sibling, 0 replies; 3+ messages in thread From: Patchwork @ 2016-03-22 15:05 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915: Increase context alignment requirement for Sandybridge URL : https://patchwork.freedesktop.org/series/4752/ State : failure == Summary == Series 4752v1 drm/i915: Increase context alignment requirement for Sandybridge http://patchwork.freedesktop.org/api/1.0/series/4752/revisions/1/mbox/ Test drv_module_reload_basic: pass -> INCOMPLETE (ilk-hp8440p) Test gem_exec_suspend: Subgroup basic-s3: dmesg-warn -> PASS (bsw-nuc-2) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-c: pass -> DMESG-WARN (bsw-nuc-2) bdw-ultra total:192 pass:171 dwarn:0 dfail:0 fail:0 skip:21 bsw-nuc-2 total:192 pass:153 dwarn:2 dfail:0 fail:0 skip:37 byt-nuc total:192 pass:156 dwarn:1 dfail:0 fail:0 skip:35 hsw-brixbox total:192 pass:170 dwarn:0 dfail:0 fail:0 skip:22 hsw-gt2 total:192 pass:175 dwarn:0 dfail:0 fail:0 skip:17 ilk-hp8440p total:65 pass:45 dwarn:0 dfail:0 fail:0 skip:19 ivb-t430s total:192 pass:167 dwarn:0 dfail:0 fail:0 skip:25 skl-i5k-2 total:192 pass:169 dwarn:0 dfail:0 fail:0 skip:23 skl-i7k-2 total:192 pass:169 dwarn:0 dfail:0 fail:0 skip:23 skl-nuci5 total:192 pass:181 dwarn:0 dfail:0 fail:0 skip:11 snb-dellxps total:192 pass:158 dwarn:0 dfail:0 fail:0 skip:34 snb-x220t total:192 pass:158 dwarn:0 dfail:0 fail:1 skip:33 Results at /archive/results/CI_IGT_test/Patchwork_1676/ 94c7a2ce8a4f659840625a318b1d3d8832f4ca46 drm-intel-nightly: 2016y-03m-22d-12h-51m-12s UTC integration manifest d6a9942faa9647c387860fe3f17cda92d2579a10 drm/i915: Increase context alignment requirement for Sandybridge _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] drm/i915: Increase context alignment requirement for Sandybridge 2016-03-22 14:07 [PATCH] drm/i915: Increase context alignment requirement for Sandybridge Chris Wilson 2016-03-22 15:05 ` ✗ Fi.CI.BAT: failure for " Patchwork @ 2016-03-23 22:42 ` Chris Wilson 1 sibling, 0 replies; 3+ messages in thread From: Chris Wilson @ 2016-03-23 22:42 UTC (permalink / raw) To: intel-gfx; +Cc: stable On Tue, Mar 22, 2016 at 02:07:24PM +0000, Chris Wilson wrote: > In bugzilla, there are some very weird bugs on SNB GT1 whereby the > seqno stop being written, but the GPU is otherwise functional, well the > command streamer at least! However, since the seqno were not being > updated any waits upon rendering results hung, triggering the GPU hang > detector. > > I found a very similar hang when running igt/gem_exec_whisper on a SNB > GT1 and after playing around came to the conclusion that: > > (a) it depends on timing, enabling debug and other slowdowns masks the > bug; > > (b) it was not context size, as increasing the allocation to 128KiB made > no difference; > > (c) it depended upon placement as restricting the binding to the > mappable region works; > > (d) it depended upon alignment of the context binding, though the bspec > still only lists the restriction as 4k > > Changing the alignment constrainst seems to be least intrusive, and > though I have not been able to reproduce this on snb-gt2 and all the > recent bugs to the best of my knowledge have been snb-gt1, it is safer > to apply the constraint to all snb. Though I am still a little wary that > is merely a side-effect that is papering over the issue (for example, it > may be placement of the context next to another object that is causng > the issue, or it may be finding the new alignment slows down context > switches enough etc). So, the hang came back (after switching to using an iomap rather than the vmap). An alternative workaround is to avoid the first page of the GTT! Either by subtracting the first page from the drm_mm, or applying a bias to the ringbuffer. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0715bb7..f75e9a8 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2726,6 +2726,7 @@ static int i915_gem_setup_global_gtt(struct drm_device *dev, BUG_ON(mappable_end > end); + start += PAGE_SIZE; ggtt_vm->start = start; /* Subtract the guard page before address space initialization to or diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index ce59850..f789f92 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2109,10 +2109,11 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev, { struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_object *obj = ringbuf->obj; + unsigned flags = PIN_OFFSET_BIAS | PAGE_SIZE; int ret; if (HAS_LLC(dev_priv) && !obj->stolen) { - ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, 0); + ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags); if (ret) return ret; @@ -2128,7 +2129,7 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev, return -ENOMEM; } } else { - ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE); + ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags | PIN_MAPPABLE); if (ret) return ret; > References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262 Has the same RING_START==0 symptom. So still promising. Ideas? -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-03-23 22:42 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-22 14:07 [PATCH] drm/i915: Increase context alignment requirement for Sandybridge Chris Wilson 2016-03-22 15:05 ` ✗ Fi.CI.BAT: failure for " Patchwork 2016-03-23 22:42 ` [PATCH] " Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).