From: Jani Nikula <jani.nikula@linux.intel.com>
To: Daniel Vetter <daniel@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
Date: Wed, 20 Apr 2016 16:26:59 +0300 [thread overview]
Message-ID: <874mawo1gc.fsf@intel.com> (raw)
In-Reply-To: <87inzfp7o4.fsf@intel.com>
On Mon, 18 Apr 2016, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>>> Two concurrent writes into the same register cacheline has the chance of
>>> killing the machine on Ivybridge and other gen7. This includes LRI
>>> emitted from the command parser. The MI_SET_CONTEXT itself serves as
>>> serialising barrier and prevents the pair of register writes in the first
>>> packet from triggering the fault. However, if a second switch-context
>>> immediately occurs then we may have two adjacent blocks of LRI to the
>>> same registers which may then trigger the hang. To counteract this we
>>> need to insert a delay after the second register write using SRM.
>>>
>>> This is easiest to reproduce with something like
>>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>>> switches (with no operations in between them in the command stream,
>>> which requires the execbuf operation to be interrupted after the
>>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>>> interruptible igt. No reports from the wild though, so it must be of low
>>> enough frequency that no one has correlated the random machine freezes
>>> with i915.ko
>>>
>>> The issue was introduced with
>>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date: Tue Dec 16 10:02:27 2014 +0000
>>>
>>> drm/i915: Disable PSMI sleep messages on all rings around context switches
>>>
>>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: stable@vger.kernel.org
>>
>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.
By which I mean, regardless of cc: stable I'm not doing anything to
backport the commit to fixes. Same with the other patch in this
thread. If someone wants the fixes to 4.6 or stable, they need to do the
backporting.
BR,
Jani.
>
> BR,
> Jani.
>
> (*) Well, not exactly *this* but rather
> https://patchwork.freedesktop.org/patch/80952/ which was not posted on
> the list so I can't reply to it.
>
>
>>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index fe580cb9501a..e5ad7b21e356 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>
>>> len = 4;
>>> if (INTEL_INFO(engine->dev)->gen >= 7)
>>> - len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>>> + len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>>
>>> ret = intel_ring_begin(req, len);
>>> if (ret)
>>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>> if (INTEL_INFO(engine->dev)->gen >= 7) {
>>> if (num_rings) {
>>> struct intel_engine_cs *signaller;
>>> + i915_reg_t last_reg = {}; /* keep gcc quiet */
>>>
>>> intel_ring_emit(engine,
>>> MI_LOAD_REGISTER_IMM(num_rings));
>>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>> if (signaller == engine)
>>> continue;
>>>
>>> - intel_ring_emit_reg(engine,
>>> - RING_PSMI_CTL(signaller->mmio_base));
>>> + last_reg = RING_PSMI_CTL(signaller->mmio_base);
>>> + intel_ring_emit_reg(engine, last_reg);
>>> intel_ring_emit(engine,
>>> _MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>> }
>>> +
>>> + /* Insert a delay before the next switch! */
>>> + intel_ring_emit(engine,
>>> + MI_STORE_REGISTER_MEM |
>>> + MI_SRM_LRM_GLOBAL_GTT);
>>> + intel_ring_emit_reg(engine, last_reg);
>>> + intel_ring_emit(engine, engine->scratch.gtt_offset);
>>> + intel_ring_emit(engine, MI_NOOP);
>>> }
>>> intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>> }
>>> --
>>> 2.8.0.rc3
>>>
--
Jani Nikula, Intel Open Source Technology Center
next prev parent reply other threads:[~2016-04-20 13:26 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
2016-04-12 20:02 ` [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes Chris Wilson
2016-04-12 20:02 ` [CI-ping 03/15] drm/i915: Add GEM debugging Kconfig option Chris Wilson
2016-04-12 20:02 ` [CI-ping 04/15] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
2016-04-12 20:02 ` [CI-ping 05/15] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson
2016-04-12 20:03 ` [CI-ping 06/15] drm/i915: Tighten reset_counter for reset status Chris Wilson
2016-04-12 20:03 ` [CI-ping 07/15] drm/i915: Store the reset counter when constructing a request Chris Wilson
2016-04-12 20:03 ` [CI-ping 08/15] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson
2016-04-12 20:03 ` [CI-ping 09/15] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
2016-04-12 20:03 ` [CI-ping 10/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
2016-04-13 9:33 ` Daniel Vetter
2016-04-18 9:50 ` Jani Nikula
2016-04-20 13:26 ` Jani Nikula [this message]
2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
2016-04-13 9:34 ` [Intel-gfx] " Daniel Vetter
2016-04-12 20:03 ` [CI-ping 13/15] drm/i915: Move the mb() following release-mmap into release-mmap Chris Wilson
2016-04-12 20:03 ` [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
2016-04-13 9:59 ` Daniel Vetter
2016-04-13 10:05 ` Chris Wilson
2016-04-13 14:16 ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Chris Wilson
2016-04-13 14:16 ` [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
2016-04-13 15:05 ` Daniel Vetter
2016-04-13 15:18 ` Chris Wilson
2016-04-13 14:56 ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Daniel Vetter
2016-04-13 15:04 ` Chris Wilson
2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
2016-04-13 9:57 ` Daniel Vetter
2016-04-13 14:21 ` John Harrison
2016-04-19 12:35 ` Dave Gordon
2016-04-21 13:04 ` John Harrison
2016-04-22 22:57 ` John Harrison
2016-04-27 18:52 ` Dave Gordon
2016-04-18 9:46 ` Jani Nikula
2016-04-14 8:45 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874mawo1gc.fsf@intel.com \
--to=jani.nikula@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).