From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jani Nikula Subject: Re: [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Date: Mon, 18 Apr 2016 12:50:35 +0300 Message-ID: <87inzfp7o4.fsf@intel.com> References: <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk> <1460491389-8602-11-git-send-email-chris@chris-wilson.co.uk> <20160413093316.GP2510@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTP id AC7E66E569 for ; Mon, 18 Apr 2016 09:50:43 +0000 (UTC) In-Reply-To: <20160413093316.GP2510@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Daniel Vetter , Chris Wilson Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org List-Id: intel-gfx@lists.freedesktop.org T24gV2VkLCAxMyBBcHIgMjAxNiwgRGFuaWVsIFZldHRlciA8ZGFuaWVsQGZmd2xsLmNoPiB3cm90 ZToKPiBPbiBUdWUsIEFwciAxMiwgMjAxNiBhdCAwOTowMzowNVBNICswMTAwLCBDaHJpcyBXaWxz b24gd3JvdGU6Cj4+IFR3byBjb25jdXJyZW50IHdyaXRlcyBpbnRvIHRoZSBzYW1lIHJlZ2lzdGVy IGNhY2hlbGluZSBoYXMgdGhlIGNoYW5jZSBvZgo+PiBraWxsaW5nIHRoZSBtYWNoaW5lIG9uIEl2 eWJyaWRnZSBhbmQgb3RoZXIgZ2VuNy4gVGhpcyBpbmNsdWRlcyBMUkkKPj4gZW1pdHRlZCBmcm9t IHRoZSBjb21tYW5kIHBhcnNlci4gIFRoZSBNSV9TRVRfQ09OVEVYVCBpdHNlbGYgc2VydmVzIGFz Cj4+IHNlcmlhbGlzaW5nIGJhcnJpZXIgYW5kIHByZXZlbnRzIHRoZSBwYWlyIG9mIHJlZ2lzdGVy IHdyaXRlcyBpbiB0aGUgZmlyc3QKPj4gcGFja2V0IGZyb20gdHJpZ2dlcmluZyB0aGUgZmF1bHQu ICBIb3dldmVyLCBpZiBhIHNlY29uZCBzd2l0Y2gtY29udGV4dAo+PiBpbW1lZGlhdGVseSBvY2N1 cnMgdGhlbiB3ZSBtYXkgaGF2ZSB0d28gYWRqYWNlbnQgYmxvY2tzIG9mIExSSSB0byB0aGUKPj4g c2FtZSByZWdpc3RlcnMgd2hpY2ggbWF5IHRoZW4gdHJpZ2dlciB0aGUgaGFuZy4gVG8gY291bnRl cmFjdCB0aGlzIHdlCj4+IG5lZWQgdG8gaW5zZXJ0IGEgZGVsYXkgYWZ0ZXIgdGhlIHNlY29uZCBy ZWdpc3RlciB3cml0ZSB1c2luZyBTUk0uCj4+IAo+PiBUaGlzIGlzIGVhc2llc3QgdG8gcmVwcm9k dWNlIHdpdGggc29tZXRoaW5nIGxpa2UKPj4gaWd0L2dlbV9jdHhfc3dpdGNoL2ludGVycnVwdGli bGUgdGhhdCB0cmlnZ2VycyBiYWNrLXRvLWJhY2sgY29udGV4dAo+PiBzd2l0Y2hlcyAod2l0aCBu byBvcGVyYXRpb25zIGluIGJldHdlZW4gdGhlbSBpbiB0aGUgY29tbWFuZCBzdHJlYW0sCj4+IHdo aWNoIHJlcXVpcmVzIHRoZSBleGVjYnVmIG9wZXJhdGlvbiB0byBiZSBpbnRlcnJ1cHRlZCBhZnRl ciB0aGUKPj4gTUlfU0VUX0NPTlRFWFQpIGJ1dCBjYW4gYmUgb2JzZXJ2ZWQgc3BvcmFkaWNhbGx5 IGVsc2V3aGVyZSB3aGVuIHJ1bm5pbmcKPj4gaW50ZXJydXB0aWJsZSBpZ3QuIE5vIHJlcG9ydHMg ZnJvbSB0aGUgd2lsZCB0aG91Z2gsIHNvIGl0IG11c3QgYmUgb2YgbG93Cj4+IGVub3VnaCBmcmVx dWVuY3kgdGhhdCBubyBvbmUgaGFzIGNvcnJlbGF0ZWQgdGhlIHJhbmRvbSBtYWNoaW5lIGZyZWV6 ZXMKPj4gd2l0aCBpOTE1LmtvCj4+IAo+PiBUaGUgaXNzdWUgd2FzIGludHJvZHVjZWQgd2l0aAo+ PiBjb21taXQgMmM1NTAxODM0NzZkZmEyNTY0MTMwOWFlOWEyOGQzMGZlZWQxNDM3OSBbdjMuMTld Cj4+IEF1dGhvcjogQ2hyaXMgV2lsc29uIDxjaHJpc0BjaHJpcy13aWxzb24uY28udWs+Cj4+IERh dGU6ICAgVHVlIERlYyAxNiAxMDowMjoyNyAyMDE0ICswMDAwCj4+IAo+PiAgICAgZHJtL2k5MTU6 IERpc2FibGUgUFNNSSBzbGVlcCBtZXNzYWdlcyBvbiBhbGwgcmluZ3MgYXJvdW5kIGNvbnRleHQg c3dpdGNoZXMKPj4gCj4+IFRlc3RjYXNlOiBpZ3QvZ2VtX2N0eF9zd2l0Y2gvcmVuZGVyLWludGVy cnVwdGlibGUgI2l2Ygo+PiBTaWduZWQtb2ZmLWJ5OiBDaHJpcyBXaWxzb24gPGNocmlzQGNocmlz LXdpbHNvbi5jby51az4KPj4gQ2M6IERhbmllbCBWZXR0ZXIgPGRhbmllbEBmZndsbC5jaD4KPj4g Q2M6IFZpbGxlIFN5cmrDpGzDpCA8dmlsbGUuc3lyamFsYUBsaW51eC5pbnRlbC5jb20+Cj4+IENj OiBzdGFibGVAdmdlci5rZXJuZWwub3JnCj4KPiBSZXZpZXdlZC1ieTogRGFuaWVsIFZldHRlciA8 ZGFuaWVsLnZldHRlckBmZndsbC5jaD4KCkZZSSwgdGhpcyAoKikgZG9lcyBub3QgY2hlcnJ5LXBp Y2sgY2xlYW5seSB0byBkcm0taW50ZWwtZml4ZXMuCgpCUiwKSmFuaS4KCigqKSBXZWxsLCBub3Qg ZXhhY3RseSAqdGhpcyogYnV0IHJhdGhlcgpodHRwczovL3BhdGNod29yay5mcmVlZGVza3RvcC5v cmcvcGF0Y2gvODA5NTIvIHdoaWNoIHdhcyBub3QgcG9zdGVkIG9uCnRoZSBsaXN0IHNvIEkgY2Fu J3QgcmVwbHkgdG8gaXQuCgoKPgo+PiAtLS0KPj4gIGRyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVf Z2VtX2NvbnRleHQuYyB8IDE1ICsrKysrKysrKysrKy0tLQo+PiAgMSBmaWxlIGNoYW5nZWQsIDEy IGluc2VydGlvbnMoKyksIDMgZGVsZXRpb25zKC0pCj4+IAo+PiBkaWZmIC0tZ2l0IGEvZHJpdmVy cy9ncHUvZHJtL2k5MTUvaTkxNV9nZW1fY29udGV4dC5jIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUv aTkxNV9nZW1fY29udGV4dC5jCj4+IGluZGV4IGZlNTgwY2I5NTAxYS4uZTVhZDdiMjFlMzU2IDEw MDY0NAo+PiAtLS0gYS9kcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2dlbV9jb250ZXh0LmMKPj4g KysrIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9nZW1fY29udGV4dC5jCj4+IEBAIC01Mzks NyArNTM5LDcgQEAgbWlfc2V0X2NvbnRleHQoc3RydWN0IGRybV9pOTE1X2dlbV9yZXF1ZXN0ICpy ZXEsIHUzMiBod19mbGFncykKPj4gIAo+PiAgCWxlbiA9IDQ7Cj4+ICAJaWYgKElOVEVMX0lORk8o ZW5naW5lLT5kZXYpLT5nZW4gPj0gNykKPj4gLQkJbGVuICs9IDIgKyAobnVtX3JpbmdzID8gNCpu dW1fcmluZ3MgKyAyIDogMCk7Cj4+ICsJCWxlbiArPSAyICsgKG51bV9yaW5ncyA/IDQqbnVtX3Jp bmdzICsgNiA6IDApOwo+PiAgCj4+ICAJcmV0ID0gaW50ZWxfcmluZ19iZWdpbihyZXEsIGxlbik7 Cj4+ICAJaWYgKHJldCkKPj4gQEAgLTU3OSw2ICs1NzksNyBAQCBtaV9zZXRfY29udGV4dChzdHJ1 Y3QgZHJtX2k5MTVfZ2VtX3JlcXVlc3QgKnJlcSwgdTMyIGh3X2ZsYWdzKQo+PiAgCWlmIChJTlRF TF9JTkZPKGVuZ2luZS0+ZGV2KS0+Z2VuID49IDcpIHsKPj4gIAkJaWYgKG51bV9yaW5ncykgewo+ PiAgCQkJc3RydWN0IGludGVsX2VuZ2luZV9jcyAqc2lnbmFsbGVyOwo+PiArCQkJaTkxNV9yZWdf dCBsYXN0X3JlZyA9IHt9OyAvKiBrZWVwIGdjYyBxdWlldCAqLwo+PiAgCj4+ICAJCQlpbnRlbF9y aW5nX2VtaXQoZW5naW5lLAo+PiAgCQkJCQlNSV9MT0FEX1JFR0lTVEVSX0lNTShudW1fcmluZ3Mp KTsKPj4gQEAgLTU4NiwxMSArNTg3LDE5IEBAIG1pX3NldF9jb250ZXh0KHN0cnVjdCBkcm1faTkx NV9nZW1fcmVxdWVzdCAqcmVxLCB1MzIgaHdfZmxhZ3MpCj4+ICAJCQkJaWYgKHNpZ25hbGxlciA9 PSBlbmdpbmUpCj4+ICAJCQkJCWNvbnRpbnVlOwo+PiAgCj4+IC0JCQkJaW50ZWxfcmluZ19lbWl0 X3JlZyhlbmdpbmUsCj4+IC0JCQkJCQkgICAgUklOR19QU01JX0NUTChzaWduYWxsZXItPm1taW9f YmFzZSkpOwo+PiArCQkJCWxhc3RfcmVnID0gUklOR19QU01JX0NUTChzaWduYWxsZXItPm1taW9f YmFzZSk7Cj4+ICsJCQkJaW50ZWxfcmluZ19lbWl0X3JlZyhlbmdpbmUsIGxhc3RfcmVnKTsKPj4g IAkJCQlpbnRlbF9yaW5nX2VtaXQoZW5naW5lLAo+PiAgCQkJCQkJX01BU0tFRF9CSVRfRElTQUJM RShHRU42X1BTTUlfU0xFRVBfTVNHX0RJU0FCTEUpKTsKPj4gIAkJCX0KPj4gKwo+PiArCQkJLyog SW5zZXJ0IGEgZGVsYXkgYmVmb3JlIHRoZSBuZXh0IHN3aXRjaCEgKi8KPj4gKwkJCWludGVsX3Jp bmdfZW1pdChlbmdpbmUsCj4+ICsJCQkJCU1JX1NUT1JFX1JFR0lTVEVSX01FTSB8Cj4+ICsJCQkJ CU1JX1NSTV9MUk1fR0xPQkFMX0dUVCk7Cj4+ICsJCQlpbnRlbF9yaW5nX2VtaXRfcmVnKGVuZ2lu ZSwgbGFzdF9yZWcpOwo+PiArCQkJaW50ZWxfcmluZ19lbWl0KGVuZ2luZSwgZW5naW5lLT5zY3Jh dGNoLmd0dF9vZmZzZXQpOwo+PiArCQkJaW50ZWxfcmluZ19lbWl0KGVuZ2luZSwgTUlfTk9PUCk7 Cj4+ICAJCX0KPj4gIAkJaW50ZWxfcmluZ19lbWl0KGVuZ2luZSwgTUlfQVJCX09OX09GRiB8IE1J X0FSQl9FTkFCTEUpOwo+PiAgCX0KPj4gLS0gCj4+IDIuOC4wLnJjMwo+PiAKCi0tIApKYW5pIE5p a3VsYSwgSW50ZWwgT3BlbiBTb3VyY2UgVGVjaG5vbG9neSBDZW50ZXIKX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4IG1haWxpbmcgbGlzdApJ bnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMuZnJlZWRlc2t0b3Au b3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com ([192.55.52.93]:52747 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751497AbcDRJun convert rfc822-to-8bit (ORCPT ); Mon, 18 Apr 2016 05:50:43 -0400 From: Jani Nikula To: Daniel Vetter , Chris Wilson Cc: stable@vger.kernel.org, intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching In-Reply-To: <20160413093316.GP2510@phenom.ffwll.local> References: <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk> <1460491389-8602-11-git-send-email-chris@chris-wilson.co.uk> <20160413093316.GP2510@phenom.ffwll.local> Date: Mon, 18 Apr 2016 12:50:35 +0300 Message-ID: <87inzfp7o4.fsf@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: stable-owner@vger.kernel.org List-ID: On Wed, 13 Apr 2016, Daniel Vetter wrote: > On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote: >> Two concurrent writes into the same register cacheline has the chance of >> killing the machine on Ivybridge and other gen7. This includes LRI >> emitted from the command parser. The MI_SET_CONTEXT itself serves as >> serialising barrier and prevents the pair of register writes in the first >> packet from triggering the fault. However, if a second switch-context >> immediately occurs then we may have two adjacent blocks of LRI to the >> same registers which may then trigger the hang. To counteract this we >> need to insert a delay after the second register write using SRM. >> >> This is easiest to reproduce with something like >> igt/gem_ctx_switch/interruptible that triggers back-to-back context >> switches (with no operations in between them in the command stream, >> which requires the execbuf operation to be interrupted after the >> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running >> interruptible igt. No reports from the wild though, so it must be of low >> enough frequency that no one has correlated the random machine freezes >> with i915.ko >> >> The issue was introduced with >> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19] >> Author: Chris Wilson >> Date: Tue Dec 16 10:02:27 2014 +0000 >> >> drm/i915: Disable PSMI sleep messages on all rings around context switches >> >> Testcase: igt/gem_ctx_switch/render-interruptible #ivb >> Signed-off-by: Chris Wilson >> Cc: Daniel Vetter >> Cc: Ville Syrjälä >> Cc: stable@vger.kernel.org > > Reviewed-by: Daniel Vetter FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes. BR, Jani. (*) Well, not exactly *this* but rather https://patchwork.freedesktop.org/patch/80952/ which was not posted on the list so I can't reply to it. > >> --- >> drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c >> index fe580cb9501a..e5ad7b21e356 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_context.c >> +++ b/drivers/gpu/drm/i915/i915_gem_context.c >> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) >> >> len = 4; >> if (INTEL_INFO(engine->dev)->gen >= 7) >> - len += 2 + (num_rings ? 4*num_rings + 2 : 0); >> + len += 2 + (num_rings ? 4*num_rings + 6 : 0); >> >> ret = intel_ring_begin(req, len); >> if (ret) >> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) >> if (INTEL_INFO(engine->dev)->gen >= 7) { >> if (num_rings) { >> struct intel_engine_cs *signaller; >> + i915_reg_t last_reg = {}; /* keep gcc quiet */ >> >> intel_ring_emit(engine, >> MI_LOAD_REGISTER_IMM(num_rings)); >> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) >> if (signaller == engine) >> continue; >> >> - intel_ring_emit_reg(engine, >> - RING_PSMI_CTL(signaller->mmio_base)); >> + last_reg = RING_PSMI_CTL(signaller->mmio_base); >> + intel_ring_emit_reg(engine, last_reg); >> intel_ring_emit(engine, >> _MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE)); >> } >> + >> + /* Insert a delay before the next switch! */ >> + intel_ring_emit(engine, >> + MI_STORE_REGISTER_MEM | >> + MI_SRM_LRM_GLOBAL_GTT); >> + intel_ring_emit_reg(engine, last_reg); >> + intel_ring_emit(engine, engine->scratch.gtt_offset); >> + intel_ring_emit(engine, MI_NOOP); >> } >> intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE); >> } >> -- >> 2.8.0.rc3 >> -- Jani Nikula, Intel Open Source Technology Center