From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mika Kuoppala Subject: Re: [PATCH] drm/i915: Fix system hang with EI UP masked on Haswell Date: Thu, 13 Apr 2017 14:58:23 +0300 Message-ID: <87efww4izk.fsf@gaia.fi.intel.com> References: <1492082127-29007-1-git-send-email-mika.kuoppala@intel.com> <20170413113715.GT12532@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9EC8E6E2E4 for ; Thu, 13 Apr 2017 11:59:29 +0000 (UTC) In-Reply-To: <20170413113715.GT12532@nuc-i3427.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Chris Wilson Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org List-Id: intel-gfx@lists.freedesktop.org Q2hyaXMgV2lsc29uIDxjaHJpc0BjaHJpcy13aWxzb24uY28udWs+IHdyaXRlczoKCj4gT24gVGh1 LCBBcHIgMTMsIDIwMTcgYXQgMDI6MTU6MjdQTSArMDMwMCwgTWlrYSBLdW9wcGFsYSB3cm90ZToK Pj4gUHJldmlvdXNseSB3aXRoIGNvbW1pdCBhOWMxZjkwYzhlMTcKPj4gKCJkcm0vaTkxNTogRG9u J3QgbWFzayBFSSBVUCBpbnRlcnJ1cHQgb24gSVZCfFNOQiIpIGNlcnRhaW4sCj4+IHNlZW1pbmds eSB1bnJlbGF0ZWQgYml0IChHRU42X1BNX1JQX1VQX0VJX0VYUElSRUQpIHdhcyBuZWVkZWQKPj4g dG8gYmUgdW5tYXNrZWQgZm9yIElWQiBhbmQgU05CIGluIG9yZGVyIHRvIHByZXZlbnQgc3lzdGVt IGhhbmcKPj4gd2l0aCBjaGFpbmVkIGJhdGNoYnVmZmVycy4KPj4gCj4+IE91ciBDSSB3YXMgc2Vl aW5nIGluY29tcGxldGUgcmVzdWx0cyB3aXRoIHRlc3RzIHRoYXQgdXNlZAo+PiBjaGFpbmVkIGJh dGNoZXMgYW5kIGl0IHdhcyBmb3VuZCBvdXQgdGhhdCBIU1cgbmVlZHMgdG8gaGF2ZSB0aGlzCj4+ IHNhbWUgYml0IHVubWFza2VkIHRvIHJlbGlhYmx5IHN1cnZpdmUgY2hhaW5lZCBiYXRjaGVzLgo+ PiAKPj4gQWx3YXlzIHVubWFzayBHRU42X1BNX1JQX1VQX0VJX0VYUElSRUQgb24gSGFzd2VsbCB0 bwo+PiBwcmV2ZW50IHN5c3RlbSBoYW5nIHdpdGggYmF0Y2ggY2hhaW5pbmcuCj4+IAo+PiBUZXN0 Y2FzZTogaWd0L2dlbV9leGVjX2ZlbmNlL25iLWF3YWl0LWRlZmF1bHQKPj4gQnVnemlsbGE6IGh0 dHBzOi8vYnVncy5mcmVlZGVza3RvcC5vcmcvc2hvd19idWcuY2dpP2lkPTEwMDY3Mgo+PiBDYzog Q2hyaXMgV2lsc29uIDxjaHJpc0BjaHJpcy13aWxzb24uY28udWs+Cj4+IENjOiBzdGFibGVAdmdl ci5rZXJuZWwub3JnCj4+IFNpZ25lZC1vZmYtYnk6IE1pa2EgS3VvcHBhbGEgPG1pa2Eua3VvcHBh bGFAaW50ZWwuY29tPgo+Cj4gKiBmYWNlcGFsbS4KPgo+IEkgYW0gYW1hemVkIHRoYXQgdG9vayBz byBsb25nIGZvciB1cyB0byBub3RpY2UuCgpJdCBjb3VsZCBiZSB0aGF0IHdlIGRvbid0IGhhdmUg Y2hhaW5lZCBzbyBtdWNoIGluIENJLgpBbHNvIGl0IHNlZW1zIHRvIGJlIG1vcmUgc3VidGxlIHRo YW4gd2l0aCBJVkIuIFdpdGgKc3BpbiBiYXRjaCBpdCBkaWRudCBzdXJmYWNlIGJ1dCB3aXRoIG5i LWF3YWl0LWRlZmF1bHQKdGhlIHN0b3JlL3NwaW4gYW5kIHBvc3NpYmx5KD8pIHRoZSBjcHUgc2lk ZSBzbGVlcApsdXJlZCBpdCBvdXQuCgo+IEFja2VkLWJ5OiBDaHJpcyBXaWxzb24gPGNocmlzQGNo cmlzLXdpbHNvbi5jby51az4KVGhhbmtzLgoKPgo+IERpZCB3ZSBldmVyIGdldCBhIHcvYSBpZGVu dGlmaWVyIGZvciB0aGlzPwoKTm90IHRoYXQgSSBrbm93IG9mLiBBbmQgaW4gcmV0cm9zcGVjdCBl eGNsdWRpbmcKaHN3IHdhcyBub3Qgd2lzZSBpbiB0aGUgb3JpZ2luYWwgcGF0Y2guIEl0IHdhcyB2 Mwp3aGVyZSBpdCB3YXMgZXhjbHVkZWQgYnV0IEkgZGlkbid0IGZpbmQgdGhlIHRyYWlsIHRoYXQK bGVhZCB0aGVyZS4gVHJ1c3RpbmcgaXQgbm90IHRvIGluaGVyaXQgdGhlIHBlY3VsaWFyaXRpZXMu Li4KCkkgbGlrZSB0byB0aGluayB0aGF0IHdlIHRlc3RlZCBhbmQgaXQgbmV2ZXIgaHVuZyB3aXRo CnN0cmFpZ2h0IHVwIGJ1c3kgY2hhaW5pbmcuIG5iLWF3YWl0LWRlZmF1bHQgaXMKbW9yZSBzb3Bo aXN0aWNhdGVkLgoKLU1pa2EKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX18KSW50ZWwtZ2Z4IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0 b3Aub3JnCmh0dHBzOi8vbGlzdHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50 ZWwtZ2Z4Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com ([192.55.52.43]:51672 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751473AbdDML7a (ORCPT ); Thu, 13 Apr 2017 07:59:30 -0400 From: Mika Kuoppala To: Chris Wilson Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org Subject: Re: [PATCH] drm/i915: Fix system hang with EI UP masked on Haswell In-Reply-To: <20170413113715.GT12532@nuc-i3427.alporthouse.com> References: <1492082127-29007-1-git-send-email-mika.kuoppala@intel.com> <20170413113715.GT12532@nuc-i3427.alporthouse.com> Date: Thu, 13 Apr 2017 14:58:23 +0300 Message-ID: <87efww4izk.fsf@gaia.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain Sender: stable-owner@vger.kernel.org List-ID: Chris Wilson writes: > On Thu, Apr 13, 2017 at 02:15:27PM +0300, Mika Kuoppala wrote: >> Previously with commit a9c1f90c8e17 >> ("drm/i915: Don't mask EI UP interrupt on IVB|SNB") certain, >> seemingly unrelated bit (GEN6_PM_RP_UP_EI_EXPIRED) was needed >> to be unmasked for IVB and SNB in order to prevent system hang >> with chained batchbuffers. >> >> Our CI was seeing incomplete results with tests that used >> chained batches and it was found out that HSW needs to have this >> same bit unmasked to reliably survive chained batches. >> >> Always unmask GEN6_PM_RP_UP_EI_EXPIRED on Haswell to >> prevent system hang with batch chaining. >> >> Testcase: igt/gem_exec_fence/nb-await-default >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100672 >> Cc: Chris Wilson >> Cc: stable@vger.kernel.org >> Signed-off-by: Mika Kuoppala > > * facepalm. > > I am amazed that took so long for us to notice. It could be that we don't have chained so much in CI. Also it seems to be more subtle than with IVB. With spin batch it didnt surface but with nb-await-default the store/spin and possibly(?) the cpu side sleep lured it out. > Acked-by: Chris Wilson Thanks. > > Did we ever get a w/a identifier for this? Not that I know of. And in retrospect excluding hsw was not wise in the original patch. It was v3 where it was excluded but I didn't find the trail that lead there. Trusting it not to inherit the peculiarities... I like to think that we tested and it never hung with straight up busy chaining. nb-await-default is more sophisticated. -Mika