From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mika Kuoppala Subject: Re: [PATCH] drm/i915: Try harder to reset the gen8+ engines Date: Tue, 06 Sep 2016 14:11:47 +0300 Message-ID: <87oa418eq4.fsf@gaia.fi.intel.com> References: <20160905093015.21820-1-chris@chris-wilson.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7C12D6E177 for ; Tue, 6 Sep 2016 11:12:44 +0000 (UTC) In-Reply-To: <20160905093015.21820-1-chris@chris-wilson.co.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Chris Wilson , intel-gfx@lists.freedesktop.org Cc: Daniel Vetter , stable@vger.kernel.org List-Id: intel-gfx@lists.freedesktop.org ClJlc2VuZGluZyBteSByLWIuLi4KCkNocmlzIFdpbHNvbiA8Y2hyaXNAY2hyaXMtd2lsc29uLmNv LnVrPiB3cml0ZXM6Cgo+IElmIGF0IGZpcnN0IHdlIGRvbid0IHN1Y2NlZWQsIHRyeSBhZ2Fpbi4K Pgo+IFJ1bm5pbmcgdGhlIHJlc2V0IGFuZCByZWNvdmVyeSByb3V0aW5lcyBpbiBhIGxvb3AgZW5k cyBpbiBhICJyZXNldAo+IHJlcXVlc3QgdGltZW91dCIgd2l0aCBhIG10YmYgb2YgYW4gaG91ciBv biBCcmFzd2VsbC4gVGhpcyBpcyBlZXJpbHkKPiBzaW1pbGFyIHRvIHRoZSB1bnJlY292ZXJhYmxl IHJlc2V0IGNvbmRpdGlvbiB0aGF0IGZpcnN0IHByb21wdGVkIHVzIHRvCj4gdXNlIHRoZSByZXNl dC1yZXF1ZXN0IG1lY2hhbmlzbSBpbiBjb21taXQgN2ZkMmQyNjkyMWQxICgiZHJtL2k5MTU6IFJl c2V0Cj4gcmVxdWVzdCBoYW5kbGluZyBmb3IgZ2VuOCsiKS4gUmVwZWF0aW5nIHRoZSByZXNldCBy ZXF1ZXN0IG1ha2VzIHRoZQo+IGZhaWx1cmUgbXVjaCBoYXJkZXIgdG8gcmVwcm9kdWNlIChidXQg dGhlcmUgaXMgbm8gcmVhc29uIHRvIGJlbGlldmUgdGhhdAo+IGl0IGlzIG1vcmUgdGhhbiBtZXJl IHBhcGVyIG92ZXIgYSB0aW1pbmcgb3Igb3RoZXIgaXNzdWUpLgo+Cj4gU2lnbmVkLW9mZi1ieTog Q2hyaXMgV2lsc29uIDxjaHJpc0BjaHJpcy13aWxzb24uY28udWs+Cj4gQ2M6IE1pa2EgS3VvcHBh bGEgPG1pa2Eua3VvcHBhbGFAaW50ZWwuY29tPgo+IENjOiBBcnVuIFNpbHV2ZXJ5IDxhcnVuLnNp bHV2ZXJ5QGxpbnV4LmludGVsLmNvbT4KPiBDYzogRGFuaWVsIFZldHRlciA8ZGFuaWVsLnZldHRl ckBmZndsbC5jaD4KPiBDYzogc3RhYmxlQHZnZXIua2VybmVsLm9yZwo+IC0tLQo+ICBkcml2ZXJz L2dwdS9kcm0vaTkxNS9pbnRlbF91bmNvcmUuYyB8IDI0ICsrKysrKysrKysrKystLS0tLS0tLS0t LQo+ICAxIGZpbGUgY2hhbmdlZCwgMTMgaW5zZXJ0aW9ucygrKSwgMTEgZGVsZXRpb25zKC0pCj4K PiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9ncHUvZHJtL2k5MTUvaW50ZWxfdW5jb3JlLmMgYi9kcml2 ZXJzL2dwdS9kcm0vaTkxNS9pbnRlbF91bmNvcmUuYwo+IGluZGV4IGU5ZjY4Y2Q1NmUzMi4uMWJl OGNlZDAzYmE1IDEwMDY0NAo+IC0tLSBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2ludGVsX3VuY29y ZS5jCj4gKysrIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUvaW50ZWxfdW5jb3JlLmMKPiBAQCAtMTY4 OCwyMCArMTY4OCwyMiBAQCBpbnQgaW50ZWxfd2FpdF9mb3JfcmVnaXN0ZXIoc3RydWN0IGRybV9p OTE1X3ByaXZhdGUgKmRldl9wcml2LAo+ICBzdGF0aWMgaW50IGdlbjhfcmVxdWVzdF9lbmdpbmVf cmVzZXQoc3RydWN0IGludGVsX2VuZ2luZV9jcyAqZW5naW5lKQo+ICB7Cj4gIAlzdHJ1Y3QgZHJt X2k5MTVfcHJpdmF0ZSAqZGV2X3ByaXYgPSBlbmdpbmUtPmk5MTU7Cj4gLQlpbnQgcmV0Owo+ICsJ aW50IGxvb3AgPSAzOwo+CgpyZXRyaWVzPwoKPiAtCUk5MTVfV1JJVEVfRlcoUklOR19SRVNFVF9D VEwoZW5naW5lLT5tbWlvX2Jhc2UpLAo+IC0JCSAgICAgIF9NQVNLRURfQklUX0VOQUJMRShSRVNF VF9DVExfUkVRVUVTVF9SRVNFVCkpOwo+ICsJZG8gewo+ICsJCUk5MTVfV1JJVEVfRlcoUklOR19S RVNFVF9DVEwoZW5naW5lLT5tbWlvX2Jhc2UpLAo+ICsJCQkgICAgICBfTUFTS0VEX0JJVF9FTkFC TEUoUkVTRVRfQ1RMX1JFUVVFU1RfUkVTRVQpKTsKPiAgCj4gLQlyZXQgPSBpbnRlbF93YWl0X2Zv cl9yZWdpc3Rlcl9mdyhkZXZfcHJpdiwKPiAtCQkJCQkgUklOR19SRVNFVF9DVEwoZW5naW5lLT5t bWlvX2Jhc2UpLAo+IC0JCQkJCSBSRVNFVF9DVExfUkVBRFlfVE9fUkVTRVQsCj4gLQkJCQkJIFJF U0VUX0NUTF9SRUFEWV9UT19SRVNFVCwKPiAtCQkJCQkgNzAwKTsKPiAtCWlmIChyZXQpCj4gLQkJ RFJNX0VSUk9SKCIlczogcmVzZXQgcmVxdWVzdCB0aW1lb3V0XG4iLCBlbmdpbmUtPm5hbWUpOwo+ ICsJCWlmICghaW50ZWxfd2FpdF9mb3JfcmVnaXN0ZXJfZncoZGV2X3ByaXYsCj4gKwkJCQkJCVJJ TkdfUkVTRVRfQ1RMKGVuZ2luZS0+bW1pb19iYXNlKSwKPiArCQkJCQkJUkVTRVRfQ1RMX1JFQURZ X1RPX1JFU0VULAo+ICsJCQkJCQlSRVNFVF9DVExfUkVBRFlfVE9fUkVTRVQsCj4gKwkJCQkJCTcw MCkpCgoKRGlkIHlvdSBjaGVjayBiZXR3ZWVuIHRoZSB3cml0ZSBkaWRuJ3Qgc3RpY2sgdnMgdGhl IHJlYWR5bmVzcyBkaWRuJ3QKc2lnbmFsPwoKV2l0aCBnZW44LCB3ZSBtaWdodCBnZXQgYXdheSB3 aXRoIGp1c3QgcmVzZXR0aW5nIHJlZ2FyZGxlc3Mgb2YgdGhlCnRoZSByZWFkeSBzdGF0ZS4gTmVl ZHMgc29tZSBleHBlcmltZW50aW5nL3Rlc3RpbmcgZmlyc3QgdGhvLgoKUmV2aWV3ZWQtYnk6IE1p a2EgS3VvcHBhbGEgPG1pa2Eua3VvcHBhbGFAaW50ZWwuY29tPgoKPiArCQkJcmV0dXJuIDA7Cj4g Kwl9IHdoaWxlICgtLWxvb3ApOwo+ICAKPiAtCXJldHVybiByZXQ7Cj4gKwlEUk1fRVJST1IoIiVz OiByZXNldCByZXF1ZXN0IHRpbWVvdXRcbiIsIGVuZ2luZS0+bmFtZSk7Cj4gKwlyZXR1cm4gLUVJ TzsKPiAgfQo+ICAKPiAgc3RhdGljIHZvaWQgZ2VuOF91bnJlcXVlc3RfZW5naW5lX3Jlc2V0KHN0 cnVjdCBpbnRlbF9lbmdpbmVfY3MgKmVuZ2luZSkKPiAtLSAKPiAyLjkuMwpfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpJbnRlbC1nZnggbWFpbGluZyBsaXN0 CkludGVsLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVlZGVza3Rv cC5vcmcvbWFpbG1hbi9saXN0aW5mby9pbnRlbC1nZngK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com ([134.134.136.31]:2966 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752345AbcIFLMp (ORCPT ); Tue, 6 Sep 2016 07:12:45 -0400 From: Mika Kuoppala To: Chris Wilson , intel-gfx@lists.freedesktop.org Cc: Chris Wilson , Arun Siluvery , Daniel Vetter , stable@vger.kernel.org Subject: Re: [PATCH] drm/i915: Try harder to reset the gen8+ engines In-Reply-To: <20160905093015.21820-1-chris@chris-wilson.co.uk> References: <20160905093015.21820-1-chris@chris-wilson.co.uk> Date: Tue, 06 Sep 2016 14:11:47 +0300 Message-ID: <87oa418eq4.fsf@gaia.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain Sender: stable-owner@vger.kernel.org List-ID: Resending my r-b... Chris Wilson writes: > If at first we don't succeed, try again. > > Running the reset and recovery routines in a loop ends in a "reset > request timeout" with a mtbf of an hour on Braswell. This is eerily > similar to the unrecoverable reset condition that first prompted us to > use the reset-request mechanism in commit 7fd2d26921d1 ("drm/i915: Reset > request handling for gen8+"). Repeating the reset request makes the > failure much harder to reproduce (but there is no reason to believe that > it is more than mere paper over a timing or other issue). > > Signed-off-by: Chris Wilson > Cc: Mika Kuoppala > Cc: Arun Siluvery > Cc: Daniel Vetter > Cc: stable@vger.kernel.org > --- > drivers/gpu/drm/i915/intel_uncore.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index e9f68cd56e32..1be8ced03ba5 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -1688,20 +1688,22 @@ int intel_wait_for_register(struct drm_i915_private *dev_priv, > static int gen8_request_engine_reset(struct intel_engine_cs *engine) > { > struct drm_i915_private *dev_priv = engine->i915; > - int ret; > + int loop = 3; > retries? > - I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base), > - _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET)); > + do { > + I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base), > + _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET)); > > - ret = intel_wait_for_register_fw(dev_priv, > - RING_RESET_CTL(engine->mmio_base), > - RESET_CTL_READY_TO_RESET, > - RESET_CTL_READY_TO_RESET, > - 700); > - if (ret) > - DRM_ERROR("%s: reset request timeout\n", engine->name); > + if (!intel_wait_for_register_fw(dev_priv, > + RING_RESET_CTL(engine->mmio_base), > + RESET_CTL_READY_TO_RESET, > + RESET_CTL_READY_TO_RESET, > + 700)) Did you check between the write didn't stick vs the readyness didn't signal? With gen8, we might get away with just resetting regardless of the the ready state. Needs some experimenting/testing first tho. Reviewed-by: Mika Kuoppala > + return 0; > + } while (--loop); > > - return ret; > + DRM_ERROR("%s: reset request timeout\n", engine->name); > + return -EIO; > } > > static void gen8_unrequest_engine_reset(struct intel_engine_cs *engine) > -- > 2.9.3