From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rob Clark Subject: [PATCH 6/6] drm/msm: dump submits which triggered gpu hang Date: Tue, 24 Oct 2017 09:22:53 -0400 Message-ID: <20171024132256.20286-7-robdclark@gmail.com> References: <20171024132256.20286-1-robdclark@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20171024132256.20286-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: freedreno-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "Freedreno" To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Cc: David Airlie , linux-arm-msm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Rob Clark , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jordan Crouse , freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org List-Id: linux-arm-msm@vger.kernel.org Tm90ZSB3ZSBuZWVkIHRvIG1vdmUgdXBkYXRlX2ZlbmNlcygpIHRvIGFmdGVyIG1zbV9yZF9kdW1w X3N1Ym1pdCgpLApvdGhlcndpc2UgdGhlIGJvJ3MgcmVmZXJlbmNlZCBieSB0aGUgc3VibWl0IG1h eSBubyBsb25nZXIgYmUgdmFsaWQuCgpTaWduZWQtb2ZmLWJ5OiBSb2IgQ2xhcmsgPHJvYmRjbGFy a0BnbWFpbC5jb20+Ci0tLQogZHJpdmVycy9ncHUvZHJtL21zbS9tc21fZ3B1LmMgfCA1MiArKysr KysrKysrKysrKysrKysrKysrKysrLS0tLS0tLS0tLS0tLS0tLS0tCiAxIGZpbGUgY2hhbmdlZCwg MzAgaW5zZXJ0aW9ucygrKSwgMjIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvZHJpdmVycy9n cHUvZHJtL21zbS9tc21fZ3B1LmMgYi9kcml2ZXJzL2dwdS9kcm0vbXNtL21zbV9ncHUuYwppbmRl eCA0MDNiYWVhMTkzMjkuLjkzOWVhOTg5MDhiOCAxMDA2NDQKLS0tIGEvZHJpdmVycy9ncHUvZHJt L21zbS9tc21fZ3B1LmMKKysrIGIvZHJpdmVycy9ncHUvZHJtL21zbS9tc21fZ3B1LmMKQEAgLTI1 NSwzNCArMjU1LDE2IEBAIHN0YXRpYyB2b2lkIHJlY292ZXJfd29ya2VyKHN0cnVjdCB3b3JrX3N0 cnVjdCAqd29yaykKIHsKIAlzdHJ1Y3QgbXNtX2dwdSAqZ3B1ID0gY29udGFpbmVyX29mKHdvcmss IHN0cnVjdCBtc21fZ3B1LCByZWNvdmVyX3dvcmspOwogCXN0cnVjdCBkcm1fZGV2aWNlICpkZXYg PSBncHUtPmRldjsKKwlzdHJ1Y3QgbXNtX2RybV9wcml2YXRlICpwcml2ID0gZGV2LT5kZXZfcHJp dmF0ZTsKIAlzdHJ1Y3QgbXNtX2dlbV9zdWJtaXQgKnN1Ym1pdDsKIAlzdHJ1Y3QgbXNtX3Jpbmdi dWZmZXIgKmN1cl9yaW5nID0gZ3B1LT5mdW5jcy0+YWN0aXZlX3JpbmcoZ3B1KTsKLQl1aW50NjRf dCBmZW5jZTsKIAlpbnQgaTsKIAotCS8qIFVwZGF0ZSBhbGwgdGhlIHJpbmdzIHdpdGggdGhlIGxh dGVzdCBhbmQgZ3JlYXRlc3QgZmVuY2UgKi8KLQlmb3IgKGkgPSAwOyBpIDwgQVJSQVlfU0laRShn cHUtPnJiKTsgaSsrKSB7Ci0JCXN0cnVjdCBtc21fcmluZ2J1ZmZlciAqcmluZyA9IGdwdS0+cmJb aV07Ci0KLQkJZmVuY2UgPSByaW5nLT5tZW1wdHJzLT5mZW5jZTsKLQotCQkvKgotCQkgKiBGb3Ig dGhlIGN1cnJlbnQgKGZhdWx0aW5nPykgcmluZy9zdWJtaXQgYWR2YW5jZSB0aGUgZmVuY2UgYnkK LQkJICogb25lIG1vcmUgdG8gY2xlYXIgdGhlIGZhdWx0aW5nIHN1Ym1pdAotCQkgKi8KLQkJaWYg KHJpbmcgPT0gY3VyX3JpbmcpCi0JCQlmZW5jZSA9IGZlbmNlICsgMTsKLQotCQl1cGRhdGVfZmVu Y2VzKGdwdSwgcmluZywgZmVuY2UpOwotCX0KLQogCW11dGV4X2xvY2soJmRldi0+c3RydWN0X211 dGV4KTsKIAotCiAJZGV2X2VycihkZXYtPmRldiwgIiVzOiBoYW5nY2hlY2sgcmVjb3ZlciFcbiIs IGdwdS0+bmFtZSk7Ci0JZmVuY2UgPSBjdXJfcmluZy0+bWVtcHRycy0+ZmVuY2UgKyAxOwogCi0J c3VibWl0ID0gZmluZF9zdWJtaXQoY3VyX3JpbmcsIGZlbmNlKTsKKwlzdWJtaXQgPSBmaW5kX3N1 Ym1pdChjdXJfcmluZywgY3VyX3JpbmctPm1lbXB0cnMtPmZlbmNlICsgMSk7CiAJaWYgKHN1Ym1p dCkgewogCQlzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRhc2s7CiAKQEAgLTMwNiwxMSArMjg4LDM3IEBA IHN0YXRpYyB2b2lkIHJlY292ZXJfd29ya2VyKHN0cnVjdCB3b3JrX3N0cnVjdCAqd29yaykKIAkJ CWxlbiA9IGdldF9jbWRsaW5lKHRhc2ssIGJ1Ziwgc2l6ZW9mKGJ1ZikpOwogCQkJbXV0ZXhfbG9j aygmZGV2LT5zdHJ1Y3RfbXV0ZXgpOwogCi0JCQlkZXZfZXJyKGRldi0+ZGV2LCAiJXM6IG9mZmVu ZGluZyB0YXNrOiAlcyAoJS0qcylcbiIsCi0JCQkJCWdwdS0+bmFtZSwgdGFzay0+Y29tbSwgbGVu LCBidWYpOworCQkJZGV2X2VycihkZXYtPmRldiwgIiVzOiBvZmZlbmRpbmcgdGFzazogJXMgKCUu KnMpXG4iLAorCQkJCWdwdS0+bmFtZSwgdGFzay0+Y29tbSwgbGVuLCBidWYpOworCisJCQltc21f cmRfZHVtcF9zdWJtaXQocHJpdi0+aGFuZ3JkLCBzdWJtaXQsCisJCQkJIm9mZmVuZGluZyB0YXNr OiAlcyAoJS4qcykiLCB0YXNrLT5jb21tLAorCQkJCWxlbiwgYnVmKTsKKwkJfSBlbHNlIHsKKwkJ CW1zbV9yZF9kdW1wX3N1Ym1pdChwcml2LT5oYW5ncmQsIHN1Ym1pdCwgTlVMTCk7CiAJCX0KIAkJ cmN1X3JlYWRfdW5sb2NrKCk7CisJfQorCisKKwkvKgorCSAqIFVwZGF0ZSBhbGwgdGhlIHJpbmdz IHdpdGggdGhlIGxhdGVzdCBhbmQgZ3JlYXRlc3QgZmVuY2UuLiB0aGlzCisJICogbmVlZHMgdG8g aGFwcGVuIGFmdGVyIG1zbV9yZF9kdW1wX3N1Ym1pdCgpIHRvIGVuc3VyZSB0aGF0IHRoZQorCSAq IGJvJ3MgcmVmZXJlbmNlZCBieSB0aGUgb2ZmZW5kaW5nIHN1Ym1pdCBhcmUgc3RpbGwgYXJvdW5k LgorCSAqLworCWZvciAoaSA9IDA7IGkgPCBBUlJBWV9TSVpFKGdwdS0+cmIpOyBpKyspIHsKKwkJ c3RydWN0IG1zbV9yaW5nYnVmZmVyICpyaW5nID0gZ3B1LT5yYltpXTsKKworCQl1aW50MzJfdCBm ZW5jZSA9IHJpbmctPm1lbXB0cnMtPmZlbmNlOwogCisJCS8qCisJCSAqIEZvciB0aGUgY3VycmVu dCAoZmF1bHRpbmc/KSByaW5nL3N1Ym1pdCBhZHZhbmNlIHRoZSBmZW5jZSBieQorCQkgKiBvbmUg bW9yZSB0byBjbGVhciB0aGUgZmF1bHRpbmcgc3VibWl0CisJCSAqLworCQlpZiAocmluZyA9PSBj dXJfcmluZykKKwkJCWZlbmNlKys7CisKKwkJdXBkYXRlX2ZlbmNlcyhncHUsIHJpbmcsIGZlbmNl KTsKIAl9CiAKIAlpZiAobXNtX2dwdV9hY3RpdmUoZ3B1KSkgewotLSAKMi4xMy42CgpfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpGcmVlZHJlbm8gbWFpbGlu ZyBsaXN0CkZyZWVkcmVub0BsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVl ZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9mcmVlZHJlbm8K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934183AbdJXNYB (ORCPT ); Tue, 24 Oct 2017 09:24:01 -0400 Received: from mail-qt0-f196.google.com ([209.85.216.196]:47582 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934066AbdJXNXf (ORCPT ); Tue, 24 Oct 2017 09:23:35 -0400 X-Google-Smtp-Source: ABhQp+TImM132Zan0XRU2I75dDCJxvhh+xsh7BktDFzGWCHKm+t2ZU6oawdZrAMnL9AiiJsUFsLm7Q== From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, Jordan Crouse , Rob Clark , David Airlie , linux-kernel@vger.kernel.org Subject: [PATCH 6/6] drm/msm: dump submits which triggered gpu hang Date: Tue, 24 Oct 2017 09:22:53 -0400 Message-Id: <20171024132256.20286-7-robdclark@gmail.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20171024132256.20286-1-robdclark@gmail.com> References: <20171024132256.20286-1-robdclark@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Note we need to move update_fences() to after msm_rd_dump_submit(), otherwise the bo's referenced by the submit may no longer be valid. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gpu.c | 52 +++++++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 403baea19329..939ea98908b8 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -255,34 +255,16 @@ static void recover_worker(struct work_struct *work) { struct msm_gpu *gpu = container_of(work, struct msm_gpu, recover_work); struct drm_device *dev = gpu->dev; + struct msm_drm_private *priv = dev->dev_private; struct msm_gem_submit *submit; struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu); - uint64_t fence; int i; - /* Update all the rings with the latest and greatest fence */ - for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) { - struct msm_ringbuffer *ring = gpu->rb[i]; - - fence = ring->memptrs->fence; - - /* - * For the current (faulting?) ring/submit advance the fence by - * one more to clear the faulting submit - */ - if (ring == cur_ring) - fence = fence + 1; - - update_fences(gpu, ring, fence); - } - mutex_lock(&dev->struct_mutex); - dev_err(dev->dev, "%s: hangcheck recover!\n", gpu->name); - fence = cur_ring->memptrs->fence + 1; - submit = find_submit(cur_ring, fence); + submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1); if (submit) { struct task_struct *task; @@ -306,11 +288,37 @@ static void recover_worker(struct work_struct *work) len = get_cmdline(task, buf, sizeof(buf)); mutex_lock(&dev->struct_mutex); - dev_err(dev->dev, "%s: offending task: %s (%-*s)\n", - gpu->name, task->comm, len, buf); + dev_err(dev->dev, "%s: offending task: %s (%.*s)\n", + gpu->name, task->comm, len, buf); + + msm_rd_dump_submit(priv->hangrd, submit, + "offending task: %s (%.*s)", task->comm, + len, buf); + } else { + msm_rd_dump_submit(priv->hangrd, submit, NULL); } rcu_read_unlock(); + } + + + /* + * Update all the rings with the latest and greatest fence.. this + * needs to happen after msm_rd_dump_submit() to ensure that the + * bo's referenced by the offending submit are still around. + */ + for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) { + struct msm_ringbuffer *ring = gpu->rb[i]; + + uint32_t fence = ring->memptrs->fence; + /* + * For the current (faulting?) ring/submit advance the fence by + * one more to clear the faulting submit + */ + if (ring == cur_ring) + fence++; + + update_fences(gpu, ring, fence); } if (msm_gpu_active(gpu)) { -- 2.13.6