public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: intel-gfx@lists.freedesktop.org
Subject: [RFC] drm/i915: reference count batch object on requests
Date: Wed,  4 Dec 2013 15:28:44 +0200	[thread overview]
Message-ID: <1386163724-8378-1-git-send-email-mika.kuoppala@intel.com> (raw)
In-Reply-To: <20131203171005.GN27344@phenom.ffwll.local>

In i915_gem_reset_ring_lists we reset requests and move objects to the
inactive list. Which means if the active list is the last one to hold a
reference, the object will disappear.

Now the problem is that we do this per-ring, and not in the order that the
objects would have been retired if the gpu wouldn't have hung. E.g. if a
batch is active on both ring 1&2 but was last active on ring 1, then we'd
free it before we go ahead with cleaning up the requests for ring 2.

Fixes regression (a possible OOPS following a GPU hang) from
commit aa60c664e6df502578454621c3a9b1f087ff8d25
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Jun 12 15:13:20 2013 +0300

    drm/i915: find guilty batch buffer on ring resets

Oops:
BUG: unable to handle kernel paging request at 6b6b6ce3
IP: [<f86124cc>] i915_gem_obj_offset+0xc/0x60 [i915]
*pdpt = 0000000000000000 *pde = 0000000000000000
Oops: 0000 [#1] SMP
CPU: 0 PID: 21 Comm: kworker/0:1 Not tainted 3.12.0+ #1274
Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0070.2012.0416.2117 04/16/2012
Workqueue: events i915_error_work_func [i915]
task: f6fd8000 ti: f2e3a000 task.ti: f2e3a000
EIP: 0060:[<f86124cc>] EFLAGS: 00010282 CPU: 0
EIP is at i915_gem_obj_offset+0xc/0x60 [i915]
EAX: ea75d880 EBX: e413497c ECX: 6b6b6b6b EDX: f6069998
ESI: f6069298 EDI: e4134960 EBP: f2e3be2c ESP: f2e3be28
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 80050033 CR2: 6b6b6ce3 CR3: 0195b000 CR4: 001407f0
Stack:
 e413497c f2e3be84 f8614d23 f8671864 f867ca2c f8689a2c f8686a49 0093c000
 00000000 0093cb60 ea75d880 f6718e10 00fd8000 f6068000 00004208 00004208
 00000000 00000002 fffff714 f6069340 f6718e10 00000000 f6068000 f2e3beac
Call Trace:
 [<f8614d23>] i915_gem_reset+0xc3/0x2a0 [i915]
 [<f85f9ada>] i915_reset+0x4a/0x160 [i915]
 [<f86002d1>] i915_error_work_func+0xc1/0x110 [i915]
 [<c1058b62>] process_one_work+0x122/0x3e0
 [<c1059aa7>] worker_thread+0xf7/0x320
 [<c10599b0>] ? manage_workers.isra.18+0x290/0x290
 [<c105f3f4>] kthread+0x94/0xa0
 [<c157daf7>] ret_from_kernel_thread+0x1b/0x28
 [<c105f360>] ? flush_kthread_worker+0xb0/0xb0
Code: 2b 5c ca c8 31 c0 83 c4 10 5b 5e 5f 5d c3 90 b8 f4 ff ff ff eb f0 89 f6 8d bc 27 00 00 00 00 55 89 e5 53 3e 8d 74 26 00 8b 48 08 <8b> 89 78 01 00 00 39 91 c0 1a 00 00 8d 99 98 19 00 00 0f 44 d3
EIP: [<f86124cc>] i915_gem_obj_offset+0xc/0x60 [i915] SS:ESP 0068:f2e3be28
CR2: 000000006b6b6ce3

v2: Better commit message and backtrace (Daniel)

Testcase: igt/gem_reset_stats/close-pending-fork

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c |   12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 92149bc..e677e56 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2145,13 +2145,12 @@ int __i915_add_request(struct intel_ring_buffer *ring,
 	request->head = request_start;
 	request->tail = request_ring_position;
 
-	/* Whilst this request exists, batch_obj will be on the
-	 * active_list, and so will hold the active reference. Only when this
-	 * request is retired will the the batch_obj be moved onto the
-	 * inactive_list and lose its active reference. Hence we do not need
-	 * to explicitly hold another reference here.
+	/* Active list has one reference but that is not enough as same
+	 * batch_obj can be active on multiple rings
 	 */
 	request->batch_obj = obj;
+	if (request->batch_obj)
+		drm_gem_object_reference(&request->batch_obj->base);
 
 	/* Hold a reference to the current context so that we can inspect
 	 * it later in case a hangcheck error event fires.
@@ -2340,6 +2339,9 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request)
 	if (request->ctx)
 		i915_gem_context_unreference(request->ctx);
 
+	if (request->batch_obj)
+		drm_gem_object_unreference(&request->batch_obj->base);
+
 	kfree(request);
 }
 
-- 
1.7.9.5

      parent reply	other threads:[~2013-12-04 13:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02 14:47 [PATCH] tests/gem_reset_stats: add close-pending-fork Mika Kuoppala
2013-12-02 15:03 ` Chris Wilson
2013-12-02 16:32   ` Mika Kuoppala
2013-12-03 17:03     ` Daniel Vetter
2013-12-04 14:39       ` Mika Kuoppala
2013-12-04 15:48         ` Daniel Vetter
2013-12-02 15:31 ` [RFC] drm/i915: reference count batch object on requests Mika Kuoppala
2013-12-03 17:10   ` Daniel Vetter
2013-12-04 11:24     ` Chris Wilson
2013-12-04 12:08       ` Daniel Vetter
2013-12-04 12:11         ` Chris Wilson
2013-12-04 13:28     ` Mika Kuoppala [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1386163724-8378-1-git-send-email-mika.kuoppala@intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox