public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Ben Widawsky <ben@bwidawsk.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 0/4] [RFC] use HW watchdog timer
Date: Mon, 16 Jul 2012 22:16:22 +0200	[thread overview]
Message-ID: <20120716201622.GE5023@phenom.ffwll.local> (raw)
In-Reply-To: <1342464719-8790-1-git-send-email-ben@bwidawsk.net>

On Mon, Jul 16, 2012 at 11:51:55AM -0700, Ben Widawsky wrote:
> This was my pet project for the last few days, but I have to take a
> break from working on it for now to do some real work ;-). The patches
> compile, and pass a basic test, but that's about it. There is still
> quite a bit of work left to make this useful. The easiest thing would be
> to tie this into error state.
> 
> The idea is pretty simple. It uses the HW watchdog to set a timeout per
> batchbuffer, instead of a global software watchdog.
> 
> Pros:
> * Potential for per batch, or ring watchdog values. I believe when/if we
> get to GPGPU workloads, this is particularly interesting.
> * Batch granularity hang detection. This mostly just makes hang
> detection and recovery a bit easier IMO.
> 
> Cons:
> * Blit ring doesn't have an interrupt. This means we still need the
> software watchdog, and it makes hang detection more complex. I've been
> led to believe future HW *may* have this interrupt.
> * Semaphores 
> 
> I'm looking for feedback, mainly for Daniel, and Chris if this is worth
> pursuing further when I have more time. The idea would be to eventually
> use this to implement much of the ARB robustness requirements instead of
> doing a bunch of request list processing.
> 
> Ben Widawsky (4):
>   drm/i915: Use HW watchdog for each batch
>   drm/i915: Turn on watchdog interrupts
>   drm/i915: Add a breadcrumb
>   drm/i915: Display the failing seqno

High-level drive-by review:

I think it's very important to separate hangs in the batch itself from
hangs in the ringbuffers, e.g. semaphore deadlocks. We should only blame
the former on the userspace-submitted batchbuffer. So on that ground I
think the following speudo-code check in the hangcheck code would be
simpler to implement and more robust:

/* Check whether we're outside of the ring. At worst this check presumes
 * that the hang is in the ring, never the other way round. */
if (ACTHEAD != RING_HEAD)
	/* Check whether ACTHEAD lies in any active ring. We can't check
	 * the object's gpu_domain, that might have been changed again
	 * already. */
	for_each_active_buffer(buffer)
		if (buffer->gtt_offset + buffer->size < ACTHEAD &&
		    buffer->gtt_offset >= ACTHEAD)
			goto found;

Otoh I don't think your patches here are a completely lost cause. This
render watchdog could be used rather well for a per-patch short timeout to
kill one specific batchbuffer I guess. But right now I don't see an
immediate use-case ...

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

  parent reply	other threads:[~2012-07-16 20:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-16 18:51 [PATCH 0/4] [RFC] use HW watchdog timer Ben Widawsky
2012-07-16 18:51 ` [PATCH 1/4] drm/i915: Use HW watchdog for each batch Ben Widawsky
2012-07-16 18:51 ` [PATCH 2/4] drm/i915: Turn on watchdog interrupts Ben Widawsky
2012-07-16 18:51 ` [PATCH 3/4] drm/i915: Add a breadcrumb Ben Widawsky
2012-07-16 18:51 ` [PATCH 4/4] drm/i915: Display the failing seqno Ben Widawsky
2012-07-16 20:16 ` Daniel Vetter [this message]
2012-07-17 11:12 ` [PATCH 0/4] [RFC] use HW watchdog timer Chris Wilson
2012-07-17 18:51   ` Ben Widawsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120716201622.GE5023@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=ben@bwidawsk.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox