From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Wilson Subject: Re: [PATCH] drm/i915: don't bail out of intel_wait_ring_buffer too early Date: Tue, 18 Oct 2011 16:24:40 +0100 Message-ID: References: <1318353957-2601-1-git-send-email-daniel.vetter@ffwll.ch> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id F23DE9E819 for ; Tue, 18 Oct 2011 08:24:43 -0700 (PDT) In-Reply-To: <1318353957-2601-1-git-send-email-daniel.vetter@ffwll.ch> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: intel-gfx Cc: Daniel Vetter List-Id: intel-gfx@lists.freedesktop.org On Tue, 11 Oct 2011 19:25:57 +0200, Daniel Vetter wrote: > In the pre-gem days with non-existing hangcheck and gpu reset code, > this timeout of 3 seconds was pretty important to avoid stuck > processes. > > But now we have the hangcheck code in gem that goes to great length > to ensure that the gpu is really dead before declaring it wedged. > > So there's no need for this timeout anymore. Actually it's even harmful > because we can bail out too early (e.g. with xscreensaver slip) > when running giant batchbuffers. And our code isn't robust enough > to properly unroll any state-changes, we pretty much rely on the gpu > reset code cleaning up the mess (like cache tracking, fencing state, > active list/request tracking, ...). > > With this change intel_begin_ring can only fail when the gpu is > wedged, and it will return -EAGAIN (like wait_request in case the > gpu reset is still outstanding). > > v2: Chris Wilson noted that on resume timers aren't running and hence > we won't ever get kicked out of this loop by the hangcheck code. Use > an insanely large timeout instead for the HAS_GEM case to prevent > resume bugs from totally hanging the machine. > > Signed-off-by: Daniel Vetter Reviewed-by: Chris Wilson Personally I would have done: end = jiffies; if (has_gem) end += infinity; else end += 3s; but that is one nitpick too far ;-) -Chris -- Chris Wilson, Intel Open Source Technology Centre