From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 2/4] drm/i915: Only slightly increment hangcheck score if we succesfully kick a ring Date: Tue, 11 Jun 2013 16:05:41 +0200 Message-ID: <20130611140541.GX22870@phenom.ffwll.local> References: <1370859622-16674-1-git-send-email-chris@chris-wilson.co.uk> <1370859622-16674-2-git-send-email-chris@chris-wilson.co.uk> <20130611094500.GV22870@phenom.ffwll.local> <20130611134019.GA32097@cantiga.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ea0-f181.google.com (mail-ea0-f181.google.com [209.85.215.181]) by gabe.freedesktop.org (Postfix) with ESMTP id E1438E602D for ; Tue, 11 Jun 2013 07:05:46 -0700 (PDT) Received: by mail-ea0-f181.google.com with SMTP id a15so3898709eae.40 for ; Tue, 11 Jun 2013 07:05:46 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20130611134019.GA32097@cantiga.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Chris Wilson , Daniel Vetter , intel-gfx@lists.freedesktop.org, Ben Widawsky List-Id: intel-gfx@lists.freedesktop.org On Tue, Jun 11, 2013 at 02:40:19PM +0100, Chris Wilson wrote: > On Tue, Jun 11, 2013 at 11:45:00AM +0200, Daniel Vetter wrote: > > On Mon, Jun 10, 2013 at 11:20:20AM +0100, Chris Wilson wrote: > > > + if (ring->hangcheck.seqno == seqno) { > > > + if (ring_idle(ring, seqno)) { > > > + if (waitqueue_active(&ring->irq_queue)) { > > > + /* Issue a wake-up to catch stuck h/w. */ > > > + DRM_ERROR("Hangcheck timer elapsed... %s idle\n", > > > + ring->name); > > > + wake_up_all(&ring->irq_queue); > > > + ring->hangcheck.score += HUNG; > > > > Not sure whether we want to hit missed interrupts this badly, it was > > rather common a while back ;-) But we can fine-tune this easily now, so > > now reservations for merging from my side. > > Not sure what you mean here. The check is fairly easy and has gotten us > out of many a hole before, and makes for a good defense. So how would > you want to fine tune it? Something like the MI_WAIT hangcheck score, but like I've said as long as we don't have a real-world bug report (some poor guy disabled semaphores maybe due to the snb issue?) not worth bothering at all. I've just thought that if we're unlucky and miss the interrupt a few times in a row we don't want to accidentally declare the gpu dead. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch