From mboxrd@z Thu Jan 1 00:00:00 1970 From: Egbert Eich Subject: Re: i915 irq storm mitigation in 3.10 Date: Mon, 22 Jul 2013 10:04:09 +0200 Message-ID: <20972.59257.635588.597578@linux-qknr.site> References: <2b828191fde71b2243f883e2dbb28d6d@hz6.de> <20130708200316.GG18285@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) by gabe.freedesktop.org (Postfix) with ESMTP id 0BB5EE5F0C for ; Mon, 22 Jul 2013 01:04:14 -0700 (PDT) In-Reply-To: daniel@ffwll.ch wrote on Sunday, 21 July 2013 at 22:43:02 +0200 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Vetter Cc: Egbert Eich , intel-gfx , Jan Niggemann List-Id: intel-gfx@lists.freedesktop.org Daniel Vetter writes: > On Sun, Jul 21, 2013 at 10:23 PM, Jan Niggemann wrote: > >> But every time this happens we only let through a few interrupts, so this > >> shouldn't affect you badly. Can you please check whether those slowdowns > >> line up with 2 minute intervalls? > > > > I observed these slowdowns for a couple of weeks now. On my machine, they > > only happen once, some minutes after a cold boot. > > They last for a minute or two, and then they are gone. > > I'd have guessed that the storm detection kicks in pretty quickly after a > > storm is detected and that it would go unnoticed. > > Hm, that sounds like something doesn't quite work as expected. We > should kill things once we get 5 interrupts or so in 1 second. So if > it's bad enough that it slows your machine down it really should only > be barely noticeable. > The logs show that the disable mechanism got triggered, so there was a storm that got detected. The respective message is generated by the worker, everything up to there (detection and marking disabled) seems to be fine. I bet we are still getting interrupts but the respective bit in hpd_event_bits doesn't get set any more. Since we unconditionally queue the worker on interrupt there is surprise it is so busy. Then this points to the call to hpd_irq_setup() in intel_hpd_irq_handler() not doing what is expected, ie masking out the stormy interrupt. Could it be that we can't mask/disable an interrupt before ACKing it? @Jan, could you also specify what hardware you are using (ie give us an output of lspci -n)? Cheers, Egbert.