From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Niggemann Subject: Re: i915 irq storm mitigation in 3.10 Date: Mon, 22 Jul 2013 21:28:33 +0200 Message-ID: References: <2b828191fde71b2243f883e2dbb28d6d@hz6.de> <20130708200316.GG18285@phenom.ffwll.local> <20972.59257.635588.597578@linux-qknr.site> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from d119a3.x-mailer.de (d119a3.x-mailer.de [212.162.53.143]) by gabe.freedesktop.org (Postfix) with ESMTP id 65411E5CE1 for ; Mon, 22 Jul 2013 12:28:35 -0700 (PDT) In-Reply-To: <20972.59257.635588.597578@linux-qknr.site> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Egbert Eich Cc: intel-gfx List-Id: intel-gfx@lists.freedesktop.org Egbert, Daniel, others, Am 22.07.2013 10:04, schrieb Egbert Eich: > Daniel Vetter writes: > > On Sun, Jul 21, 2013 at 10:23 PM, Jan Niggemann wrote: > > >> But every time this happens we only let through a few > interrupts, so this > > >> shouldn't affect you badly. Can you please check whether those > slowdowns > > >> line up with 2 minute intervalls? > > > > > > I observed these slowdowns for a couple of weeks now. On my > machine, they > > > only happen once, some minutes after a cold boot. > > > They last for a minute or two, and then they are gone. > > > I'd have guessed that the storm detection kicks in pretty > quickly after a > > > storm is detected and that it would go unnoticed. > > > > Hm, that sounds like something doesn't quite work as expected. We > > should kill things once we get 5 interrupts or so in 1 second. So > if > > it's bad enough that it slows your machine down it really should > only > > be barely noticeable. > > > > The logs show that the disable mechanism got triggered, so there was > a storm that got detected. > The respective message is generated by the worker, everything up to > there (detection and marking disabled) seems to be fine. > I bet we are still getting interrupts but the respective bit in > hpd_event_bits doesn't get set any more. Since we unconditionally > queue the worker on interrupt there is surprise it is so busy. > > Then this points to the call to hpd_irq_setup() in > intel_hpd_irq_handler() > not doing what is expected, ie masking out the stormy interrupt. > Could it be that we can't mask/disable an interrupt before ACKing > it? > > @Jan, could you also specify what hardware you are using (ie give us > an output of lspci -n)? It's a Lenovo ThinkPad T400, the model is 7434-AG2. root@muretop:~# lspci -n 00:00.0 0600: 8086:2a40 (rev 07) 00:02.0 0300: 8086:2a42 (rev 07) 00:02.1 0380: 8086:2a43 (rev 07) 00:03.0 0780: 8086:2a44 (rev 07) 00:19.0 0200: 8086:10f5 (rev 03) 00:1a.0 0c03: 8086:2937 (rev 03) 00:1a.1 0c03: 8086:2938 (rev 03) 00:1a.2 0c03: 8086:2939 (rev 03) 00:1a.7 0c03: 8086:293c (rev 03) 00:1b.0 0403: 8086:293e (rev 03) 00:1c.0 0604: 8086:2940 (rev 03) 00:1c.1 0604: 8086:2942 (rev 03) 00:1c.3 0604: 8086:2946 (rev 03) 00:1c.4 0604: 8086:2948 (rev 03) 00:1d.0 0c03: 8086:2934 (rev 03) 00:1d.1 0c03: 8086:2935 (rev 03) 00:1d.2 0c03: 8086:2936 (rev 03) 00:1d.7 0c03: 8086:293a (rev 03) 00:1e.0 0604: 8086:2448 (rev 93) 00:1f.0 0601: 8086:2917 (rev 03) 00:1f.2 0106: 8086:2929 (rev 03) 00:1f.3 0c05: 8086:2930 (rev 03) 03:00.0 0280: 8086:4237 15:00.0 0607: 1180:0476 (rev ba) As to the log: I messed up the kernel parameters this morning... was out of coffee this morning and my 1,5y daughter played around me :-) Here's my kernel log with drm.debug and printk.time enabled: Uncompressed (22M): http://files.hz6.de/kern_20130722.log bzip2'd (some 600 KB): http://files.hz6.de/kern_20130722.log.bz2 Regards jan