From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Haigh Subject: Re: Kernel 3.7.[12] - irq 16: nobody cared Date: Wed, 16 Jan 2013 04:15:38 +1100 Message-ID: <50F58EBA.708@crc.id.au> References: <50F4CC98.7090303@crc.id.au> <50F5828302000078000B5E06@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <50F5828302000078000B5E06@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org Hi Jan, On 16/01/2013 2:23 AM, Jan Beulich wrote: >>>> On 15.01.13 at 04:27, Steven Haigh wrote: >> irq 16: nobody cared (try booting with the "irqpoll" option) >> Pid: 0, comm: swapper/0 Not tainted 3.7.2-1.el6xen.x86_64 #1 >> Call Trace: >> [] __report_bad_irq+0x3a/0xc6 >> [] note_interrupt+0x169/0x1e5 >> [] handle_irq_event_percpu+0x16e/0x1b6 >> [] handle_irq_event+0x38/0x54 >> [] handle_fasteoi_irq+0x88/0xd5 >> [] __xen_evtchn_do_upcall+0x15a/0x1f7 >> [] xen_evtchn_do_upcall+0x2f/0x42 >> [] xen_do_hypervisor_callback+0x1e/0x30 >> [] ? xen_hypercall_sched_op+0xa/0x20 >> [] ? xen_hypercall_sched_op+0xa/0x20 >> [] ? xen_hypercall_sched_op+0xa/0x20 >> [] ? xen_safe_halt+0x10/0x1a >> [] ? default_idle+0x50/0x8a >> [] ? cpu_idle+0xc0/0xff >> [] ? rest_init+0x72/0x74 >> [] ? start_kernel+0x3b0/0x3bd >> [] ? repair_env_string+0x58/0x58 >> [] ? x86_64_start_reservations+0xb8/0xbd >> [] ? xen_start_kernel+0x4f2/0x4f4 >> handlers: >> [] mv_interrupt [sata_mv] >> Disabling IRQ #16 >> >> I have tried booting with the irqpoll option on the kernel boot line, >> but the same problem occurs. >> >> It seems disk throughput almost drops dead when this happens - as the >> SATA controller seems to go into some different mode of operation. It >> also seems like this has only happened recently - I was using builds of >> 3.6.x as my Xen Dom0 kernel with no signs of this problem. >> >> Has anyone else seen this in recent kernel releases? I'm not quite sure >> how to try and track this down. > First of all, you'll want to clarify whether this problem is present > _only_ when running under Xen, or also when running the same > kernel without Xen underneath. This is primarily because the > output you provided shows that IRQ 16 actually has a handler, > just that it apparently ignores the interrupts (and that's nothing > that Xen controls). I'm not 100% sure how to do this. I haven't been able to find a method to cause the problem to happen... It just does - and it seems random when it does happen. Part of the problem with running the system without the hypervisor in place is that I can't replicate any kind of workload that would normally trigger the problem. > Then, if this is a Xen-only problem, you will want to provide full > hypervisor and kernel (boot) logs, the hypervisor one including > debug key 'i' output, and the kernel one once with and once > without Xen. > > Finally you'll want to clarify whether, when updating the kernel, > you also updated the hypervisor (and if so, try the know good > and known bad kernels on identical hypervisors). I have been running Xen 4.2.1 for a while - and used multiple kernel versions with it. Sadly, I don't have an archive of the RPMs that I used (even though I built them!). I've only really noticed this happening in the last month - when I've been running kernel 3.7.1+ On the off chance today, I have moved the card from one 16x PCIe slot to the second one on the mainboard. This has moved the card from IRQ16 to IRQ19. As of yet, I haven't had the problem occur - however as it is a seemingly random occurrence, there is no guarantee that the problem is solved. I've tried loading up the i/o by doing a resync of the RAID6 (of which, 2 drives are on the sata_mv card) as well as hammering i/o in the DomUs (rather random stuff), but still no reliable way to force the problem to occur :( I'm open to any suggestions :) -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 Fax: (03) 8338 0299