From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Hypervisor memory leak/corruption because of guest irqs Date: Fri, 7 Sep 2012 19:04:15 +0100 Message-ID: <504A371F.7000808@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040106020207030903020401" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "xen-devel@lists.xen.org" , Keir Fraser , Jan Beulich List-Id: xen-devel@lists.xenproject.org --------------040106020207030903020401 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Hello, I appear to have opened a can of worms here. This was discovered when turning on ASSERT()s, looking for another crash (and adding in 98 new ASSERTs, along with some 'type checking' magic constants) The issue has been discovered against Xen 4.1.3, but a cursory inspection of unstable shows no signs of it being fixed. The relevant assertion (which I added) is attached. (In the process of debugging, I also developed the ASSERT_PRINTK(bool, fmt, ...) macro which will be upstreamed in due course.) The root cause of the problem is the compelete abuse of the irq_desc->action pointer being cast to a irq_guest_action_t* when in-fact it is an irqaction*, but the (correct) solution is not easy. destroy_irq() calls dynamic_irq_cleanup() which xfree()'s desc->action. This would be all well and fine if it were only an irqaction pointer. However, in this case, it is actually an irq_guest_action_t pointer, meaning that we have free()'d an inactive timer, which is on a pcpu's inactive timer linked list. This means that as soon as the free()'d memory is reused for something new, the linked list gets trashed, which which point all bets are off with regards to the validity of hypervisor memory. As far as I can tell, this bug only manifests in combination with PCI Passthrough, as we only perform cleanup of guest irqs when a domain with passthrough is shut down. The issue was first found by the ASSERT()s in __list_del(), when something tried to use the pcpu inactive timer list, after the free()'d memory was reused. In this specific case, a quick and dirty hack would be to check every time we free an action and possibly kill the timer if it is a guest irq. Having said that, it is not a correct fix; the utter abuse of irq_desc->action has been a ticking timebomb for a long time. irq_guest_action_t is private to the 2nd half of irq.c(x86), whereas irqaction is common and architecture independent. The only acceptable solution I see is to re-architect a substantial proportion of the irq code. Am I missing something obvious? or is the best way to continue (in which case I have my work cut out as, it is currently affecting XenServer customers) ? -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com --------------040106020207030903020401 Content-Type: text/x-log; name="hypervisor-panic.log" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="hypervisor-panic.log" (XEN) [2012-09-07 11:35:18] Assertion '! (t->status == 1 || t->status == 3 || t->status == 4)' failed, line 229, file irq.c (XEN) [2012-09-07 11:35:18] will free action with timer in status 1 (XEN) [2012-09-07 11:35:18] Xen BUG at irq.c:229 (XEN) [2012-09-07 11:35:18] ----[ Xen-4.1.3 x86_64 debug=n Tainted: C ]---- (XEN) [2012-09-07 11:35:18] CPU: 0 (XEN) [2012-09-07 11:35:18] RIP: e008:[] destroy_irq+0x1ae/0x210 (XEN) [2012-09-07 11:35:18] RFLAGS: 0000000000010082 CONTEXT: hypervisor (XEN) [2012-09-07 11:35:18] rax: 0000000000000000 rbx: ffff83083fe04800 rcx: 0000000000000000 (XEN) [2012-09-07 11:35:19] rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c4802662c4 (XEN) [2012-09-07 11:35:19] rbp: ffff830993cd77f0 rsp: ffff82c4802b7d88 r8: 0000000000000000 (XEN) [2012-09-07 11:35:19] r9: 0000000000000000 r10: 00000000ffffffff r11: ffff82c480141560 (XEN) [2012-09-07 11:35:19] r12: 000000000000008f r13: ffff83083fe04834 r14: 0000000000000286 (XEN) [2012-09-07 11:35:19] r15: ffff830993cd7810 cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) [2012-09-07 11:35:19] cr3: 00000008369ae000 cr2: 00000000e9925e68 (XEN) [2012-09-07 11:35:19] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0000 cs: e008 (XEN) [2012-09-07 11:35:19] Xen stack trace from rsp=ffff82c4802b7d88: (XEN) [2012-09-07 11:35:19] 0000000000000000 ffff8303c40757d0 ffff830895660000 000000000000004f (XEN) [2012-09-07 11:35:19] ffff8303c40757d0 000000000000008f ffff83083fe04834 ffff82c480169647 (XEN) [2012-09-07 11:35:20] ffff8303c40757d0 ffff830895660000 000000000000004f ffff83083fe04800 (XEN) [2012-09-07 11:35:20] 000000000000008f ffff82c48016e491 000000000000004f 000000000000013c (XEN) [2012-09-07 11:35:20] 000000000000008f ffff830895660000 000000000000004f 000000000000013c (XEN) [2012-09-07 11:35:20] ffff830895660188 ffff82c4802d4600 0000000000000000 ffff82c48016e71a (XEN) [2012-09-07 11:35:20] ffff830895660000 ffff8300be6ac000 ffff830895660000 00000000ffffffff (XEN) [2012-09-07 11:35:20] ffff830895660000 ffff82c48015efb2 00000000ffffffff ffff8300be6ac000 (XEN) [2012-09-07 11:35:20] fffffffffffffff8 ffff82c480104a80 ffff82c4802f4060 0000000000000000 (XEN) [2012-09-07 11:35:20] 0000000000000001 ffff82c4802f4320 ffff82c4802d0600 ffff82c480130191 (XEN) [2012-09-07 11:35:20] 0000000000000000 ffffffffffffffff ffff82c4802b7f18 ffff82c480126f85 (XEN) [2012-09-07 11:35:20] ffff8300bf2d8000 00000000e98dde14 0000000000000000 0000000000000000 (XEN) [2012-09-07 11:35:21] 0000000000000000 ffff82c480223b26 0000000000000000 0000000000000000 (XEN) [2012-09-07 11:35:21] 0000000000000000 0000000000000000 00000000e98dde14 00000000e913584c (XEN) [2012-09-07 11:35:21] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) [2012-09-07 11:35:21] 0000000000000008 00000000e9135878 0000000000000000 00000000e9135854 (XEN) [2012-09-07 11:35:21] 00000000e98ddec4 000000f900000000 00000000c01d68f9 0000000000000061 (XEN) [2012-09-07 11:35:21] 0000000000000202 00000000e98dde0c 0000000000000069 0000000000000000 (XEN) [2012-09-07 11:35:21] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) [2012-09-07 11:35:21] ffff8300bf2d8000 0000000000000000 0000000000000000 (XEN) [2012-09-07 11:35:21] Xen call trace: (XEN) [2012-09-07 11:35:21] [] destroy_irq+0x1ae/0x210 (XEN) [2012-09-07 11:35:22] 7[] msi_free_irq+0x27/0x210 (XEN) [2012-09-07 11:35:22] 13[] unmap_domain_pirq+0x181/0x3b0 (XEN) [2012-09-07 11:35:22] 23[] free_domain_pirqs+0x5a/0x90 (XEN) [2012-09-07 11:35:22] 29[] arch_domain_destroy+0x32/0x330 (XEN) [2012-09-07 11:35:22] 33[] complete_domain_destroy+0x80/0x150 (XEN) [2012-09-07 11:35:22] 39[] rcu_process_callbacks+0xa1/0x200 (XEN) [2012-09-07 11:35:22] 43[] __do_softirq+0x65/0x90 (XEN) [2012-09-07 11:35:22] 49[] compat_process_softirqs+0x6/0x10 (XEN) [2012-09-07 11:35:22] (XEN) [2012-09-07 11:35:22] (XEN) [2012-09-07 11:35:22] **************************************** (XEN) [2012-09-07 11:35:22] Panic on CPU 0: (XEN) [2012-09-07 11:35:22] Xen BUG at irq.c:229 (XEN) [2012-09-07 11:35:22] **************************************** (XEN) [2012-09-07 11:35:23] (XEN) [2012-09-07 11:35:23] Manual reset required ('noreboot' specified) --------------040106020207030903020401 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------040106020207030903020401--