From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wen Congyang Subject: Re: question about ioreq server Date: Fri, 11 Jul 2014 17:04:05 +0800 Message-ID: <53BFA885.3050002@cn.fujitsu.com> References: <53BF7501.5090606@cn.fujitsu.com> <9AAE0902D5BC7E449B7C8E4E778ABCD03D4544@AMSPEX01CL01.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD03D4544@AMSPEX01CL01.citrite.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Paul Durrant , Jan Beulich , xen-devl List-Id: xen-devel@lists.xenproject.org At 07/11/2014 04:36 PM, Paul Durrant Wrote: >> -----Original Message----- >> From: Wen Congyang [mailto:wency@cn.fujitsu.com] >> Sent: 11 July 2014 06:24 >> To: Paul Durrant; Jan Beulich; xen-devl >> Subject: question about ioreq server >> >> Hi, all >> >> I am trying to rebase our colo codes to upstream recently. I meet the >> following >> error in the test: >> (XEN) Assertion 'consumer_is_xen(lchn)' failed at event_channel.c:1202 >> (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 3 >> (XEN) RIP: e008:[] >> notify_via_xen_event_channel+0xac/0x11a >> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor >> (XEN) rax: ffff830200d18000 rbx: ffff830200d19000 rcx: ffff830200d190b8 >> (XEN) rdx: ffff830200d18000 rsi: 0000000000000000 rdi: ffff830200d190bc >> (XEN) rbp: ffff8302175ef398 rsp: ffff8302175ef378 r8: 00000000000c0000 >> (XEN) r9: 0000000000000004 r10: 0000000000020000 r11: 00000000f3044014 >> (XEN) r12: ffff830200d190b8 r13: 0000000000000000 r14: ffff830200d19000 >> (XEN) r15: 00000000f3044010 cr0: 0000000080050033 cr4: 00000000001526f0 >> (XEN) cr3: 0000000208e8b000 cr2: 0000000001b50004 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >> (XEN) Xen stack trace from rsp=ffff8302175ef378: >> (XEN) ffff8302175ef3e8 ffff8302175ef438 ffff82c00020f000 ffff830208e397f0 >> (XEN) ffff8302175ef3c8 ffff82d0801b5c43 ffff8302175ef438 >> 0000000000000001 >> (XEN) ffff82e00413dde0 0000000000000004 ffff8302175ef3e8 >> ffff82d0801b5cfe >> (XEN) 0000000000000004 ffff83008a2bf000 ffff8302175ef488 >> ffff82d0801af231 >> (XEN) 0000000000000004 0000000000000010 ffff8302175ef4f8 >> 00ff830000000004 >> (XEN) 0000000100000001 0000000000000000 0000000000000004 >> 0000000200000010 >> (XEN) 00000000f3044010 0000000000000000 0000000400000001 >> 0120000000000000 >> (XEN) 00000000000f3044 0000000000000004 0000000000000004 >> 0000000000000010 >> (XEN) ffff8302175efb88 ffff8302175ef980 ffff8302175ef4a8 ffff82d0801af3b1 >> (XEN) ffff830200000000 ffff8302175ef980 ffff8302175ef538 >> ffff82d0801afe4e >> (XEN) ffff8302175ef980 ffff8302175ef550 ffff8302175ef4f0 ffff8302175ef4f8 >> (XEN) ffff830200000002 00000001175ef508 ffff8302175ef510 >> 00000000f3044010 >> (XEN) 0000000000000001 ffffd000211bf010 ffff8302175ef558 >> ffff8302175efb88 >> (XEN) 000000000000008b 0000000000000000 0000000000000000 >> ffff82d08027a320 >> (XEN) ffff8302175ef548 ffff82d0801aff79 ffff8302175ef558 >> ffff82d08018d1db >> (XEN) ffff8302175efab8 ffff82d08018f2a6 ffff82d08018d1db >> ffff8302175efad0 >> (XEN) ffff82d08018f700 ffffffffffd0d210 ffff8302175ef5d8 0000000000000018 >> (XEN) 0000000000000001 0000000000000000 0000000000000018 >> 000000048027a300 >> (XEN) ffff8302175ef588 ffff82d0801aff79 00000004175ef5d8 >> 000000d08018d180 >> (XEN) ffffd00000000001 000000080018f72a 00000002175e0000 >> ffffd00000000001 >> (XEN) Xen call trace: >> (XEN) [] notify_via_xen_event_channel+0xac/0x11a >> (XEN) [] >> hvm_send_assist_req_to_ioreq_server+0x132/0x14c >> (XEN) [] hvm_send_assist_req+0x3e/0x45 >> (XEN) [] hvmemul_do_io+0x4dd/0x630 >> (XEN) [] hvmemul_do_mmio+0x2d/0x2f >> (XEN) [] __hvmemul_read+0x227/0x29c >> (XEN) [] hvmemul_read+0x12/0x19 >> (XEN) [] read_ulong+0xe/0x10 >> (XEN) [] x86_emulate+0x1745/0xf8ef >> (XEN) [] hvm_emulate_one+0x15e/0x25e >> (XEN) [] handle_mmio+0x69/0x1f9 >> (XEN) [] hvm_hap_nested_page_fault+0x28a/0x489 >> (XEN) [] vmx_vmexit_handler+0x1446/0x17c7 >> (XEN) [] vmx_asm_vmexit_handler+0x41/0xc0 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) Assertion 'consumer_is_xen(lchn)' failed at event_channel.c:1202 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> >> COLO is very similar as remus, except that secondary vm is running. >> This problem happens in the slave side. The things I do in slave side: >> 1. call libxl__xc_domain_restore_done() after migration to restore the >> secondary >> vm, and unpause it. >> 2. do a new checkpoint: suspend secondary vm and update secondary vm's >> state. >> 3. resume the secondary vm. The hypervisor crash now. >> > > This suggests that emulator at the receiving end might have corrupted the event channel value stored in the shared ioreq page. The fact this triggered an assert is concerning though. I don't have any knowledge about ioreq server, and I have some easy questions: How is the event channel corrupted? xc_hvm_param_set() affects it? Or some other hypercall may corrupt it? Thanks Wen Congyang > > Paul > >> I study ioreq server related codes now. I am also happy if anyone can help >> me. >> >> Thanks >> Wen Congyang > . >