From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48368) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VGr9p-00025v-O3 for qemu-devel@nongnu.org; Tue, 03 Sep 2013 10:00:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VGr9i-0000Q7-Cw for qemu-devel@nongnu.org; Tue, 03 Sep 2013 10:00:37 -0400 Received: from goliath.siemens.de ([192.35.17.28]:32065) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VGr7L-00084h-CI for qemu-devel@nongnu.org; Tue, 03 Sep 2013 09:58:03 -0400 Message-ID: <5225EAE3.8060202@siemens.com> Date: Tue, 03 Sep 2013 15:57:55 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <5225D839.8000700@siemens.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [KVM] segmentation fault happened when reboot VM after hot-uplug virtio NIC List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Zhanghaoyu (A)" Cc: "Huangweidong (C)" , Gleb Natapov , "Michael S. Tsirkin" , "paolo.bonzini@gmail.com" , Luonengjun , qemu-devel On 2013-09-03 15:22, Zhanghaoyu (A) wrote: >>> Hi, all >>> >>> Segmentation fault happened when reboot VM after hot-unplug virtio NIC, which can be reproduced 100%. >>> See similar bug report to >>> https://bugzilla.redhat.com/show_bug.cgi?id=988256 >>> >>> test environment: >>> host: SLES11SP2 (kenrel version: 3.0.58) >>> qemu: 1.5.1, upstream-qemu (commit >>> 545825d4cda03ea292b7788b3401b99860efe8bc) >>> libvirt: 1.1.0 >>> guest os: win2k8 R2 x64bit or sles11sp2 x64 or win2k3 32bit >>> >>> You can reproduce this problem by following steps: >>> 1. start a VM with virtio NIC(s) >>> 2. hot-unplug a virtio NIC from the VM 3. reboot the VM, then >>> segmentation fault happened during starting period >>> >>> the qemu backtrace shown as below: >>> #0 0x00007ff4be3288d0 in __memcmp_sse4_1 () from /lib64/libc.so.6 >>> #1 0x00007ff4c07f82c0 in patch_hypercalls (s=0x7ff4c15dd610) at >>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:549 >>> #2 0x00007ff4c07f84f0 in vapic_prepare (s=0x7ff4c15dd610) at >>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:614 >>> #3 0x00007ff4c07f85e7 in vapic_write (opaque=0x7ff4c15dd610, addr=0, data=32, size=2) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:651 >>> #4 0x00007ff4c082a917 in memory_region_write_accessor (opaque=0x7ff4c15df938, addr=0, value=0x7ff4bbfe3d00, size=2, >>> shift=0, mask=65535) at >>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:334 >>> #5 0x00007ff4c082a9ee in access_with_adjusted_size (addr=0, value=0x7ff4bbfe3d00, size=2, access_size_min=1, >>> access_size_max=4, access=0x7ff4c082a89a , opaque=0x7ff4c15df938) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:364 >>> #6 0x00007ff4c082ae49 in memory_region_iorange_write (iorange=0x7ff4c15dfca0, offset=0, width=2, data=32) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:439 >>> #7 0x00007ff4c08236f7 in ioport_writew_thunk (opaque=0x7ff4c15dfca0, addr=126, data=32) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:219 >>> #8 0x00007ff4c0823078 in ioport_write (index=1, address=126, data=32) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:83 >>> #9 0x00007ff4c0823ca9 in cpu_outw (addr=126, val=32) at >>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:296 >>> #10 0x00007ff4c0827485 in kvm_handle_io (port=126, data=0x7ff4c0510000, direction=1, size=2, count=1) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1485 >>> #11 0x00007ff4c0827e14 in kvm_cpu_exec (env=0x7ff4c15bf270) at >>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1634 >>> #12 0x00007ff4c07b6f27 in qemu_kvm_cpu_thread_fn (arg=0x7ff4c15bf270) >>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/cpus.c:759 >>> #13 0x00007ff4be58af05 in start_thread () from /lib64/libpthread.so.0 >>> #14 0x00007ff4be2cd53d in clone () from /lib64/libc.so.6 >>> >>> If I apply below patch to the upstream qemu, this problem will >>> disappear, >>> --- >>> hw/i386/kvmvapic.c | 6 +++--- >>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c index >>> 15beb80..6fff299 100644 >>> --- a/hw/i386/kvmvapic.c >>> +++ b/hw/i386/kvmvapic.c >>> @@ -652,11 +652,11 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t data, >>> switch (size) { >>> case 2: >>> if (s->state == VAPIC_INACTIVE) { >>> - rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK; >>> - s->rom_state_paddr = rom_paddr + data; >>> - >>> s->state = VAPIC_STANDBY; >>> } >>> + rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK; >>> + s->rom_state_paddr = rom_paddr + data; >>> + >>> if (vapic_prepare(s) < 0) { >>> s->state = VAPIC_INACTIVE; >>> break; >> >> Yes, we need to update the ROM's physical address after the BIOS reshuffled the layout. >> >> But I'm not happy with simply updating the address unconditionally. We need to understand the crash first, then make QEMU robust against the guest not issuing this initial write after a ROM region layout change. >> And finally make it work properly in the normal case. >> > The direct cause of crash is trying to access invalid address, which is due to not updating the rom's physical address. > In my opinion, since hot-plug/unplug involved in, we need to re-calculate rom's physical address for all devices which have rom during starting period when reboot/reset vm, > is it reasonable to set vapic's state to VAPIC_INACTIVE during vapic's reset? I checked meanwhile, and it should be sufficient to make the VAPIC state working again under sane conditions. Still, this is not yet satisfying /wrt what *precisely* went wrong when the physical address became incorrect, and if a malicious guest could exploit this (e.g. by invoking the 16-bit write from an address outside the VAPIC ROM). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux