From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56023) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VGraB-0007NE-4z for qemu-devel@nongnu.org; Tue, 03 Sep 2013 10:27:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VGra3-0000rF-P7 for qemu-devel@nongnu.org; Tue, 03 Sep 2013 10:27:51 -0400 Received: from david.siemens.de ([192.35.17.14]:23752) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VGra3-0000r3-Ew for qemu-devel@nongnu.org; Tue, 03 Sep 2013 10:27:43 -0400 Message-ID: <5225F1D4.7020006@siemens.com> Date: Tue, 03 Sep 2013 16:27:32 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <5225D839.8000700@siemens.com> <5225EAE3.8060202@siemens.com> In-Reply-To: <5225EAE3.8060202@siemens.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [KVM] segmentation fault happened when reboot VM after hot-uplug virtio NIC List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Zhanghaoyu (A)" Cc: "Huangweidong (C)" , Gleb Natapov , "Michael S. Tsirkin" , "paolo.bonzini@gmail.com" , Luonengjun , qemu-devel On 2013-09-03 15:57, Jan Kiszka wrote: > On 2013-09-03 15:22, Zhanghaoyu (A) wrote: >>>> Hi, all >>>> >>>> Segmentation fault happened when reboot VM after hot-unplug virtio NIC, which can be reproduced 100%. >>>> See similar bug report to >>>> https://bugzilla.redhat.com/show_bug.cgi?id=988256 >>>> >>>> test environment: >>>> host: SLES11SP2 (kenrel version: 3.0.58) >>>> qemu: 1.5.1, upstream-qemu (commit >>>> 545825d4cda03ea292b7788b3401b99860efe8bc) >>>> libvirt: 1.1.0 >>>> guest os: win2k8 R2 x64bit or sles11sp2 x64 or win2k3 32bit >>>> >>>> You can reproduce this problem by following steps: >>>> 1. start a VM with virtio NIC(s) >>>> 2. hot-unplug a virtio NIC from the VM 3. reboot the VM, then >>>> segmentation fault happened during starting period >>>> >>>> the qemu backtrace shown as below: >>>> #0 0x00007ff4be3288d0 in __memcmp_sse4_1 () from /lib64/libc.so.6 >>>> #1 0x00007ff4c07f82c0 in patch_hypercalls (s=0x7ff4c15dd610) at >>>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:549 >>>> #2 0x00007ff4c07f84f0 in vapic_prepare (s=0x7ff4c15dd610) at >>>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:614 >>>> #3 0x00007ff4c07f85e7 in vapic_write (opaque=0x7ff4c15dd610, addr=0, data=32, size=2) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:651 >>>> #4 0x00007ff4c082a917 in memory_region_write_accessor (opaque=0x7ff4c15df938, addr=0, value=0x7ff4bbfe3d00, size=2, >>>> shift=0, mask=65535) at >>>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:334 >>>> #5 0x00007ff4c082a9ee in access_with_adjusted_size (addr=0, value=0x7ff4bbfe3d00, size=2, access_size_min=1, >>>> access_size_max=4, access=0x7ff4c082a89a , opaque=0x7ff4c15df938) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:364 >>>> #6 0x00007ff4c082ae49 in memory_region_iorange_write (iorange=0x7ff4c15dfca0, offset=0, width=2, data=32) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:439 >>>> #7 0x00007ff4c08236f7 in ioport_writew_thunk (opaque=0x7ff4c15dfca0, addr=126, data=32) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:219 >>>> #8 0x00007ff4c0823078 in ioport_write (index=1, address=126, data=32) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:83 >>>> #9 0x00007ff4c0823ca9 in cpu_outw (addr=126, val=32) at >>>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:296 >>>> #10 0x00007ff4c0827485 in kvm_handle_io (port=126, data=0x7ff4c0510000, direction=1, size=2, count=1) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1485 >>>> #11 0x00007ff4c0827e14 in kvm_cpu_exec (env=0x7ff4c15bf270) at >>>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1634 >>>> #12 0x00007ff4c07b6f27 in qemu_kvm_cpu_thread_fn (arg=0x7ff4c15bf270) >>>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/cpus.c:759 >>>> #13 0x00007ff4be58af05 in start_thread () from /lib64/libpthread.so.0 >>>> #14 0x00007ff4be2cd53d in clone () from /lib64/libc.so.6 >>>> >>>> If I apply below patch to the upstream qemu, this problem will >>>> disappear, >>>> --- >>>> hw/i386/kvmvapic.c | 6 +++--- >>>> 1 file changed, 3 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c index >>>> 15beb80..6fff299 100644 >>>> --- a/hw/i386/kvmvapic.c >>>> +++ b/hw/i386/kvmvapic.c >>>> @@ -652,11 +652,11 @@ static void vapic_write(void *opaque, hwaddr addr, uint64_t data, >>>> switch (size) { >>>> case 2: >>>> if (s->state == VAPIC_INACTIVE) { >>>> - rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK; >>>> - s->rom_state_paddr = rom_paddr + data; >>>> - >>>> s->state = VAPIC_STANDBY; >>>> } >>>> + rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK; >>>> + s->rom_state_paddr = rom_paddr + data; >>>> + >>>> if (vapic_prepare(s) < 0) { >>>> s->state = VAPIC_INACTIVE; >>>> break; >>> >>> Yes, we need to update the ROM's physical address after the BIOS reshuffled the layout. >>> >>> But I'm not happy with simply updating the address unconditionally. We need to understand the crash first, then make QEMU robust against the guest not issuing this initial write after a ROM region layout change. >>> And finally make it work properly in the normal case. >>> >> The direct cause of crash is trying to access invalid address, which is due to not updating the rom's physical address. >> In my opinion, since hot-plug/unplug involved in, we need to re-calculate rom's physical address for all devices which have rom during starting period when reboot/reset vm, >> is it reasonable to set vapic's state to VAPIC_INACTIVE during vapic's reset? > > I checked meanwhile, and it should be sufficient to make the VAPIC state > working again under sane conditions. > > Still, this is not yet satisfying /wrt what *precisely* went wrong when > the physical address became incorrect, and if a malicious guest could > exploit this (e.g. by invoking the 16-bit write from an address outside > the VAPIC ROM). OK, this is the unfortunate chain of events: rom_state_paddr becomes invalid, so vapic_map_rom_writable, called by vapic_prepare during the 16-bit write, accesses an invalid region to grab the ROM size from the ROM image. This size becomes 0, and the g_malloc in patch_hypercalls return NULL as we try to allocate 0 bytes. So memcmp uses a NULL pointer as base for walking the ROM image. I'll send a separate patch to catch this case. Other invalid ROM size values cannot cause harm. At most, 128K is allocated, and random guest memory is read. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux