From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=47335 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OfuoI-0006tj-C8 for qemu-devel@nongnu.org; Mon, 02 Aug 2010 09:12:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OfuoH-0001pH-6m for qemu-devel@nongnu.org; Mon, 02 Aug 2010 09:12:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:18803) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OfuoG-0001ou-Uf for qemu-devel@nongnu.org; Mon, 02 Aug 2010 09:12:05 -0400 From: Alex Williamson In-Reply-To: <4C5692FD.80808@redhat.com> References: <4C568A85.9040500@redhat.com> <4C5692FD.80808@redhat.com> Content-Type: text/plain; charset="UTF-8" Date: Mon, 02 Aug 2010 07:12:00 -0600 Message-ID: <1280754720.6598.10.camel@x201> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: Migration issues in qemu.git List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: qemu-devel , KVM list , Juan Quintela On Mon, 2010-08-02 at 12:42 +0300, Avi Kivity wrote: > On 08/02/2010 12:06 PM, Avi Kivity wrote: > > I'm hitting some migration issues merging qemu.git into qemu-kvm.git: > > > > 1. Crash in mig_cancel test: > > > > (gdb) bt > > #0 0x0000003a91c83dbb in memcpy () from /lib64/libc.so.6 > > #1 0x000000000049c2ff in qemu_get_buffer (f=0x302d870, buf= > optimized out>, size1=4096) at /usr/include/bits/string3.h:52 > > #2 0x0000000000409464 in ram_load (f=0x302d870, opaque= > optimized out>, version_id=4) at > > /build/home/tlv/akivity/qemu-kvm/arch_init.c:407 > > #3 0x000000000049cb4c in qemu_loadvm_state (f=0x302d870) at > > savevm.c:1708 > > #4 0x0000000000494169 in process_incoming_migration (f= > optimized out>) at migration.c:63 > > #5 0x0000000000494517 in tcp_accept_incoming_migration (opaque= > optimized out>) at migration-tcp.c:163 > > #6 0x000000000041b67e in main_loop_wait (nonblocking= > out>) at /build/home/tlv/akivity/qemu-kvm/vl.c:1300 > > #7 0x00000000004314e7 in kvm_main_loop () at > > /build/home/tlv/akivity/qemu-kvm/qemu-kvm.c:1710 > > #8 0x000000000041c67f in main_loop (argc=, > > argv=, envp=) > > at /build/home/tlv/akivity/qemu-kvm/vl.c:1340 > > #9 main (argc=, argv=, > > envp=) at /build/home/tlv/akivity/qemu-kvm/vl.c:3069 > > > > This is on the incoming side so the test completes successfully, only > > leaving a core dump to fill my disks. > > > This appears to be > > > static inline void *host_from_stream_offset(QEMUFile *f, > > ram_addr_t offset, > > int flags) > > { > > static RAMBlock *block = NULL; > > char id[256]; > > uint8_t len; > > > > if (flags & RAM_SAVE_FLAG_CONTINUE) { > > if (!block) { > > fprintf(stderr, "Ack, bad migration stream!\n"); > > return NULL; > > } > > > > return block->host + offset; > > } > > with block == NULL, if my gdb-fu got a static variable in an inlined > function examined correctly. If block == NULL, are you getting the fprintf? > I don't see any special reason for block to be NULL on a cancelled > migration. Though perhaps the incoming stream was terminated without us > noticing, and we're migrating from some random buffer and confusing the > code? Yeah, I don't understand that either, block == NULL should only be an initial state, once we've seen a block it shouldn't happen. Does this patch solve anything: http://lists.nongnu.org/archive/html/qemu-devel/2010-07/msg01114.html I could see this fixing it if the migration was re-attempted after the cancel. Alex