From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:44050) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RJoE0-0008JG-3p for qemu-devel@nongnu.org; Fri, 28 Oct 2011 11:20:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RJoDv-000074-Ju for qemu-devel@nongnu.org; Fri, 28 Oct 2011 11:20:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:9867) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RJoDv-00006r-Ba for qemu-devel@nongnu.org; Fri, 28 Oct 2011 11:19:59 -0400 Message-ID: <4EAAC81A.8040900@redhat.com> Date: Fri, 28 Oct 2011 17:19:54 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <20111028125804.751c99de@doriath> In-Reply-To: <20111028125804.751c99de@doriath> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] Fix segfault after migration completes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: Juan Jose Quintela Carreira , qemu-devel , Eduardo Habkost On 10/28/2011 04:58 PM, Luiz Capitulino wrote: > To reproduce: > > 1. Start the source VM with: > > # qemu [...] -S > > 2. Start the destination VM with: > > # qemu -incoming tcp:0:4444 > > 3. In the source VM: > > (qemu) migrate -d tcp:0:4444 > > 3. The source VM will segfault as soon as migration completes (might not > happen in the first try) > > Here's the backtrace: > > #0 0x0000000000516f39 in qemu_file_get_error (f=0x0) at /home/lcapitulino/src/qmp-unstable/savevm.c:431 > 431 return f->last_error; > > #0 0x0000000000516f39 in qemu_file_get_error (f=0x0) at /home/lcapitulino/src/qmp-unstable/savevm.c:431 > #1 0x00000000004e7a9a in migrate_fd_put_notify (opaque=0x987640) at /home/lcapitulino/src/qmp-unstable/migration.c:255 > #2 0x000000000046d59a in qemu_iohandler_poll (readfds=0x7fff45ccfe50, writefds=0x7fff45ccfdd0, xfds=0x7fff45ccfd50, ret=1) > at /home/lcapitulino/src/qmp-unstable/iohandler.c:124 > #3 0x00000000004e6033 in main_loop_wait (nonblocking=0) at /home/lcapitulino/src/qmp-unstable/main-loop.c:463 > #4 0x00000000004db5b0 in main_loop () at /home/lcapitulino/src/qmp-unstable/vl.c:1478 > #5 0x00000000004dffed in main (argc=16, argv=0x7fff45cd0318, envp=0x7fff45cd03a0) at /home/lcapitulino/src/qmp-unstable/vl.c:3449 > > So, 's->file' is NULL in migrate_fd_put_notify(). The interesting thing > is that it's valid in the qemu_file_put_notify() call, which makes me > think that either: there's a race somewhere or qemu_file_put_notify() is > itself clearing 's->file'. In both cases the fix below could just be hiding > the real issue, but let's get started... > > Signed-off-by: Luiz Capitulino > --- > migration.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/migration.c b/migration.c > index bdca72e..f6e6208 100644 > --- a/migration.c > +++ b/migration.c > @@ -252,7 +252,7 @@ static void migrate_fd_put_notify(void *opaque) > > qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL); > qemu_file_put_notify(s->file); > - if (qemu_file_get_error(s->file)) { > + if (s->file&& qemu_file_get_error(s->file)) { > migrate_fd_error(s); > } > } Just one comment, it would be good to mention in the commit message the call chain. The one that Eduardo had tracked offlist looks indeed correct to me: select loop -> migrate_fd_put_notify() -> qemu_file_put_notify() -> buffered_put_buffer() -> migrate_fd_put_ready() -> migrate_fd_completed() -> migrate_fd_cleanup(). Anyway, code-wise: Reviewed-by: Paolo Bonzini Paolo