From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=47764 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PxN8e-0001nR-A4 for qemu-devel@nongnu.org; Wed, 09 Mar 2011 12:25:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PxN4Z-0006Hc-5V for qemu-devel@nongnu.org; Wed, 09 Mar 2011 12:21:20 -0500 Received: from mail-qy0-f180.google.com ([209.85.216.180]:45116) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PxN4Z-0006HV-0s for qemu-devel@nongnu.org; Wed, 09 Mar 2011 12:21:19 -0500 Received: by qyk10 with SMTP id 10so784459qyk.4 for ; Wed, 09 Mar 2011 09:21:18 -0800 (PST) Sender: Paolo Bonzini From: Paolo Bonzini Date: Wed, 9 Mar 2011 18:21:08 +0100 Message-Id: <1299691270-16328-1-git-send-email-pbonzini@redhat.com> Subject: [Qemu-devel] [PATCH 0/2] avoid races on exec migration List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org QEMU has a sigchld handler that reaps any child process. -smb is the only user of it and, in fact, QEMU inherited it from slirp. However, this handler causes 'exec' based migration to randomly return 'status: failed' in the monitor. This happens when the signal handler for SIGCHLD is ran before the pclose() of exec migration. The return status of fclose() is passed back as return status of qemu_fclose(). If qemu_fclose() fails, then the exec_close() in migration-exec.c returns a error code. This causes migrate_fd_cleanup() to return an error, and thus finally we see why 'status: failed' occurs: if (migrate_fd_cleanup(s) < 0) { if (old_vm_running) { vm_start(); } state = MIG_STATE_ERROR; } To avoid this, register the pids in a list and, on SIGCHLD, set up a bottom-half that would go through the pids and reap them. Since I'm at it, I'm moving iohandler stuff out of vl.c. The new file isn't a perfect place to add the child watcher, but it's arguably better than vl.c. This should be applied to both master and stable. Paolo Bonzini (2): extract I/O handler lists to iohandler.c add a service to reap zombies Makefile.objs | 2 +- iohandler.c | 193 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ os-posix.c | 9 --- qemu-common.h | 4 + slirp/misc.c | 5 +- vl.c | 106 ++------------------------------ 6 files changed, 207 insertions(+), 112 deletions(-) create mode 100644 iohandler.c -- 1.7.4