From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42660) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uu1y1-00020E-WD for qemu-devel@nongnu.org; Tue, 02 Jul 2013 10:54:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uu1xy-0005JF-HH for qemu-devel@nongnu.org; Tue, 02 Jul 2013 10:54:05 -0400 Received: from mail-gh0-f176.google.com ([209.85.160.176]:55229) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uu1xy-0005J4-DG for qemu-devel@nongnu.org; Tue, 02 Jul 2013 10:54:02 -0400 Received: by mail-gh0-f176.google.com with SMTP id z17so2459737ghb.7 for ; Tue, 02 Jul 2013 07:54:01 -0700 (PDT) From: Anthony Liguori In-Reply-To: References: Date: Tue, 02 Jul 2013 09:53:58 -0500 Message-ID: <8761wtqagp.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] [PATCH] migration: add timeout option for tcp migration send/receive socket List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Zhanghaoyu (A)" , qemu-devel , Eric Blake Cc: Luonengjun , "Wangrui (K)" , "Huangweidong (C)" , Paolo Bonzini , Zengjunliang "Zhanghaoyu (A)" writes: > When network disconnection occurs during live migration, the migration thread will be stuck in the function sendmsg(), as the migration socket is in ~O_NONBLOCK mode now. > > Signed-off-by: Zeng Junliang Unconditionally setting an arbitrary timeout is a bad idea. Should we send a signal on cancel to break the migration thread out of sendmsg? Regards, Anthony Liguori > --- > include/migration/migration.h | 4 ++++ > migration-tcp.c | 23 ++++++++++++++++++++++- > 2 files changed, 26 insertions(+), 1 deletions(-) > > diff --git a/include/migration/migration.h b/include/migration/migration.h > index f0640e0..1a56248 100644 > --- a/include/migration/migration.h > +++ b/include/migration/migration.h > @@ -23,6 +23,8 @@ > #include "qapi-types.h" > #include "exec/cpu-common.h" > > +#define QEMU_MIGRATE_SOCKET_OP_TIMEOUT 60 > + > struct MigrationParams { > bool blk; > bool shared; > @@ -109,6 +111,8 @@ uint64_t xbzrle_mig_pages_transferred(void); > uint64_t xbzrle_mig_pages_overflow(void); > uint64_t xbzrle_mig_pages_cache_miss(void); > > +int tcp_migration_set_socket_timeout(int fd, int optname, int timeout_in_sec); > + > /** > * @migrate_add_blocker - prevent migration from proceeding > * > diff --git a/migration-tcp.c b/migration-tcp.c > index b20ee58..860238b 100644 > --- a/migration-tcp.c > +++ b/migration-tcp.c > @@ -29,11 +29,28 @@ > do { } while (0) > #endif > > +int tcp_migration_set_socket_timeout(int fd, int optname, int timeout_in_sec) > +{ > + struct timeval timeout; > + int ret = 0; > + > + if (fd < 0 || timeout_in_sec < 0 || > + (optname != SO_RCVTIMEO && optname != SO_SNDTIMEO)) > + return -1; > + > + timeout.tv_sec = timeout_in_sec; > + timeout.tv_usec = 0; > + > + ret = qemu_setsockopt(fd, SOL_SOCKET, optname, &timeout, sizeof(timeout)); > + > + return ret; > +} > + > static void tcp_wait_for_connect(int fd, void *opaque) > { > MigrationState *s = opaque; > > - if (fd < 0) { > + if (tcp_migration_set_socket_timeout(fd, SO_SNDTIMEO, QEMU_MIGRATE_SOCKET_OP_TIMEOUT) < 0) { > DPRINTF("migrate connect error\n"); > s->file = NULL; > migrate_fd_error(s); > @@ -76,6 +93,10 @@ static void tcp_accept_incoming_migration(void *opaque) > goto out; > } > > + if (tcp_migration_set_socket_timeout(c, SO_RCVTIMEO, QEMU_MIGRATE_SOCKET_OP_TIMEOUT) < 0) { > + fprintf(stderr, "set tcp migration socket receive timeout error\n"); > + goto out; > + } > process_incoming_migration(f); > return; > > -- > 1.7.3.1.msysgit.0