From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LbHr3-0002Oh-Rv for qemu-devel@nongnu.org; Sun, 22 Feb 2009 12:11:02 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LbHr2-0002No-Lb for qemu-devel@nongnu.org; Sun, 22 Feb 2009 12:11:01 -0500 Received: from [199.232.76.173] (port=60444 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LbHr2-0002NY-EU for qemu-devel@nongnu.org; Sun, 22 Feb 2009 12:11:00 -0500 Received: from mx2.redhat.com ([66.187.237.31]:46735) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LbHr1-0000NC-S7 for qemu-devel@nongnu.org; Sun, 22 Feb 2009 12:11:00 -0500 Message-ID: <49A18702.8080200@redhat.com> Date: Sun, 22 Feb 2009 19:10:26 +0200 From: Uri Lublin MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040802080805000100050908" Subject: [Qemu-devel] [PATCH FOR REVIEW] migration to/from file (v3 aio) Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Uri Lublin This is a multi-part message in MIME format. --------------040802080805000100050908 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello, Attached for review is an implementation of migration to/from a file. migrate-to-file: uses posix-aio-compat and FdMigration migrate-from-file: uses qemu-fopen directly. This patch is incomplete as currently it uses polling of aio-completion (we fake EAGAIN, and select finds the fd writeable, so file_write will be called again). As Anthony suggested I plan to implement s->set_fd in FdMigration code, and use signal based aio-completion notifications (SIGUSR1). Anthony mentioned we need only a single aio operation at a time. Although I'm not sure what the reasoning for that, I modified the code to support a single operation (my previous implementation used multiple aio operations). Differences from v2: use aio writes (posix-aio-compat) instead of select and/or write. compile migration-to-file only if AIO is configured. An alternative (v4) would be to use a single thread and a pipe which is also asynchronous, much easier to implement and possibly faster (need to be measured). Regards, Uri. --------------040802080805000100050908 Content-Type: text/x-patch; name="0001-Adding-migration-to-from-file-v3-aio.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="0001-Adding-migration-to-from-file-v3-aio.patch" >>From 34e539ca774d14aa9950fbc4404148d7a5e49101 Mon Sep 17 00:00:00 2001 From: Uri Lublin Date: Thu, 12 Feb 2009 11:04:59 +0200 Subject: [PATCH] Adding migration to/from file (v3 - aio) Migration to file, reuses migration-to-fd. Migration from file, uses qemu-fopen directly. The saved state-file should be used only once and removed (or used with -snapshot, or a the disk-image should be copied), as the disk image is not saved, only the VM state. I recommend to stop the VM before migrating its state to a file. This version supports live migration to file. Can be used later for VM-checkpointing. An advantage migration-to-file over savevm/loadvm is that for the latter a qcow2 is a requirement, while the former works for any image-format. This version uses posix-aio-compat (which uses worker threads). Additional memory is allocated to hold the data, so it's not changed while written to file. Signed-off-by: Uri Lublin --- Makefile | 3 + migration-file.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ migration.c | 8 ++ migration.h | 6 ++ 4 files changed, 262 insertions(+), 0 deletions(-) create mode 100644 migration-file.c diff --git a/Makefile b/Makefile index 4f7a55a..60170d1 100644 --- a/Makefile +++ b/Makefile @@ -95,6 +95,9 @@ ifdef CONFIG_WIN32 OBJS+=tap-win32.o else OBJS+=migration-exec.o +ifdef CONFIG_AIO +OBJS+=migration-file.o +endif endif AUDIO_OBJS = audio.o noaudio.o wavaudio.o mixeng.o diff --git a/migration-file.c b/migration-file.c new file mode 100644 index 0000000..6d3dbf5 --- /dev/null +++ b/migration-file.c @@ -0,0 +1,245 @@ +/* + * QEMU live migration + * + * Copyright IBM, Corp. 2008 + * Red Hat, Inc. 2009 + * + * Authors: + * Anthony Liguori + * Uri Lublin + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "migration.h" +#include "sysemu.h" +#include "console.h" +#include "block.h" +#include "posix-aio-compat.h" +#include "buffered_file.h" + +//#define DEBUG_MIGRATION_FILE + +#ifdef DEBUG_MIGRATION_FILE +#define dprintf(fmt, ...) \ + do { printf("migration-file: " fmt, ## __VA_ARGS__); } while (0) +#else +#define dprintf(fmt, ...) \ + do { } while (0) +#endif + +#define MF_AIO_BUF_SIZE 32768 + +typedef struct FileMigrationState_s { + FdMigrationState fds; + off_t write_offset; + struct qemu_paiocb aiocb; + int aio_in_use; +} FileMigrationState; + + +static int file_errno(FdMigrationState *fds) +{ + return errno; +} + +static void mig_file_check_aiocb(FileMigrationState *s) +{ + ssize_t ret; + + if (s->aio_in_use) { + ret = qemu_paio_return(&s->aiocb); + dprintf("mig_file_check_aiocb: aiocb=%p size=%ld ret=%ld\n", + &s->aiocb, s->aiocb.aio_nbytes, ret); + + if (ret == -EINPROGRESS) + return; + if (ret == s->aiocb.aio_nbytes) { + s->aio_in_use = 0; + } + else { + dprintf("posix aio write failed, returned %ld expected %ld\n", + ret, s->aiocb.aio_nbytes); + if (s->fds.state == MIG_STATE_ACTIVE) + s->fds.state = MIG_STATE_ERROR; + } + } +} + +static struct qemu_paiocb* mig_file_get_aiocb(FileMigrationState *s) +{ + struct qemu_paiocb *paiocb = NULL; + + if (s->aio_in_use == 0) { + s->aio_in_use = 1; + paiocb = &s->aiocb; + if (paiocb->aio_buf == NULL) + paiocb->aio_buf = qemu_malloc(MF_AIO_BUF_SIZE); + } + + return paiocb; +} + +static void free_aio_buf(FileMigrationState *s) +{ + struct qemu_paiocb *paiocb = &s->aiocb; + + if (paiocb->aio_buf) { + qemu_free(paiocb->aio_buf); + paiocb->aio_buf = NULL; + } +} + +static int write_async(FileMigrationState *s, const void *buf, size_t size) +{ + struct qemu_paiocb *paiocb; + + mig_file_check_aiocb(s); + + paiocb = mig_file_get_aiocb(s); + if (paiocb == NULL) { + dprintf("write_async: returning EAGAIN\n"); + return EAGAIN; + } + + if (size > MF_AIO_BUF_SIZE) { + fprintf(stderr, "mig2file: write_async: size=%ld > aio_max=%d\n", + size, MF_AIO_BUF_SIZE); + return -EINVAL; + } + + memcpy(paiocb->aio_buf, buf, size); + + paiocb->aio_fildes = s->fds.fd; + paiocb->aio_nbytes = size; + paiocb->ev_signo = 0; /* No notification is needed */ + paiocb->aio_offset = s->write_offset; + + s->write_offset += size; + + return qemu_paio_write(paiocb); +} + +static int file_write(FdMigrationState *fds, const void * buf, size_t size) +{ + FileMigrationState *s = (FileMigrationState*)fds; + int ret = 0; + ssize_t offset = 0, len = MF_AIO_BUF_SIZE; + + while (offset < size) { + if (size - offset < len) + len = size - offset; + ret = write_async(s, buf + offset, len); + if (ret == EAGAIN) { + if (offset == 0) { + errno = EAGAIN; + return -1; + } + break; + } + if (ret < 0) + return ret; + offset += len; + } + + dprintf("file_write: size=%ld ret=%d returning %ld\n", size, ret, offset); + + return offset; +} + +static int file_close(FdMigrationState *fds) +{ + FileMigrationState *s = (FileMigrationState*)fds; + int i; + + for (i=0; i<30; i++) { + dprintf("file_close: still in_use %d/%d\n", i, 30); + mig_file_check_aiocb(s); + if (s->aio_in_use == 0) + break; + sleep(1); + } + close(fds->fd); + + free_aio_buf(s); + + return 0; +} + +MigrationState *file_start_outgoing_migration(const char *filename, + int64_t bandwidth_limit, + int async) +{ + FdMigrationState *fds; + FileMigrationState *s; + int fd; + + s = qemu_mallocz(sizeof(*s)); + + fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0600); + if (fd < 0) { + perror("file_migration: failed to open filename"); + term_printf("file_migration: failed to open filename %s\n", filename); + goto err; + } + + fds = &s->fds; + + fds->fd = fd; + fds->close = file_close; + fds->get_error = file_errno; + fds->write = file_write; + fds->mig_state.cancel = migrate_fd_cancel; + fds->mig_state.get_status = migrate_fd_get_status; + fds->mig_state.release = migrate_fd_release; + + fds->state = MIG_STATE_ACTIVE; + fds->detach = !async; + fds->bandwidth_limit = bandwidth_limit; + + + if (fds->detach == 1) { + monitor_suspend(); + fds->detach = 2; + } + + migrate_fd_connect(fds); + return &fds->mig_state; + +err: + qemu_free(s); + return NULL; +} + +int file_start_incoming_migration(const char *filename) +{ + int ret; + QEMUFile *f; + + dprintf("Starting incoming file migration from '%s'\n", filename); + f = qemu_fopen(filename, "rb"); + if(f == NULL) { + perror("failed to open file"); + term_printf("failed to open file %s\n", filename); + return -errno; + } + + vm_stop(0); /* just in case */ + ret = qemu_loadvm_state(f); + if (ret < 0) { + fprintf(stderr, "in_file_mig: load of migration failed\n"); + goto err; + } + qemu_announce_self(); + dprintf("successfully loaded vm state\n"); + vm_start(); + qemu_fclose(f); + return 0; + +err: + qemu_fclose(f); + return -errno; +} diff --git a/migration.c b/migration.c index 0ef777a..234dcf6 100644 --- a/migration.c +++ b/migration.c @@ -43,6 +43,10 @@ void qemu_start_incoming_migration(const char *uri) #if !defined(WIN32) else if (strstart(uri, "exec:", &p)) exec_start_incoming_migration(p); +#ifdef CONFIG_AIO + else if (strstart(uri, "file:", &p)) + file_start_incoming_migration(p); +#endif #endif else fprintf(stderr, "unknown migration protocol: %s\n", uri); @@ -58,6 +62,10 @@ void do_migrate(int detach, const char *uri) #if !defined(WIN32) else if (strstart(uri, "exec:", &p)) s = exec_start_outgoing_migration(p, max_throttle, detach); +#ifdef CONFIG_AIO + else if (strstart(uri, "file:", &p)) + s = file_start_outgoing_migration(p, max_throttle, detach); +#endif #endif else term_printf("unknown migration protocol: %s\n", uri); diff --git a/migration.h b/migration.h index d9771ad..ae78792 100644 --- a/migration.h +++ b/migration.h @@ -67,6 +67,12 @@ MigrationState *tcp_start_outgoing_migration(const char *host_port, int64_t bandwidth_limit, int detach); + +int file_start_incoming_migration(const char *filename); +MigrationState *file_start_outgoing_migration(const char *filename, + int64_t bandwidth_limit, + int detach); + void migrate_fd_error(FdMigrationState *s); void migrate_fd_cleanup(FdMigrationState *s); -- 1.6.0.6 --------------040802080805000100050908--