From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=37277 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PrNf5-0005zn-Se for qemu-devel@nongnu.org; Sun, 20 Feb 2011 23:46:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PrNf2-0002Ga-9K for qemu-devel@nongnu.org; Sun, 20 Feb 2011 23:46:15 -0500 Received: from mail-gy0-f173.google.com ([209.85.160.173]:64858) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PrNf1-0002GB-Uj for qemu-devel@nongnu.org; Sun, 20 Feb 2011 23:46:12 -0500 Received: by gyd8 with SMTP id 8so297572gyd.4 for ; Sun, 20 Feb 2011 20:46:11 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1297330258-20494-8-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> References: <1297330258-20494-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1297330258-20494-8-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> Date: Mon, 21 Feb 2011 12:46:09 +0800 Message-ID: From: ya su Content-Type: multipart/alternative; boundary=000e0cd25a1edd4a7d049cc38a93 Subject: [Qemu-devel] Re: [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode. List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yoshiaki Tamura Cc: kwolf@redhat.com, aliguori@us.ibm.com, dlaor@redhat.com, ananth@in.ibm.com, kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org, vatsa@linux.vnet.ibm.com, blauwirbel@gmail.com, ohmura.kei@lab.ntt.co.jp, avi@redhat.com, pbonzini@redhat.com, psuriset@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com --000e0cd25a1edd4a7d049cc38a93 Content-Type: text/plain; charset=ISO-8859-1 Yoshiaki: I have one question about ram_save_live, during migration 3 stage(completation stage), it will call cpu_physical_memory_set_dirty_tracking(0) to stop recording ram dirty pages. at the end of migrate_ft_trans_connect function, it will invoke vm_start(), at this time, cpu_physical_memory_set_dirty_tracking(1) is not called yet, so there may have some ram pages not recorded when qemu_savevm_trans_begin is called. I think you need calll cpu_physical_memory_set_dirty_tracking(1) in migrate_ft_trans_connect function, Am I right? BR Green. 2011/2/10 Yoshiaki Tamura > This code implements VM transaction protocol. Like buffered_file, it > sits between savevm and migration layer. With this architecture, VM > transaction protocol is implemented mostly independent from other > existing code. > > Signed-off-by: Yoshiaki Tamura > Signed-off-by: OHMURA Kei > --- > Makefile.objs | 1 + > ft_trans_file.c | 624 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > ft_trans_file.h | 72 +++++++ > migration.c | 3 + > trace-events | 15 ++ > 5 files changed, 715 insertions(+), 0 deletions(-) > create mode 100644 ft_trans_file.c > create mode 100644 ft_trans_file.h > > diff --git a/Makefile.objs b/Makefile.objs > index 353b1a8..04148b5 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -100,6 +100,7 @@ common-obj-y += msmouse.o ps2.o > common-obj-y += qdev.o qdev-properties.o > common-obj-y += block-migration.o > common-obj-y += pflib.o > +common-obj-y += ft_trans_file.o > > common-obj-$(CONFIG_BRLAPI) += baum.o > common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o > migration-fd.o > diff --git a/ft_trans_file.c b/ft_trans_file.c > new file mode 100644 > index 0000000..2b42b95 > --- /dev/null > +++ b/ft_trans_file.c > @@ -0,0 +1,624 @@ > +/* > + * Fault tolerant VM transaction QEMUFile > + * > + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + * > + * This source code is based on buffered_file.c. > + * Copyright IBM, Corp. 2008 > + * Authors: > + * Anthony Liguori > + */ > + > +#include "qemu-common.h" > +#include "qemu-error.h" > +#include "hw/hw.h" > +#include "qemu-timer.h" > +#include "sysemu.h" > +#include "qemu-char.h" > +#include "trace.h" > +#include "ft_trans_file.h" > + > +typedef struct FtTransHdr > +{ > + uint16_t cmd; > + uint16_t id; > + uint32_t seq; > + uint32_t payload_len; > +} FtTransHdr; > + > +typedef struct QEMUFileFtTrans > +{ > + FtTransPutBufferFunc *put_buffer; > + FtTransGetBufferFunc *get_buffer; > + FtTransPutReadyFunc *put_ready; > + FtTransGetReadyFunc *get_ready; > + FtTransWaitForUnfreezeFunc *wait_for_unfreeze; > + FtTransCloseFunc *close; > + void *opaque; > + QEMUFile *file; > + > + enum QEMU_VM_TRANSACTION_STATE state; > + uint32_t seq; > + uint16_t id; > + > + int has_error; > + > + bool freeze_output; > + bool freeze_input; > + bool rate_limit; > + bool is_sender; > + bool is_payload; > + > + uint8_t *buf; > + size_t buf_max_size; > + size_t put_offset; > + size_t get_offset; > + > + FtTransHdr header; > + size_t header_offset; > +} QEMUFileFtTrans; > + > +#define IO_BUF_SIZE 32768 > + > +static void ft_trans_append(QEMUFileFtTrans *s, > + const uint8_t *buf, size_t size) > +{ > + if (size > (s->buf_max_size - s->put_offset)) { > + trace_ft_trans_realloc(s->buf_max_size, size + 1024); > + s->buf_max_size += size + 1024; > + s->buf = qemu_realloc(s->buf, s->buf_max_size); > + } > + > + trace_ft_trans_append(size); > + memcpy(s->buf + s->put_offset, buf, size); > + s->put_offset += size; > +} > + > +static void ft_trans_flush(QEMUFileFtTrans *s) > +{ > + size_t offset = 0; > + > + if (s->has_error) { > + error_report("flush when error %d, bailing", s->has_error); > + return; > + } > + > + while (offset < s->put_offset) { > + ssize_t ret; > + > + ret = s->put_buffer(s->opaque, s->buf + offset, s->put_offset - > offset); > + if (ret == -EAGAIN) { > + break; > + } > + > + if (ret <= 0) { > + error_report("error flushing data, %s", strerror(errno)); > + s->has_error = FT_TRANS_ERR_FLUSH; > + break; > + } else { > + offset += ret; > + } > + } > + > + trace_ft_trans_flush(offset, s->put_offset); > + memmove(s->buf, s->buf + offset, s->put_offset - offset); > + s->put_offset -= offset; > + s->freeze_output = !!s->put_offset; > +} > + > +static ssize_t ft_trans_put(void *opaque, void *buf, int size) > +{ > + QEMUFileFtTrans *s = opaque; > + size_t offset = 0; > + ssize_t len; > + > + /* flush buffered data before putting next */ > + if (s->put_offset) { > + ft_trans_flush(s); > + } > + > + while (!s->freeze_output && offset < size) { > + len = s->put_buffer(s->opaque, (uint8_t *)buf + offset, size - > offset); > + > + if (len == -EAGAIN) { > + trace_ft_trans_freeze_output(); > + s->freeze_output = 1; > + break; > + } > + > + if (len <= 0) { > + error_report("putting data failed, %s", strerror(errno)); > + s->has_error = 1; > + offset = -EINVAL; > + break; > + } > + > + offset += len; > + } > + > + if (s->freeze_output) { > + ft_trans_append(s, buf + offset, size - offset); > + offset = size; > + } > + > + return offset; > +} > + > +static int ft_trans_send_header(QEMUFileFtTrans *s, > + enum QEMU_VM_TRANSACTION_STATE state, > + uint32_t payload_len) > +{ > + int ret; > + FtTransHdr *hdr = &s->header; > + > + trace_ft_trans_send_header(state); > + > + hdr->cmd = s->state = state; > + hdr->id = s->id; > + hdr->seq = s->seq; > + hdr->payload_len = payload_len; > + > + ret = ft_trans_put(s, hdr, sizeof(*hdr)); > + if (ret < 0) { > + error_report("send header failed"); > + s->has_error = FT_TRANS_ERR_SEND_HDR; > + } > + > + return ret; > +} > + > +static int ft_trans_put_buffer(void *opaque, const uint8_t *buf, int64_t > pos, int size) > +{ > + QEMUFileFtTrans *s = opaque; > + ssize_t ret; > + > + trace_ft_trans_put_buffer(size, pos); > + > + if (s->has_error) { > + error_report("put_buffer when error %d, bailing", s->has_error); > + return -EINVAL; > + } > + > + /* assuming qemu_file_put_notify() is calling */ > + if (pos == 0 && size == 0) { > + trace_ft_trans_put_ready(); > + ft_trans_flush(s); > + > + if (!s->freeze_output) { > + trace_ft_trans_cb(s->put_ready); > + ret = s->put_ready(); > + } > + > + ret = 0; > + goto out; > + } > + > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_CONTINUE, size); > + if (ret < 0) { > + goto out; > + } > + > + ret = ft_trans_put(s, (uint8_t *)buf, size); > + if (ret < 0) { > + error_report("send palyload failed"); > + s->has_error = FT_TRANS_ERR_SEND_PAYLOAD; > + goto out; > + } > + > + s->seq++; > + > +out: > + return ret; > +} > + > +static int ft_trans_fill_buffer(void *opaque, void *buf, int size) > +{ > + QEMUFileFtTrans *s = opaque; > + size_t offset = 0; > + ssize_t len; > + > + s->freeze_input = 0; > + > + while (offset < size) { > + len = s->get_buffer(s->opaque, (uint8_t *)buf + offset, > + 0, size - offset); > + if (len == -EAGAIN) { > + trace_ft_trans_freeze_input(); > + s->freeze_input = 1; > + break; > + } > + > + if (len <= 0) { > + error_report("fill buffer failed, %s", strerror(errno)); > + s->has_error = 1; > + return -EINVAL; > + } > + > + offset += len; > + } > + > + return offset; > +} > + > +static int ft_trans_recv_header(QEMUFileFtTrans *s) > +{ > + int ret; > + char *buf = (char *)&s->header + s->header_offset; > + > + ret = ft_trans_fill_buffer(s, buf, sizeof(FtTransHdr) - > s->header_offset); > + if (ret < 0) { > + error_report("recv header failed"); > + s->has_error = FT_TRANS_ERR_RECV_HDR; > + goto out; > + } > + > + s->header_offset += ret; > + if (s->header_offset == sizeof(FtTransHdr)) { > + trace_ft_trans_recv_header(s->header.cmd); > + s->state = s->header.cmd; > + s->header_offset = 0; > + > + if (!s->is_sender) { > + s->id = s->header.id; > + s->seq = s->header.seq; > + } > + } > + > +out: > + return ret; > +} > + > +static int ft_trans_recv_payload(QEMUFileFtTrans *s) > +{ > + QEMUFile *f = s->file; > + int ret = -1; > + > + /* extend QEMUFile buf if there weren't enough space */ > + if (s->header.payload_len > (s->buf_max_size - s->get_offset)) { > + s->buf_max_size += (s->header.payload_len - > + (s->buf_max_size - s->get_offset)); > + s->buf = qemu_realloc_buffer(f, s->buf_max_size); > + } > + > + ret = ft_trans_fill_buffer(s, s->buf + s->get_offset, > + s->header.payload_len); > + if (ret < 0) { > + error_report("recv payload failed"); > + s->has_error = FT_TRANS_ERR_RECV_PAYLOAD; > + goto out; > + } > + > + trace_ft_trans_recv_payload(ret, s->header.payload_len, > s->get_offset); > + > + s->header.payload_len -= ret; > + s->get_offset += ret; > + s->is_payload = !!s->header.payload_len; > + > +out: > + return ret; > +} > + > +static int ft_trans_recv(QEMUFileFtTrans *s) > +{ > + int ret; > + > + /* get payload and return */ > + if (s->is_payload) { > + ret = ft_trans_recv_payload(s); > + goto out; > + } > + > + ret = ft_trans_recv_header(s); > + if (ret < 0 || s->freeze_input) { > + goto out; > + } > + > + switch (s->state) { > + case QEMU_VM_TRANSACTION_BEGIN: > + /* CONTINUE or COMMIT should come shortly */ > + s->is_payload = 0; > + break; > + > + case QEMU_VM_TRANSACTION_CONTINUE: > + /* get payload */ > + s->is_payload = 1; > + break; > + > + case QEMU_VM_TRANSACTION_COMMIT: > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0); > + if (ret < 0) { > + goto out; > + } > + > + trace_ft_trans_cb(s->get_ready); > + ret = s->get_ready(s->opaque); > + if (ret < 0) { > + goto out; > + } > + > + qemu_clear_buffer(s->file); > + s->get_offset = 0; > + s->is_payload = 0; > + > + break; > + > + case QEMU_VM_TRANSACTION_ATOMIC: > + /* not implemented yet */ > + error_report("QEMU_VM_TRANSACTION_ATOMIC not implemented. %d", > + ret); > + break; > + > + case QEMU_VM_TRANSACTION_CANCEL: > + /* return -EINVAL until migrate cancel on recevier side is > supported */ > + ret = -EINVAL; > + break; > + > + default: > + error_report("unknown QEMU_VM_TRANSACTION_STATE %d", ret); > + s->has_error = FT_TRANS_ERR_STATE_INVALID; > + ret = -EINVAL; > + } > + > +out: > + return ret; > +} > + > +static int ft_trans_get_buffer(void *opaque, uint8_t *buf, > + int64_t pos, int size) > +{ > + QEMUFileFtTrans *s = opaque; > + int ret; > + > + if (s->has_error) { > + error_report("get_buffer when error %d, bailing", s->has_error); > + return -EINVAL; > + } > + > + /* assuming qemu_file_get_notify() is calling */ > + if (pos == 0 && size == 0) { > + trace_ft_trans_get_ready(); > + s->freeze_input = 0; > + > + /* sender should be waiting for ACK */ > + if (s->is_sender) { > + ret = ft_trans_recv_header(s); > + if (s->freeze_input) { > + ret = 0; > + goto out; > + } > + if (ret < 0) { > + error_report("recv ack failed"); > + goto out; > + } > + > + if (s->state != QEMU_VM_TRANSACTION_ACK) { > + error_report("recv invalid state %d", s->state); > + s->has_error = FT_TRANS_ERR_STATE_INVALID; > + ret = -EINVAL; > + goto out; > + } > + > + trace_ft_trans_cb(s->get_ready); > + ret = s->get_ready(s->opaque); > + if (ret < 0) { > + goto out; > + } > + > + /* proceed trans id */ > + s->id++; > + > + return 0; > + } > + > + /* set QEMUFile buf at beginning */ > + if (!s->buf) { > + s->buf = buf; > + } > + > + ret = ft_trans_recv(s); > + goto out; > + } > + > + ret = s->get_offset; > + > +out: > + return ret; > +} > + > +static int ft_trans_close(void *opaque) > +{ > + QEMUFileFtTrans *s = opaque; > + int ret; > + > + trace_ft_trans_close(); > + ret = s->close(s->opaque); > + if (s->is_sender) { > + qemu_free(s->buf); > + } > + qemu_free(s); > + > + return ret; > +} > + > +static int ft_trans_rate_limit(void *opaque) > +{ > + QEMUFileFtTrans *s = opaque; > + > + if (s->has_error) { > + return 0; > + } > + > + if (s->rate_limit && s->freeze_output) { > + return 1; > + } > + > + return 0; > +} > + > +static int64_t ft_trans_set_rate_limit(void *opaque, int64_t new_rate) > +{ > + QEMUFileFtTrans *s = opaque; > + > + if (s->has_error) { > + goto out; > + } > + > + s->rate_limit = !!new_rate; > + > +out: > + return s->rate_limit; > +} > + > +int ft_trans_begin(void *opaque) > +{ > + QEMUFileFtTrans *s = opaque; > + int ret; > + s->seq = 0; > + > + /* receiver sends QEMU_VM_TRANSACTION_ACK to start transaction */ > + if (!s->is_sender) { > + if (s->state != QEMU_VM_TRANSACTION_INIT) { > + error_report("invalid state %d", s->state); > + s->has_error = FT_TRANS_ERR_STATE_INVALID; > + ret = -EINVAL; > + } > + > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0); > + goto out; > + } > + > + /* sender waits for QEMU_VM_TRANSACTION_ACK to start transaction */ > + if (s->state == QEMU_VM_TRANSACTION_INIT) { > +retry: > + ret = ft_trans_recv_header(s); > + if (s->freeze_input) { > + goto retry; > + } > + if (ret < 0) { > + error_report("recv ack failed"); > + goto out; > + } > + > + if (s->state != QEMU_VM_TRANSACTION_ACK) { > + error_report("recv invalid state %d", s->state); > + s->has_error = FT_TRANS_ERR_STATE_INVALID; > + ret = -EINVAL; > + goto out; > + } > + } > + > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_BEGIN, 0); > + if (ret < 0) { > + goto out; > + } > + > + s->state = QEMU_VM_TRANSACTION_CONTINUE; > + > +out: > + return ret; > +} > + > +int ft_trans_commit(void *opaque) > +{ > + QEMUFileFtTrans *s = opaque; > + int ret; > + > + if (!s->is_sender) { > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0); > + goto out; > + } > + > + /* sender should flush buf before sending COMMIT */ > + qemu_fflush(s->file); > + > + ret = ft_trans_send_header(s, QEMU_VM_TRANSACTION_COMMIT, 0); > + if (ret < 0) { > + goto out; > + } > + > + while (!s->has_error && s->put_offset) { > + ft_trans_flush(s); > + if (s->freeze_output) { > + s->wait_for_unfreeze(s); > + } > + } > + > + if (s->has_error) { > + ret = -EINVAL; > + goto out; > + } > + > + ret = ft_trans_recv_header(s); > + if (s->freeze_input) { > + ret = -EAGAIN; > + goto out; > + } > + if (ret < 0) { > + error_report("recv ack failed"); > + goto out; > + } > + > + if (s->state != QEMU_VM_TRANSACTION_ACK) { > + error_report("recv invalid state %d", s->state); > + s->has_error = FT_TRANS_ERR_STATE_INVALID; > + ret = -EINVAL; > + goto out; > + } > + > + s->id++; > + ret = 0; > + > +out: > + return ret; > +} > + > +int ft_trans_cancel(void *opaque) > +{ > + QEMUFileFtTrans *s = opaque; > + > + /* invalid until migrate cancel on recevier side is supported */ > + if (!s->is_sender) { > + return -EINVAL; > + } > + > + return ft_trans_send_header(s, QEMU_VM_TRANSACTION_CANCEL, 0); > +} > + > +QEMUFile *qemu_fopen_ops_ft_trans(void *opaque, > + FtTransPutBufferFunc *put_buffer, > + FtTransGetBufferFunc *get_buffer, > + FtTransPutReadyFunc *put_ready, > + FtTransGetReadyFunc *get_ready, > + FtTransWaitForUnfreezeFunc > *wait_for_unfreeze, > + FtTransCloseFunc *close, > + bool is_sender) > +{ > + QEMUFileFtTrans *s; > + > + s = qemu_mallocz(sizeof(*s)); > + > + s->opaque = opaque; > + s->put_buffer = put_buffer; > + s->get_buffer = get_buffer; > + s->put_ready = put_ready; > + s->get_ready = get_ready; > + s->wait_for_unfreeze = wait_for_unfreeze; > + s->close = close; > + s->is_sender = is_sender; > + s->id = 0; > + s->seq = 0; > + s->rate_limit = 1; > + > + if (!s->is_sender) { > + s->buf_max_size = IO_BUF_SIZE; > + } > + > + s->file = qemu_fopen_ops(s, ft_trans_put_buffer, ft_trans_get_buffer, > + ft_trans_close, ft_trans_rate_limit, > + ft_trans_set_rate_limit, NULL); > + > + return s->file; > +} > diff --git a/ft_trans_file.h b/ft_trans_file.h > new file mode 100644 > index 0000000..5ca6b53 > --- /dev/null > +++ b/ft_trans_file.h > @@ -0,0 +1,72 @@ > +/* > + * Fault tolerant VM transaction QEMUFile > + * > + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + * > + * This source code is based on buffered_file.h. > + * Copyright IBM, Corp. 2008 > + * Authors: > + * Anthony Liguori > + */ > + > +#ifndef QEMU_FT_TRANSACTION_FILE_H > +#define QEMU_FT_TRANSACTION_FILE_H > + > +#include "hw/hw.h" > + > +enum QEMU_VM_TRANSACTION_STATE { > + QEMU_VM_TRANSACTION_NACK = -1, > + QEMU_VM_TRANSACTION_INIT, > + QEMU_VM_TRANSACTION_BEGIN, > + QEMU_VM_TRANSACTION_CONTINUE, > + QEMU_VM_TRANSACTION_COMMIT, > + QEMU_VM_TRANSACTION_CANCEL, > + QEMU_VM_TRANSACTION_ATOMIC, > + QEMU_VM_TRANSACTION_ACK, > +}; > + > +enum FT_MODE { > + FT_ERROR = -1, > + FT_OFF, > + FT_INIT, > + FT_TRANSACTION_BEGIN, > + FT_TRANSACTION_ITER, > + FT_TRANSACTION_COMMIT, > + FT_TRANSACTION_ATOMIC, > + FT_TRANSACTION_RECV, > +}; > +extern enum FT_MODE ft_mode; > + > +#define FT_TRANS_ERR_UNKNOWN 0x01 /* Unknown error */ > +#define FT_TRANS_ERR_SEND_HDR 0x02 /* Send header failed */ > +#define FT_TRANS_ERR_RECV_HDR 0x03 /* Recv header failed */ > +#define FT_TRANS_ERR_SEND_PAYLOAD 0x04 /* Send payload failed */ > +#define FT_TRANS_ERR_RECV_PAYLOAD 0x05 /* Recv payload failed */ > +#define FT_TRANS_ERR_FLUSH 0x06 /* Flush buffered data failed */ > +#define FT_TRANS_ERR_STATE_INVALID 0x07 /* Invalid state */ > + > +typedef ssize_t (FtTransPutBufferFunc)(void *opaque, const void *data, > size_t size); > +typedef int (FtTransGetBufferFunc)(void *opaque, uint8_t *buf, int64_t > pos, size_t size); > +typedef ssize_t (FtTransPutVectorFunc)(void *opaque, const struct iovec > *iov, int iovcnt); > +typedef int (FtTransPutReadyFunc)(void); > +typedef int (FtTransGetReadyFunc)(void *opaque); > +typedef void (FtTransWaitForUnfreezeFunc)(void *opaque); > +typedef int (FtTransCloseFunc)(void *opaque); > + > +int ft_trans_begin(void *opaque); > +int ft_trans_commit(void *opaque); > +int ft_trans_cancel(void *opaque); > + > +QEMUFile *qemu_fopen_ops_ft_trans(void *opaque, > + FtTransPutBufferFunc *put_buffer, > + FtTransGetBufferFunc *get_buffer, > + FtTransPutReadyFunc *put_ready, > + FtTransGetReadyFunc *get_ready, > + FtTransWaitForUnfreezeFunc > *wait_for_unfreeze, > + FtTransCloseFunc *close, > + bool is_sender); > + > +#endif > diff --git a/migration.c b/migration.c > index dd3bf94..c5e0146 100644 > --- a/migration.c > +++ b/migration.c > @@ -15,6 +15,7 @@ > #include "migration.h" > #include "monitor.h" > #include "buffered_file.h" > +#include "ft_trans_file.h" > #include "sysemu.h" > #include "block.h" > #include "qemu_socket.h" > @@ -31,6 +32,8 @@ > do { } while (0) > #endif > > +enum FT_MODE ft_mode = FT_OFF; > + > /* Migration speed throttling */ > static int64_t max_throttle = (32 << 20); > > diff --git a/trace-events b/trace-events > index e6138ea..50ac840 100644 > --- a/trace-events > +++ b/trace-events > @@ -254,3 +254,18 @@ disable spice_vmc_write(ssize_t out, int len) "spice > wrottn %lu of requested %zd > disable spice_vmc_read(int bytes, int len) "spice read %lu of requested > %zd" > disable spice_vmc_register_interface(void *scd) "spice vmc registered > interface %p" > disable spice_vmc_unregister_interface(void *scd) "spice vmc unregistered > interface %p" > + > +# ft_trans_file.c > +disable ft_trans_realloc(size_t old_size, size_t new_size) "increasing > buffer from %zu by %zu" > +disable ft_trans_append(size_t size) "buffering %zu bytes" > +disable ft_trans_flush(size_t size, size_t req) "flushed %zu of %zu bytes" > +disable ft_trans_send_header(uint16_t cmd) "send header %d" > +disable ft_trans_recv_header(uint16_t cmd) "recv header %d" > +disable ft_trans_put_buffer(size_t size, int64_t pos) "putting %d bytes at > %"PRId64"" > +disable ft_trans_recv_payload(size_t len, uint32_t hdr, size_t total) > "recv %d of %d total %d" > +disable ft_trans_close(void) "closing" > +disable ft_trans_freeze_output(void) "backend not ready, freezing output" > +disable ft_trans_freeze_input(void) "backend not ready, freezing input" > +disable ft_trans_put_ready(void) "file is ready to put" > +disable ft_trans_get_ready(void) "file is ready to get" > +disable ft_trans_cb(void *cb) "callback %p" > -- > 1.7.1.2 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > --000e0cd25a1edd4a7d049cc38a93 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Yoshiaki:

=A0=A0=A0 I have one question about ram_save_live, during = migration 3 stage(completation stage), it will call cpu_physical_memory_set= _dirty_tracking(0) to stop recording ram dirty pages. at the end of migrate= _ft_trans_connect function, it will invoke vm_start(), at this time, cpu_ph= ysical_memory_set_dirty_tracking(1) is not called yet, so there may have so= me ram pages not recorded when qemu_savevm_trans_begin is called.=A0 I thin= k you need calll=A0 cpu_physical_memory_set_dirty_tracking(1) in migrate_ft= _trans_connect function, Am I right?

BR

Green.


2011/2/10 Yoshia= ki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
This code implements VM transaction protocol. =A0Like buffered_file, it
sits between savevm and migration layer. =A0With this architecture, VM
transaction protocol is implemented mostly independent from other
existing code.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: OHMURA Kei <o= hmura.kei@lab.ntt.co.jp>
---
=A0Makefile.objs =A0 | =A0 =A01 +
=A0ft_trans_file.c | =A0624 +++++++++++++++++++++++++++++++++++++++++++++++= ++++++++
=A0ft_trans_file.h | =A0 72 +++++++
=A0migration.c =A0 =A0 | =A0 =A03 +
=A0trace-events =A0 =A0| =A0 15 ++
=A05 files changed, 715 insertions(+), 0 deletions(-)
=A0create mode 100644 ft_trans_file.c
=A0create mode 100644 ft_trans_file.h

diff --git a/Makefile.objs b/Makefile.objs
index 353b1a8..04148b5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -100,6 +100,7 @@ common-obj-y +=3D msmouse.o ps2.o
=A0common-obj-y +=3D qdev.o qdev-properties.o
=A0common-obj-y +=3D block-migration.o
=A0common-obj-y +=3D pflib.o
+common-obj-y +=3D ft_trans_file.o

=A0common-obj-$(CONFIG_BRLAPI) +=3D baum.o
=A0common-obj-$(CONFIG_POSIX) +=3D migration-exec.o migration-unix.o migrat= ion-fd.o
diff --git a/ft_trans_file.c b/ft_trans_file.c
new file mode 100644
index 0000000..2b42b95
--- /dev/null
+++ b/ft_trans_file.c
@@ -0,0 +1,624 @@
+/*
+ * Fault tolerant VM transaction QEMUFile
+ *
+ * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. =A0See=
+ * the COPYING file in the top-level directory.
+ *
+ * This source code is based on buffered_file.c.
+ * Copyright IBM, Corp. 2008
+ * Authors:
+ * =A0Anthony Liguori =A0 =A0 =A0 =A0<aliguori@us.ibm.com>
+ */
+
+#include "qemu-common.h"
+#include "qemu-error.h"
+#include "hw/hw.h"
+#include "qemu-timer.h"
+#include "sysemu.h"
+#include "qemu-char.h"
+#include "trace.h"
+#include "ft_trans_file.h"
+
+typedef struct FtTransHdr
+{
+ =A0 =A0uint16_t cmd;
+ =A0 =A0uint16_t id;
+ =A0 =A0uint32_t seq;
+ =A0 =A0uint32_t payload_len;
+} FtTransHdr;
+
+typedef struct QEMUFileFtTrans
+{
+ =A0 =A0FtTransPutBufferFunc *put_buffer;
+ =A0 =A0FtTransGetBufferFunc *get_buffer;
+ =A0 =A0FtTransPutReadyFunc *put_ready;
+ =A0 =A0FtTransGetReadyFunc *get_ready;
+ =A0 =A0FtTransWaitForUnfreezeFunc *wait_for_unfreeze;
+ =A0 =A0FtTransCloseFunc *close;
+ =A0 =A0void *opaque;
+ =A0 =A0QEMUFile *file;
+
+ =A0 =A0enum QEMU_VM_TRANSACTION_STATE state;
+ =A0 =A0uint32_t seq;
+ =A0 =A0uint16_t id;
+
+ =A0 =A0int has_error;
+
+ =A0 =A0bool freeze_output;
+ =A0 =A0bool freeze_input;
+ =A0 =A0bool rate_limit;
+ =A0 =A0bool is_sender;
+ =A0 =A0bool is_payload;
+
+ =A0 =A0uint8_t *buf;
+ =A0 =A0size_t buf_max_size;
+ =A0 =A0size_t put_offset;
+ =A0 =A0size_t get_offset;
+
+ =A0 =A0FtTransHdr header;
+ =A0 =A0size_t header_offset;
+} QEMUFileFtTrans;
+
+#define IO_BUF_SIZE 32768
+
+static void ft_trans_append(QEMUFileFtTrans *s,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0const uint8_t *buf= , size_t size)
+{
+ =A0 =A0if (size > (s->buf_max_size - s->put_offset)) {
+ =A0 =A0 =A0 =A0trace_ft_trans_realloc(s->buf_max_size, size + 1024); + =A0 =A0 =A0 =A0s->buf_max_size +=3D size + 1024;
+ =A0 =A0 =A0 =A0s->buf =3D qemu_realloc(s->buf, s->buf_max_size);=
+ =A0 =A0}
+
+ =A0 =A0trace_ft_trans_append(size);
+ =A0 =A0memcpy(s->buf + s->put_offset, buf, size);
+ =A0 =A0s->put_offset +=3D size;
+}
+
+static void ft_trans_flush(QEMUFileFtTrans *s)
+{
+ =A0 =A0size_t offset =3D 0;
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0error_report("flush when error %d, bailing", s-&= gt;has_error);
+ =A0 =A0 =A0 =A0return;
+ =A0 =A0}
+
+ =A0 =A0while (offset < s->put_offset) {
+ =A0 =A0 =A0 =A0ssize_t ret;
+
+ =A0 =A0 =A0 =A0ret =3D s->put_buffer(s->opaque, s->buf + offset,= s->put_offset - offset);
+ =A0 =A0 =A0 =A0if (ret =3D=3D -EAGAIN) {
+ =A0 =A0 =A0 =A0 =A0 =A0break;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0if (ret <=3D 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("error flushing data, %s", = strerror(errno));
+ =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_FLUSH;
+ =A0 =A0 =A0 =A0 =A0 =A0break;
+ =A0 =A0 =A0 =A0} else {
+ =A0 =A0 =A0 =A0 =A0 =A0offset +=3D ret;
+ =A0 =A0 =A0 =A0}
+ =A0 =A0}
+
+ =A0 =A0trace_ft_trans_flush(offset, s->put_offset);
+ =A0 =A0memmove(s->buf, s->buf + offset, s->put_offset - offset);=
+ =A0 =A0s->put_offset -=3D offset;
+ =A0 =A0s->freeze_output =3D !!s->put_offset;
+}
+
+static ssize_t ft_trans_put(void *opaque, void *buf, int size)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0size_t offset =3D 0;
+ =A0 =A0ssize_t len;
+
+ =A0 =A0/* flush buffered data before putting next */
+ =A0 =A0if (s->put_offset) {
+ =A0 =A0 =A0 =A0ft_trans_flush(s);
+ =A0 =A0}
+
+ =A0 =A0while (!s->freeze_output && offset < size) {
+ =A0 =A0 =A0 =A0len =3D s->put_buffer(s->opaque, (uint8_t *)buf + of= fset, size - offset);
+
+ =A0 =A0 =A0 =A0if (len =3D=3D -EAGAIN) {
+ =A0 =A0 =A0 =A0 =A0 =A0trace_ft_trans_freeze_output();
+ =A0 =A0 =A0 =A0 =A0 =A0s->freeze_output =3D 1;
+ =A0 =A0 =A0 =A0 =A0 =A0break;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0if (len <=3D 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("putting data failed, %s", = strerror(errno));
+ =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D 1;
+ =A0 =A0 =A0 =A0 =A0 =A0offset =3D -EINVAL;
+ =A0 =A0 =A0 =A0 =A0 =A0break;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0offset +=3D len;
+ =A0 =A0}
+
+ =A0 =A0if (s->freeze_output) {
+ =A0 =A0 =A0 =A0ft_trans_append(s, buf + offset, size - offset);
+ =A0 =A0 =A0 =A0offset =3D size;
+ =A0 =A0}
+
+ =A0 =A0return offset;
+}
+
+static int ft_trans_send_header(QEMUFileFtTrans *s,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0enum QEMU_= VM_TRANSACTION_STATE state,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0uint32_t p= ayload_len)
+{
+ =A0 =A0int ret;
+ =A0 =A0FtTransHdr *hdr =3D &s->header;
+
+ =A0 =A0trace_ft_trans_send_header(state);
+
+ =A0 =A0hdr->cmd =3D s->state =3D state;
+ =A0 =A0hdr->id =3D s->id;
+ =A0 =A0hdr->seq =3D s->seq;
+ =A0 =A0hdr->payload_len =3D payload_len;
+
+ =A0 =A0ret =3D ft_trans_put(s, hdr, sizeof(*hdr));
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0error_report("send header failed");
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_SEND_HDR;
+ =A0 =A0}
+
+ =A0 =A0return ret;
+}
+
+static int ft_trans_put_buffer(void *opaque, const uint8_t *buf, int64_t p= os, int size)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0ssize_t ret;
+
+ =A0 =A0trace_ft_trans_put_buffer(size, pos);
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0error_report("put_buffer when error %d, bailing"= , s->has_error);
+ =A0 =A0 =A0 =A0return -EINVAL;
+ =A0 =A0}
+
+ =A0 =A0/* assuming qemu_file_put_notify() is calling */
+ =A0 =A0if (pos =3D=3D 0 && size =3D=3D 0) {
+ =A0 =A0 =A0 =A0trace_ft_trans_put_ready();
+ =A0 =A0 =A0 =A0ft_trans_flush(s);
+
+ =A0 =A0 =A0 =A0if (!s->freeze_output) {
+ =A0 =A0 =A0 =A0 =A0 =A0trace_ft_trans_cb(s->put_ready);
+ =A0 =A0 =A0 =A0 =A0 =A0ret =3D s->put_ready();
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0ret =3D 0;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_CONTINUE, size= );
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_put(s, (uint8_t *)buf, size);
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0error_report("send palyload failed");
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_SEND_PAYLOAD;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0s->seq++;
+
+out:
+ =A0 =A0return ret;
+}
+
+static int ft_trans_fill_buffer(void *opaque, void *buf, int size)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0size_t offset =3D 0;
+ =A0 =A0ssize_t len;
+
+ =A0 =A0s->freeze_input =3D 0;
+
+ =A0 =A0while (offset < size) {
+ =A0 =A0 =A0 =A0len =3D s->get_buffer(s->opaque, (uint8_t *)buf + of= fset,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A00, size - offset);=
+ =A0 =A0 =A0 =A0if (len =3D=3D -EAGAIN) {
+ =A0 =A0 =A0 =A0 =A0 =A0trace_ft_trans_freeze_input();
+ =A0 =A0 =A0 =A0 =A0 =A0s->freeze_input =3D 1;
+ =A0 =A0 =A0 =A0 =A0 =A0break;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0if (len <=3D 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("fill buffer failed, %s", s= trerror(errno));
+ =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D 1;
+ =A0 =A0 =A0 =A0 =A0 =A0return -EINVAL;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0offset +=3D len;
+ =A0 =A0}
+
+ =A0 =A0return offset;
+}
+
+static int ft_trans_recv_header(QEMUFileFtTrans *s)
+{
+ =A0 =A0int ret;
+ =A0 =A0char *buf =3D (char *)&s->header + s->header_offset;
+
+ =A0 =A0ret =3D ft_trans_fill_buffer(s, buf, sizeof(FtTransHdr) - s->he= ader_offset);
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0error_report("recv header failed");
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_RECV_HDR;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0s->header_offset +=3D ret;
+ =A0 =A0if (s->header_offset =3D=3D sizeof(FtTransHdr)) {
+ =A0 =A0 =A0 =A0trace_ft_trans_recv_header(s->header.cmd);
+ =A0 =A0 =A0 =A0s->state =3D s->header.cmd;
+ =A0 =A0 =A0 =A0s->header_offset =3D 0;
+
+ =A0 =A0 =A0 =A0if (!s->is_sender) {
+ =A0 =A0 =A0 =A0 =A0 =A0s->id =3D s->header.id;
+ =A0 =A0 =A0 =A0 =A0 =A0s->seq =3D s->header.seq;
+ =A0 =A0 =A0 =A0}
+ =A0 =A0}
+
+out:
+ =A0 =A0return ret;
+}
+
+static int ft_trans_recv_payload(QEMUFileFtTrans *s)
+{
+ =A0 =A0QEMUFile *f =3D s->file;
+ =A0 =A0int ret =3D -1;
+
+ =A0 =A0/* extend QEMUFile buf if there weren't enough space */
+ =A0 =A0if (s->header.payload_len > (s->buf_max_size - s->get_= offset)) {
+ =A0 =A0 =A0 =A0s->buf_max_size +=3D (s->header.payload_len -
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(s->buf_max_siz= e - s->get_offset));
+ =A0 =A0 =A0 =A0s->buf =3D qemu_realloc_buffer(f, s->buf_max_size);<= br> + =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_fill_buffer(s, s->buf + s->get_offset,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 s->header.= payload_len);
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0error_report("recv payload failed");
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_RECV_PAYLOAD;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0trace_ft_trans_recv_payload(ret, s->header.payload_len, s->g= et_offset);
+
+ =A0 =A0s->header.payload_len -=3D ret;
+ =A0 =A0s->get_offset +=3D ret;
+ =A0 =A0s->is_payload =3D !!s->header.payload_len;
+
+out:
+ =A0 =A0return ret;
+}
+
+static int ft_trans_recv(QEMUFileFtTrans *s)
+{
+ =A0 =A0int ret;
+
+ =A0 =A0/* get payload and return */
+ =A0 =A0if (s->is_payload) {
+ =A0 =A0 =A0 =A0ret =3D ft_trans_recv_payload(s);
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_recv_header(s);
+ =A0 =A0if (ret < 0 || s->freeze_input) {
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0switch (s->state) {
+ =A0 =A0case QEMU_VM_TRANSACTION_BEGIN:
+ =A0 =A0 =A0 =A0/* CONTINUE or COMMIT should come shortly */
+ =A0 =A0 =A0 =A0s->is_payload =3D 0;
+ =A0 =A0 =A0 =A0break;
+
+ =A0 =A0case QEMU_VM_TRANSACTION_CONTINUE:
+ =A0 =A0 =A0 =A0/* get payload */
+ =A0 =A0 =A0 =A0s->is_payload =3D 1;
+ =A0 =A0 =A0 =A0break;
+
+ =A0 =A0case QEMU_VM_TRANSACTION_COMMIT:
+ =A0 =A0 =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0= );
+ =A0 =A0 =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0trace_ft_trans_cb(s->get_ready);
+ =A0 =A0 =A0 =A0ret =3D s->get_ready(s->opaque);
+ =A0 =A0 =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0qemu_clear_buffer(s->file);
+ =A0 =A0 =A0 =A0s->get_offset =3D 0;
+ =A0 =A0 =A0 =A0s->is_payload =3D 0;
+
+ =A0 =A0 =A0 =A0break;
+
+ =A0 =A0case QEMU_VM_TRANSACTION_ATOMIC:
+ =A0 =A0 =A0 =A0/* not implemented yet */
+ =A0 =A0 =A0 =A0error_report("QEMU_VM_TRANSACTION_ATOMIC not implemen= ted. %d",
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret);
+ =A0 =A0 =A0 =A0break;
+
+ =A0 =A0case QEMU_VM_TRANSACTION_CANCEL:
+ =A0 =A0 =A0 =A0/* return -EINVAL until migrate cancel on recevier side is= supported */
+ =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0break;
+
+ =A0 =A0default:
+ =A0 =A0 =A0 =A0error_report("unknown QEMU_VM_TRANSACTION_STATE %d&qu= ot;, ret);
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_STATE_INVALID;
+ =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0}
+
+out:
+ =A0 =A0return ret;
+}
+
+static int ft_trans_get_buffer(void *opaque, uint8_t *buf,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 int64_t pos, = int size)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0int ret;
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0error_report("get_buffer when error %d, bailing"= , s->has_error);
+ =A0 =A0 =A0 =A0return -EINVAL;
+ =A0 =A0}
+
+ =A0 =A0/* assuming qemu_file_get_notify() is calling */
+ =A0 =A0if (pos =3D=3D 0 && size =3D=3D 0) {
+ =A0 =A0 =A0 =A0trace_ft_trans_get_ready();
+ =A0 =A0 =A0 =A0s->freeze_input =3D 0;
+
+ =A0 =A0 =A0 =A0/* sender should be waiting for ACK */
+ =A0 =A0 =A0 =A0if (s->is_sender) {
+ =A0 =A0 =A0 =A0 =A0 =A0ret =3D ft_trans_recv_header(s);
+ =A0 =A0 =A0 =A0 =A0 =A0if (s->freeze_input) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret =3D 0;
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0 =A0 =A0}
+ =A0 =A0 =A0 =A0 =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error_report("recv ack failed");=
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0 =A0 =A0if (s->state !=3D QEMU_VM_TRANSACTION_ACK) { + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error_report("recv invalid state %d&q= uot;, s->state);
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_STATE_INV= ALID;
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0 =A0 =A0trace_ft_trans_cb(s->get_ready);
+ =A0 =A0 =A0 =A0 =A0 =A0ret =3D s->get_ready(s->opaque);
+ =A0 =A0 =A0 =A0 =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0 =A0 =A0/* proceed trans id */
+ =A0 =A0 =A0 =A0 =A0 =A0s->id++;
+
+ =A0 =A0 =A0 =A0 =A0 =A0return 0;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0/* set QEMUFile buf at beginning */
+ =A0 =A0 =A0 =A0if (!s->buf) {
+ =A0 =A0 =A0 =A0 =A0 =A0s->buf =3D buf;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0ret =3D ft_trans_recv(s);
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0ret =3D s->get_offset;
+
+out:
+ =A0 =A0return ret;
+}
+
+static int ft_trans_close(void *opaque)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0int ret;
+
+ =A0 =A0trace_ft_trans_close();
+ =A0 =A0ret =3D s->close(s->opaque);
+ =A0 =A0if (s->is_sender) {
+ =A0 =A0 =A0 =A0qemu_free(s->buf);
+ =A0 =A0}
+ =A0 =A0qemu_free(s);
+
+ =A0 =A0return ret;
+}
+
+static int ft_trans_rate_limit(void *opaque)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0return 0;
+ =A0 =A0}
+
+ =A0 =A0if (s->rate_limit && s->freeze_output) {
+ =A0 =A0 =A0 =A0return 1;
+ =A0 =A0}
+
+ =A0 =A0return 0;
+}
+
+static int64_t ft_trans_set_rate_limit(void *opaque, int64_t new_rate)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0s->rate_limit =3D !!new_rate;
+
+out:
+ =A0 =A0return s->rate_limit;
+}
+
+int ft_trans_begin(void *opaque)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0int ret;
+ =A0 =A0s->seq =3D 0;
+
+ =A0 =A0/* receiver sends QEMU_VM_TRANSACTION_ACK to start transaction */<= br> + =A0 =A0if (!s->is_sender) {
+ =A0 =A0 =A0 =A0if (s->state !=3D QEMU_VM_TRANSACTION_INIT) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("invalid state %d", s->s= tate);
+ =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_STATE_INVALID; + =A0 =A0 =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0= );
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0/* sender waits for QEMU_VM_TRANSACTION_ACK to start transaction *= /
+ =A0 =A0if (s->state =3D=3D QEMU_VM_TRANSACTION_INIT) {
+retry:
+ =A0 =A0 =A0 =A0ret =3D ft_trans_recv_header(s);
+ =A0 =A0 =A0 =A0if (s->freeze_input) {
+ =A0 =A0 =A0 =A0 =A0 =A0goto retry;
+ =A0 =A0 =A0 =A0}
+ =A0 =A0 =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("recv ack failed");
+ =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0}
+
+ =A0 =A0 =A0 =A0if (s->state !=3D QEMU_VM_TRANSACTION_ACK) {
+ =A0 =A0 =A0 =A0 =A0 =A0error_report("recv invalid state %d", s-= >state);
+ =A0 =A0 =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_STATE_INVALID; + =A0 =A0 =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0 =A0 =A0goto out;
+ =A0 =A0 =A0 =A0}
+ =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_BEGIN, 0);
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0s->state =3D QEMU_VM_TRANSACTION_CONTINUE;
+
+out:
+ =A0 =A0return ret;
+}
+
+int ft_trans_commit(void *opaque)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+ =A0 =A0int ret;
+
+ =A0 =A0if (!s->is_sender) {
+ =A0 =A0 =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_ACK, 0= );
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0/* sender should flush buf before sending COMMIT */
+ =A0 =A0qemu_fflush(s->file);
+
+ =A0 =A0ret =3D ft_trans_send_header(s, QEMU_VM_TRANSACTION_COMMIT, 0); + =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0while (!s->has_error && s->put_offset) {
+ =A0 =A0 =A0 =A0ft_trans_flush(s);
+ =A0 =A0 =A0 =A0if (s->freeze_output) {
+ =A0 =A0 =A0 =A0 =A0 =A0s->wait_for_unfreeze(s);
+ =A0 =A0 =A0 =A0}
+ =A0 =A0}
+
+ =A0 =A0if (s->has_error) {
+ =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0ret =3D ft_trans_recv_header(s);
+ =A0 =A0if (s->freeze_input) {
+ =A0 =A0 =A0 =A0ret =3D -EAGAIN;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+ =A0 =A0if (ret < 0) {
+ =A0 =A0 =A0 =A0error_report("recv ack failed");
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0if (s->state !=3D QEMU_VM_TRANSACTION_ACK) {
+ =A0 =A0 =A0 =A0error_report("recv invalid state %d", s->stat= e);
+ =A0 =A0 =A0 =A0s->has_error =3D FT_TRANS_ERR_STATE_INVALID;
+ =A0 =A0 =A0 =A0ret =3D -EINVAL;
+ =A0 =A0 =A0 =A0goto out;
+ =A0 =A0}
+
+ =A0 =A0s->id++;
+ =A0 =A0ret =3D 0;
+
+out:
+ =A0 =A0return ret;
+}
+
+int ft_trans_cancel(void *opaque)
+{
+ =A0 =A0QEMUFileFtTrans *s =3D opaque;
+
+ =A0 =A0/* invalid until migrate cancel on recevier side is supported */ + =A0 =A0if (!s->is_sender) {
+ =A0 =A0 =A0 =A0return -EINVAL;
+ =A0 =A0}
+
+ =A0 =A0return ft_trans_send_header(s, QEMU_VM_TRANSACTION_CANCEL, 0);
+}
+
+QEMUFile *qemu_fopen_ops_ft_trans(void *opaque,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sPutBufferFunc *put_buffer,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sGetBufferFunc *get_buffer,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sPutReadyFunc *put_ready,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sGetReadyFunc *get_ready,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sWaitForUnfreezeFunc *wait_for_unfreeze,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sCloseFunc *close,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0bool i= s_sender)
+{
+ =A0 =A0QEMUFileFtTrans *s;
+
+ =A0 =A0s =3D qemu_mallocz(sizeof(*s));
+
+ =A0 =A0s->opaque =3D opaque;
+ =A0 =A0s->put_buffer =3D put_buffer;
+ =A0 =A0s->get_buffer =3D get_buffer;
+ =A0 =A0s->put_ready =3D put_ready;
+ =A0 =A0s->get_ready =3D get_ready;
+ =A0 =A0s->wait_for_unfreeze =3D wait_for_unfreeze;
+ =A0 =A0s->close =3D close;
+ =A0 =A0s->is_sender =3D is_sender;
+ =A0 =A0s->id =3D 0;
+ =A0 =A0s->seq =3D 0;
+ =A0 =A0s->rate_limit =3D 1;
+
+ =A0 =A0if (!s->is_sender) {
+ =A0 =A0 =A0 =A0s->buf_max_size =3D IO_BUF_SIZE;
+ =A0 =A0}
+
+ =A0 =A0s->file =3D qemu_fopen_ops(s, ft_trans_put_buffer, ft_trans_get= _buffer,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ft_trans_close, f= t_trans_rate_limit,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ft_trans_set_rate= _limit, NULL);
+
+ =A0 =A0return s->file;
+}
diff --git a/ft_trans_file.h b/ft_trans_file.h
new file mode 100644
index 0000000..5ca6b53
--- /dev/null
+++ b/ft_trans_file.h
@@ -0,0 +1,72 @@
+/*
+ * Fault tolerant VM transaction QEMUFile
+ *
+ * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. =A0See=
+ * the COPYING file in the top-level directory.
+ *
+ * This source code is based on buffered_file.h.
+ * Copyright IBM, Corp. 2008
+ * Authors:
+ * =A0Anthony Liguori =A0 =A0 =A0 =A0<aliguori@us.ibm.com>
+ */
+
+#ifndef QEMU_FT_TRANSACTION_FILE_H
+#define QEMU_FT_TRANSACTION_FILE_H
+
+#include "hw/hw.h"
+
+enum QEMU_VM_TRANSACTION_STATE {
+ =A0 =A0QEMU_VM_TRANSACTION_NACK =3D -1,
+ =A0 =A0QEMU_VM_TRANSACTION_INIT,
+ =A0 =A0QEMU_VM_TRANSACTION_BEGIN,
+ =A0 =A0QEMU_VM_TRANSACTION_CONTINUE,
+ =A0 =A0QEMU_VM_TRANSACTION_COMMIT,
+ =A0 =A0QEMU_VM_TRANSACTION_CANCEL,
+ =A0 =A0QEMU_VM_TRANSACTION_ATOMIC,
+ =A0 =A0QEMU_VM_TRANSACTION_ACK,
+};
+
+enum FT_MODE {
+ =A0 =A0FT_ERROR =3D -1,
+ =A0 =A0FT_OFF,
+ =A0 =A0FT_INIT,
+ =A0 =A0FT_TRANSACTION_BEGIN,
+ =A0 =A0FT_TRANSACTION_ITER,
+ =A0 =A0FT_TRANSACTION_COMMIT,
+ =A0 =A0FT_TRANSACTION_ATOMIC,
+ =A0 =A0FT_TRANSACTION_RECV,
+};
+extern enum FT_MODE ft_mode;
+
+#define FT_TRANS_ERR_UNKNOWN =A0 =A0 =A0 0x01 /* Unknown error */
+#define FT_TRANS_ERR_SEND_HDR =A0 =A0 =A00x02 /* Send header failed */
+#define FT_TRANS_ERR_RECV_HDR =A0 =A0 =A00x03 /* Recv header failed */
+#define FT_TRANS_ERR_SEND_PAYLOAD =A00x04 /* Send payload failed */
+#define FT_TRANS_ERR_RECV_PAYLOAD =A00x05 /* Recv payload failed */
+#define FT_TRANS_ERR_FLUSH =A0 =A0 =A0 =A0 0x06 /* Flush buffered data fai= led */
+#define FT_TRANS_ERR_STATE_INVALID 0x07 /* Invalid state */
+
+typedef ssize_t (FtTransPutBufferFunc)(void *opaque, const void *data, siz= e_t size);
+typedef int (FtTransGetBufferFunc)(void *opaque, uint8_t *buf, int64_t pos= , size_t size);
+typedef ssize_t (FtTransPutVectorFunc)(void *opaque, const struct iovec *i= ov, int iovcnt);
+typedef int (FtTransPutReadyFunc)(void);
+typedef int (FtTransGetReadyFunc)(void *opaque);
+typedef void (FtTransWaitForUnfreezeFunc)(void *opaque);
+typedef int (FtTransCloseFunc)(void *opaque);
+
+int ft_trans_begin(void *opaque);
+int ft_trans_commit(void *opaque);
+int ft_trans_cancel(void *opaque);
+
+QEMUFile *qemu_fopen_ops_ft_trans(void *opaque,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sPutBufferFunc *put_buffer,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sGetBufferFunc *get_buffer,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sPutReadyFunc *put_ready,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sGetReadyFunc *get_ready,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sWaitForUnfreezeFunc *wait_for_unfreeze,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0FtTran= sCloseFunc *close,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0bool i= s_sender);
+
+#endif
diff --git a/migration.c b/migration.c
index dd3bf94..c5e0146 100644
--- a/migration.c
+++ b/migration.c
@@ -15,6 +15,7 @@
=A0#include "migration.h"
=A0#include "monitor.h"
=A0#include "buffered_file.h"
+#include "ft_trans_file.h"
=A0#include "sysemu.h"
=A0#include "block.h"
=A0#include "qemu_socket.h"
@@ -31,6 +32,8 @@
=A0 =A0 do { } while (0)
=A0#endif

+enum FT_MODE ft_mode =3D FT_OFF;
+
=A0/* Migration speed throttling */
=A0static int64_t max_throttle =3D (32 << 20);

diff --git a/trace-events b/trace-events
index e6138ea..50ac840 100644
--- a/trace-events
+++ b/trace-events
@@ -254,3 +254,18 @@ disable spice_vmc_write(ssize_t out, int len) "sp= ice wrottn %lu of requested %zd
=A0disable spice_vmc_read(int bytes, int len) "spice read %lu of reque= sted %zd"
=A0disable spice_vmc_register_interface(void *scd) "spice vmc register= ed interface %p"
=A0disable spice_vmc_unregister_interface(void *scd) "spice vmc unregi= stered interface %p"
+
+# ft_trans_file.c
+disable ft_trans_realloc(size_t old_size, size_t new_size) "increasin= g buffer from %zu by %zu"
+disable ft_trans_append(size_t size) "buffering %zu bytes"
+disable ft_trans_flush(size_t size, size_t req) "flushed %zu of %zu b= ytes"
+disable ft_trans_send_header(uint16_t cmd) "send header %d"
+disable ft_trans_recv_header(uint16_t cmd) "recv header %d"
+disable ft_trans_put_buffer(size_t size, int64_t pos) "putting %d byt= es at %"PRId64""
+disable ft_trans_recv_payload(size_t len, uint32_t hdr, size_t total) &quo= t;recv %d of %d total %d"
+disable ft_trans_close(void) "closing"
+disable ft_trans_freeze_output(void) "backend not ready, freezing out= put"
+disable ft_trans_freeze_input(void) "backend not ready, freezing inpu= t"
+disable ft_trans_put_ready(void) "file is ready to put"
+disable ft_trans_get_ready(void) "file is ready to get"
+disable ft_trans_cb(void *cb) "callback %p"
--
1.7.1.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in=
the body of a message to major= domo@vger.kernel.org
More majordomo info at =A0http://vger.kernel.org/majordomo-info.html

--000e0cd25a1edd4a7d049cc38a93--