From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52219) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VCm41-0000Mk-Kg for qemu-devel@nongnu.org; Fri, 23 Aug 2013 03:45:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VCm3s-0003Gg-FN for qemu-devel@nongnu.org; Fri, 23 Aug 2013 03:45:45 -0400 Received: from e23smtp02.au.ibm.com ([202.81.31.144]:57470) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VCm3r-0003GF-KG for qemu-devel@nongnu.org; Fri, 23 Aug 2013 03:45:36 -0400 Received: from /spool/local by e23smtp02.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 23 Aug 2013 17:34:14 +1000 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [9.190.234.120]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id 20F8F2BB0051 for ; Fri, 23 Aug 2013 17:45:28 +1000 (EST) Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r7N7TR4o262650 for ; Fri, 23 Aug 2013 17:29:28 +1000 Received: from d23av03.au.ibm.com (localhost [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r7N7jQX8025805 for ; Fri, 23 Aug 2013 17:45:27 +1000 Message-ID: <521712CD.2000106@linux.vnet.ibm.com> Date: Fri, 23 Aug 2013 15:44:13 +0800 From: Lei Li MIME-Version: 1.0 References: <1377069536-12658-1-git-send-email-lilei@linux.vnet.ibm.com> <1377069536-12658-9-git-send-email-lilei@linux.vnet.ibm.com> <521677CD.70801@linux.vnet.ibm.com> In-Reply-To: <521677CD.70801@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 08/18] migration-local: introduce qemu_fopen_local() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael R. Hines" Cc: aarcange@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org, Anthony Liguori , lagarcia@br.ibm.com, pbonzini@redhat.com, rcj@linux.vnet.ibm.com On 08/23/2013 04:42 AM, Michael R. Hines wrote: > On 08/21/2013 03:18 AM, Lei Li wrote: >> Introduce read/write backend of QEMUFileLocal used by localhost >> migration. The unix domain socket will be replaced by PIPE with >> vmsplice mechanism. >> >> Signed-off-by: Lei Li >> --- >> Makefile.objs | 1 + >> migration-local.c | 211 >> +++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 212 insertions(+), 0 deletions(-) >> create mode 100644 migration-local.c >> >> diff --git a/Makefile.objs b/Makefile.objs >> index f46a4cd..30670cc 100644 >> --- a/Makefile.objs >> +++ b/Makefile.objs >> @@ -54,6 +54,7 @@ common-obj-y += migration.o migration-tcp.o >> common-obj-$(CONFIG_RDMA) += migration-rdma.o >> common-obj-y += qemu-char.o #aio.o >> common-obj-y += block-migration.o >> +common-obj-y += migration-local.o >> common-obj-y += page_cache.o xbzrle.o >> >> common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o >> migration-fd.o >> diff --git a/migration-local.c b/migration-local.c >> new file mode 100644 >> index 0000000..93190fd >> --- /dev/null >> +++ b/migration-local.c >> @@ -0,0 +1,211 @@ >> +/* >> + * QEMU localhost migration >> + * >> + * Copyright IBM, Corp. 2013 >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or >> + * later. >> + * >> + * See the COPYING file in the top-level directory. >> + * >> + */ >> + >> +#include "config-host.h" >> +#include "qemu-common.h" >> +#include "migration/migration.h" >> +#include "exec/cpu-common.h" >> +#include "config.h" >> +#include "exec/cpu-all.h" >> +#include "monitor/monitor.h" >> +#include "migration/qemu-file.h" >> +#include "qemu/iov.h" >> +#include "sysemu/arch_init.h" >> +#include "sysemu/sysemu.h" >> +#include "block/block.h" >> +#include "qemu/sockets.h" >> +#include "migration/block.h" >> +#include "qemu/thread.h" >> +#include "qmp-commands.h" >> +#include "trace.h" >> +#include "qemu/osdep.h" >> + >> +//#define DEBUG_MIGRATION_LOCAL >> + >> +#ifdef DEBUG_MIGRATION_LOCAL >> +#define DPRINTF(fmt, ...) \ >> + do { printf("migration-local: " fmt, ## __VA_ARGS__); } while (0) >> +#else >> +#define DPRINTF(fmt, ...) \ >> + do { } while (0) >> +#endif >> + >> +/* >> + * Interface for the local migration. >> + */ >> +typedef struct QEMUFileLocal { >> + QEMUFile *file; >> + int fd; >> + int state; >> + >> + /* >> + * This is the last block from where we have sent data >> + * for local migration >> + */ >> + RAMBlock *last_block_sent; >> +} QEMUFileLocal; >> + >> + >> +static int qemu_local_get_buffer(void *opaque, uint8_t *buf, >> + int64_t pos, int size) >> +{ >> + QEMUFileLocal *s = opaque; >> + ssize_t len; >> + >> + for (;;) { >> + len = qemu_recv(s->fd, buf, size, 0); >> + if (len != -1) { >> + break; >> + } >> + if (socket_error() == EAGAIN) { >> + yield_until_fd_readable(s->fd); >> + } else if (socket_error() != EINTR) { >> + break; >> + } >> + } >> + >> + if (len == -1) { >> + len = -socket_error(); >> + } >> + return len; >> +} >> + > > This looks like a line-for-line copy of socket_get_buffer()...... > > Since you're just going to end up replacing this with vmsplice(), > could you just call socket_get_buffer() temporarily until > your next patch is ready? > >> +static int qemu_local_get_fd(void *opaque) >> +{ >> + QEMUFileLocal *s = opaque; >> + >> + return s->fd; >> +} >> + >> +static int qemu_local_close(void *opaque) >> +{ >> + QEMUFileLocal *s = opaque; >> + >> + closesocket(s->fd); >> + g_free(s); >> + >> + return 0; >> +} >> + >> +static size_t qemu_local_put_buffer(void *opaque, struct iovec *iov, >> + int iovcnt, int64_t pos) >> +{ >> + QEMUFileLocal *s = opaque; >> + ssize_t len; >> + ssize_t size = iov_size(iov, iovcnt); >> + >> + len = iov_send(s->fd, iov, iovcnt, 0, size); >> + if (len < size) { >> + len = -socket_error(); >> + } >> + >> + return len; >> +} >> + >> +static size_t local_save_page(QEMUFile *f, RAMBlock *block, >> + ram_addr_t offset, int flags) >> +{ >> + MemoryRegion *mr = block->mr; >> + uint8_t *p; >> + >> + p = memory_region_get_ram_ptr(mr) + offset; >> + >> + if (buffer_find_nonzero_offset(p, TARGET_PAGE_SIZE)) { >> + qemu_put_be64(f, offset | flags | RAM_SAVE_FLAG_COMPRESS); >> + if (!flags) { >> + qemu_put_byte(f, strlen(block->idstr)); >> + qemu_put_buffer(f, (uint8_t *)block->idstr, >> + strlen(block->idstr)); >> + } >> + qemu_put_byte(f, *p); >> + return 0; >> + } >> + >> + qemu_put_be64(f, offset | flags | RAM_SAVE_FLAG_PAGE); >> + if (!flags) { >> + qemu_put_byte(f, strlen(block->idstr)); >> + qemu_put_buffer(f, (uint8_t *)block->idstr, >> + strlen(block->idstr)); >> + } >> + qemu_put_buffer(f, p, TARGET_PAGE_SIZE); >> + >> + return TARGET_PAGE_SIZE; >> +} >> + >> +static size_t qemu_local_ram_save(QEMUFile *f, void *opaque, >> + ram_addr_t block_offset, >> ram_addr_t offset, >> + size_t size, int *bytes_sent) >> +{ >> + QEMUFileLocal *s = opaque; >> + uint64_t current_addr = block_offset + offset; >> + RAMBlock *block = qemu_get_ram_block(current_addr); >> + MemoryRegion *mr = block->mr; > > RAMBlock structs are not visible outside of exec.c and arch_init.c, > how did you do this? Hi Michael, Good catch! Actually this is the 'Known issue' that I listed in the cover letter, and I planed to ask suggestions on this. Currently the implementation of override of the RDMA hook like save_page for localhost migration needs to have knowledge of MemoryRegion and RAMBlock, say, when saving ram page, need to know which RAMBlock the given ram address save_page hook passed is belong to, or which is the last ram block has been sent. But seems that such structs can not be exported to the private code in migration-local.c. My guess for now is that there might be two possible ways to handle this: 1) Choose another way around, like representation of a RAMBlock and MemoryRegion from localhost migration perspective. This might need more work to handle the descriptions. 2) It could be exported directly into private code like migration-local.c through some ways that I don't know yet. I remembered that the implementation of postcopy migration did this. So I'd like to post and ask for your suggestions to make sure it's the right and best way to handle. >> + void *ram; >> + int ret; >> + int cont; >> + >> + ret = qemu_file_get_error(f); >> + if (ret < 0) { >> + return ret; >> + } >> + >> + qemu_fflush(f); >> + >> + cont = (block == s->last_block_sent) ? RAM_SAVE_FLAG_CONTINUE : 0; >> + >> + ram = memory_region_get_ram_ptr(mr) + offset; >> + s->last_block_sent = block; >> + >> + *bytes_sent = local_save_page(f, block, offset, cont); >> + if (!bytes_sent || *bytes_sent < 0) { >> + return RAM_SAVE_CONTROL_DELAYED; >> + } >> + > RAM_SAVE_CONTROL_DELAYED is only if you have *not* finished moving the > bytes. > > If you've finished moving the bytes, then you should return zero. Acknowledge, thanks. > >> + /* DONTNEED the RAM page that has already been copied. */ >> + qemu_madvise(ram, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED); >> + > > This should be ram_handle_compressed(). > >> + return 0; >> +} >> + >> +const QEMUFileOps local_read_ops = { >> + .get_fd = qemu_local_get_fd, >> + .get_buffer = qemu_local_get_buffer, >> + .close = qemu_local_close, >> +}; >> + >> +const QEMUFileOps local_write_ops = { >> + .get_fd = qemu_local_get_fd, >> + .writev_buffer = qemu_local_put_buffer, >> + .close = qemu_local_close, >> + .save_page = qemu_local_ram_save, >> +}; >> + >> +static void *qemu_fopen_local(int fd, const char *mode) >> +{ >> + QEMUFileLocal *s; >> + >> + if (qemu_file_mode_is_not_valid(mode)) { >> + return NULL; >> + } >> + >> + s = g_malloc0(sizeof(QEMUFileLocal)); >> + s->fd = fd; >> + >> + if (mode[0] == 'w') { >> + qemu_set_block(s->fd); >> + s->file = qemu_fopen_ops(s, &local_write_ops); >> + } else { >> + s->file = qemu_fopen_ops(s, &local_read_ops); >> + } >> + >> + return s->file; >> +} > > -- Lei