From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48318) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SfI9H-0002mO-JN for qemu-devel@nongnu.org; Thu, 14 Jun 2012 18:04:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SfI9F-0005Mf-Hn for qemu-devel@nongnu.org; Thu, 14 Jun 2012 18:04:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:8289) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SfI9F-0005MS-8X for qemu-devel@nongnu.org; Thu, 14 Jun 2012 18:04:13 -0400 From: Juan Quintela In-Reply-To: <77d2fc4b1c99e1520a7fbe873e73119ce9463592.1338802192.git.yamahata@valinux.co.jp> (Isaku Yamahata's message of "Mon, 4 Jun 2012 18:57:37 +0900") References: <77d2fc4b1c99e1520a7fbe873e73119ce9463592.1338802192.git.yamahata@valinux.co.jp> Date: Thu, 14 Jun 2012 23:34:09 +0200 Message-ID: <87vcitshm6.fsf@elfo.mitica> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH v2 35/41] postcopy: introduce helper functions for postcopy Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Isaku Yamahata Cc: benoit.hudzia@gmail.com, aarcange@redhat.com, aliguori@us.ibm.com, kvm@vger.kernel.org, satoshi.itoh@aist.go.jp, stefanha@gmail.com, t.hirofuchi@aist.go.jp, dlaor@redhat.com, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, yoshikawa.takuya@oss.ntt.co.jp, owasserm@redhat.com, avi@redhat.com, pbonzini@redhat.com Isaku Yamahata wrote: > +//#define DEBUG_UMEM > +#ifdef DEBUG_UMEM > +#include > +#define DPRINTF(format, ...) \ > + do { \ > + printf("%d:%ld %s:%d "format, getpid(), syscall(SYS_gettid), \ > + __func__, __LINE__, ## __VA_ARGS__); \ > + } while (0) This should be in a header file that is linux specific? And (at least on my systems) gettid is already defined on glibc. > +#else > +#define DPRINTF(format, ...) do { } while (0) > +#endif > + > +#define DEV_UMEM "/dev/umem" > + > +UMem *umem_new(void *hostp, size_t size) > +{ > + struct umem_init uinit = { > + .size = size, > + }; > + UMem *umem; > + > + assert((size % getpagesize()) == 0); > + umem = g_new(UMem, 1); > + umem->fd = open(DEV_UMEM, O_RDWR); > + if (umem->fd < 0) { > + perror("can't open "DEV_UMEM); > + abort(); Can we return one error insntead of abort? the same for the rest of the file aborts. > +size_t umem_pages_size(uint64_t nr) > +{ > + return sizeof(struct umem_pages) + nr * sizeof(uint64_t); Can we make sure that the pgoffs field is aligned? I know that as it is now it is aligned, but better to be sure? > +} > + > +static void umem_write_cmd(int fd, uint8_t cmd) > +{ > + DPRINTF("write cmd %c\n", cmd); > + > + for (;;) { > + ssize_t ret = write(fd, &cmd, 1); > + if (ret == -1) { > + if (errno == EINTR) { > + continue; > + } else if (errno == EPIPE) { > + perror("pipe"); > + DPRINTF("write cmd %c %zd %d: pipe is closed\n", > + cmd, ret, errno); > + break; > + } Grr, we don't have a function that writes does a "safe_write". The most similar thing in qemu looks to be send_all(). > + > + perror("pipe"); Can we make a different perror() message than previous error? > + DPRINTF("write cmd %c %zd %d\n", cmd, ret, errno); > + abort(); > + } > + > + break; > + } > +} > + > +static void umem_read_cmd(int fd, uint8_t expect) > +{ > + uint8_t cmd; > + for (;;) { > + ssize_t ret = read(fd, &cmd, 1); > + if (ret == -1) { > + if (errno == EINTR) { > + continue; > + } > + perror("pipe"); > + DPRINTF("read error cmd %c %zd %d\n", cmd, ret, errno); > + abort(); > + } > + > + if (ret == 0) { > + DPRINTF("read cmd %c %zd: pipe is closed\n", cmd, ret); > + abort(); > + } > + > + break; > + } > + > + DPRINTF("read cmd %c\n", cmd); > + if (cmd != expect) { > + DPRINTF("cmd %c expect %d\n", cmd, expect); > + abort(); Ouch. If we receive garbage, we just exit? I really think that we should implement error handling. > + } > +} > + > +struct umem_pages *umem_recv_pages(QEMUFile *f, int *offset) > +{ > + int ret; > + uint64_t nr; > + size_t size; > + struct umem_pages *pages; > + > + ret = qemu_peek_buffer(f, (uint8_t*)&nr, sizeof(nr), *offset); > + *offset += sizeof(nr); > + DPRINTF("ret %d nr %ld\n", ret, nr); > + if (ret != sizeof(nr) || nr == 0) { > + return NULL; > + } > + > + size = umem_pages_size(nr); > + pages = g_malloc(size); Just thinking about this. Couldn't we just decide on a "big enough" buffer, and never send anything bigger than that? That would remove the need to have to malloc()/free() a buffer for each reception? > +/* qemu side handler */ > +struct umem_pages *umem_qemu_trigger_page_fault(QEMUFile *from_umemd, > + int *offset) > +{ > + uint64_t i; > + int page_shift = ffs(getpagesize()) - 1; > + struct umem_pages *pages = umem_recv_pages(from_umemd, offset); > + if (pages == NULL) { > + return NULL; > + } > + > + for (i = 0; i < pages->nr; i++) { > + ram_addr_t addr = pages->pgoffs[i] << page_shift; > + > + /* make pages present by forcibly triggering page fault. */ > + volatile uint8_t *ram = qemu_get_ram_ptr(addr); > + uint8_t dummy_read = ram[0]; > + (void)dummy_read; /* suppress unused variable warning */ > + } > + > + /* > + * Very Linux implementation specific. > + * Make it sure that other thread doesn't fault on the above virtual > + * address. (More exactly other thread doesn't call fault handler with > + * the offset.) > + * the fault handler is called with mmap_sem read locked. > + * madvise() does down/up_write(mmap_sem) > + */ > + qemu_madvise(NULL, 0, MADV_NORMAL); If it is linux specific, should be inside CONFIG_LINUX ifdef, or a function hided on some header. Talking about looking, what protects that no other thread enters this function before this one calls madvise? Or I am losing something obvious? > + > +struct umem_pages { > + uint64_t nr; > + uint64_t pgoffs[0]; > +}; > + QEMU really likes typedefs for structs. Later, Juan.