From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=36829 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OBawn-0006JX-31 for qemu-devel@nongnu.org; Mon, 10 May 2010 17:55:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OBawk-0006Gr-4Q for qemu-devel@nongnu.org; Mon, 10 May 2010 17:55:32 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:64218) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OBawj-0006Ge-SQ for qemu-devel@nongnu.org; Mon, 10 May 2010 17:55:30 -0400 Received: by vws10 with SMTP id 10so556929vws.4 for ; Mon, 10 May 2010 14:55:29 -0700 (PDT) Message-ID: <4BE880CE.1090604@codemonkey.ws> Date: Mon, 10 May 2010 16:55:26 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] fix migration with large mem References: <20100413123318.0f2cd334@redhat.com> <4BE86B74.4020103@codemonkey.ws> <20100511004523.4bc5a369@redhat.com> In-Reply-To: <20100511004523.4bc5a369@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Izik Eidus Cc: qemu-devel@nongnu.org On 05/10/2010 04:45 PM, Izik Eidus wrote: > On Mon, 10 May 2010 15:24:20 -0500 > Anthony Liguori wrote: > > >> On 04/13/2010 04:33 AM, Izik Eidus wrote: >> >>> From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00 >>> 2001 From: Izik Eidus >>> Date: Tue, 13 Apr 2010 12:24:57 +0300 >>> Subject: [PATCH] fix migration with large mem >>> >>> In cases of guests with large mem that have pages >>> that all their bytes content are the same, we will >>> spend alot of time reading the memory from the guest >>> (is_dup_page()) >>> >>> It is happening beacuse ram_save_live() function have >>> limit of how much we can send to the dest but not how >>> much we read from it, and in cases we have many is_dup_page() >>> hits, we might read huge amount of data without updating important >>> stuff like the timers... >>> >>> The guest lose all its repsonsibility and have many softlock ups >>> inside itself. >>> >>> this patch add limit on the size we can read from the guest each >>> iteration. >>> >>> Thanks. >>> >>> Signed-off-by: Izik Eidus >>> --- >>> arch_init.c | 6 +++++- >>> 1 files changed, 5 insertions(+), 1 deletions(-) >>> >>> diff --git a/arch_init.c b/arch_init.c >>> index cfc03ea..e27b1a0 100644 >>> --- a/arch_init.c >>> +++ b/arch_init.c >>> @@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH; >>> #define RAM_SAVE_FLAG_PAGE 0x08 >>> #define RAM_SAVE_FLAG_EOS 0x10 >>> >>> +#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024 >>> + >>> static int is_dup_page(uint8_t *page, uint8_t ch) >>> { >>> uint32_t val = ch<< 24 | ch<< 16 | ch<< 8 | ch; >>> @@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, >>> int stage, void *opaque) uint64_t bytes_transferred_last; >>> double bwidth = 0; >>> uint64_t expected_time = 0; >>> + int data_read = 0; >>> >>> if (stage< 0) { >>> cpu_physical_memory_set_dirty_tracking(0); >>> @@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, >>> int stage, void *opaque) bytes_transferred_last = bytes_transferred; >>> bwidth = qemu_get_clock_ns(rt_clock); >>> >>> - while (!qemu_file_rate_limit(f)) { >>> + while (!qemu_file_rate_limit(f)&& data_read< >>> MAX_SAVE_BLOCK_READ) { >>> >> The effect of this patch is that we'll never send more than 10mb/s >> during live migration? If so, it's totally wrong as a fix to the >> problem. >> > It is 100mb/s... (if I remember correct the migration code will run > this thing 10 times for each iteration) > No, it only runs it once. > My feeling is that limit it with the network 32mb/s limit is too low, > reading memory for 100mb/s is not such a problem as long as we don`t > read gigas out of memory every sec... > You've limited bandwidth to 10 mb/sec. Even if it was 100 mb/sec a fixed limit is wrong. On a 10gbit (or 40gbit) link, 100 mb/sec is not enough. > (Still we want to optimize the billion of zeros cases of windows guests) > > Anyway if the above does not make sense to you, I will just change it > into what you suggested > > So ? > That would work for me. Regards, Anthony Liguori