From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=43913 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PNdus-0006c4-H0 for qemu-devel@nongnu.org; Tue, 30 Nov 2010 23:03:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PNVLm-0002eh-AF for qemu-devel@nongnu.org; Tue, 30 Nov 2010 13:54:56 -0500 Received: from mail-qw0-f45.google.com ([209.85.216.45]:46964) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PNVLm-0002Pg-7w for qemu-devel@nongnu.org; Tue, 30 Nov 2010 13:54:50 -0500 Received: by qwd7 with SMTP id 7so655431qwd.4 for ; Tue, 30 Nov 2010 10:54:28 -0800 (PST) Message-ID: <4CF5485F.605@codemonkey.ws> Date: Tue, 30 Nov 2010 12:54:23 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <20101124104035.GB23493@redhat.com> <4CF46012.2060804@codemonkey.ws> <4CF50410.3080305@codemonkey.ws> <20101130161032.GF20536@redhat.com> <4CF52A09.5080201@codemonkey.ws> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 02/10] Add buffered_file_internal constant List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" On 11/30/2010 12:04 PM, Juan Quintela wrote: > Anthony Liguori wrote: > >> On 11/30/2010 10:32 AM, Juan Quintela wrote: >> >>> "Michael S. Tsirkin" wrote: >>> >>> >>>> On Tue, Nov 30, 2010 at 04:40:41PM +0100, Juan Quintela wrote: >>>> >>>> >>>>> Basically our bitmap handling code is "exponential" on memory size, >>>>> >>>>> >>>> I didn't realize this. What makes it exponential? >>>> >>>> >>> Well, 1st of all, it is "exponential" as you measure it. >>> >>> stalls by default are: >>> >>> 1-2GB: milliseconds >>> 2-4GB: 100-200ms >>> 4-8GB: 1s >>> 64GB: 59s >>> 400GB: 24m (yes, minutes) >>> >>> That sounds really exponential. >>> >>> >> How are you measuring stalls btw? >> > At the end of the ram_save_live(). This was the reason that I put the > information there. > > for the 24mins stall (I don't have that machine anymore) I had less > "exact" measurements. It was the amount that it "decided" to sent in > the last non-live part of memory migration. With the stalls& zero page > account, we just got to the point where we had basically infinity speed. > That's not quite guest visible. It only is a "stall" if the guest is trying to access device emulation and acquiring the qemu_mutex. A more accurate measurement would be something that measured guest availability. For instance, I tight loop of while (1) { usleep(100); gettimeofday(); } that then recorded periods of unavailability > X. Of course, it's critically important that a working version of pvclock be available int he guest for this to be accurate. Regards, Anthony Liguori > Later, Juan. > > >> Regards, >> >> Anthony Liguori >> >> >>> Now the other thing is the cache size. >>> >>> with 64GB of RAM, we basically have a 16MB bitmap size, i.e. we blow the >>> cache each time that we run ram_save_live(). >>> >>> This is one of the reasons why I don't want to walk the bitmap on >>> ram_save_remaining(). >>> >>> ram_save_remaining() is linear with memory size (previous my changes). >>> ram_save_live is also linear on the memory size, so we are in a worse >>> case of n^n (notice that this is the worst case, we normally do much >>> better n^2, n^3 or so). >>> >>> Add to this that we are blowing the caches with big amounts of memory >>> (we don't do witch smaller sizes), and you can see that our behaviour is >>> clearly exponential. >>> >>> As I need to fix them, I will work on them today/tomorrow. >>> >>> Later, Juan. >>> >>>