From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45351 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PNdqd-0007Hq-4b for qemu-devel@nongnu.org; Tue, 30 Nov 2010 22:59:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PNVgL-0005Vq-EB for qemu-devel@nongnu.org; Tue, 30 Nov 2010 14:16:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33902) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PNVgL-0005Vd-2Q for qemu-devel@nongnu.org; Tue, 30 Nov 2010 14:16:05 -0500 From: Juan Quintela In-Reply-To: <4CF5485F.605@codemonkey.ws> (Anthony Liguori's message of "Tue, 30 Nov 2010 12:54:23 -0600") References: <20101124104035.GB23493@redhat.com> <4CF46012.2060804@codemonkey.ws> <4CF50410.3080305@codemonkey.ws> <20101130161032.GF20536@redhat.com> <4CF52A09.5080201@codemonkey.ws> <4CF5485F.605@codemonkey.ws> Date: Tue, 30 Nov 2010 20:15:54 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [Qemu-devel] Re: [PATCH 02/10] Add buffered_file_internal constant List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" Anthony Liguori wrote: > On 11/30/2010 12:04 PM, Juan Quintela wrote: >> Anthony Liguori wrote: >> >>> On 11/30/2010 10:32 AM, Juan Quintela wrote: >>> >>>> "Michael S. Tsirkin" wrote: >>>> >>>> >>>>> On Tue, Nov 30, 2010 at 04:40:41PM +0100, Juan Quintela wrote: >>>>> >>>>> >>>>>> Basically our bitmap handling code is "exponential" on memory size, >>>>>> >>>>>> >>>>> I didn't realize this. What makes it exponential? >>>>> >>>>> >>>> Well, 1st of all, it is "exponential" as you measure it. >>>> >>>> stalls by default are: >>>> >>>> 1-2GB: milliseconds >>>> 2-4GB: 100-200ms >>>> 4-8GB: 1s >>>> 64GB: 59s >>>> 400GB: 24m (yes, minutes) >>>> >>>> That sounds really exponential. >>>> >>>> >>> How are you measuring stalls btw? >>> >> At the end of the ram_save_live(). This was the reason that I put the >> information there. >> >> for the 24mins stall (I don't have that machine anymore) I had less >> "exact" measurements. It was the amount that it "decided" to sent in >> the last non-live part of memory migration. With the stalls& zero page >> account, we just got to the point where we had basically infinity speed. >> > > That's not quite guest visible. Humm, guest don't answer in 24mins monitor don't answer in 24mins ping don't answer in 24mins are you sure that this is not visible? Bug report put that guest had just died, it was me who waited to see that it took 24mins to end. > It only is a "stall" if the guest is trying to access device emulation > and acquiring the qemu_mutex. A more accurate measurement would be > something that measured guest availability. For instance, I tight > loop of while (1) { usleep(100); gettimeofday(); } that then recorded > periods of unavailability > X. This is better, and this is what qemu_mutex change should fix. > Of course, it's critically important that a working version of pvclock > be available int he guest for this to be accurate. If the problem are 24mins, we don't need such an "exact" version O:-) Later, Juan.