From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46168) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XI6Go-0003CP-KQ for qemu-devel@nongnu.org; Thu, 14 Aug 2014 21:25:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XI6Gf-00045G-GO for qemu-devel@nongnu.org; Thu, 14 Aug 2014 21:25:30 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:57555) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XI6Gf-000457-8g for qemu-devel@nongnu.org; Thu, 14 Aug 2014 21:25:21 -0400 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 14 Aug 2014 19:25:20 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 435C61FF003D for ; Thu, 14 Aug 2014 19:25:16 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s7ENLP5r11010404 for ; Fri, 15 Aug 2014 01:21:25 +0200 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s7F1PGss017880 for ; Thu, 14 Aug 2014 19:25:16 -0600 Message-ID: <53ECF082.6090307@linux.vnet.ibm.com> Date: Fri, 15 Aug 2014 01:23:14 +0800 From: "Michael R. Hines" MIME-Version: 1.0 References: <53DBE726.4050102@gmail.com> <1406947532.2680.11.camel@usa> <53E0AA60.9030404@gmail.com> <1407376929.21497.2.camel@usa> <53E60F34.1070607@gmail.com> <1407587152.24027.5.camel@usa> <53E8FBBD.7050703@gmail.com> <53E9247F.4030909@linux.vnet.ibm.com> <53EB7026.805@gmail.com> <53EBE672.7050903@linux.vnet.ibm.com> <20140814105802.GD2503@work-vm> In-Reply-To: <20140814105802.GD2503@work-vm> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Walid Nouri , hinesmr@cn.ibm.com, qemu-devel@nongnu.org, michael@hinespot.com On 08/14/2014 06:58 PM, Dr. David Alan Gilbert wrote: > cc'ing in a couple of the COLOers. Thanks, David. Glad to see their patches in last month - I need to take a look at them. > The 2013 paper says: 'COLO modifies the guest OS’s TCP/IP stack in > order to make the behavior more deterministic. ' but does say that an > alternative might be to have a ' comparison function that operates > transparently over re-assembled TCP streams' Ouch - I didn't realize that. It may or may not be a problem - but if it gets us further towards fault-tolerance, I'm open-minded. =) The Xen paper did the same thing for databases - they also modified the guest TCP stack. >> My hope in the future was that the two approaches could be used in a >> "Hybrid" manner - actually MC has much more of a performance hit for I/O >> than COLO does because of its buffering requirements. >> >> On the other hand, MC would perform better in a memory-intensive or >> CPU-intensive situation - so maybe QEMU could "switch" between the two >> mechanisms at different points in time when the resource bottleneck changes. > If the primary were to rate-limit the number of resynchronisations > (and send the secondary a message as soon as it knew a resync was needed) that > would get some of the way, but then the only difference from microcheckpointing > at that point is the secondary doing a wasteful copy and sending the packets across; > it seems it should be easy to disable those if it knew that a resync was going to > happen. > > Dave