From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1wvm-00025x-Me for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:13:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X1wvh-0002E8-T9 for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:13:02 -0400 Received: from mx1.redhat.com ([209.132.183.28]:19169) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1wvh-0002CD-LQ for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:12:57 -0400 Date: Tue, 1 Jul 2014 13:12:48 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20140701121248.GH2394@work-vm> References: <53A8DD80.7070905@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A8DD80.7070905@cn.fujitsu.com> Subject: Re: [Qemu-devel] [RFC] COLO HA Project proposal List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Hongyang Yang Cc: FNST-Gui Jianfeng , Dong Eddie , qemu-devel@nongnu.org, kvm@vger.kernel.org * Hongyang Yang (yanghy@cn.fujitsu.com) wrote: Hi Yang, > Background: > COLO HA project is a high availability solution. Both primary > VM (PVM) and secondary VM (SVM) run in parallel. They receive the > same request from client, and generate response in parallel too. > If the response packets from PVM and SVM are identical, they are > released immediately. Otherwise, a VM checkpoint (on demand) is > conducted. The idea is presented in Xen summit 2012, and 2013, > and academia paper in SOCC 2013. It's also presented in KVM forum > 2013: > http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf > Please refer to above document for detailed information. Yes, I remember that talk - very interesting. I didn't quite understand a couple of things though, perhaps you can explain: 1) If we ignore the TCP sequence number problem, in an SMP machine don't we get other randomnesses - e.g. which core completes something first, or who wins a lock contention, so the output stream might not be identical - so do those normal bits of randomness cause the machines to flag as out-of-sync? 2) If the PVM has decided that the SVM is out of sync (due to 1) and the PVM fails at about the same point - can we switch over to the SVM? I'm worried that due to (1) there are periods where the system is out-of-sync and a failure of the PVM is not protected. Does that happen? If so how often? > The attached was the architecture of kvm-COLO we proposed. > - COLO Manager: Requires modifications of qemu > - COLO Controller > COLO Controller includes modifications of save/restore > flow just like MC(macrocheckpoint), a memory cache on > secondary VM which cache the dirty pages of primary VM > and a failover module which provides APIs to communicate > with external heartbead module. > - COLO Disk Manager > When pvm writes data into image, the colo disk manger > captures this data and send it to the colo disk manger > which makes sure the context of svm's image is consentient > with the context of pvm's image. I wonder if there is anyway to coordinate this between COLO, Michael Hines microcheckpointing and the two separate reverse-execution projects that also need to do some similar things. Are there any standard APIs for the heartbeet thing we can already tie into? > - COLO Agent("Proxy module" in the arch picture) > We need an agent to compare the packets returned by > Primary VM and Secondary VM, and decide whether to start a > checkpoint according to some rules. It is a linux kernel > module for host. Why is that a kernel module, and how does it communicate the state to the QEMU instance? > - Other minor modifications > We may need other modifications for better performance. Dave P.S. I'm starting to look at fault-tolerance stuff, but haven't got very far yet, so starting to try and understand the details of COLO, microcheckpointing, etc > -- > Thanks, > Yang. -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK