From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38194) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VWm6g-0000Km-S6 for qemu-devel@nongnu.org; Thu, 17 Oct 2013 07:51:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VWm6Y-00020d-Ef for qemu-devel@nongnu.org; Thu, 17 Oct 2013 07:51:10 -0400 Received: from mail-bk0-x231.google.com ([2a00:1450:4008:c01::231]:49808) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VWm6Y-00020X-8L for qemu-devel@nongnu.org; Thu, 17 Oct 2013 07:51:02 -0400 Received: by mail-bk0-f49.google.com with SMTP id w14so763397bkz.22 for ; Thu, 17 Oct 2013 04:51:01 -0700 (PDT) Date: Thu, 17 Oct 2013 13:50:59 +0200 From: Stefan Hajnoczi Message-ID: <20131017115059.GF10774@stefanha-thinkpad.redhat.com> References: <1381821983-13932-1-git-send-email-junqing.wang@cs2c.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1381821983-13932-1-git-send-email-junqing.wang@cs2c.com.cn> Subject: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jules Wang Cc: pbonzini@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org, owasserm@redhat.com On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote: > v2 -> v3: > * add documentation of new option in qapi-schema. > > * long option name: ft -> fault-tolerant > > v1 -> v2: > * cmdline: migrate curling:tcp:
: > -> migrate -f tcp:
: > > * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration > to indicate this is a ft migration. > > * receiver: look for the signature: > QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total) > which indicates the end of one migration. > -- > Jules Wang (4): > Curling: add doc > Curling: cmdline interface. > Curling: the sender > Curling: the receiver It would be helpful to clarify the status of Curling in the cover letter email so reviewers know what to expect. This series does not address I/O or failover. I guess you are aware of the missing topics that I mentioned, here are my thoughts on them: I/O needs to be held back until the destination host has acknowledged receiving the last full migration state. The outside world cannot witness state changes in the guest until the migration state has been successfully transferred to the destination host. Otherwise the guest may appear to act incorrectly when resuming execution from the last snapshot. The time period used by the FT sender thread determines how much latency is added to I/O requests. Failover functionality is missing from these patches. We cannot simply start executing on the destination host when the migration connection ends. If the guest disk image is located on shared storage then split-brain occurs when a network error terminates the migration connection - will both hosts begin accessing the shared disk? What is your plan to address these issues? Stefan