From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34479) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agrx9-0004Nt-G6 for qemu-devel@nongnu.org; Fri, 18 Mar 2016 06:48:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1agrx8-0005Uz-JL for qemu-devel@nongnu.org; Fri, 18 Mar 2016 06:48:23 -0400 Date: Fri, 18 Mar 2016 10:48:08 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20160318104807.GC2246@work-vm> References: <56EA06E0.7000409@cn.fujitsu.com> <56EA7C62.3090000@cn.fujitsu.com> <20160317094831.GA2504@work-vm> <56EA7F39.9060504@cn.fujitsu.com> <56EA858B.9070408@cn.fujitsu.com> <20160317112508.GF5966@work-vm> <56EB6E5D.9040207@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56EB6E5D.9040207@cn.fujitsu.com> Subject: Re: [Qemu-devel] [PATCH v12 2/3] quorum: implement bdrv_add_child() and bdrv_del_child() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wen Congyang Cc: Kevin Wolf , Changlong Xie , Alberto Garcia , zhanghailiang , qemu block , Markus Armbruster , Jiang Yunhong , Dong Eddie , qemu devel , Max Reitz , Gonglei , Stefan Hajnoczi * Wen Congyang (wency@cn.fujitsu.com) wrote: > On 03/17/2016 07:25 PM, Dr. David Alan Gilbert wrote: > > * Wen Congyang (wency@cn.fujitsu.com) wrote: > >> On 03/17/2016 06:07 PM, Alberto Garcia wrote: > >>> On Thu 17 Mar 2016 10:56:09 AM CET, Wen Congyang wrote: > >>>>> We should have the failure modes documented, and how you'll use it > >>>>> after failover etc Without that it's really difficult to tell if this > >>>>> naming is right. > >>>> > >>>> For COLO, children.0 is the real disk, children.1 is replication > >>>> driver. After failure, children.1 will be removed by the user. If we > >>>> want to continue do COLO, we need add a new children.1 again. > >>> > >>> What if children.0 fails ? > >> > >> For COLO, reading from children.1 always fails. if children.0 fails, it > >> means that reading from the disk fails. The guest vm will see the I/O error. > > > > How do we get that to cause a fail over before the guest detects it? > > If the primary's local disk (children.0) fails then if we can failover > > at that point then the guest carries running on the secondary without > > ever knowing about the failure. > > COLO is not designed for such case. The children.0 can also be quorum, so > you can add more than one real disk, and get more reliability. Another > choice is that, the real disk is an external storage, and it has > its own replication solution. > > COLO is designed for such case: the host is crashed, and the guest is still > alive after failover, the client doesn't know this event. That seems an odd limitation; the only thing needed for COLO to survive a disk failure on the primary would be to ensure that the primary fails/triggers failover if access to the local disk fails and to do it before the IO result is returned to the guest. Dave > > Thanks > Wen Congyang > > > > > Dave > > > >> > >> Thanks > >> Wen Congyang > >> > >>> > >>> Berto > >>> > >>> > >>> . > >>> > >> > >> > >> > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > > . > > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK