From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35369) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNsFO-0002Sf-CK for qemu-devel@nongnu.org; Mon, 25 Jan 2016 20:16:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNsFN-0003Rm-BI for qemu-devel@nongnu.org; Mon, 25 Jan 2016 20:16:42 -0500 References: <20160122193534.GF2482@work-vm> <56A57B43.1040103@cn.fujitsu.com> <56A58441.4030704@cn.fujitsu.com> <20160125202001.GL2464@work-vm> From: Li Zhijian Message-ID: <56A6C8DE.8020606@cn.fujitsu.com> Date: Tue, 26 Jan 2016 09:16:14 +0800 MIME-Version: 1.0 In-Reply-To: <20160125202001.GL2464@work-vm> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] COLO: how to flip a secondary to a primary? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Changlong Xie , zhanghailiang , qemu block , qemu devel , Luis Tomas , Simon Kollberg , Abel Souza On 01/26/2016 04:20 AM, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhijian@cn.fujitsu.com) wrote: >> >> >> On 01/25/2016 09:32 AM, Wen Congyang wrote: >>>>> f) I've not thought about the colo-proxy that much yet - I guess that >>>>> existing connections need to keep their sequence number offset but >> >> Strictly speaking, after failover, we only need to keep servicing for the tcp connections which are >> established after the last checkpoint but not all existing connections. Because after a checkpoint >> (primary and secondary node works well), primary vm and secondary vm is same, that means the existing >> tcp connection has the same sequence。 >> >>>>> new connections made by what is now the primary dont need to do anything >>>>> special. >> Yes, you are right. > > I wonder whether we need to do something special to the new-secondary; > consider this: > > 1 primary (P1) & secondary (S1) run together > 2 New connection opened > 3 secondary records an offset > 4 > 5 primary (P1) fails; do failover to secondary > 6 secondary (S1) still rewrites sequence for connection opened at (2) > 7 Start new-secondary (S2), send checkpoint from S1->S2 > 8 S2 has same guest contents as S1; so the > sequence numbers are still offset compared to the outside world. > > So S2 needs to be sent the offsets for existing connections, otherwise > is S1 was then to fail, S2 would send the wrong output on the existing > connection? Thanks for the example. Sure, if we support continuous FT, colo proxy need to implement migration_save and migration_load. At the beginning of (7), we need to save colo_proxy info(including connection info and sequence offset) at S1 and load colo_proxy at S2. S1/S2 need to keep doing tcp re-writer for the connections opened at (2) until they are closed. Thanks Li Zhijian > > Dave > >> >> >>> Hailiang or Zhijian can answer this question. >> >> >> Thanks >> Li Zhijian >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > . > -- Best regards. Li Zhijian (8555)