From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wen Congyang Subject: [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Date: Mon, 18 Jan 2016 13:40:18 +0800 Message-ID: <1453095622-14859-2-git-send-email-wency@cn.fujitsu.com> References: <1453095622-14859-1-git-send-email-wency@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1453095622-14859-1-git-send-email-wency@cn.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen devel , Andrew Cooper Cc: Changlong Xie , Wei Liu , Ian Campbell , Wen Congyang , Ian Jackson , Shriram Rajagopalan , Yang Hongyang List-Id: xen-devel@lists.xenproject.org stream_continue() is used for migration to read emulator xenstore data and emulator context. For remus, if we do failover, we have read it in the checkpoint cycle, and we only need to complete the stream. Signed-off-by: Wen Congyang Reviewed-by: Andrew Cooper --- tools/libxl/libxl_stream_read.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c index 258dec4..24305f4 100644 --- a/tools/libxl/libxl_stream_read.c +++ b/tools/libxl/libxl_stream_read.c @@ -101,6 +101,19 @@ * - stream_write_emulator_done() * - stream_continue() * + * 4) Failover for remus + * - we buffer all records until a CHECKPOINT_END record is received + * - we will use the records when a CHECKPOINT_END record is received + * - if we find some internal error, the rc or retval is not 0 in + * libxl__xc_domain_restore_done(). In this case, we don't resume the + * guest + * - if we need to do failover from primary, the rc and retval are 0 + * in libxl__xc_domain_restore_done(). In this case, the buffered state + * will be dropped, because we don't receive a CHECKPOINT_END record, + * and it is a inconsistent state. In libxl__xc_domain_restore_done(), + * we just complete the stream and stream->completion_callback() will + * be called to resume the guest + * * Depending on the contents of the stream, there are likely to be several * parallel tasks being managed. check_all_finished() is used to join all * tasks in both success and error cases. @@ -758,6 +771,9 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, libxl__stream_read_state *stream = &dcs->srs; STATE_AO_GC(dcs->ao); + /* convenience aliases */ + const int checkpointed_stream = dcs->restore_params.checkpointed_stream; + if (rc) goto err; @@ -777,11 +793,20 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, * If the stream is not still alive, we must not continue any work. */ if (libxl__stream_read_inuse(stream)) { - /* - * Libxc has indicated that it is done with the stream. Resume reading - * libxl records from it. - */ - stream_continue(egc, stream); + if (checkpointed_stream) { + /* + * Failover from primary. Domain state is currently at a + * consistent checkpoint, complete the stream, and call + * stream->completion_callback() to resume the guest. + */ + stream_complete(egc, stream, 0); + } else { + /* + * Libxc has indicated that it is done with the stream. + * Resume reading libxl records from it. + */ + stream_continue(egc, stream); + } } } -- 2.5.0