From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shriram Rajagopalan Subject: Re: [PATCH v2] fix Remus failover regression Date: Mon, 28 Jul 2014 00:05:55 -0400 Message-ID: References: <1406520207-10769-1-git-send-email-yanghy@cn.fujitsu.com> Reply-To: rshriram@cs.ubc.ca Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1482620036627874321==" Return-path: In-Reply-To: <1406520207-10769-1-git-send-email-yanghy@cn.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: FNST-Yang Hongyang Cc: Andrew Cooper , Ian Jackson , Ian Campbell , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org --===============1482620036627874321== Content-Type: multipart/alternative; boundary=089e0111d1be2cc16904ff390a09 --089e0111d1be2cc16904ff390a09 Content-Type: text/plain; charset=UTF-8 On Jul 28, 2014 12:03 AM, "Yang Hongyang" wrote: > > commit: c2ba706c > tools/libxc: goto correct label on error paths by Andrew Cooper > broke Remus in Xen 4.4 or earlier versions that has this commit > backported. > > With Remus, this jump essentially discards the current incomplete > checkpoint received by the backup and restore backup from the > last complete checkpoint. > This is required for Remus to work and this does not break live > migration. > It has been around since Xen 4.0. > > CC: Ian Jackson > CC: Ian Campbell > CC: Andrew Cooper > CC: Shriram Rajagopalan > Signed-off-by: Yang Hongyang > --- > tools/libxc/xc_domain_restore.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c > index e73e0a2..b9a56d5 100644 > --- a/tools/libxc/xc_domain_restore.c > +++ b/tools/libxc/xc_domain_restore.c > @@ -1783,20 +1783,29 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom, > > if ( pagebuf_get(xch, ctx, &pagebuf, io_fd, dom) ) { > PERROR("error when buffering batch, finishing"); > - goto out; > + /* > + * Remus: discard the current incomplete checkpoint and restore > + * backup from the last complete checkpoint. > + */ > + goto finish; > } > memset(&tmptail, 0, sizeof(tmptail)); > tmptail.ishvm = hvm; > if ( buffer_tail(xch, ctx, &tmptail, io_fd, max_vcpu_id, vcpumap, > ext_vcpucontext, vcpuextstate_size) < 0 ) { > ERROR ("error buffering image tail, finishing"); > - goto out; > + /* > + * Remus: discard the current incomplete checkpoint and restore > + * backup from the last complete checkpoint. > + */ > + goto finish; > } > tailbuf_free(&tailbuf); > memcpy(&tailbuf, &tmptail, sizeof(tailbuf)); > > goto loadpages; > > + /* With Remus: restore from last complete checkpoint */ > finish: > if ( hvm ) > goto finish_hvm; > -- > 1.9.1 > Acked-by: Shriram Rajagopalan --089e0111d1be2cc16904ff390a09 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Jul 28, 2014 12:03 AM, "Yang Hongyang" <yanghy@cn.fujitsu.com> wrote:
>
> commit: c2ba706c
> tools/libxc: goto correct label on error paths by Andrew Cooper
> broke Remus in Xen 4.4 or earlier versions that has this commit
> backported.
>
> With Remus, this jump essentially discards the current incomplete
> checkpoint received by the backup and restore backup from the
> last complete checkpoint.
> This is required for Remus to work and this does not break live
> migration.
> It has been around since Xen 4.0.
>
> CC: Ian Jackson <ian.j= ackson@eu.citrix.com>
> CC: Ian Campbell <ian.ca= mpbell@citrix.com>
> CC: Andrew Cooper <and= rew.cooper3@citrix.com>
> CC: Shriram Rajagopalan <rshr= iram@cs.ubc.ca>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
> =C2=A0tools/libxc/xc_domain_restore.c | 13 +++++++++++--
> =C2=A01 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_r= estore.c
> index e73e0a2..b9a56d5 100644
> --- a/tools/libxc/xc_domain_restore.c
> +++ b/tools/libxc/xc_domain_restore.c
> @@ -1783,20 +1783,29 @@ int xc_domain_restore(xc_interface *xch, int i= o_fd, uint32_t dom,
>
> =C2=A0 =C2=A0 =C2=A0if ( pagebuf_get(xch, ctx, &pagebuf, io_fd, do= m) ) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0PERROR("error when buffering ba= tch, finishing");
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0goto out;
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Remus: discard the current incomplete = checkpoint and restore
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 * backup from the last complete checkpoi= nt.
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0goto finish;
> =C2=A0 =C2=A0 =C2=A0}
> =C2=A0 =C2=A0 =C2=A0memset(&tmptail, 0, sizeof(tmptail));
> =C2=A0 =C2=A0 =C2=A0tmptail.ishvm =3D hvm;
> =C2=A0 =C2=A0 =C2=A0if ( buffer_tail(xch, ctx, &tmptail, io_fd, ma= x_vcpu_id, vcpumap,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 ext_vcpucontext, vcpuextstate_size) < 0 ) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ERROR ("error buffering image t= ail, finishing");
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0goto out;
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Remus: discard the current incomplete = checkpoint and restore
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 * backup from the last complete checkpoi= nt.
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0goto finish;
> =C2=A0 =C2=A0 =C2=A0}
> =C2=A0 =C2=A0 =C2=A0tailbuf_free(&tailbuf);
> =C2=A0 =C2=A0 =C2=A0memcpy(&tailbuf, &tmptail, sizeof(tailbuf)= );
>
> =C2=A0 =C2=A0 =C2=A0goto loadpages;
>
> + =C2=A0/* With Remus: restore from last complete checkpoint */
> =C2=A0 =C2=A0finish:
> =C2=A0 =C2=A0 =C2=A0if ( hvm )
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto finish_hvm;
> --
> 1.9.1
>

Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

--089e0111d1be2cc16904ff390a09-- --===============1482620036627874321== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1482620036627874321==--