From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shriram Rajagopalan Subject: Re: [PATCH v5 00/21] libxl: domain save/restore: run in a separate process Date: Wed, 27 Jun 2012 12:09:02 -0400 Message-ID: References: <1340733318-21099-1-git-send-email-ian.jackson@eu.citrix.com> <20457.63648.864838.199205@mariner.uk.xensource.com> <20459.3767.440030.931642@mariner.uk.xensource.com> <20459.11730.332905.175948@mariner.uk.xensource.com> Reply-To: rshriram@cs.ubc.ca Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1492871240793936303==" Return-path: In-Reply-To: <20459.11730.332905.175948@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============1492871240793936303== Content-Type: multipart/alternative; boundary=000e0cdfd88069be1304c3767121 --000e0cdfd88069be1304c3767121 Content-Type: text/plain; charset=ISO-8859-1 On Wed, Jun 27, 2012 at 11:59 AM, Ian Jackson wrote: > Ian Jackson writes ("Re: [PATCH v5 00/21] libxl: domain save/restore: run > in a separate process"): > > However, when I apply my series, I can indeed produce an assertion > > failure: > ... > > So I have indeed made matters worse. > > I found two bugs: > > 1. The void* passed to the callback was being treated as a > libxl__domain_suspend_state* by the remus callbacks; this is a > holdover from a much earlier version of the series. It should be > converted to a libxl__save_helper_state and then the dss extracted > with CONTAINER_OF. > > 2. The way remus works means that the toolstack save callback is > invoked more than once, which the helper's implementation was not > prepared to deal with. Fix this by moving the rewind of the fd into > the helper. > > Fixes for these are below. With this, on top of my series, seem to I > get the same behaviour as with the baseline. Would you like to try it ? > > Sure, I ll give it a shot. Btw, my earlier mail was in response to remus not working on the baseline setup on your dev environment. > Thanks, > Ian. > > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c > index abc5932..069aca1 100644 > --- a/tools/libxl/libxl_dom.c > +++ b/tools/libxl/libxl_dom.c > @@ -984,7 +984,8 @@ static int libxl__remus_domain_suspend_callback(void > *data) > > static int libxl__remus_domain_resume_callback(void *data) > { > - libxl__domain_suspend_state *dss = data; > + libxl__save_helper_state *shs = data; > + libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); > STATE_AO_GC(dss->ao); > > /* Resumes the domain and the device model */ > @@ -1002,7 +1003,8 @@ static void remus_checkpoint_dm_saved(libxl__egc > *egc, > > static void libxl__remus_domain_checkpoint_callback(void *data) > { > - libxl__domain_suspend_state *dss = data; > + libxl__save_helper_state *shs = data; > + libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); > libxl__egc *egc = dss->shs.egc; > STATE_AO_GC(dss->ao); > > diff --git a/tools/libxl/libxl_save_callout.c > b/tools/libxl/libxl_save_callout.c > index 6332beb..078b7ee 100644 > --- a/tools/libxl/libxl_save_callout.c > +++ b/tools/libxl/libxl_save_callout.c > @@ -105,13 +105,6 @@ void libxl__xc_domain_save(libxl__egc *egc, > libxl__domain_suspend_state *dss, > toolstack_data_buf, toolstack_data_len, > "toolstack data tmpfile", 0); > if (r) { rc = ERROR_FAIL; goto out; } > - > - r = lseek(toolstack_data_fd, 0, SEEK_SET); > - if (r) { > - LOGE(ERROR, "rewind toolstack data tmpfile"); > - rc = ERROR_FAIL; > - goto out; > - } > } > > const unsigned long argnums[] = { > diff --git a/tools/libxl/libxl_save_helper.c > b/tools/libxl/libxl_save_helper.c > index 3bdfa28..772251a 100644 > --- a/tools/libxl/libxl_save_helper.c > +++ b/tools/libxl/libxl_save_helper.c > @@ -171,12 +171,14 @@ static int toolstack_save_cb(uint32_t domid, uint8_t > **buf, > { > assert(toolstack_save_fd > 0); > > + int r = lseek(toolstack_save_fd, 0, SEEK_SET); > + if (r) fail(errno,"rewind toolstack data tmpfile"); > + > *buf = xmalloc(toolstack_save_len); > - int r = read_exactly(toolstack_save_fd, *buf, toolstack_save_len); > + r = read_exactly(toolstack_save_fd, *buf, toolstack_save_len); > if (r<0) fail(errno,"read toolstack data"); > if (r==0) fail(0,"read toolstack data eof"); > > - toolstack_save_fd = -1; > *len = toolstack_save_len; > return 0; > } > > --000e0cdfd88069be1304c3767121 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Jun 27, 2012 at 11:59 AM, Ian Jackson <Ian.Jackson@eu.citr= ix.com> wrote:
Ian Jackson writes ("Re: [PATCH v5 00/21] libxl: domain save/restore: = run in a separate process"):
> However, when I apply my series, I can indeed produc= e an assertion
> failure:
...
> So I have indeed made matters worse.

I found two bugs:

1. The void* passed to the callback was being treated as a
libxl__domain_suspend_state* by the remus callbacks; this is a
holdover from a much earlier version of the series. =A0It should be
converted to a libxl__save_helper_state and then the dss extracted
with CONTAINER_OF.

2. The way remus works means that the toolstack save callback is
invoked more than once, which the helper's implementation was not
prepared to deal with. =A0Fix this by moving the rewind of the fd into
the helper.

Fixes for these are below. =A0With this, on top of my series, seem to I
get the same behaviour as with the baseline. =A0Would you like to try it ?<= br>

Sure, I ll give it a shot.
B= tw, my earlier mail was in response to remus not
working on the b= aseline setup on your dev environment.

=A0
Thanks,
Ian.

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index abc5932..069aca1 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -984,7 +984,8 @@ static int libxl__remus_domain_suspend_callback(void *d= ata)

=A0static int libxl__remus_domain_resume_callback(void *data)
=A0{
- =A0 =A0libxl__domain_suspend_state *dss =3D data;
+ =A0 =A0libxl__save_helper_state *shs =3D data;
+ =A0 =A0libxl__domain_suspend_state *dss =3D CONTAINER_O= F(shs, *dss, shs);
=A0 =A0 STATE_AO_GC(dss->ao);

=A0 =A0 /* Resumes the domain and the device model */
@@ -1002,7 +1003,8 @@ static void remus_checkpoint_dm_saved(libxl__egc *egc= ,

=A0static void libxl__remus_domain_checkpoint_callback(void *data)
=A0{
- =A0 =A0libxl__domain_suspend_state *dss =3D data;
+ =A0 =A0libxl__save_helper_state *shs =3D data;
+ =A0 =A0libxl__domain_suspend_state *dss =3D CONTAINER_OF(shs, *dss, shs);=
=A0 =A0 libxl__egc *egc =3D dss->shs.egc;
=A0 =A0 STATE_AO_GC(dss->ao);

diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_call= out.c
index 6332beb..078b7ee 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -105,13 +105,6 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__dom= ain_suspend_state *dss,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 toolstack_= data_buf, toolstack_data_len,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 "toolstack data tmpfile", 0);
=A0 =A0 =A0 =A0 if (r) { rc =3D ERROR_FAIL; goto o= ut; }
-
- =A0 =A0 =A0 =A0r =3D lseek(toolstack_data_fd, 0, SEEK_SET);
- =A0 =A0 =A0 =A0if (r) {
- =A0 =A0 =A0 =A0 =A0 =A0LOGE(ERROR, "rewind toolstack data tmpfile&qu= ot;);
- =A0 =A0 =A0 =A0 =A0 =A0rc =3D ERROR_FAIL;
- =A0 =A0 =A0 =A0 =A0 =A0goto out;
- =A0 =A0 =A0 =A0}
=A0 =A0 }

=A0 =A0 const unsigned long argnums[] =3D {
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save= _helper.c
index 3bdfa28..772251a 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -171,12 +171,14 @@ static int toolstack_save_cb(uint32_t domid, uint8_t = **buf,
=A0{
=A0 =A0 assert(toolstack_save_fd > 0);

+ =A0 =A0int r =3D lseek(toolstack_save_fd, 0, SEEK_SET);
+ =A0 =A0if (r) fail(errno,"rewind toolstack data tmpfile");
+
=A0 =A0 *buf =3D xmalloc(toolstack_save_len);
- =A0 =A0int r =3D read_exactly(toolstack_save_fd, *buf, toolstack_save_len= );
+ =A0 =A0r =3D read_exactly(toolstack_save_fd, *buf, toolstack_save_len); =A0 =A0 if (r<0) fail(errno,"read toolstack data");
=A0 =A0 if (r=3D=3D0) fail(0,"read toolstack = data eof");

- =A0 =A0toolstack_save_fd =3D -1;
=A0 =A0 *len =3D toolstack_save_len;
=A0 =A0 return 0;
=A0}


--000e0cdfd88069be1304c3767121-- --===============1492871240793936303== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1492871240793936303==--