xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Possible error restoring machine
@ 2012-05-23  9:39 Frediano Ziglio
  2012-05-23 10:25 ` Ian Campbell
  2012-05-23 13:30 ` Shriram Rajagopalan
  0 siblings, 2 replies; 5+ messages in thread
From: Frediano Ziglio @ 2012-05-23  9:39 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

I noted a possible problem restoring a machine.

In xc_domain_restore (xc_domain_restore.c) if it's not the last
checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
pagebuf_get or just load other pages (see following "goto loadpages;"
line).
Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
(xc_tmem.c) which call read_extract (xc_private.c) on the same non
blocking socket/file but read_extract does not handle EAGAIN/EWOULDBLOCK
(both can be returned on non blocking socket depending on file type and
Unix/Linux version) leading to a failure.
Does this make sense or is it impossible ??

Also note that rdexact (xc_domain_restore.c) handle data timeout but we
can still block in read_exact called by
xc_tmem_restore/xc_tmem_restore_extra.

Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
are network problems?

Frediano

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible error restoring machine
  2012-05-23  9:39 Possible error restoring machine Frediano Ziglio
@ 2012-05-23 10:25 ` Ian Campbell
  2012-05-23 11:37   ` Frediano Ziglio
  2012-05-23 13:30 ` Shriram Rajagopalan
  1 sibling, 1 reply; 5+ messages in thread
From: Ian Campbell @ 2012-05-23 10:25 UTC (permalink / raw)
  To: Frediano Ziglio; +Cc: Shriram Rajagopalan, xen-devel@lists.xensource.com

CCiong the Remus maintainer since all this non-blocking stuff is for
remus/checkpointing.

On Wed, 2012-05-23 at 10:39 +0100, Frediano Ziglio wrote:
> I noted a possible problem restoring a machine.
> 
> In xc_domain_restore (xc_domain_restore.c) if it's not the last
> checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
> pagebuf_get or just load other pages (see following "goto loadpages;"
> line).
> Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
> (xc_tmem.c) which call read_extract (xc_private.c) on the same non
> blocking socket/file

There's a bunch of such places in that function, the RDEXACT macro is
also == rdexact except on Minios.

>  but read_extract does not handle EAGAIN/EWOULDBLOCK
> (both can be returned on non blocking socket depending on file type and
> Unix/Linux version) leading to a failure.
> Does this make sense or is it impossible ??

Isn't this what the if line:
        len = read(fd, buf + offset, size - offset);
        if ( (len == -1) && ((errno == EINTR) || (errno == EAGAIN)) )
            continue;

is doing?

> Also note that rdexact (xc_domain_restore.c) handle data timeout but we
> can still block in read_exact called by
> xc_tmem_restore/xc_tmem_restore_extra.

Oh, wait! read_exact != rdexact -- ouch! Those are confusingly similar!

I suspect we need to pull the xc_tmem_{save,restore} into the
appropriate file and use the non-blocking capable versions or to export
the non-blocking function, with an improved name, so it can be used from
xc_tmem.c.

Shriram, any thoughts?

> 
> Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
> are network problems?
> 
> Frediano
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible error restoring machine
  2012-05-23 10:25 ` Ian Campbell
@ 2012-05-23 11:37   ` Frediano Ziglio
  0 siblings, 0 replies; 5+ messages in thread
From: Frediano Ziglio @ 2012-05-23 11:37 UTC (permalink / raw)
  To: Ian Campbell; +Cc: rshriram@cs.ubc.ca, xen-devel@lists.xensource.com

On Wed, 2012-05-23 at 11:25 +0100, Ian Campbell wrote:
> CCiong the Remus maintainer since all this non-blocking stuff is for
> remus/checkpointing.
> 
> On Wed, 2012-05-23 at 10:39 +0100, Frediano Ziglio wrote:
> > I noted a possible problem restoring a machine.
> > 
> > In xc_domain_restore (xc_domain_restore.c) if it's not the last
> > checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
> > pagebuf_get or just load other pages (see following "goto loadpages;"
> > line).
> > Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
> > (xc_tmem.c) which call read_extract (xc_private.c) on the same non
> > blocking socket/file
> 
> There's a bunch of such places in that function, the RDEXACT macro is
> also == rdexact except on Minios.
> 
> >  but read_extract does not handle EAGAIN/EWOULDBLOCK
> > (both can be returned on non blocking socket depending on file type and
> > Unix/Linux version) leading to a failure.
> > Does this make sense or is it impossible ??
> 
> Isn't this what the if line:
>         len = read(fd, buf + offset, size - offset);
>         if ( (len == -1) && ((errno == EINTR) || (errno == EAGAIN)) )
>             continue;
> 
> is doing?
> 
> > Also note that rdexact (xc_domain_restore.c) handle data timeout but we
> > can still block in read_exact called by
> > xc_tmem_restore/xc_tmem_restore_extra.
> 
> Oh, wait! read_exact != rdexact -- ouch! Those are confusingly similar!
> 
> I suspect we need to pull the xc_tmem_{save,restore} into the
> appropriate file and use the non-blocking capable versions or to export
> the non-blocking function, with an improved name, so it can be used from
> xc_tmem.c.
> 

I was working on a patch to try to reduce cpu usage and read calls using
buffering for io_fd.

Currently works but is not still that good to post.

> Shriram, any thoughts?
> 
> > 
> > Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
> > are network problems?
> > 

Frediano

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible error restoring machine
  2012-05-23  9:39 Possible error restoring machine Frediano Ziglio
  2012-05-23 10:25 ` Ian Campbell
@ 2012-05-23 13:30 ` Shriram Rajagopalan
  2012-05-23 14:15   ` Dan Magenheimer
  1 sibling, 1 reply; 5+ messages in thread
From: Shriram Rajagopalan @ 2012-05-23 13:30 UTC (permalink / raw)
  To: Frediano Ziglio; +Cc: xen-devel@lists.xensource.com, Ian Campbell


[-- Attachment #1.1: Type: text/plain, Size: 1942 bytes --]

On Wed, May 23, 2012 at 5:39 AM, Frediano Ziglio <frediano.ziglio@citrix.com>
wrote:
> I noted a possible problem restoring a machine.
>
> In xc_domain_restore (xc_domain_restore.c) if it's not the last
> checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
> pagebuf_get or just load other pages (see following "goto loadpages;"
> line).
> Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
> (xc_tmem.c) which call read_extract (xc_private.c) on the same non
> blocking socket/file but read_extract does not handle EAGAIN/EWOULDBLOCK
> (both can be returned on non blocking socket depending on file type and
> Unix/Linux version) leading to a failure.
> Does this make sense or is it impossible ??
>


It certainly is possible. But again, I have never seen anyone use tmem with
Remus. I dont even know if it would work properly, even if we fix the
read_exact code
to handle non-blocking fds.

For the normal live-migration scenario, the O_NONBLOCK change does not
happen.
So, RDEXACT == rdexact == read_exact, output wise.

> Also note that rdexact (xc_domain_restore.c) handle data timeout but we
> can still block in read_exact called by
> xc_tmem_restore/xc_tmem_restore_extra.
>

Yep. Only in Remus case. As stated above, havent come across anyone
using Remus + tmem and/or dont know if it would work properly. I dont
know the semantics of tmem enough to comment on remus+tmem, whether
it makes sense or not, etc..


> Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
> are network problems?
>

This wont be a problem for live migration. Because that timeout code
is within the if (ctx->completed) { }  block. It only becomes active when
Remus is enabled i.e. ctx->last_checkpoint = 0. Otherwise, the read call is
still blocking.


> Frediano
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 2581 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible error restoring machine
  2012-05-23 13:30 ` Shriram Rajagopalan
@ 2012-05-23 14:15   ` Dan Magenheimer
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Magenheimer @ 2012-05-23 14:15 UTC (permalink / raw)
  To: rshriram, Frediano Ziglio; +Cc: xen-devel, Ian Campbell

> From: Shriram Rajagopalan [mailto:rshriram@cs.ubc.ca]
> Subject: Re: [Xen-devel] Possible error restoring machine
> 
> Yep. Only in Remus case. As stated above, havent come across anyone
> using Remus + tmem and/or dont know if it would work properly. I dont
> know the semantics of tmem enough to comment on remus+tmem, whether
> it makes sense or not, etc..

An interesting question... from what I remember about Remus
(it's been a few years now since I looked at it), they
can't co-exist I think.  To Remus, tmem is like a hidden
hypervisor-private local disk and the writes to it don't
get captured/replicated by Remus.  I think this is fixable
but I don't think the fix would be easy.

But this is just a few seconds of thought, so I may be
all wrong.

Dan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-05-23 14:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-23  9:39 Possible error restoring machine Frediano Ziglio
2012-05-23 10:25 ` Ian Campbell
2012-05-23 11:37   ` Frediano Ziglio
2012-05-23 13:30 ` Shriram Rajagopalan
2012-05-23 14:15   ` Dan Magenheimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).