xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Shriram Rajagopalan <rshriram@cs.ubc.ca>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages]
Date: Mon, 4 Nov 2013 10:47:07 -0600	[thread overview]
Message-ID: <CAP8mzPN5oTPLfAq+7pB38WTRAEUxkdG6pE_QWPKLNU++2LL4Ow@mail.gmail.com> (raw)
In-Reply-To: <21111.50707.178276.553159@mariner.uk.xensource.com>


[-- Attachment #1.1: Type: text/plain, Size: 3561 bytes --]

On Mon, Nov 4, 2013 at 10:06 AM, Ian Jackson <Ian.Jackson@eu.citrix.com>wrote:

> Ian Campbell writes ("Re: [PATCH 4 of 5 V3] tools/libxl: Control network
> buffering in remus callbacks [and 1 more messages]"):
> > Regardless of the answer here, would it make sense to do some/all of the
> > checkpoint processing in the helper subprocess anyway and only signal
> > the eventual failover up to the libxl process?
>
> It might do.  Mind you, the code in libxc is tangled enough as it is
> and is due for a rewrite.  Perhaps this could be done in the helper
> executable, although there isn't currently any way to easily
> intersperse code in there.
>
>
> This async op is potentially quite long running I think compared to a
> > normal one i.e. if the guest doesn't die it is expected that the ao
> > lives "forever". Since the associated gc's persist until the ao ends
> > this might end up accumulating lots of allocations? Ian had a similar
> > concern about Roger's hotplug daemon series and suggested creating a per
> > iteration gc or something.
>
Yes, this is indeed a problem.  Well spotted.
>
Which of the xc_domain_save (and _restore) callbacks are called each
> remus iteration ?
>
>
Almost all of them on the xc_domain_save side. (suspend, resume,
save_qemu state, checkpoint).
xc_domain_restore doesn't have any callbacks AFAIK. And remus as of now
does not have a component on the restore side. It piggybacks on live
migration's
restore framework.


> I think introducing a per-iteration gc here is going to involve taking
> some care, since we need to be sure not to put
> per-iteraton-gc-allocated objects into data structures which are used
> by subsequent iterations.
>
>
FWIW, the remus related code that executes per iteration does not allocate
anything.
All allocations happen only during setup and I was under the impression
that no other
allocations are taking place everytime xc_domain_save calls back into libxl.

However, it may be possible that other parts of the AO machinery (and there
are a lot of them) are allocating stuff per iteration. And if that is the
case, it could
easily lead to OOMs since Remus technically runs as long as the domain
lives.


Shriram writes:
> > Fair enough. My question is what is the overhead of setting up, firing
> > and tearing down a timeout event using the event gen framework, if I
> > wish to checkpoint the VM, say every 20ms ?
>
> The ultimate cost of going back into the event loop to wait for a
> timeout will depend on what else the process is doing.  If the process
> is doing nothing else, it's about two calls to gettimeofday and one to
> poll.  Plus a bit of in-process computation, but that's going to be
> swamped by system call overhead.
>
> Having said that, libxl is not performance-optimised.  Indeed the
> callback mechanism involves context switching, and IPC, between the
> save/restore helper and libxl proper.  Probably not too much to be
> doing every 20ms for a single domain, but if you have a lot of these
> it's going to end up taking a lot of dom0 cpu etc.
>
>
Yes and that is a problem. Xend+Remus avoided this by linking
the libcheckpoint library that interfaced with both the python & libxc code.


> I assume you're not doing this for HVM domains, which involve saving
> the qemu state each time too.
>
>
It includes HVM domains too. Although in that case, xenstore based suspend
takes about 5ms. So the checkpoint interval is typically 50ms or so.

If there is a latency sensitive task running inside
the VM, lower checkpoint interval leads to better performance.

[-- Attachment #1.2: Type: text/html, Size: 5435 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2013-11-04 16:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-21  5:58 [PATCH 0 of 5 V3] Remus/Libxl: Network buffering support Shriram Rajagopalan
2013-10-21  5:58 ` [PATCH 1 of 5 V3] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
2013-10-31 20:13   ` Ian Campbell
2013-10-21  5:58 ` [PATCH 2 of 5 V3] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
2013-10-31 20:21   ` Ian Campbell
2013-10-31 21:06     ` Shriram Rajagopalan
2013-10-31 22:25       ` Ian Campbell
2013-11-14  3:55         ` Shriram Rajagopalan
2013-10-21  5:58 ` [PATCH 3 of 5 V3] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
2013-10-31 20:28   ` Ian Campbell
2013-10-21  5:58 ` [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
2013-10-31 20:31   ` Ian Campbell
2013-11-01 18:28   ` Ian Jackson
2013-11-01 19:57     ` Shriram Rajagopalan
2013-11-04 12:12       ` [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages] Ian Jackson
2013-11-04 15:17         ` Shriram Rajagopalan
2013-11-04 15:32           ` Ian Campbell
2013-11-04 16:06             ` Ian Jackson
2013-11-04 16:40               ` [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks [and 1 more messages] " Ian Jackson
2013-11-11 17:56                 ` Shriram Rajagopalan
2013-11-12  9:48                   ` Ian Campbell
2013-11-12 15:38                   ` Ian Jackson
2013-11-12 16:24                     ` Shriram Rajagopalan
2013-11-12 16:38                       ` Ian Jackson
2013-11-12 16:43                         ` Shriram Rajagopalan
2013-11-12 17:00                           ` Ian Jackson
2013-11-04 16:45               ` [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks " Ian Campbell
2013-11-04 16:47               ` Shriram Rajagopalan [this message]
2013-11-04 17:01                 ` Ian Jackson
2013-11-04 17:23                   ` Shriram Rajagopalan
2013-11-04 17:33                     ` Ian Jackson
2013-11-01 20:04     ` [PATCH 4 of 5 V3] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
2013-10-21  5:58 ` [PATCH 5 of 5 V3] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
2013-10-31 20:38   ` Ian Campbell
2013-10-31 21:47     ` Shriram Rajagopalan
2013-10-31 22:29       ` Ian Campbell
2013-10-30 23:05 ` [PATCH 0 of 5 V3] Remus/Libxl: Network buffering support Shriram Rajagopalan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP8mzPN5oTPLfAq+7pB38WTRAEUxkdG6pE_QWPKLNU++2LL4Ow@mail.gmail.com \
    --to=rshriram@cs.ubc.ca \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).