From: Olaf Hering <olaf@aepfle.de>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v4] tools: set migration constraints from cmdline
Date: Mon, 4 Feb 2013 14:41:02 +0100 [thread overview]
Message-ID: <20130204134102.GC16090@aepfle.de> (raw)
In-Reply-To: <1359983463.7743.23.camel@zakaz.uk.xensource.com>
On Mon, Feb 04, Ian Campbell wrote:
> On Fri, 2013-02-01 at 19:34 +0000, Olaf Hering wrote:
> > # HG changeset patch
> > # User Olaf Hering <olaf@aepfle.de>
> > # Date 1359746832 -3600
> > # Node ID 785c8f34e1f802106e53ca4d2c54dce55c8ee166
> > # Parent fd711ebdce9a58556d62c2daaf5d49ab06de4a3c
> > tools: set migration constraints from cmdline
> >
> > Add new options to xm/xl migrate to control the process of migration.
> > The intention is to optionally abort the migration if it takes too long
> > to migrate a busy guest due to the high number of dirty pages. Currently
> > the guest is suspended to transfer the remaining dirty pages. This
> > transfer can take too long, which can confuse the guest if its suspended
> > for too long.
> >
> > -M <number> Number of iterations before final suspend (default: 30)
> > --max_iters <number>
> >
> > -m <factor> Max amount of memory to transfer before final suspend (default: 3*RAM)
> > --max_factor <factor>
> >
> > -N Abort migration instead of doing final suspend.
> > --no_suspend
>
> Is this last one a debug option? (Having read the patch I think not, but
> the description here doesn't really explain it)
No, its not debug. And the naming of that -N option is not really good,
I agree.
What it is supposed to do is to avoid the final suspend+migrate+resume
step when either there were too many iterations or too many pages
transfered when the guest continues to dirty many pages. Its not
predictable how long the guest will be suspended to transfer the
remaining pages to the new host. I think even transfering 1GB RAM at
1000/GBs takes maybe 10 seconds and with large guests there may be
several GB dirty pages, so that a guest may be suspended for a minute.
This caused issues in a customer environment.
Instead the migration should be aborted and the guest should continue to
run on the old host.
> > =item B<remus> [I<OPTIONS>] I<domain-id> I<host>
> > diff -r fd711ebdce9a -r 785c8f34e1f8 tools/libxc/xc_domain_save.c
> > --- a/tools/libxc/xc_domain_save.c
> > +++ b/tools/libxc/xc_domain_save.c
>
> The changes to this file only seem to implement the -N and not the other
> two?
xc_domain_save already has max_iters and max_factor as arguments.
> > diff -r fd711ebdce9a -r 785c8f34e1f8 tools/libxl/libxl.h
> > --- a/tools/libxl/libxl.h
> > +++ b/tools/libxl/libxl.h
> > @@ -500,12 +500,25 @@ int libxl_domain_create_restore(libxl_ct
> > void libxl_domain_config_init(libxl_domain_config *d_config);
> > void libxl_domain_config_dispose(libxl_domain_config *d_config);
> >
> > -int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
> > +int libxl_domain_suspend_0x040200(libxl_ctx *ctx, uint32_t domid, int fd,
> > int flags, /* LIBXL_SUSPEND_* */
> > const libxl_asyncop_how *ao_how)
> > LIBXL_EXTERNAL_CALLERS_ONLY;
> > +#ifdef LIBXL_API_VERSION
> > +#if LIBXL_API_VERSION == 0x040200
> > +#define libxl_domain_suspend libxl_domain_suspend_0x040200
>
> int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
> int flags, /* LIBXL_SUSPEND_* */
> int max_iters, int max_factor,
> const libxl_asyncop_how *ao_how)
> LIBXL_EXTERNAL_CALLERS_ONLY;
> #ifdef LIBXL_API_VERSION
> #if LIBXL_API_VERSION == 0x040200
> #define libxl_domain_suspend(ctx, domid, fd, flags, ao_how) \
> libxl_domain_suspend(ctx, domid, fd, flags, 0, 0, ao_how)
> #endif /* LIBXL_API_VERSION == 0x040200 */
> #endif /* defined(LIBXL_API_VERSION) */
>
> Should work?
That may work, have to try it. In the end if we make such a change it
would serve as example for other upcoming API changes.
> > #define LIBXL_SUSPEND_DEBUG 1
> > #define LIBXL_SUSPEND_LIVE 2
> > +#define LIBXL_SUSPEND_NO_FINAL_SUSPEND 4
>
> (These should really be in the IDL, but this is a pre-existing
> condition)
>
> The name of this new option doesn't quite describe what it does, since
> it doesn't disable the final suspend always, only under certain
> condition.
>
> LIBXL_SUSPEND_ABORT_ON_<xxx>
>
> Where the <xxx> is tricky ;-)
>
> <xxx> == DIRTYING_TOO_FAST
> <xxx> == GUEST_TOO_BUSY
> <xxx> == YOUR_SUGGESTIONS_HERE
Thats alot more descriptive, thanks. DIRTYING_TOO_FAST describes it well
I think.
> > /* @param suspend_cancel [from xenctrl.h:xc_domain_resume( @param fast )]
> > * If this parameter is true, use co-operative resume. The guest
> > diff -r fd711ebdce9a -r 785c8f34e1f8 tools/libxl/libxl_dom.c
> > --- a/tools/libxl/libxl_dom.c
> > +++ b/tools/libxl/libxl_dom.c
> > @@ -1240,6 +1240,7 @@ void libxl__domain_suspend(libxl__egc *e
> >
> > dss->xcflags = (live ? XCFLAGS_LIVE : 0)
> > | (debug ? XCFLAGS_DEBUG : 0)
> > + | (dss->xlflags & LIBXL_SUSPEND_NO_FINAL_SUSPEND ? XCFLAGS_DOMSAVE_NOSUSPEND : 0)
> > | (dss->hvm ? XCFLAGS_HVM : 0);
> >
> > dss->suspend_eventchn = -1;
>
> > diff -r fd711ebdce9a -r 785c8f34e1f8 tools/libxl/xl_cmdimpl.c
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -3172,7 +3172,7 @@ static int save_domain(uint32_t domid, c
> >
> > save_domain_core_writeconfig(fd, filename, config_data, config_len);
> >
> > - int rc = libxl_domain_suspend(ctx, domid, fd, 0, NULL);
> > + int rc = libxl_domain_suspend(ctx, domid, fd, 0, 0, 0, NULL);
>
> Doesn't this need to pass down the selected values?
save_domain is different from migrate_domain, I think xl save will
suspend the guest anyway? But I notice the -c option for "save", so
perhaps it would be useful to catch busy guests as well here?
Olaf
next prev parent reply other threads:[~2013-02-04 13:41 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-28 17:32 [PATCH] tools: set migration constraints from cmdline Olaf Hering
2013-01-30 14:30 ` Ian Campbell
2013-01-30 16:43 ` Olaf Hering
2013-02-04 9:57 ` Olaf Hering
2013-02-04 12:54 ` Ian Campbell
2013-02-04 13:09 ` Olaf Hering
2013-02-04 13:14 ` Ian Campbell
2013-01-30 16:41 ` [PATCH v2] " Olaf Hering
2013-01-31 16:17 ` [PATCH v3] " Olaf Hering
2013-02-01 19:34 ` [PATCH v4] " Olaf Hering
2013-02-04 13:11 ` Ian Campbell
2013-02-04 13:41 ` Olaf Hering [this message]
2013-02-04 13:46 ` Ian Campbell
2013-02-04 13:55 ` Olaf Hering
2013-02-04 13:59 ` Ian Campbell
2013-02-04 18:43 ` Olaf Hering
2013-02-05 9:22 ` Ian Campbell
2013-02-05 10:16 ` Olaf Hering
2013-02-19 11:42 ` Olaf Hering
2013-02-19 11:48 ` Ian Campbell
2013-02-19 14:12 ` Olaf Hering
2013-02-05 10:03 ` [PATCH v5] " Olaf Hering
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130204134102.GC16090@aepfle.de \
--to=olaf@aepfle.de \
--cc=Ian.Campbell@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).