All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jason Andryuk <andryuk@aero.org>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	xen-devel@lists.xen.org
Subject: Re: [PATCH RFC] libxc: Protect xc_domain_resume from clobbering domain registers
Date: Mon, 19 May 2014 10:37:11 +0100	[thread overview]
Message-ID: <5379D0C7.6020309@citrix.com> (raw)
In-Reply-To: <1400342483-18476-1-git-send-email-andryuk@aero.org>

On 17/05/14 17:01, Jason Andryuk wrote:
> xc_domain_resume() expects the guest to be in state SHUTDOWN_suspend.
> However, nothing verifies the state before modify_returncode() modifies
> the domain's registers.  This will crash guest processes or the kernel
> itself.
>
> This can be demonstrated with `LIBXL_SAVE_HELPER=/bin/false xl migrate`.
>
> Signed-off-by: Jason Andryuk <andryuk@aero.org>

Hmm.

There is no possible way whatsoever that migration can work if a PV
guest is not in SHUTDOWN_suspend.  PV guests have to leave an MFN in edx
which the toolstack rewrites with a new MFN on resume.

By default, there is no need for knowledge from the HVM guest for
migrate.  XenServer is perfectly capable of migrating HVM VMs without PV
drivers.  I suspect therefore that we never use cooperative resume.

This cooperative resume which modifies guest register state therefore
imposes the same SHUTDOWN_suspend restriction on HVM guests as it does
for PV guests.  As a result, your patch below is correct as a fallback
safety measure, and should be taken.

However the caller of modify_returncode is also at fault for attempting
to resume an already-running domain.  I think there needs to be a bugfix
there as well.  I presume that some piece of code is assuming that
despite libxl-save-helper failing, xc_domain_safe() paused the guest,
which is clearly not true in this case.

~Andrew

> ---
>
> This change stops xc_domain_resume from killing my domUs on a failed
> migration.  I'm using a wrapper around libxl-save-helper which may fail
> before libxl-save-helper is invoked, so xc_domain_save has not been
> called.  The idle Linux domU kernels would BUG coming out of
> SCHEDOP_block in xen_safe_halt() since modify_returncode set EAX to 1.
> journald was also observed to segfault.
>
> As written, this code treats calling xc_domain_resume on a running
> domain as an error.  Do we want it silently ignored?  Output with this
> patch looks like:
>
> """
> Migration failed, resuming at sender.
> xc: error: Domain not in suspended state: Internal error
> libxl: error: libxl.c:402:libxl__domain_resume: xc_domain_resume failed for domain 92: Interrupted system call
> """
>
> libxl__domain_resume prints errno, but it is stale for this case.
> xc_domain_resume_cooperative could swallow modify_returncode's error,
> bypass issuing XEN_DOMCTL_resumedomain, and return success to avoid the
> libxl error message.
>
> ---
>  tools/libxc/xc_resume.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
> index 18b4818..9ec6a59 100644
> --- a/tools/libxc/xc_resume.c
> +++ b/tools/libxc/xc_resume.c
> @@ -39,6 +39,12 @@ static int modify_returncode(xc_interface *xch, uint32_t domid)
>          return -1;
>      }
>  
> +    if ( !info.shutdown || (info.shutdown_reason != SHUTDOWN_suspend) )
> +    {
> +        ERROR("Domain not in suspended state");
> +        return 1;
> +    }
> +
>      if ( info.hvm )
>      {
>          /* HVM guests without PV drivers have no return code to modify. */

  reply	other threads:[~2014-05-19  9:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-17 16:01 [PATCH RFC] libxc: Protect xc_domain_resume from clobbering domain registers Jason Andryuk
2014-05-19  9:37 ` Andrew Cooper [this message]
2014-05-19  9:44   ` Andrew Cooper
2014-05-19 12:07   ` Jason Andryuk
2014-05-19 12:26     ` Andrew Cooper
2014-05-19 11:35 ` Ian Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5379D0C7.6020309@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=andryuk@aero.org \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.