From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Andryuk Subject: Re: [PATCH RFC] libxc: Protect xc_domain_resume from clobbering domain registers Date: Mon, 19 May 2014 08:07:10 -0400 Message-ID: <5379F3EE.4050809@aero.org> References: <1400342483-18476-1-git-send-email-andryuk@aero.org> <5379D0C7.6020309@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5379D0C7.6020309@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper Cc: xen-devel@lists.xen.org, Ian Jackson , Ian Campbell , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On 5/19/2014 5:37 AM, Andrew Cooper wrote: > On 17/05/14 17:01, Jason Andryuk wrote: >> xc_domain_resume() expects the guest to be in state SHUTDOWN_suspend. >> However, nothing verifies the state before modify_returncode() modifies >> the domain's registers. This will crash guest processes or the kernel >> itself. >> >> This can be demonstrated with `LIBXL_SAVE_HELPER=/bin/false xl migrate`. >> >> Signed-off-by: Jason Andryuk > > Hmm. > > There is no possible way whatsoever that migration can work if a PV > guest is not in SHUTDOWN_suspend. PV guests have to leave an MFN in edx > which the toolstack rewrites with a new MFN on resume. > > By default, there is no need for knowledge from the HVM guest for > migrate. XenServer is perfectly capable of migrating HVM VMs without PV > drivers. I suspect therefore that we never use cooperative resume. I've only used 64-bit PV domUs, so I haven't really thought about HVM. If info.shutdown_reason == SHUTDOWN_suspend is expected for all HVM cases, then the hunk can stand. Otherwise it should be moved later after HVM without PV drivers has exited. > This cooperative resume which modifies guest register state therefore > imposes the same SHUTDOWN_suspend restriction on HVM guests as it does > for PV guests. As a result, your patch below is correct as a fallback > safety measure, and should be taken. > > However the caller of modify_returncode is also at fault for attempting > to resume an already-running domain. I think there needs to be a bugfix > there as well. I presume that some piece of code is assuming that > despite libxl-save-helper failing, xc_domain_safe() paused the guest, > which is clearly not true in this case. Agreed. modify_returncode was already making the call to xc_domain_info (and doing the damage), so adding a check there was easy. The patch was posted RFC since I was looking for guidance on whether xc_domain_resume on a running domain is an error or should it be treated as success? The original modify_returncode returns 0 on success or -1 on error. This patch returns 1 for the already running case. This could be handled differently by the caller to bypass XEN_DOMCTL_resumedomain without returning an error. -Jason > ~Andrew > >> --- >> >> This change stops xc_domain_resume from killing my domUs on a failed >> migration. I'm using a wrapper around libxl-save-helper which may fail >> before libxl-save-helper is invoked, so xc_domain_save has not been >> called. The idle Linux domU kernels would BUG coming out of >> SCHEDOP_block in xen_safe_halt() since modify_returncode set EAX to 1. >> journald was also observed to segfault. >> >> As written, this code treats calling xc_domain_resume on a running >> domain as an error. Do we want it silently ignored? Output with this >> patch looks like: >> >> """ >> Migration failed, resuming at sender. >> xc: error: Domain not in suspended state: Internal error >> libxl: error: libxl.c:402:libxl__domain_resume: xc_domain_resume failed for domain 92: Interrupted system call >> """ >> >> libxl__domain_resume prints errno, but it is stale for this case. >> xc_domain_resume_cooperative could swallow modify_returncode's error, >> bypass issuing XEN_DOMCTL_resumedomain, and return success to avoid the >> libxl error message. >> >> --- >> tools/libxc/xc_resume.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c >> index 18b4818..9ec6a59 100644 >> --- a/tools/libxc/xc_resume.c >> +++ b/tools/libxc/xc_resume.c >> @@ -39,6 +39,12 @@ static int modify_returncode(xc_interface *xch, uint32_t domid) >> return -1; >> } >> >> + if ( !info.shutdown || (info.shutdown_reason != SHUTDOWN_suspend) ) >> + { >> + ERROR("Domain not in suspended state"); >> + return 1; >> + } >> + >> if ( info.hvm ) >> { >> /* HVM guests without PV drivers have no return code to modify. */ > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >