From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH RFC] libxc: Protect xc_domain_resume from clobbering domain registers Date: Mon, 19 May 2014 13:26:31 +0100 Message-ID: <5379F877.4000506@citrix.com> References: <1400342483-18476-1-git-send-email-andryuk@aero.org> <5379D0C7.6020309@citrix.com> <5379F3EE.4050809@aero.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5379F3EE.4050809@aero.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jason Andryuk Cc: xen-devel@lists.xen.org, Ian Jackson , Ian Campbell , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On 19/05/14 13:07, Jason Andryuk wrote: > On 5/19/2014 5:37 AM, Andrew Cooper wrote: >> On 17/05/14 17:01, Jason Andryuk wrote: >>> xc_domain_resume() expects the guest to be in state SHUTDOWN_suspend. >>> However, nothing verifies the state before modify_returncode() modifies >>> the domain's registers. This will crash guest processes or the kernel >>> itself. >>> >>> This can be demonstrated with `LIBXL_SAVE_HELPER=/bin/false xl migrate`. >>> >>> Signed-off-by: Jason Andryuk >> Hmm. >> >> There is no possible way whatsoever that migration can work if a PV >> guest is not in SHUTDOWN_suspend. PV guests have to leave an MFN in edx >> which the toolstack rewrites with a new MFN on resume. >> >> By default, there is no need for knowledge from the HVM guest for >> migrate. XenServer is perfectly capable of migrating HVM VMs without PV >> drivers. I suspect therefore that we never use cooperative resume. > I've only used 64-bit PV domUs, so I haven't really thought about HVM. If info.shutdown_reason == SHUTDOWN_suspend is expected for all HVM cases, then the hunk can stand. Otherwise it should be moved later after HVM without PV drivers has exited. I disagree. It is unconditionally an error to be in this function with a guest which is not in SHUTDOWN_suspend, even if there is a codepath through the function which would not corrupt state. I would leave the hunk as it stands, although you might consider setting errno so EINVAL (or something more appropriate). > >> This cooperative resume which modifies guest register state therefore >> imposes the same SHUTDOWN_suspend restriction on HVM guests as it does >> for PV guests. As a result, your patch below is correct as a fallback >> safety measure, and should be taken. >> >> However the caller of modify_returncode is also at fault for attempting >> to resume an already-running domain. I think there needs to be a bugfix >> there as well. I presume that some piece of code is assuming that >> despite libxl-save-helper failing, xc_domain_safe() paused the guest, >> which is clearly not true in this case. > Agreed. modify_returncode was already making the call to xc_domain_info (and doing the damage), so adding a check there was easy. > > The patch was posted RFC since I was looking for guidance on whether xc_domain_resume on a running domain is an error or should it be treated as success? The original modify_returncode returns 0 on success or -1 on error. This patch returns 1 for the already running case. This could be handled differently by the caller to bypass XEN_DOMCTL_resumedomain without returning an error. > > -Jason I still think the real problem is higher up. The toolstack absolutely shouldn't running xc_domain_resume() on a running domain. ~Andrew