From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH] x86: expose XENMEM_get_pod_target to subject domain Date: Wed, 2 Apr 2014 16:10:24 +0100 Message-ID: <533C2860.2070206@eu.citrix.com> References: <530B57A7020000780011ECAC@nat28.tlf.novell.com> <530C5C60020000780011F076@nat28.tlf.novell.com> <1393406711.6506.15.camel@kazak.uk.xensource.com> <530DC5EF020000780011F6D6@nat28.tlf.novell.com> <1393410525.18730.17.camel@kazak.uk.xensource.com> <533BEBF2020000780000492D@nat28.tlf.novell.com> <533C1D83.3080207@eu.citrix.com> <533C42B80200007800004C64@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WVMoD-00076f-S0 for xen-devel@lists.xenproject.org; Wed, 02 Apr 2014 15:10:34 +0000 In-Reply-To: <533C42B80200007800004C64@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: KeirFraser , Ian Campbell , Stefano Stabellini , xen-devel , Daniel De Graaf List-Id: xen-devel@lists.xenproject.org On 04/02/2014 04:02 PM, Jan Beulich wrote: >>>> On 02.04.14 at 16:24, wrote: >> I understand the sentiment; but as I said, the real problem is a lack of >> clarity about what exactly the toolstack is asking the VM to do. This >> is obviously a particular problem in the case of PoD, but it's still a >> problem even for non-PoD guests; it's just that the outcomes are less >> severe. If we solve the general problem, we'll solve the PoD problem. >> >> The other thing is that the whole point of PoD is to be transparent to >> the guest. Xen is already careful in how it handles post-creation >> adjustments to the PoD size -- always increasing the PoD cache, never >> decreasing it -- specifically so that the guest doesn't need to know. >> >> What should really happen at some point is for PoD to just become a >> special case of swapping. In a sense, that's almost the same issue: you >> could have a situation where the toolstack asks a guest to balloon down, >> and the guest does so; but not as low as the toolstack expected, so the >> toolstack labels the guest as "misbehaving" and tells Xen to swap out >> pages until it reaches what the toolstack thinks is the correct value. >> The guest won't crash, but performance will be impacted. >> >> The target in xenstore could be made tightly coupled: if the toolstack >> always wrote into xenstore exactly what it reported to Xen, then it >> would be the same. >> >> Alternately, since now Xen is involved with ballooning targets -- >> whether you're doing PoD or swapping -- maybe we should consider moving >> the "target" into Xen entirely. Then there would be no chance for >> "drift", as Xen and the balloon driver would be working from the same >> data. This would be basically repurposing the get/set pod_target >> hypercall to something specifically for ballooning. > This all reads like something that won't happen soon, and wouldn't > likely to be reasonably backportable. Yet we have the problem in > shipping code, and hence alongside a proper long term solution we > should also (and perhaps first) try to find a simple and sufficiently > correct short term one. (But yes, the present "balloon down much > further than needed" model might be perceived as that short term > one, albeit personally I don't like it.) There are three things I mentioned: 1. Make the static-max / target something rational and useable by the balloon driver 2. Move "target" from xenstore into the hypervisor, and make a proper interface for it there. 3. Re-implement PoD as a special case of hypervisor swap #3 is unlikely to happen soon; but it's not a solution to your problem anyway. It just changes the failure mode from "guest crashes" to "guest experiences performance degradation". Either #1 or #2 should be straightforward to implement and backport; #1 would probably be the easiest to backport. (Yet another reason to prefer it over #2.) -George