From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhigang Wang Subject: Re: Proposed new "memory capacity claim" hypercall/feature Date: Mon, 05 Nov 2012 17:58:13 -0500 Message-ID: <50984485.3010405@oracle.com> References: <50939A1502000078000A5F61@nat28.tlf.novell.com> <7481128d-3f65-4cc3-ad96-1d4e9cd25094@default> <20121104203532.GA11377@ocelot.phlegethon.org> <26f55aab-7523-4681-9a61-a5e5740d43a9@default> <1352111399.25014.101.camel@hastur.hellion.org.uk> <5517445d-ed2e-4592-9a2b-98fc612f1ae8@default> <1352154288.7253.25.camel@hastur.hellion.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1352154288.7253.25.camel@hastur.hellion.org.uk> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Dan Magenheimer , "Keir (Xen.org)" , Konrad Wilk , George Dunlap , Ian Jackson , "Tim (Xen.org)" , Olaf Hering , "xen-devel@lists.xen.org" , George Shuklin , DarioFaggioli , Jan Beulich , Kurt Hackel List-Id: xen-devel@lists.xenproject.org On 11/05/2012 05:24 PM, Ian Campbell wrote: > On Mon, 2012-11-05 at 14:54 +0000, Dan Magenheimer wrote: >>> On Mon, 2012-11-05 at 00:23 +0000, Dan Magenheimer wrote: >>>> There is no "free up enough memory on that host". Tmem doesn't start >>>> ballooning out enough memory to start the VM... the guests are >>>> responsible for doing the ballooning and it is _already done_. The >>>> machine either has sufficient free+freeable memory or it does not; >>> How does one go about deciding which host in a multi thousand host >>> deployment to try the claim hypercall on? > I guess I don't see how your proposed claim hypercall is useful if you > can't decide which machine you should call it on, whether it's 10s, 100s > or 1000s of hosts. Surely you aren't suggesting that the toolstack try > it on all (or even a subset) of them and see which sticks? > > By ignoring this part of the problem I think you are ignoring one of the > most important bits of the story, without which it is very hard to make > a useful and informed determination about the validity of the use cases > you are describing for the new call. Planned implement: 1. Every Server (dom0) sends memory statistics to Manager every 20 seconds (tunable). 2. At one time, Manager selects a Server to run VM based on the snapshot of Server memory. Selected server should have: enough free memory for the VM or have free + freeable memory > VM memory. Two ways to handle failures: 1. Try start_vm on the first selected Server. If failed, try the second one. 2. Try reserve memory on the first Server. If failed, try the second one. If success, start_vm on the Server. >>From high level, Dan's proposal could help with 2). If memory allocation is fast enough (VM start failed/success very fast), then 1) is preferred. Thanks, Zhigang