From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH 2 of 2 RFC] xl: allow for moving the domain's memory when changing vcpu affinity Date: Fri, 6 Jul 2012 14:30:24 +0100 Message-ID: <4FF6E870.9040702@eu.citrix.com> References: <89aba27edf62271a4862.1341568445@Solace> <4FF6DFE5.7060403@eu.citrix.com> <1341581104.32747.43.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1341581104.32747.43.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: "Zhang, Yang Z" , Ian Jackson , Dario Faggioli , Andre Przywara , xen-devel List-Id: xen-devel@lists.xenproject.org On 06/07/12 14:25, Ian Campbell wrote: > On Fri, 2012-07-06 at 13:53 +0100, George Dunlap wrote: >> On 06/07/12 10:54, Dario Faggioli wrote: >>> By introducing a new '-M' option to the `xl vcpu-pin' command. The actual >>> memory "movement" is achieved suspending the domain to a ttemporary file and >>> resuming it with the new vcpu-affinity >> Hmm... this will work and be reliable, but it seems a bit clunky. Long >> term we want to be able to do node migration in the background without >> shutting down a VM, right? If that can be done in the 4.3 timeframe, >> then it seems unnecessary to implement something like this. > We could do something cleverer for HVM (or hybrid) guests to migrate > pages while the guest is live but migrating a page under a PV guest's > feet requires quiescing it in the style of a migrate. > > We could probably manage to make it such that you just need the > pause/frob/unpause phases without actually writing RAM to disk, and that > infrastructure might be useful for other reasons I suppose. (e.g. I > think bad page offlining currently uses a similar save/restore trick) Yes, I think doing this chunks at a time is likely to be much preferrable to writing the whole thing to disk and reading it out again. Doing it "live" may end up with more total overhead, but it wont' require the VM being actually down for the suspend/resume cycle, which on a large guest may be tens of seconds. -George