From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Levon Subject: save/restore race Date: Tue, 23 Jan 2007 22:01:45 +0000 Message-ID: <20070123220145.GA22372@totally.trollied.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org save requires a valid arch.pfn_to_mfn_frame_list_list MFN. However, there is no guarantee that this is up to date, since a previous restore is considered complete as soon as the domain is unpaused: if not paused: dominfo.unpause() dominfo.completeRestore(handler.store_mfn, handler.console_mfn) It seems that Linux is being lucky here, in that rebuilding the MFNs is the first thing it does after suspend(). On Solaris, it occurs somewhat later in the resume process due to constraints on locking within our MMU code. This doesn't seem specific to migration either, a save just after a restore has completed can hit this race as far as I can see. I'm short on ideas that don't involve a new interface (like the domain writing back a xenstore value when it's done resuming). Suggestions? regards john