From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55334) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zuj16-0002qO-Rw for qemu-devel@nongnu.org; Fri, 06 Nov 2015 10:33:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zuj13-0001cT-IP for qemu-devel@nongnu.org; Fri, 06 Nov 2015 10:33:28 -0500 Received: from e28smtp07.in.ibm.com ([122.248.162.7]:35656) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zuj12-0001be-TF for qemu-devel@nongnu.org; Fri, 06 Nov 2015 10:33:25 -0500 Received: from /spool/local by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 6 Nov 2015 21:03:22 +0530 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by d28dlp02.in.ibm.com (Postfix) with ESMTP id E6E353940019 for ; Fri, 6 Nov 2015 21:03:18 +0530 (IST) Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay04.in.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id tA6FXIrO62587090 for ; Fri, 6 Nov 2015 21:03:18 +0530 Received: from d28av03.in.ibm.com (localhost [127.0.0.1]) by d28av03.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id tA6FXHGB013840 for ; Fri, 6 Nov 2015 21:03:18 +0530 Date: Fri, 6 Nov 2015 21:03:14 +0530 From: Bharata B Rao Message-ID: <20151106153314.GA14232@in.ibm.com> References: <1446747083-18205-1-git-send-email-dgilbert@redhat.com> <20151106034846.GC29481@in.ibm.com> <20151106090952.GA2459@work-vm> <20151106122222.GF2459@work-vm> <20151106134341.GG2459@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151106134341.GG2459@work-vm> Subject: Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, quintela@redhat.com, liang.z.li@intel.com, "qemu-devel@nongnu.org" , luis@cs.umu.se, Bharata B Rao , "amit.shah@redhat.com" , Paolo Bonzini , David Gibson On Fri, Nov 06, 2015 at 01:43:42PM +0000, Dr. David Alan Gilbert wrote: > * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote: > > * Bharata B Rao (bharata.rao@gmail.com) wrote: > > > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert > > > wrote: > > > > * Bharata B Rao (bharata@linux.vnet.ibm.com) wrote: > > > >> On Thu, Nov 05, 2015 at 06:10:27PM +0000, Dr. David Alan Gilbert (git) wrote: > > > >> > From: "Dr. David Alan Gilbert" > > > >> > > > > >> > This is the 9th cut of my version of postcopy. > > > >> > > > > >> > The userfaultfd linux kernel code is now in the upstream kernel > > > >> > tree, and so 4.3 can be used without modification. > > > >> > > > > >> > This qemu series can be found at: > > > >> > https://github.com/orbitfp7/qemu.git > > > >> > on the wp3-postcopy-v9 tag > > > >> > > > > >> > Testing status: > > > >> > * Tested heavily on x86 > > > >> > * Smoke tested on aarch64 (so it does work on different page sizes) > > > >> > > > >> Tested minimally on ppc64 with back and forth postcopy migration of > > > >> unloaded pseries guest within the localhost - works as expected. > > > >> > > > >> However I am seeing a failure in one case. I am not sure if this is > > > >> a user error or a real issue in postcopy migration. If I switch to postcopy > > > >> migration immediately after starting the migration, I see the migration > > > >> failing with error: > > > >> > > > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large packaged state: 25905005 > > > > > > > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in include/sysemu/sysemu.h) > > > > on the size of the data accepted into the packaged blob. How big is the htab data likely to be? > > > > > > HTAB size is a variable and depends on maxmem size. It will be 1/128 > > > th of maxmem. So for a 32G guest, HTAB will be 256M in size. > > > > OK, that does get a bit big. > > Two possible fixes; > > 1 - postcopy htab (I don't know htab to know how hard that is) > > 2 - do one pass of iterable/non-postcopiable devices before we start the package; > > I'm just writing a patch to try that; I'll send it to you to let > > you try once I get it to not-break normal migration. > > > > Hi Bharata, > Can you try the patch below and let me know if it solves the problem; > if it doesn't, I'd be interested to know when the HTAB routines get > called in the precopy/postcopy phases. > > Dave > > >From 0f965d4dec7b188aec5324c3350704f993517cc8 Mon Sep 17 00:00:00 2001 > From: "Dr. David Alan Gilbert" > Date: Fri, 6 Nov 2015 12:06:16 +0000 > Subject: [PATCH] Finish non-postcopiable iterative devices before package > > Where we have iterable, but non-postcopiable devices (e.g. htab > or block migration), complete them before forming the 'package' > but with the CPUs stopped. This stops them filling up the package. That helps and the migration suceeds now when I switch to postcopy immediately after starting the migration. However after postcopy migration, when I attempt to start an incoming instance again to migrate the guest back, I see this failure: qemu-system-ppc64: cannot set up guest memory 'ppc_spapr.ram': Cannot allocate memory The same doesn't happen with normal migration. I will see if I can debug this more tomorrow. Regards, Bharata.