From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45968) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZuhJ4-0003km-8L for qemu-devel@nongnu.org; Fri, 06 Nov 2015 08:43:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZuhJ1-0000iI-1i for qemu-devel@nongnu.org; Fri, 06 Nov 2015 08:43:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41952) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZuhJ0-0000hz-OM for qemu-devel@nongnu.org; Fri, 06 Nov 2015 08:43:50 -0500 Date: Fri, 6 Nov 2015 13:43:42 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20151106134341.GG2459@work-vm> References: <1446747083-18205-1-git-send-email-dgilbert@redhat.com> <20151106034846.GC29481@in.ibm.com> <20151106090952.GA2459@work-vm> <20151106122222.GF2459@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <20151106122222.GF2459@work-vm> Subject: Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharata B Rao Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, quintela@redhat.com, liang.z.li@intel.com, "qemu-devel@nongnu.org" , luis@cs.umu.se, Bharata B Rao , "amit.shah@redhat.com" , Paolo Bonzini , David Gibson * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote: > * Bharata B Rao (bharata.rao@gmail.com) wrote: > > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert > > wrote: > > > * Bharata B Rao (bharata@linux.vnet.ibm.com) wrote: > > >> On Thu, Nov 05, 2015 at 06:10:27PM +0000, Dr. David Alan Gilbert (gi= t) wrote: > > >> > From: "Dr. David Alan Gilbert" > > >> > > > >> > This is the 9th cut of my version of postcopy. > > >> > > > >> > The userfaultfd linux kernel code is now in the upstream kernel > > >> > tree, and so 4.3 can be used without modification. > > >> > > > >> > This qemu series can be found at: > > >> > https://github.com/orbitfp7/qemu.git > > >> > on the wp3-postcopy-v9 tag > > >> > > > >> > Testing status: > > >> > * Tested heavily on x86 > > >> > * Smoke tested on aarch64 (so it does work on different page siz= es) > > >> > > >> Tested minimally on ppc64 with back and forth postcopy migration of > > >> unloaded pseries guest within the localhost - works as expected. > > >> > > >> However I am seeing a failure in one case. I am not sure if this is > > >> a user error or a real issue in postcopy migration. If I switch to p= ostcopy > > >> migration immediately after starting the migration, I see the migrat= ion > > >> failing with error: > > >> > > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large pac= kaged state: 25905005 > > > > > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in inc= lude/sysemu/sysemu.h) > > > on the size of the data accepted into the packaged blob. How big is = the htab data likely to be? > >=20 > > HTAB size is a variable and depends on maxmem size. It will be 1/128 > > th of maxmem. So for a 32G guest, HTAB will be 256M in size. >=20 > OK, that does get a bit big. > Two possible fixes; > 1 - postcopy htab (I don't know htab to know how hard that is) > 2 - do one pass of iterable/non-postcopiable devices before we start the= package; > I'm just writing a patch to try that; I'll send it to you to let > you try once I get it to not-break normal migration. >=20 Hi Bharata, Can you try the patch below and let me know if it solves the problem; if it doesn't, I'd be interested to know when the HTAB routines get called in the precopy/postcopy phases. Dave =46rom 0f965d4dec7b188aec5324c3350704f993517cc8 Mon Sep 17 00:00:00 2001 =46rom: "Dr. David Alan Gilbert" Date: Fri, 6 Nov 2015 12:06:16 +0000 Subject: [PATCH] Finish non-postcopiable iterative devices before package Where we have iterable, but non-postcopiable devices (e.g. htab or block migration), complete them before forming the 'package' but with the CPUs stopped. This stops them filling up the package. Signed-off-by: Dr. David Alan Gilbert --- include/sysemu/sysemu.h | 2 +- migration/migration.c | 10 ++++++++-- migration/savevm.c | 10 ++++++++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index f992494..3bb8897 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -112,7 +112,7 @@ void qemu_savevm_state_header(QEMUFile *f); int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy); void qemu_savevm_state_cleanup(void); void qemu_savevm_state_complete_postcopy(QEMUFile *f); -void qemu_savevm_state_complete_precopy(QEMUFile *f); +void qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only); void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size, uint64_t *res_non_postcopiable, uint64_t *res_postcopiable); diff --git a/migration/migration.c b/migration/migration.c index fd51d79..1d382ce 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1429,6 +1429,12 @@ static int postcopy_start(MigrationState *ms, bool *= old_vm_running) } =20 /* + * Cause any non-postcopiable, but iterative devices to + * send out their final data. + */ + qemu_savevm_state_complete_precopy(ms->file, true); + + /* * in Finish migrate and with the io-lock held everything should * be quiet, but we've potentially still got dirty pages and we * need to tell the destination to throw any pages it's already receiv= ed @@ -1471,7 +1477,7 @@ static int postcopy_start(MigrationState *ms, bool *o= ld_vm_running) */ qemu_savevm_send_postcopy_listen(fb); =20 - qemu_savevm_state_complete_precopy(fb); + qemu_savevm_state_complete_precopy(fb, false); qemu_savevm_send_ping(fb, 3); =20 qemu_savevm_send_postcopy_run(fb); @@ -1538,7 +1544,7 @@ static void migration_completion(MigrationState *s, i= nt current_active_state, ret =3D vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if (ret >=3D 0) { qemu_file_set_rate_limit(s->file, INT64_MAX); - qemu_savevm_state_complete_precopy(s->file); + qemu_savevm_state_complete_precopy(s->file, false); } } qemu_mutex_unlock_iothread(); diff --git a/migration/savevm.c b/migration/savevm.c index e5c8482..7e43923 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1026,7 +1026,7 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f) qemu_fflush(f); } =20 -void qemu_savevm_state_complete_precopy(QEMUFile *f) +void qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only) { QJSON *vmdesc; int vmdesc_len; @@ -1041,9 +1041,11 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f) QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (!se->ops || (in_postcopy && se->ops->save_live_complete_postcopy) || + (in_postcopy && !iterable_only) || !se->ops->save_live_complete_precopy) { continue; } + if (se->ops && se->ops->is_active) { if (!se->ops->is_active(se->opaque)) { continue; @@ -1062,6 +1064,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f) } } =20 + if (iterable_only) { + return; + } + vmdesc =3D qjson_new(); json_prop_int(vmdesc, "page_size", TARGET_PAGE_SIZE); json_start_array(vmdesc, "devices"); @@ -1176,7 +1182,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **err= p) =20 ret =3D qemu_file_get_error(f); if (ret =3D=3D 0) { - qemu_savevm_state_complete_precopy(f); + qemu_savevm_state_complete_precopy(f, false); ret =3D qemu_file_get_error(f); } if (ret !=3D 0) { --=20 2.5.0 -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK