From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47810)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1Zvea6-0003wA-4J
	for qemu-devel@nongnu.org; Mon, 09 Nov 2015 00:01:27 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1Zvea4-0008HX-TC
	for qemu-devel@nongnu.org; Mon, 09 Nov 2015 00:01:26 -0500
Received: from ozlabs.org ([2401:3900:2:1::2]:58867)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1Zvea4-0008GZ-96
	for qemu-devel@nongnu.org; Mon, 09 Nov 2015 00:01:24 -0500
Date: Mon, 9 Nov 2015 15:13:59 +1100
From: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20151109041359.GC18558@voom.redhat.com>
References: <1446747083-18205-1-git-send-email-dgilbert@redhat.com>
	<20151106034846.GC29481@in.ibm.com> <20151106090952.GA2459@work-vm>
	<CAGZKiBrHLKvOfxrB9juKzS3v_gdp-jfKFGH8TGNB-4zOiDUWbA@mail.gmail.com>
	<20151106122222.GF2459@work-vm>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="z4+8/lEcDcG5Ke9S"
Content-Disposition: inline
In-Reply-To: <20151106122222.GF2459@work-vm>
Subject: Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, quintela@redhat.com, liang.z.li@intel.com, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Bharata B Rao <bharata.rao@gmail.com>, luis@cs.umu.se, Bharata B Rao <bharata@linux.vnet.ibm.com>, "amit.shah@redhat.com" <amit.shah@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>


--z4+8/lEcDcG5Ke9S
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Nov 06, 2015 at 12:22:23PM +0000, Dr. David Alan Gilbert wrote:
> * Bharata B Rao (bharata.rao@gmail.com) wrote:
> > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > > * Bharata B Rao (bharata@linux.vnet.ibm.com) wrote:
> > >> On Thu, Nov 05, 2015 at 06:10:27PM +0000, Dr. David Alan Gilbert (gi=
t) wrote:
> > >> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > >> >
> > >> >   This is the 9th cut of my version of postcopy.
> > >> >
> > >> > The userfaultfd linux kernel code is now in the upstream kernel
> > >> > tree, and so 4.3 can be used without modification.
> > >> >
> > >> > This qemu series can be found at:
> > >> > https://github.com/orbitfp7/qemu.git
> > >> > on the wp3-postcopy-v9 tag
> > >> >
> > >> > Testing status:
> > >> >   * Tested heavily on x86
> > >> >   * Smoke tested on aarch64 (so it does work on different page siz=
es)
> > >>
> > >> Tested minimally on ppc64 with back and forth postcopy migration of
> > >> unloaded pseries guest within the localhost - works as expected.
> > >>
> > >> However I am seeing a failure in one case. I am not sure if this is
> > >> a user error or a real issue in postcopy migration. If I switch to p=
ostcopy
> > >> migration immediately after starting the migration, I see the migrat=
ion
> > >> failing with error:
> > >>
> > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large pac=
kaged state: 25905005
> > >
> > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in inc=
lude/sysemu/sysemu.h)
> > > on the size of the data accepted into the packaged blob.  How big is =
the htab data likely to be?
> >=20
> > HTAB size is a variable and depends on maxmem size. It will be 1/128
> > th of maxmem. So for a 32G guest, HTAB will be 256M in size.
>=20
> OK, that does get a bit big.
> Two possible fixes;
>  1 - postcopy htab (I don't know htab to know how hard that is)

It's.. awkward.  We'd need a way to set up the mappings on the
destination so that faults on bits of the hash table not yet up to
date get flagged and handed to qemu, rather than causing a fatal fault
in the guest.  I suspect that will need host kernel changes, although
maybe there's a way of setting up the htab on destination so that
unmapping things look like MMIO (which already goes to qemu).

>  2 - do one pass of iterable/non-postcopiable devices before we start the=
 package;
>      I'm just writing a patch to try that; I'll send it to you to let
>      you try once I get it to not-break normal migration.

Hm.  So, depends a bit on what you mean by "one pass".  If we've had
one complete pass through the hash table, I'd expect that to be enough
to get the package down to a reasonable size.  But one pass through
the full hash table can be multiple calls to the htab iterator.

Which makes me think it's a bit odd that we're not already getting
most of the htab data across during the precopy phase.  Don't we
already delay entering the postcopy phase until precopy is "complete"
in the sense that the remaining non-postcopiable data is below the
downtime limit?  I would have thought that would also ensure we'd only
have a reasonable number of remaining htab updates for the package.

--=20
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

--z4+8/lEcDcG5Ke9S
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJWQB2HAAoJEGw4ysog2bOSHLAP/jE3fVgjhxecQCjHtZsSExEC
HkiqXzBE+z8XDuPSC+O/0+JTshSvSGIL5cWMRdj1n1xF6xO1IkouZb0QKNEEKjpR
G+7YIWhE5gNuNroigH1ydpuFbCDdKj6AZB94Oim5mVm1zUXuoHEGqkf+1sMuXFWw
/W7GSBSNm64eEKUEAoD7NCvBmBEr6RwJrO5LdwmJvuKWiwU+zLEfgXCzBnR8wOsX
OBUKhvjGLTOaGMSgfZdEmybH2KwBcZEDzapGIUCMHDG9XjcbIr11ulHyRXThze7R
iGySN7ggjlTSnCHOlZ7Lupov+T0qr3pzFxSaQKhre2w6jdgUlSAwCyqLxrLMq1Yq
BowUteliSUO+UL26sOhKeu5K+R9427esOuhcUHfttyI3UW/kwzeWGsy2HbI3KjPQ
2Kq7eCQBexRq2N2vP0L6GnVQ5OllBvEpjaolNrv/VSEwlWNb6G2IcGt16qp9UqWb
9JnvxmeXTfupnMtHhnRs7EKcN0mzNmoEmJG9icEhU6PfYm7fKz1KqnSeiG7eTrOf
Wq8nefPaEgI/yelK9vnS0ORO6U5fqmoqEQ5WwY8BbTOdQqA2sIUYEte5nTQQwBxi
YV1q6497kPxwb2yk4zznxYvItce77pUwRhHEGD6fmaa1Js7pktlHRa8AutyR/LLI
Ww0dzxlZZAwDavBV9mOi
=UTl6
-----END PGP SIGNATURE-----

--z4+8/lEcDcG5Ke9S--