From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:49605)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <eblake@redhat.com>) id 1Sh2r8-0005ne-D9
	for qemu-devel@nongnu.org; Tue, 19 Jun 2012 14:08:52 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <eblake@redhat.com>) id 1Sh2r3-00019V-6N
	for qemu-devel@nongnu.org; Tue, 19 Jun 2012 14:08:45 -0400
Received: from mx1.redhat.com ([209.132.183.28]:21751)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <eblake@redhat.com>) id 1Sh2r2-00018w-U6
	for qemu-devel@nongnu.org; Tue, 19 Jun 2012 14:08:41 -0400
Message-ID: <4FE0C01B.1020000@redhat.com>
Date: Tue, 19 Jun 2012 12:08:27 -0600
From: Eric Blake <eblake@redhat.com>
MIME-Version: 1.0
References: <1340120601-24747-1-git-send-email-owasserm@redhat.com>
	<1340120601-24747-11-git-send-email-owasserm@redhat.com>
In-Reply-To: <1340120601-24747-11-git-send-email-owasserm@redhat.com>
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature";
	boundary="------------enig64CD8AF1A86F457DE222E512"
Subject: Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and
 xbzrle_decode_buffer functions
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Orit Wasserman <owasserm@redhat.com>
Cc: peter.maydell@linaro.org, aliguori@us.ibm.com, quintela@redhat.com, stefanha@gmail.com, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, blauwirbel@gmail.com, Petter Svard <petters@cs.umu.se>, Benoit Hudzia <benoit.hudzia@sap.com>, avi@redhat.com, Aidan Shribman <aidan.shribman@sap.com>, pbonzini@redhat.com, chegu_vinod@hp.com

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig64CD8AF1A86F457DE222E512
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 06/19/2012 09:43 AM, Orit Wasserman wrote:
> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
> Signed-off-by: Petter Svard <petters@cs.umu.se>
> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  savevm.c |   91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++=
++++++++
>  1 files changed, 91 insertions(+), 0 deletions(-)
>=20

> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,=

> +                         uint8_t *dst, int dlen)
> +{
> +    uint32_t zrun_len =3D 0, nzrun_len =3D 0;
> +    int d =3D 0 , i =3D 0;
> +    uint8_t *nzrun_start =3D NULL;
> +
> +    while (i < slen) {
> +        /* overflow */
> +        if (d + 2 > dlen) {
> +            return -1;
> +        }
> +
> +        while (!(old_buf[i] ^ new_buf[i]) && ++i <=3D slen) {
> +            zrun_len++;
> +        }

Can we vectorize this to compare a word at a time for improved speed?
It would be really cool if glibc had an interface like memcmp(), but
which returned the offset of the first difference instead of a
comparison of the overall memory regions.  But even without such an
interface, pseudocode such as:

while (i < slen) {
  check for overflow
  if (not aligned)
    up to sizeof(long) byte comparisons
  while (long comparisons && not end)
    increment i by sizeof(long)
  if (end of buffer) {
    either buffer unchanged, or we have trailing zrun
    break;
  }
  // found a word that differed
  do byte-wise comparisons on that word
  then back to word-size xor, checking for no zero bytes in the xor (or
anywhere that a zero byte occurs in isolation, it is more compact to
pass that lone byte as part of the xor encoding instead of breaking out
to a zrun)
}

> +
> +        zrun_len =3D 0;
> +        nzrun_start =3D new_buf + i;
> +        while ((old_buf[i] ^ new_buf[i]) !=3D 0 && ++i <=3D slen) {
> +            nzrun_len++;
> +        }

Note that the encoding for a single unchanged byte with bytes changed on
both sides is more compact to pass the unchanged byte as part of the
nzrun, instead of breaking out into a zrun and starting a second nzrun.
 If I'm analyzing things correctly, that's also true for a run of two
bytes, and it's not until we get to a run of three or more unchanged
bytes where we start to get compression (except for the corner case of
ending on a 1- or 2-byte zrun).  Should we be altering this algorithm to
take advantage of this?

> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dle=
n)
> +{
> +    int i =3D 0, d =3D 0;
> +    uint32_t count =3D 0;
> +
> +    while (i < slen) {
> +
> +        /* zrun */
> +        i +=3D uleb128_decode_small(src + i, &count);
> +        d +=3D count;
> +
> +        /* overflow */
> +        g_assert(d <=3D dlen);

Again, is g_assert() appropriate for parsing user-supplied input, or are
there more appropriate ways for gracefully failing an incoming migration
attempt from corrupt data?

> +
> +        /* completed decoding */
> +        if (i =3D=3D slen - 1) {
> +            return d;
> +        }
> +
> +        /* nzrun */
> +        i +=3D uleb128_decode_small(src + i, &count);
> +
> +        g_assert(d + count <=3D dlen);
> +
> +        memcpy(dst + d , src + i, count);

Am I correct that you are using XOR to compute the locations of nzrun,
but then transferring the original data (and not the xor data) over the
wire?  This was not made clear in the doc patch of 3/13.  I would
suggest adding an example to the docs of a 4k page that consists only of
a short string at the beginning of the page (and NUL bytes thereafter),
then show both the before and after version of that page to demonstrate
sparse memory changes, as well as the XBZRLE encoding of that page.

--=20
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


--------------enig64CD8AF1A86F457DE222E512
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBCAAGBQJP4MAbAAoJEKeha0olJ0NqzrAIAKjHioAT4VoG5Y/X6YmQM5Fq
NLueP2OcnwVIFP9xImmVz/RqsAm+VpR7+rUT5x59or1pHVFcDVydGZ2C2YkdoANP
7md9fCM5VNJIjO0lOFjKIxk7TMEBcuu0omuxptLNpQGKMVRZvyAqLU0c7OPmr0Ut
HyH9M/t5JubBvsKNWE7kkXiHoiIDDp0pZWc0ihH2/cZ0YIicPgVYGVLZSHBAnr0M
7eULPaBko2L7kxkTUPMiOu8W21Z1icqbAVuHyWkHH69eOPi4pHxzK9k1oYOmbLq2
bRSLAIiSvhSSXFHu/SUkLc2M8VeZvYpC5YAzg2HBLquLZyXwcgw35LfjLq+E+qE=
=bdTR
-----END PGP SIGNATURE-----

--------------enig64CD8AF1A86F457DE222E512--