From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:41738) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SmAhq-0000Qz-Oa for qemu-devel@nongnu.org; Tue, 03 Jul 2012 17:32:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SmAho-00037H-A9 for qemu-devel@nongnu.org; Tue, 03 Jul 2012 17:32:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40523) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SmAho-00036l-19 for qemu-devel@nongnu.org; Tue, 03 Jul 2012 17:32:20 -0400 Message-ID: <4FF364D8.4020407@redhat.com> Date: Tue, 03 Jul 2012 15:32:08 -0600 From: Eric Blake MIME-Version: 1.0 References: <1341323574-23206-1-git-send-email-owasserm@redhat.com> <1341323574-23206-11-git-send-email-owasserm@redhat.com> In-Reply-To: <1341323574-23206-11-git-send-email-owasserm@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enig6FB6C486FEDA83EEA5DC64D6" Subject: Re: [Qemu-devel] [PATCH v14 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Orit Wasserman Cc: peter.maydell@linaro.org, aliguori@us.ibm.com, quintela@redhat.com, stefanha@gmail.com, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, blauwirbel@gmail.com, Petter Svard , Benoit Hudzia , avi@redhat.com, Aidan Shribman , pbonzini@redhat.com, chegu_vinod@hp.com This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6FB6C486FEDA83EEA5DC64D6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 07/03/2012 07:52 AM, Orit Wasserman wrote: > Signed-off-by: Benoit Hudzia > Signed-off-by: Petter Svard > Signed-off-by: Aidan Shribman > Signed-off-by: Orit Wasserman > +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,= > + uint8_t *dst, int dlen) > +{ > + uint32_t zrun_len =3D 0, nzrun_len =3D 0; > + int d =3D 0, i =3D 0, start; > + long res, xor; > + uint8_t *nzrun_start =3D NULL; > + > + g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) && > + !((uintptr_t)new_buf & (sizeof(long) - 1)) && > + !(slen & (sizeof(long) - 1))); I know I suggested this in v13, so now I'm bike-shedding my own review, but I think it might be a bit more legible as: g_assert(!(((uintptr_t)old_buf) | ((uintptr_t)new_buf) | slen) & (sizeof(long) - 1))); > + > + while (i < slen) { > + /* overflow */ > + if (d + 2 > dlen) { > + return -1; > + } > + > + /* not aligned to sizeof(long) */ > + res =3D (slen - i) % sizeof(long); > + if (res) { > + start =3D i; > + while (!(old_buf[i] ^ new_buf[i]) && (i - start) <=3D res)= { > + zrun_len++; > + i++; > + } > + } I'm not sure why you need 'start'; this same block will do the same amount of computation with less typing and register pressure: res =3D (slen - i) % sizeof(long); while (res && old_buf[i] =3D=3D new_buf[i]) { zrun_len++; i++; res--; } > + > + if (zrun_len =3D=3D res) { If you use my above rewrite, then this condition changes to: if (!res) { > + while (i <=3D (slen - sizeof(long)) && I'd shorten this to: while (i < slen && > + (*(long *)(old_buf + i)) =3D=3D (*(long *)(new_buf = + i))) { > + i +=3D sizeof(long); > + zrun_len +=3D sizeof(long); > + } > + > + /* go over the rest */ > + while (old_buf[i] =3D=3D new_buf[i] && ++i <=3D slen) { Buffer overrun: if the page ends on a zrun, then at this point, i =3D=3D slen, and old_buf[i] is now pointing one past the end of the array. You need to swap the conditions, and only increment 'i' inside the {}, as in:= while (i < slen && old_buf[i] =3D=3D new_buf[i]) { i++; > + zrun_len++; > + } > + } > + > + /* buffer unchanged */ > + if (zrun_len =3D=3D slen) { > + return 0; > + } > + > + /* skip last zero run */ > + if (i =3D=3D slen + 1) { Buffer overrun. The loop should terminate at 'i =3D=3D slen', not at 'i = =3D=3D slen + 1', since old_buf[slen - 1] is the last addressable byte. > + return d; > + } > + > + d +=3D uleb128_encode_small(dst + d, zrun_len); > + > + /* no nzrun */ > + if (i =3D=3D slen) { > + return d; > + } This is the correct code but repeats the intent of what you were trying just before the uleb128_encode_small(). > + > + zrun_len =3D 0; > + nzrun_start =3D new_buf + i; Missing an overflow check here. If there are fewer than 2 bytes before we hit dlen, then there is no way that we can fit in an nzrun, so we might as well quit here. > + > + /* not aligned to sizeof(long) */ > + res =3D (slen - i) % sizeof(long); > + if (res) { > + start =3D i; > + while (old_buf[i] !=3D new_buf[i] && i - start < res) { > + i++; > + nzrun_len++; > + } > + } Again, you can exploit the fact that slen is aligned, and shorten to: res =3D (slen - i) % sizeof(long); while (res && old_bf[i] !=3D new_buf[i]) { i++; nzrun_len++; res--; } > + > + if (nzrun_len =3D=3D res) { and this would change to: if (!res) { > + /* truncation to 32-bit long okay */ > + long mask =3D 0x0101010101010101ULL; > + xor =3D *(long *)(old_buf + i) ^ *(long *)(new_buf + i); > + printf("cont\n"); Stray debugging comment. > + while (i <=3D (slen - sizeof(long))) { shorter to write as: while (i < slen) { > + if ((xor - mask) & ~xor & (mask << 7)) { > + /* found the end of an nzrun within the current lo= ng */ > + while (old_buf[i] !=3D new_buf[i] && ++i <=3D slen= ) { No need to compare against slen here, since the outer while guarantees that. You can simplify to: while (old_buf[i] !=3D new_buf[i]) { i++; > + nzrun_len++; > + } > + break; > + } else { > + i +=3D sizeof(long); > + nzrun_len +=3D sizeof(long); > + xor =3D *(long *)(old_buf + i) ^ *(long *)(new_buf= + i); > + } > + } > + At this point, you are guaranteed to have found the end of the nzrun (either you found two equal bytes, or you hit slen), which means... > + while (old_buf[i] !=3D new_buf[i] && ++i <=3D slen) { > + nzrun_len++; > + } this while loop is dead code. Furthermore, this while loop is a potential out-of-bounds-read (remember, old_buf[slen] is not valid memory), so nuking it is the way to go. > + } > + > + /* overflow */ > + if (d + nzrun_len + 2 > dlen) { > + return -1; There is a corner case where this reports overflow even when there was none - that is when the old_buf ends in an nzrun of 0x7f bytes or less, so the encoding occupies only 1 byte, and the encoded version would still fit in dlen. I think you want to defer your overflow checking until after... > + } > + > + d +=3D uleb128_encode_small(dst + d, nzrun_len); =2E..you know the encoded size. But that only works if you made sure earlier that there was room to encode nzrun_len (up to 2 bytes) in the first place. > + memcpy(dst + d, nzrun_start, nzrun_len); > + d +=3D nzrun_len; > + nzrun_len =3D 0; > + } > + > + return d; > +} > + Do you have any unit tests that pass in various inputs and compare against expected outputs? > +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dle= n) > +{ > + int i =3D 0, d =3D 0; > + int ret; > + uint32_t count =3D 0; > + > + while (i < slen) { > + > + /* zrun */ > + if ((slen - i) < 2 && *(src + i) & 0x80) { > + return -1; > + } You can simplify this - the protocol demands an nzrun after every zrun, which means at this point in the decode, there must be at least 3 bytes to be valid. It's simpler to write: if (slen - i < 2) { return -1; } > + > + ret =3D uleb128_decode_small(src + i, &count); > + if (ret < 0) { > + return -1; > + } I'd also add a check that we only ever get a zero-length zrun at the start of a buffer (anywhere else, a zero-length zrun is invalid); as in: if (!count && i) { return -1; } > + i +=3D ret; > + d +=3D count; > + > + /* overflow */ > + if (d > dlen) { > + return -1; > + } > + > + /* completed decoding */ No, if we got here, then we were handed an invalid stream. Remember, the protocol says that every zrun must be followed by an nzrun. > + if (i =3D=3D slen - 1) { > + return d; > + } > + > + /* nzrun */ > + if ((slen - i) < 2 && *(src + i) & 0x80) { > + return -1; > + } Again, a valid nzrun is at least 2 bytes, so no need to worry about the contents of *src, it is sufficient to check: if (slen - i < 2) { return -1; } > + ret =3D uleb128_decode_small(src + i, &count); > + if (ret < 0) { An nzrun should be a non-zero value; I'd write this as (ret <=3D 0) to rule out an attempt to pass a zero-length nzrun. > + return -1; > + } > + i +=3D ret; > + > + /* overflow */ > + if (d + count > dlen) { > + return -1; > + } > + > + memcpy(dst + d , src + i, count); > + d +=3D count; > + i +=3D count; > + } > + > + return d; > +} >=20 --=20 Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --------------enig6FB6C486FEDA83EEA5DC64D6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJP82TYAAoJEKeha0olJ0NqJKwH/2oHs+u3yheZsadmnotsm7Dg IhRCfaeIxtUHYJn4hb8ZZjT62mmTsra23qMkgbaprya+4RhMs5rUjxaGExbf7EZ/ SifAGfPbZTTOQ1cZa7EgG5Op+DT7pjChdFZ9x1MtHK1ueTD5WYETk8LwLcv5qvXf anYjeJRiXF4d+LasKt+kl2Kwb5dQUHLvPHOn4JBD4ztRQ0/EBahTvAfkdOwpLxQM CEvk4x+AFvE0ZaamAyJZ2jN0G/gzS0hifZi3tTJcCvuYA2reCBiyKmouW6EL9v4i bgxm6UVwmOu2/UzRM/JW86W0ywq5c2dqbtC9EPl+c+1DLSifD8H6cDdHCAgNf3o= =WUxm -----END PGP SIGNATURE----- --------------enig6FB6C486FEDA83EEA5DC64D6--