All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: arei.gonglei@huawei.com
Cc: ChenLiang <chenliang88@huawei.com>,
	pbonzini@redhat.com, weidong.huang@huawei.com,
	qemu-devel@nongnu.org, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy
Date: Fri, 28 Mar 2014 19:59:09 +0000	[thread overview]
Message-ID: <20140328195908.GK2450@work-vm> (raw)
In-Reply-To: <1395890303-7880-8-git-send-email-arei.gonglei@huawei.com>

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Reducing data copy can reduce cpu overhead.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> ---
>  arch_init.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 84a4bd3..94b62e2 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
>  
> -    /* save current buffer into memory */
> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> -
>      /* XBZRLE encoding (if there is no overflow) */
> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
>                                         TARGET_PAGE_SIZE);

Hi Gonglei,

I've got a world which has this patch series on, and it's producing some XBZRLE errors,
and I suspect that it's down to my original worries of running xbzrle_encode_buffer
on changing data.

My setup is a pair of machines, with a guest with 4GB RAM running SPECjbb, 1GbE
between them and a 2GB xbzrle cache, and I can reliably trigger:

Failed to load XBZRLE page - decode error!


so I added some debug and saw:

qemu-system-x86_64: xbzrle_decode_buffer: Failed decode (uleb128 a) ret=1 i=45 count=0

xbzrle data:: 0000:  38 04 c0 d4  a4 fc 5c 04  d9 d4 a4 fc  5c 04 f2 d4
xbzrle data:: 0010:  a4 fc 88 02  03 c1 dd 79  01 01 03 17  10 87 b2 a3
xbzrle data:: 0020:  e8 8e b2 a3  e8 95 b2 a3  e8 9c b2 a3  e8 00*0c a3  <--- * corresponds to i=45
xbzrle data:: 0030:  b2 a3 e8 aa  b2 a3 e8 b1  b2 a3 e8 00  10 b8 b2 a3
xbzrle data:: 0040:  e8 bf b2 a3  e8 c6 b2 a3  e8 d2 ca a3  e8 00 0c d9
xbzrle data:: 0050:  ca a3 e8 e0  ca a3 e8 e7  ca a3 e8 00  08 ee ca a3
xbzrle data:: 0060:  e8 f5 ca a3  e8 00 08 fc  ca a3 e8 03  cb a3 e8 00
xbzrle data:: 0070:  0c 0a cb a3  e8 11 cb a3  e8 18 cb a3  e8 00 08 1f
xbzrle data:: 0080:  cb a3 e8 26  cb a3 e8 40  03 29 de 79  e9 02 01 82
xbzrle data:: 0090:  03 04 00 00  00 00 14 01  00 03 01 52  07 04 00 00
xbzrle data:: 00a0:  00 00 28 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00b0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00c0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00d0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00e0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00f0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0100:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0110:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0120:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0130:  00 00 e0 08  01 5a 03 01  b1 01 01 b1

If I understand this correctly the zero-run was found to be '0' length, and
that should never happen - xbzrle should always output non-0 lengths
for both it's zero and nz run lengths according to my reading of the code
and the error check on the decode.

So I added:
@@ -73,6 +74,11 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
             return d;
         }
 
+        if (zrun_len == 0) {
+            error_report("xbzrle_encode_buffer: zrun_len=0! i=%d\n",i);
+            return -1;
+        }
+
         d += uleb128_encode_small(dst + d, zrun_len);

on the encode side, and yes that's triggering (I also added it for the
nzrun version)

The code in xbzrle.c is basically like:

a  loop {
b     while *ptr == 0   increment
c     save count of 0's
d     while *ptr != 0   increment
e     save count of none=0's
f  }

With your patch the data can be changing during this loop since
the code now runs directly on current_data, so that a byte that might
have read as none-0 by loop (b) above, gets changed by the guest
to 0 just after (b) exits.  When it enters (d) it reads the byte
find it's 0 and outputs a '0' length count which is invalid format.

Dave

>      if (encoded_len == 0) {
> @@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      /* we need to update the data in the cache, in order to get the same data */
>      if (!last_stage) {
> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> +                             TARGET_PAGE_SIZE);
>      }
>  
>      /* Send XBZRLE based compressed page */
> -- 
> 1.7.12.4
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2014-03-28 19:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
2014-03-27  3:26   ` [Qemu-devel] for 2.0? " Eric Blake
2014-03-27  3:45     ` Gonglei (Arei)
2014-03-27 12:31       ` Paolo Bonzini
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 2/8] migration: Add counts of updating the dirty bitmap arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user arei.gonglei
2014-03-27  3:27   ` Eric Blake
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate arei.gonglei
2014-03-27  3:28   ` Eric Blake
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
2014-04-01 16:47   ` Dr. David Alan Gilbert
2014-04-02  9:01     ` Gonglei (Arei)
2014-04-03 14:37     ` 陈梁
2014-04-03 14:40       ` Dr. David Alan Gilbert
2014-04-03 15:36         ` 陈梁
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 6/8] XBZRLE: rebuild the cache_is_cached function arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy arei.gonglei
2014-03-28 19:59   ` Dr. David Alan Gilbert [this message]
2014-03-29  7:51     ` Gonglei (Arei)
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 8/8] migration: clear the dead code arei.gonglei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140328195908.GK2450@work-vm \
    --to=dgilbert@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=chenliang88@huawei.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=weidong.huang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.