[Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
@ 2014-11-24 11:55 arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 1/6] xbzrle: optimize XBZRLE to decrease the cache misses arei.gonglei
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

Hi,

This set of patches rebase on
https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,

Those patches have been reviewed before half a year. 
For now I rebase them with the master branch. 
Hope those optimization can be merged in Qemu-2.3, Thanks!

a. Optimization the xbzrle remarkable decrease the cache misses.
    The efficiency of compress increases more than fifty times.
    Before the patch set, the cache almost totally miss when the
    number of cache item less than the dirty page number. Now the
    hot pages in the cache will not be replaced by other pages.

b. Reducing the data copy

Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>

Regards,
-Gonglei

ChenLiang (6):
  xbzrle: optimize XBZRLE to decrease the cache misses
  xbzrle: rebuild the cache_is_cached function
  xbzrle: don't check the value in the vm ram repeatedly
  xbzrle: check 8 bytes at a time after an concurrency scene
  migration: optimize xbzrle by reducing data copy
  migration: clear the dead code

 arch_init.c                    |  28 ++++--------
 docs/xbzrle.txt                |   8 ++++
 include/migration/page_cache.h |  10 ++--
 page_cache.c                   | 101 +++++++++++------------------------------
 xbzrle.c                       |  44 ++++++++++++------
 5 files changed, 79 insertions(+), 112 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 1/6] xbzrle: optimize XBZRLE to decrease the cache misses
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 2/6] xbzrle: rebuild the cache_is_cached function arei.gonglei
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

Avoid hot pages being replaced by others to remarkably decrease cache
misses

Sample results with the test program which quote from xbzrle.txt ran in
vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)

the test program:

include <stdlib.h>
include <stdio.h>
int main()
 {
        char *buf = (char *) calloc(4096, 4096);
        while (1) {
            int i;
            for (i = 0; i < 4096 * 4; i++) {
                buf[i * 4096 / 4]++;
            }
            printf(".");
        }
 }

before this patch:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
"cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
"cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
"ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
"transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
"dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
"normal":1403465}},"id":"libvirt-706"}

18k pages sent compressed in 52 seconds.
cache-miss-rate is 98.7%, totally miss.

after optimizing:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
"cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
"cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
"ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
"transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
"dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
"normal":387794}},"id":"libvirt-266"}

194k pages sent compressed in 18 seconds.
The value of cache-miss-rate decrease to 48.59%.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c                    |  8 +++++---
 docs/xbzrle.txt                |  8 ++++++++
 include/migration/page_cache.h | 10 +++++++---
 page_cache.c                   | 23 +++++++++++++++++++----
 4 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 7680d28..846e4c5 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -346,7 +346,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 
     /* We don't care if this fails to allocate a new cache page
      * as long as it updated an old one */
-    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE);
+    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE,
+                 bitmap_sync_count);
 }
 
 #define ENCODING_FLAG_XBZRLE 0x1
@@ -358,10 +359,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
     int encoded_len = 0, bytes_sent = -1;
     uint8_t *prev_cached_page;
 
-    if (!cache_is_cached(XBZRLE.cache, current_addr)) {
+    if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) {
         acct_info.xbzrle_cache_miss++;
         if (!last_stage) {
-            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
+            if (cache_insert(XBZRLE.cache, current_addr, *current_data,
+                             bitmap_sync_count) == -1) {
                 return -1;
             } else {
                 /* update *current_data when the page has been
diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
index cc3a26a..52c8511 100644
--- a/docs/xbzrle.txt
+++ b/docs/xbzrle.txt
@@ -71,6 +71,14 @@ encoded buffer:
 encoded length 24
 e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
 
+Cache update strategy
+=====================
+Keeping the hot pages in the cache is effective for decreased cache
+misses. XBZRLE uses a counter as the age of each page. The counter will
+increase after each ram dirty bitmap sync. When a cache conflict is
+detected, XBZRLE will only evict pages in the cache that are older than
+a threshold.
+
 Usage
 ======================
 1. Verify the destination QEMU version is able to decode the new format.
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 2d5ce2d..10ed532 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -43,8 +43,10 @@ void cache_fini(PageCache *cache);
  *
  * @cache pointer to the PageCache struct
  * @addr: page addr
+ * @current_age: current bitmap generation
  */
-bool cache_is_cached(const PageCache *cache, uint64_t addr);
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age);
 
 /**
  * get_cached_data: Get the data cached for an addr
@@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr);
  * cache_insert: insert the page into the cache. the page cache
  * will dup the data on insert. the previous value will be overwritten
  *
- * Returns -1 on error
+ * Returns -1 when the page isn't inserted into cache
  *
  * @cache pointer to the PageCache struct
  * @addr: page address
  * @pdata: pointer to the page
+ * @current_age: current bitmap generation
  */
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age);
 
 /**
  * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index 89bb1ec..aa68192 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -33,6 +33,9 @@
     do { } while (0)
 #endif
 
+/* the page in cache will not be replaced in two cycles */
+#define CACHED_PAGE_LIFETIME 2
+
 typedef struct CacheItem CacheItem;
 
 struct CacheItem {
@@ -122,7 +125,8 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr)
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
 {
     size_t pos;
 
@@ -131,7 +135,12 @@ bool cache_is_cached(const PageCache *cache, uint64_t addr)
 
     pos = cache_get_cache_pos(cache, addr);
 
-    return (cache->page_cache[pos].it_addr == addr);
+    if (cache->page_cache[pos].it_addr == addr) {
+        /* update the it_age when the cache hit */
+        cache->page_cache[pos].it_age = current_age;
+        return true;
+    }
+    return false;
 }
 
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
@@ -151,7 +160,8 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age)
 {
 
     CacheItem *it = NULL;
@@ -162,6 +172,11 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
 
+    if (it->it_data && it->it_addr != addr &&
+        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
+        /* the cache page is fresh, don't replace it */
+        return -1;
+    }
     /* allocate page */
     if (!it->it_data) {
         it->it_data = g_try_malloc(cache->page_size);
@@ -174,7 +189,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
 
     memcpy(it->it_data, pdata, cache->page_size);
 
-    it->it_age = ++cache->max_item_age;
+    it->it_age = current_age;
     it->it_addr = addr;
 
     return 0;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 2/6] xbzrle: rebuild the cache_is_cached function
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 1/6] xbzrle: optimize XBZRLE to decrease the cache misses arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 3/6] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

Rebuild the cache_is_cached function by cache_get_by_addr. And
drops the asserts because the caller is also asserting the same
thing.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 page_cache.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/page_cache.c b/page_cache.c
index aa68192..cf8878d 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -125,24 +125,6 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr,
-                     uint64_t current_age)
-{
-    size_t pos;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
-
-    pos = cache_get_cache_pos(cache, addr);
-
-    if (cache->page_cache[pos].it_addr == addr) {
-        /* update the it_age when the cache hit */
-        cache->page_cache[pos].it_age = current_age;
-        return true;
-    }
-    return false;
-}
-
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
 {
     size_t pos;
@@ -160,14 +142,26 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
+{
+    CacheItem *it;
+
+    it = cache_get_by_addr(cache, addr);
+
+    if (it->it_addr == addr) {
+        /* update the it_age when the cache hit */
+        it->it_age = current_age;
+        return true;
+    }
+    return false;
+}
+
 int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
                  uint64_t current_age)
 {
 
-    CacheItem *it = NULL;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
+    CacheItem *it;
 
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 3/6] xbzrle: don't check the value in the vm ram repeatedly
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 1/6] xbzrle: optimize XBZRLE to decrease the cache misses arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 2/6] xbzrle: rebuild the cache_is_cached function arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

xbzrle_encode_buffer checks the value in the vm ram repeatedly.
It is risk if runs xbzrle_encode_buffer on changing data.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 xbzrle.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/xbzrle.c b/xbzrle.c
index 8e220bf..d27a140 100644
--- a/xbzrle.c
+++ b/xbzrle.c
@@ -27,7 +27,7 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
                          uint8_t *dst, int dlen)
 {
     uint32_t zrun_len = 0, nzrun_len = 0;
-    int d = 0, i = 0;
+    int d = 0, i = 0, j;
     long res;
     uint8_t *nzrun_start = NULL;
 
@@ -82,6 +82,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
         if (d + 2 > dlen) {
             return -1;
         }
+        i++;
+        nzrun_len++;
         /* not aligned to sizeof(long) */
         res = (slen - i) % sizeof(long);
         while (res && old_buf[i] != new_buf[i]) {
@@ -96,14 +98,18 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
             unsigned long mask = (unsigned long)0x0101010101010101ULL;
             while (i < slen) {
                 unsigned long xor;
+                uint8_t *xor_ptr = (uint8_t *)(&xor);
                 xor = *(unsigned long *)(old_buf + i)
                     ^ *(unsigned long *)(new_buf + i);
                 if ((xor - mask) & ~xor & (mask << 7)) {
                     /* found the end of an nzrun within the current long */
-                    while (old_buf[i] != new_buf[i]) {
-                        nzrun_len++;
-                        i++;
+                    for (j = 0; j < sizeof(long); j++) {
+                        if (0 == xor_ptr[j]) {
+                            break;
+                        }
                     }
+                    i += j;
+                    nzrun_len += j;
                     break;
                 } else {
                     i += sizeof(long);
@@ -120,6 +126,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
         memcpy(dst + d, nzrun_start, nzrun_len);
         d += nzrun_len;
         nzrun_len = 0;
+        i++;
+        zrun_len++;
     }
 
     return d;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
                   ` (2 preceding siblings ...)
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 3/6] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-12-10  3:18   ` Amit Shah
  2014-12-10 10:05   ` Juan Quintela
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy arei.gonglei
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

The logic of old code is correct. But Checking byte by byte will
consume time after an concurrency scene.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 xbzrle.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/xbzrle.c b/xbzrle.c
index d27a140..0477367 100644
--- a/xbzrle.c
+++ b/xbzrle.c
@@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
 
         /* word at a time for speed */
         if (!res) {
-            while (i < slen &&
-                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
-                i += sizeof(long);
-                zrun_len += sizeof(long);
-            }
-
-            /* go over the rest */
-            while (i < slen && old_buf[i] == new_buf[i]) {
-                zrun_len++;
-                i++;
+            while (i < slen) {
+                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
+                    i += sizeof(long);
+                    zrun_len += sizeof(long);
+                } else {
+                    /* go over the rest */
+                    for (j = 0; j < sizeof(long); j++) {
+                        if (old_buf[i] == new_buf[i]) {
+                            i++;
+                            zrun_len++;
+                        } else {
+                            break;
+                        }
+                    }
+                    if (j != sizeof(long)) {
+                        break;
+                    }
+                }
             }
         }
 
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
@ 2014-12-10  3:18   ` Amit Shah
  2014-12-10  3:55     ` ChenLiang
  2014-12-10 10:05   ` Juan Quintela
  1 sibling, 1 reply; 22+ messages in thread
From: Amit Shah @ 2014-12-10  3:18 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, qemu-devel,
	pbonzini, dgilbert

On (Mon) 24 Nov 2014 [19:55:50], arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> The logic of old code is correct. But Checking byte by byte will
> consume time after an concurrency scene.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  xbzrle.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/xbzrle.c b/xbzrle.c
> index d27a140..0477367 100644
> --- a/xbzrle.c
> +++ b/xbzrle.c
> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>  
>          /* word at a time for speed */
>          if (!res) {
> -            while (i < slen &&
> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> -                i += sizeof(long);
> -                zrun_len += sizeof(long);
> -            }
> -
> -            /* go over the rest */
> -            while (i < slen && old_buf[i] == new_buf[i]) {
> -                zrun_len++;
> -                i++;
> +            while (i < slen) {
> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> +                    i += sizeof(long);
> +                    zrun_len += sizeof(long);
> +                } else {
> +                    /* go over the rest */
> +                    for (j = 0; j < sizeof(long); j++) {
> +                        if (old_buf[i] == new_buf[i]) {
> +                            i++;
> +                            zrun_len++;

I don't see how this is different from the code it's replacing.  The
check and increments are all the same.  Difficult to see why there'll
be a speed benefit.  Can you please explain?  Do you have any
performance numbers for before/after?

Thanks,

		Amit

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-12-10  3:18   ` Amit Shah
@ 2014-12-10  3:55     ` ChenLiang
  2014-12-10 10:02       ` Amit Shah
  0 siblings, 1 reply; 22+ messages in thread
From: ChenLiang @ 2014-12-10  3:55 UTC (permalink / raw)
  To: Amit Shah
  Cc: weidong.huang, quintela, qemu-devel, dgilbert, arei.gonglei,
	pbonzini, peter.huangpeng

On 2014/12/10 11:18, Amit Shah wrote:

> On (Mon) 24 Nov 2014 [19:55:50], arei.gonglei@huawei.com wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>>
>> The logic of old code is correct. But Checking byte by byte will
>> consume time after an concurrency scene.
>>
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>  xbzrle.c | 28 ++++++++++++++++++----------
>>  1 file changed, 18 insertions(+), 10 deletions(-)
>>
>> diff --git a/xbzrle.c b/xbzrle.c
>> index d27a140..0477367 100644
>> --- a/xbzrle.c
>> +++ b/xbzrle.c
>> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>>  
>>          /* word at a time for speed */
>>          if (!res) {
>> -            while (i < slen &&
>> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> -                i += sizeof(long);
>> -                zrun_len += sizeof(long);
>> -            }
>> -
>> -            /* go over the rest */
>> -            while (i < slen && old_buf[i] == new_buf[i]) {
>> -                zrun_len++;
>> -                i++;
>> +            while (i < slen) {
>> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> +                    i += sizeof(long);
>> +                    zrun_len += sizeof(long);
>> +                } else {
>> +                    /* go over the rest */
>> +                    for (j = 0; j < sizeof(long); j++) {
>> +                        if (old_buf[i] == new_buf[i]) {
>> +                            i++;
>> +                            zrun_len++;
> 
> I don't see how this is different from the code it's replacing.  The
> check and increments are all the same.  Difficult to see why there'll
> be a speed benefit.  Can you please explain?  Do you have any
> performance numbers for before/after?
> 
> Thanks,
> 
> 		Amit
> 
> .
> 

Hi Amit:

+                    for (j = 0; j < sizeof(long); j++) {
+                        if (old_buf[i] == new_buf[i]) {
+                            i++;
+                            zrun_len++;
+                        } else {
+                            break;
+                        }
+                    }
+                    if (j != sizeof(long)) {
+                        break;
+                    }

The branch of *j != sizeof(long)* may not be hit after an concurrency scene.
so we can continue doing "(*(long *)(old_buf + i)) == (*(long *)(new_buf + i))".
On the another side the old code does "old_buf[i] == new_buf[i]".

To be honest, This scene is rare.

Best regards
ChenLiang

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-12-10  3:55     ` ChenLiang
@ 2014-12-10 10:02       ` Amit Shah
  2014-12-10 10:37         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 22+ messages in thread
From: Amit Shah @ 2014-12-10 10:02 UTC (permalink / raw)
  To: ChenLiang
  Cc: weidong.huang, quintela, qemu-devel, dgilbert, arei.gonglei,
	pbonzini, peter.huangpeng

On (Wed) 10 Dec 2014 [11:55:49], ChenLiang wrote:
> On 2014/12/10 11:18, Amit Shah wrote:
> 
> > On (Mon) 24 Nov 2014 [19:55:50], arei.gonglei@huawei.com wrote:
> >> From: ChenLiang <chenliang88@huawei.com>
> >>
> >> The logic of old code is correct. But Checking byte by byte will
> >> consume time after an concurrency scene.
> >>
> >> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >> ---
> >>  xbzrle.c | 28 ++++++++++++++++++----------
> >>  1 file changed, 18 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/xbzrle.c b/xbzrle.c
> >> index d27a140..0477367 100644
> >> --- a/xbzrle.c
> >> +++ b/xbzrle.c
> >> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
> >>  
> >>          /* word at a time for speed */
> >>          if (!res) {
> >> -            while (i < slen &&
> >> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> >> -                i += sizeof(long);
> >> -                zrun_len += sizeof(long);
> >> -            }
> >> -
> >> -            /* go over the rest */
> >> -            while (i < slen && old_buf[i] == new_buf[i]) {
> >> -                zrun_len++;
> >> -                i++;
> >> +            while (i < slen) {
> >> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> >> +                    i += sizeof(long);
> >> +                    zrun_len += sizeof(long);
> >> +                } else {
> >> +                    /* go over the rest */
> >> +                    for (j = 0; j < sizeof(long); j++) {
> >> +                        if (old_buf[i] == new_buf[i]) {
> >> +                            i++;
> >> +                            zrun_len++;
> > 
> > I don't see how this is different from the code it's replacing.  The
> > check and increments are all the same.  Difficult to see why there'll
> > be a speed benefit.  Can you please explain?  Do you have any
> > performance numbers for before/after?
> > 
> > Thanks,
> > 
> > 		Amit
> > 
> > .
> > 
> 
> Hi Amit:
> 
> +                    for (j = 0; j < sizeof(long); j++) {
> +                        if (old_buf[i] == new_buf[i]) {
> +                            i++;
> +                            zrun_len++;
> +                        } else {
> +                            break;
> +                        }
> +                    }
> +                    if (j != sizeof(long)) {
> +                        break;
> +                    }
> 
> The branch of *j != sizeof(long)* may not be hit after an concurrency scene.
> so we can continue doing "(*(long *)(old_buf + i)) == (*(long *)(new_buf + i))".
> On the another side the old code does "old_buf[i] == new_buf[i]".

Frankly, I still don't see it.

Earlier:

 while..
  match words
 while..
  match bytes

Now:

  while..
   match words
   if word mismatch
   match bytes

to me, essentially looks the same.

I'll propose to drop this patch till we have a proper justification.


		Amit

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-12-10 10:02       ` Amit Shah
@ 2014-12-10 10:37         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-10 10:37 UTC (permalink / raw)
  To: Amit Shah
  Cc: ChenLiang, weidong.huang, quintela, qemu-devel, peter.huangpeng,
	arei.gonglei, pbonzini

* Amit Shah (amit.shah@redhat.com) wrote:
> On (Wed) 10 Dec 2014 [11:55:49], ChenLiang wrote:
> > On 2014/12/10 11:18, Amit Shah wrote:
> > 
> > > On (Mon) 24 Nov 2014 [19:55:50], arei.gonglei@huawei.com wrote:
> > >> From: ChenLiang <chenliang88@huawei.com>
> > >>
> > >> The logic of old code is correct. But Checking byte by byte will
> > >> consume time after an concurrency scene.
> > >>
> > >> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> > >> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > >> ---
> > >>  xbzrle.c | 28 ++++++++++++++++++----------
> > >>  1 file changed, 18 insertions(+), 10 deletions(-)
> > >>
> > >> diff --git a/xbzrle.c b/xbzrle.c
> > >> index d27a140..0477367 100644
> > >> --- a/xbzrle.c
> > >> +++ b/xbzrle.c
> > >> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
> > >>  
> > >>          /* word at a time for speed */
> > >>          if (!res) {
> > >> -            while (i < slen &&
> > >> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> > >> -                i += sizeof(long);
> > >> -                zrun_len += sizeof(long);
> > >> -            }
> > >> -
> > >> -            /* go over the rest */
> > >> -            while (i < slen && old_buf[i] == new_buf[i]) {
> > >> -                zrun_len++;
> > >> -                i++;
> > >> +            while (i < slen) {
> > >> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> > >> +                    i += sizeof(long);
> > >> +                    zrun_len += sizeof(long);
> > >> +                } else {
> > >> +                    /* go over the rest */
> > >> +                    for (j = 0; j < sizeof(long); j++) {
> > >> +                        if (old_buf[i] == new_buf[i]) {
> > >> +                            i++;
> > >> +                            zrun_len++;
> > > 
> > > I don't see how this is different from the code it's replacing.  The
> > > check and increments are all the same.  Difficult to see why there'll
> > > be a speed benefit.  Can you please explain?  Do you have any
> > > performance numbers for before/after?
> > > 
> > > Thanks,
> > > 
> > > 		Amit
> > > 
> > > .
> > > 
> > 
> > Hi Amit:
> > 
> > +                    for (j = 0; j < sizeof(long); j++) {
> > +                        if (old_buf[i] == new_buf[i]) {
> > +                            i++;
> > +                            zrun_len++;
> > +                        } else {
> > +                            break;
> > +                        }
> > +                    }
> > +                    if (j != sizeof(long)) {
> > +                        break;
> > +                    }
> > 
> > The branch of *j != sizeof(long)* may not be hit after an concurrency scene.
> > so we can continue doing "(*(long *)(old_buf + i)) == (*(long *)(new_buf + i))".
> > On the another side the old code does "old_buf[i] == new_buf[i]".
> 
> Frankly, I still don't see it.
> 
> Earlier:
> 
>  while..
>   match words
>  while..
>   match bytes
> 
> Now:
> 
>   while..
>    match words
>    if word mismatch
>    match bytes
> 
> to me, essentially looks the same.
> 
> I'll propose to drop this patch till we have a proper justification.

Watch for the next patch; - patch 5 makes new_buf be the live, volatile memory,
when that happens you could end up falling into the 'match bytes' and getting
a whole word matching again because it had changed while you were processing it,
and that's the change this loop does, it would flip back to processing
whole words at a time again instead of getting stuck in the byte loop.
(It would be rare I guess)

Dave

> 
> 
> 		Amit
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
  2014-12-10  3:18   ` Amit Shah
@ 2014-12-10 10:05   ` Juan Quintela
  2014-12-10 10:41     ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 22+ messages in thread
From: Juan Quintela @ 2014-12-10 10:05 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, qemu-devel, dgilbert, amit.shah,
	pbonzini, peter.huangpeng

<arei.gonglei@huawei.com> wrote:
> From: ChenLiang <chenliang88@huawei.com>
>
> The logic of old code is correct. But Checking byte by byte will
> consume time after an concurrency scene.
>
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  xbzrle.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/xbzrle.c b/xbzrle.c
> index d27a140..0477367 100644
> --- a/xbzrle.c
> +++ b/xbzrle.c
> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>  
>          /* word at a time for speed */
>          if (!res) {
> -            while (i < slen &&
> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> -                i += sizeof(long);
> -                zrun_len += sizeof(long);
> -            }
> -
> -            /* go over the rest */
> -            while (i < slen && old_buf[i] == new_buf[i]) {
> -                zrun_len++;
> -                i++;
> +            while (i < slen) {
> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> +                    i += sizeof(long);
> +                    zrun_len += sizeof(long);
> +                } else {
> +                    /* go over the rest */
> +                    for (j = 0; j < sizeof(long); j++) {
> +                        if (old_buf[i] == new_buf[i]) {
> +                            i++;
> +                            zrun_len++;
> +                        } else {
> +                            break;
> +                        }
> +                    }
> +                    if (j != sizeof(long)) {
> +                        break;
> +                    }
> +                }
>              }
>          }

This still does misaligned reads.  If we want to do aligned stuff,
something like that looks much better, no?  Notice that where I put
"break", I mean we have finished, but you get the idea.  Or I am missing something?


while(i % sizeof(long) != 0) {
	if (old_buf[i] == new_buf[i]) {
		i++;
		zrun_len++;
	} else {
		break;
	}
}

while (i < slen) {
	if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
		i += sizeof(long);
		zrun_len += sizeof(long);
	} else {
		break;
	}
}

for (j = 0; j < sizeof(long); j++) {
	if (old_buf[i] == new_buf[i]) {
		i++;
		zrun_len++;
	} else {
		break;
	}
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-12-10 10:05   ` Juan Quintela
@ 2014-12-10 10:41     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-10 10:41 UTC (permalink / raw)
  To: Juan Quintela
  Cc: ChenLiang, weidong.huang, qemu-devel, dgilbert, arei.gonglei,
	amit.shah, pbonzini, peter.huangpeng

* Juan Quintela (quintela@redhat.com) wrote:
> <arei.gonglei@huawei.com> wrote:
> > From: ChenLiang <chenliang88@huawei.com>
> >
> > The logic of old code is correct. But Checking byte by byte will
> > consume time after an concurrency scene.
> >
> > Signed-off-by: ChenLiang <chenliang88@huawei.com>
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > ---
> >  xbzrle.c | 28 ++++++++++++++++++----------
> >  1 file changed, 18 insertions(+), 10 deletions(-)
> >
> > diff --git a/xbzrle.c b/xbzrle.c
> > index d27a140..0477367 100644
> > --- a/xbzrle.c
> > +++ b/xbzrle.c
> > @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
> >  
> >          /* word at a time for speed */
> >          if (!res) {
> > -            while (i < slen &&
> > -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> > -                i += sizeof(long);
> > -                zrun_len += sizeof(long);
> > -            }
> > -
> > -            /* go over the rest */
> > -            while (i < slen && old_buf[i] == new_buf[i]) {
> > -                zrun_len++;
> > -                i++;
> > +            while (i < slen) {
> > +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> > +                    i += sizeof(long);
> > +                    zrun_len += sizeof(long);
> > +                } else {
> > +                    /* go over the rest */
> > +                    for (j = 0; j < sizeof(long); j++) {
> > +                        if (old_buf[i] == new_buf[i]) {
> > +                            i++;
> > +                            zrun_len++;
> > +                        } else {
> > +                            break;
> > +                        }
> > +                    }
> > +                    if (j != sizeof(long)) {
> > +                        break;
> > +                    }
> > +                }
> >              }
> >          }
> 
> This still does misaligned reads.  If we want to do aligned stuff,
> something like that looks much better, no?  Notice that where I put
> "break", I mean we have finished, but you get the idea.  Or I am missing something?

You're missing the loop just above this code that's changed that
guarantees alginment by the start of this code.

Dave

> 
> while(i % sizeof(long) != 0) {
> 	if (old_buf[i] == new_buf[i]) {
> 		i++;
> 		zrun_len++;
> 	} else {
> 		break;
> 	}
> }
> 
> while (i < slen) {
> 	if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> 		i += sizeof(long);
> 		zrun_len += sizeof(long);
> 	} else {
> 		break;
> 	}
> }
> 
> for (j = 0; j < sizeof(long); j++) {
> 	if (old_buf[i] == new_buf[i]) {
> 		i++;
> 		zrun_len++;
> 	} else {
> 		break;
> 	}
> }
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
                   ` (3 preceding siblings ...)
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-12-10 10:35   ` Juan Quintela
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 6/6] migration: clear the dead code arei.gonglei
  2014-12-10  2:33 ` [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle Amit Shah
  6 siblings, 1 reply; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 arch_init.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 846e4c5..0d0ba4a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -376,11 +376,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
 
-    /* save current buffer into memory */
-    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
-
     /* XBZRLE encoding (if there is no overflow) */
-    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
+    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
                                        TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
                                        TARGET_PAGE_SIZE);
     if (encoded_len == 0) {
@@ -399,7 +396,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     /* we need to update the data in the cache, in order to get the same data */
     if (!last_stage) {
-        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
+        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
+                             TARGET_PAGE_SIZE);
     }
 
     /* Send XBZRLE based compressed page */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-12-10 10:35   ` Juan Quintela
  2014-12-10 10:39     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 22+ messages in thread
From: Juan Quintela @ 2014-12-10 10:35 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, qemu-devel, dgilbert, amit.shah,
	pbonzini, peter.huangpeng

<arei.gonglei@huawei.com> wrote:
> From: ChenLiang <chenliang88@huawei.com>
>
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  arch_init.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch_init.c b/arch_init.c
> index 846e4c5..0d0ba4a 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -376,11 +376,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
>  
> -    /* save current buffer into memory */
> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> -

I think this is wrong.
Remember that now migration is done in parallel with the guest running.
If the guest modifies the page while we are encoding it, we end with a
different contents in the cache and in the real page, and that causes
corruption.

This way, what we encoded is a "private copy of the page, so we don't
have that problem".

Makes sense?

>      /* XBZRLE encoding (if there is no overflow) */
> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
>                                         TARGET_PAGE_SIZE);
>      if (encoded_len == 0) {
> @@ -399,7 +396,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      /* we need to update the data in the cache, in order to get the same data */
>      if (!last_stage) {
> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> +                             TARGET_PAGE_SIZE);
>      }
>  
>      /* Send XBZRLE based compressed page */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy
  2014-12-10 10:35   ` Juan Quintela
@ 2014-12-10 10:39     ` Dr. David Alan Gilbert
  2014-12-11  2:34       ` ChenLiang
  0 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-10 10:39 UTC (permalink / raw)
  To: Juan Quintela
  Cc: ChenLiang, weidong.huang, qemu-devel, peter.huangpeng,
	arei.gonglei, amit.shah, pbonzini

* Juan Quintela (quintela@redhat.com) wrote:
> <arei.gonglei@huawei.com> wrote:
> > From: ChenLiang <chenliang88@huawei.com>
> >
> > Signed-off-by: ChenLiang <chenliang88@huawei.com>
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  arch_init.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch_init.c b/arch_init.c
> > index 846e4c5..0d0ba4a 100644
> > --- a/arch_init.c
> > +++ b/arch_init.c
> > @@ -376,11 +376,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
> >  
> >      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
> >  
> > -    /* save current buffer into memory */
> > -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> > -
> 
> I think this is wrong.
> Remember that now migration is done in parallel with the guest running.
> If the guest modifies the page while we are encoding it, we end with a
> different contents in the cache and in the real page, and that causes
> corruption.
> 
> This way, what we encoded is a "private copy of the page, so we don't
> have that problem".
> 
> Makes sense?

Kind of; see back in March I hit this while testing the 1st version of this
patch:
https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05631.html

but then we had some patches that fixed it; and the discussion was here:
https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05677.html
and then I summarized it as:
https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05768.html

* It's an interesting, if unusual, observation; it means we can send
* completely bogus data at this point because we know it will get
* overwritten later; I think the requirements are:
* 
*   1) That we meet the protocol (which seems to require that the run lengths are
*      not allowed to be 0)
*   2) That we don't get stuck in any loops or go over the end of the page
*      (I think this means we have to be careful of those byte loops within
*      the word-at-a-time cases)
*   3) The page that ends up in our xbzrle cache must match the destination
*      page, since the next cycle of xbzrle will use it as reference.
* 

Dave

> >      /* XBZRLE encoding (if there is no overflow) */
> > -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> > +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
> >                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
> >                                         TARGET_PAGE_SIZE);
> >      if (encoded_len == 0) {
> > @@ -399,7 +396,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
> >  
> >      /* we need to update the data in the cache, in order to get the same data */
> >      if (!last_stage) {
> > -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> > +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> > +                             TARGET_PAGE_SIZE);
> >      }
> >  
> >      /* Send XBZRLE based compressed page */
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy
  2014-12-10 10:39     ` Dr. David Alan Gilbert
@ 2014-12-11  2:34       ` ChenLiang
  2014-12-11  8:57         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 22+ messages in thread
From: ChenLiang @ 2014-12-11  2:34 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: weidong.huang, Juan Quintela, peter.huangpeng, qemu-devel,
	arei.gonglei, amit.shah, pbonzini

On 2014/12/10 18:39, Dr. David Alan Gilbert wrote:

> * Juan Quintela (quintela@redhat.com) wrote:
>> <arei.gonglei@huawei.com> wrote:
>>> From: ChenLiang <chenliang88@huawei.com>
>>>
>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> ---
>>>  arch_init.c | 8 +++-----
>>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch_init.c b/arch_init.c
>>> index 846e4c5..0d0ba4a 100644
>>> --- a/arch_init.c
>>> +++ b/arch_init.c
>>> @@ -376,11 +376,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>>>  
>>>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
>>>  
>>> -    /* save current buffer into memory */
>>> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
>>> -
>>
>> I think this is wrong.
>> Remember that now migration is done in parallel with the guest running.
>> If the guest modifies the page while we are encoding it, we end with a
>> different contents in the cache and in the real page, and that causes
>> corruption.
>>
>> This way, what we encoded is a "private copy of the page, so we don't
>> have that problem".
>>
>> Makes sense?
> 
> Kind of; see back in March I hit this while testing the 1st version of this
> patch:
> https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05631.html
> 
> but then we had some patches that fixed it; and the discussion was here:
> https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05677.html
> and then I summarized it as:
> https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05768.html
> 
> * It's an interesting, if unusual, observation; it means we can send
> * completely bogus data at this point because we know it will get
> * overwritten later; I think the requirements are:
> * 
> *   1) That we meet the protocol (which seems to require that the run lengths are
> *      not allowed to be 0)
> *   2) That we don't get stuck in any loops or go over the end of the page
> *      (I think this means we have to be careful of those byte loops within
> *      the word-at-a-time cases)
> *   3) The page that ends up in our xbzrle cache must match the destination
> *      page, since the next cycle of xbzrle will use it as reference.
> * 
> 
> Dave

>
Hi
The content that is discussed above is helpful to understand
the principle of xbzrle. Do you mind that I add it into xbzrle.txt?

Best regards
ChenLiang

>>>      /* XBZRLE encoding (if there is no overflow) */
>>> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
>>> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
>>>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
>>>                                         TARGET_PAGE_SIZE);
>>>      if (encoded_len == 0) {
>>> @@ -399,7 +396,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>>>  
>>>      /* we need to update the data in the cache, in order to get the same data */
>>>      if (!last_stage) {
>>> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
>>> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
>>> +                             TARGET_PAGE_SIZE);
>>>      }
>>>  
>>>      /* Send XBZRLE based compressed page */
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> .
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy
  2014-12-11  2:34       ` ChenLiang
@ 2014-12-11  8:57         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-11  8:57 UTC (permalink / raw)
  To: ChenLiang
  Cc: weidong.huang, Juan Quintela, peter.huangpeng, qemu-devel,
	arei.gonglei, amit.shah, pbonzini

* ChenLiang (chenliang88@huawei.com) wrote:
> On 2014/12/10 18:39, Dr. David Alan Gilbert wrote:
> 
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> <arei.gonglei@huawei.com> wrote:
> >>> From: ChenLiang <chenliang88@huawei.com>
> >>>
> >>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >>> ---
> >>>  arch_init.c | 8 +++-----
> >>>  1 file changed, 3 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/arch_init.c b/arch_init.c
> >>> index 846e4c5..0d0ba4a 100644
> >>> --- a/arch_init.c
> >>> +++ b/arch_init.c
> >>> @@ -376,11 +376,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
> >>>  
> >>>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
> >>>  
> >>> -    /* save current buffer into memory */
> >>> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> >>> -
> >>
> >> I think this is wrong.
> >> Remember that now migration is done in parallel with the guest running.
> >> If the guest modifies the page while we are encoding it, we end with a
> >> different contents in the cache and in the real page, and that causes
> >> corruption.
> >>
> >> This way, what we encoded is a "private copy of the page, so we don't
> >> have that problem".
> >>
> >> Makes sense?
> > 
> > Kind of; see back in March I hit this while testing the 1st version of this
> > patch:
> > https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05631.html
> > 
> > but then we had some patches that fixed it; and the discussion was here:
> > https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05677.html
> > and then I summarized it as:
> > https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg05768.html
> > 
> > * It's an interesting, if unusual, observation; it means we can send
> > * completely bogus data at this point because we know it will get
> > * overwritten later; I think the requirements are:
> > * 
> > *   1) That we meet the protocol (which seems to require that the run lengths are
> > *      not allowed to be 0)
> > *   2) That we don't get stuck in any loops or go over the end of the page
> > *      (I think this means we have to be careful of those byte loops within
> > *      the word-at-a-time cases)
> > *   3) The page that ends up in our xbzrle cache must match the destination
> > *      page, since the next cycle of xbzrle will use it as reference.
> > * 
> > 
> > Dave
> 
> >
> Hi
> The content that is discussed above is helpful to understand
> the principle of xbzrle. Do you mind that I add it into xbzrle.txt?

That's fine.

Dave

> 
> Best regards
> ChenLiang
> 
> >>>      /* XBZRLE encoding (if there is no overflow) */
> >>> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> >>> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
> >>>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
> >>>                                         TARGET_PAGE_SIZE);
> >>>      if (encoded_len == 0) {
> >>> @@ -399,7 +396,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
> >>>  
> >>>      /* we need to update the data in the cache, in order to get the same data */
> >>>      if (!last_stage) {
> >>> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> >>> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> >>> +                             TARGET_PAGE_SIZE);
> >>>      }
> >>>  
> >>>      /* Send XBZRLE based compressed page */
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > .
> > 
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH RESEND for 2.3 6/6] migration: clear the dead code
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
                   ` (4 preceding siblings ...)
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-11-24 11:55 ` arei.gonglei
  2014-12-10  2:33 ` [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle Amit Shah
  6 siblings, 0 replies; 22+ messages in thread
From: arei.gonglei @ 2014-11-24 11:55 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, dgilbert,
	Gonglei, amit.shah, pbonzini

From: ChenLiang <chenliang88@huawei.com>

clear the dead code

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c  | 12 ------------
 page_cache.c | 58 ----------------------------------------------------------
 2 files changed, 70 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 0d0ba4a..5d31def 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -167,8 +167,6 @@ static inline bool is_zero_range(uint8_t *p, uint64_t size)
 static struct {
     /* buffer used for XBZRLE encoding */
     uint8_t *encoded_buf;
-    /* buffer for storing page content */
-    uint8_t *current_buf;
     /* Cache for XBZRLE, Protected by lock. */
     PageCache *cache;
     QemuMutex lock;
@@ -750,10 +748,8 @@ static void migration_end(void)
     if (XBZRLE.cache) {
         cache_fini(XBZRLE.cache);
         g_free(XBZRLE.encoded_buf);
-        g_free(XBZRLE.current_buf);
         XBZRLE.cache = NULL;
         XBZRLE.encoded_buf = NULL;
-        XBZRLE.current_buf = NULL;
     }
     XBZRLE_cache_unlock();
 }
@@ -803,14 +799,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
             return -1;
         }
 
-        XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
-        if (!XBZRLE.current_buf) {
-            error_report("Error allocating current_buf");
-            g_free(XBZRLE.encoded_buf);
-            XBZRLE.encoded_buf = NULL;
-            return -1;
-        }
-
         acct_clear();
     }
 
diff --git a/page_cache.c b/page_cache.c
index cf8878d..c0ce00b 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -48,7 +48,6 @@ struct PageCache {
     CacheItem *page_cache;
     unsigned int page_size;
     int64_t max_num_items;
-    uint64_t max_item_age;
     int64_t num_items;
 };
 
@@ -76,7 +75,6 @@ PageCache *cache_init(int64_t num_pages, unsigned int page_size)
     }
     cache->page_size = page_size;
     cache->num_items = 0;
-    cache->max_item_age = 0;
     cache->max_num_items = num_pages;
 
     DPRINTF("Setting cache buckets to %" PRId64 "\n", cache->max_num_items);
@@ -188,59 +186,3 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
 
     return 0;
 }
-
-int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
-{
-    PageCache *new_cache;
-    int64_t i;
-
-    CacheItem *old_it, *new_it;
-
-    g_assert(cache);
-
-    /* cache was not inited */
-    if (cache->page_cache == NULL) {
-        return -1;
-    }
-
-    /* same size */
-    if (pow2floor(new_num_pages) == cache->max_num_items) {
-        return cache->max_num_items;
-    }
-
-    new_cache = cache_init(new_num_pages, cache->page_size);
-    if (!(new_cache)) {
-        DPRINTF("Error creating new cache\n");
-        return -1;
-    }
-
-    /* move all data from old cache */
-    for (i = 0; i < cache->max_num_items; i++) {
-        old_it = &cache->page_cache[i];
-        if (old_it->it_addr != -1) {
-            /* check for collision, if there is, keep MRU page */
-            new_it = cache_get_by_addr(new_cache, old_it->it_addr);
-            if (new_it->it_data && new_it->it_age >= old_it->it_age) {
-                /* keep the MRU page */
-                g_free(old_it->it_data);
-            } else {
-                if (!new_it->it_data) {
-                    new_cache->num_items++;
-                }
-                g_free(new_it->it_data);
-                new_it->it_data = old_it->it_data;
-                new_it->it_age = old_it->it_age;
-                new_it->it_addr = old_it->it_addr;
-            }
-        }
-    }
-
-    g_free(cache->page_cache);
-    cache->page_cache = new_cache->page_cache;
-    cache->max_num_items = new_cache->max_num_items;
-    cache->num_items = new_cache->num_items;
-
-    g_free(new_cache);
-
-    return cache->max_num_items;
-}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
  2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
                   ` (5 preceding siblings ...)
  2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 6/6] migration: clear the dead code arei.gonglei
@ 2014-12-10  2:33 ` Amit Shah
  2014-12-10 10:09   ` Amit Shah
  6 siblings, 1 reply; 22+ messages in thread
From: Amit Shah @ 2014-12-10  2:33 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, quintela, peter.huangpeng, qemu-devel,
	pbonzini, dgilbert

On (Mon) 24 Nov 2014 [19:55:46], arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Hi,
> 
> This set of patches rebase on
> https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,
> 
> Those patches have been reviewed before half a year. 
> For now I rebase them with the master branch. 
> Hope those optimization can be merged in Qemu-2.3, Thanks!

Applied to my tree.

Thanks,

		Amit

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
  2014-12-10  2:33 ` [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle Amit Shah
@ 2014-12-10 10:09   ` Amit Shah
  2014-12-11  2:24     ` ChenLiang
  2014-12-16 15:01     ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 22+ messages in thread
From: Amit Shah @ 2014-12-10 10:09 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, quintela, qemu-devel, peter.huangpeng,
	pbonzini, dgilbert

On (Wed) 10 Dec 2014 [08:03:33], Amit Shah wrote:
> On (Mon) 24 Nov 2014 [19:55:46], arei.gonglei@huawei.com wrote:
> > From: ChenLiang <chenliang88@huawei.com>
> > 
> > Hi,
> > 
> > This set of patches rebase on
> > https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,
> > 
> > Those patches have been reviewed before half a year. 
> > For now I rebase them with the master branch. 
> > Hope those optimization can be merged in Qemu-2.3, Thanks!
> 
> Applied to my tree.

Sorry, spoke to soon; I've held this series for the moment due to
ongoing review.

		Amit

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
  2014-12-10 10:09   ` Amit Shah
@ 2014-12-11  2:24     ` ChenLiang
  2014-12-16 15:01     ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 22+ messages in thread
From: ChenLiang @ 2014-12-11  2:24 UTC (permalink / raw)
  To: Amit Shah
  Cc: weidong.huang, quintela, qemu-devel, peter.huangpeng,
	arei.gonglei, pbonzini, dgilbert

On 2014/12/10 18:09, Amit Shah wrote:

> On (Wed) 10 Dec 2014 [08:03:33], Amit Shah wrote:
>> On (Mon) 24 Nov 2014 [19:55:46], arei.gonglei@huawei.com wrote:
>>> From: ChenLiang <chenliang88@huawei.com>
>>>
>>> Hi,
>>>
>>> This set of patches rebase on
>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,
>>>
>>> Those patches have been reviewed before half a year. 
>>> For now I rebase them with the master branch. 
>>> Hope those optimization can be merged in Qemu-2.3, Thanks!
>>
>> Applied to my tree.
> 
> Sorry, spoke to soon; I've held this series for the moment due to
> ongoing review.
> 
> 		Amit
> 
> .
> 


Hi

Ok
I'm happy to do something to help this set of patches to be applied.
Let me know if need me do it. :)

Best regards
ChenLiang

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
  2014-12-10 10:09   ` Amit Shah
  2014-12-11  2:24     ` ChenLiang
@ 2014-12-16 15:01     ` Dr. David Alan Gilbert
  2014-12-22 13:00       ` ChenLiang
  1 sibling, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-16 15:01 UTC (permalink / raw)
  To: Amit Shah
  Cc: ChenLiang, weidong.huang, quintela, qemu-devel, peter.huangpeng,
	arei.gonglei, pbonzini

* Amit Shah (amit.shah@redhat.com) wrote:
> On (Wed) 10 Dec 2014 [08:03:33], Amit Shah wrote:
> > On (Mon) 24 Nov 2014 [19:55:46], arei.gonglei@huawei.com wrote:
> > > From: ChenLiang <chenliang88@huawei.com>
> > > 
> > > Hi,
> > > 
> > > This set of patches rebase on
> > > https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,
> > > 
> > > Those patches have been reviewed before half a year. 
> > > For now I rebase them with the master branch. 
> > > Hope those optimization can be merged in Qemu-2.3, Thanks!
> > 
> > Applied to my tree.
> 
> Sorry, spoke to soon; I've held this series for the moment due to
> ongoing review.

Why don't we split this series; 1 and 2 seem to be the least
controversial;  then we can think about 3-5 separately if we're
still uncomfortable.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle
  2014-12-16 15:01     ` Dr. David Alan Gilbert
@ 2014-12-22 13:00       ` ChenLiang
  0 siblings, 0 replies; 22+ messages in thread
From: ChenLiang @ 2014-12-22 13:00 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: weidong.huang, quintela, qemu-devel, peter.huangpeng,
	arei.gonglei, Amit Shah, pbonzini

On 2014/12/16 23:01, Dr. David Alan Gilbert wrote:

> * Amit Shah (amit.shah@redhat.com) wrote:
>> On (Wed) 10 Dec 2014 [08:03:33], Amit Shah wrote:
>>> On (Mon) 24 Nov 2014 [19:55:46], arei.gonglei@huawei.com wrote:
>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>
>>>> Hi,
>>>>
>>>> This set of patches rebase on
>>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg04956.html,
>>>>
>>>> Those patches have been reviewed before half a year. 
>>>> For now I rebase them with the master branch. 
>>>> Hope those optimization can be merged in Qemu-2.3, Thanks!
>>>
>>> Applied to my tree.
>>
>> Sorry, spoke to soon; I've held this series for the moment due to
>> ongoing review.
> 
> Why don't we split this series; 1 and 2 seem to be the least
> controversial;  then we can think about 3-5 separately if we're
> still uncomfortable.
> 
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> .
> 


It sounds good. The 1 and 2 is the core of patches.

Best regards
ChenLiang

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-12-22 13:01 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-24 11:55 [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle arei.gonglei
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 1/6] xbzrle: optimize XBZRLE to decrease the cache misses arei.gonglei
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 2/6] xbzrle: rebuild the cache_is_cached function arei.gonglei
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 3/6] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
2014-12-10  3:18   ` Amit Shah
2014-12-10  3:55     ` ChenLiang
2014-12-10 10:02       ` Amit Shah
2014-12-10 10:37         ` Dr. David Alan Gilbert
2014-12-10 10:05   ` Juan Quintela
2014-12-10 10:41     ` Dr. David Alan Gilbert
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 5/6] migration: optimize xbzrle by reducing data copy arei.gonglei
2014-12-10 10:35   ` Juan Quintela
2014-12-10 10:39     ` Dr. David Alan Gilbert
2014-12-11  2:34       ` ChenLiang
2014-12-11  8:57         ` Dr. David Alan Gilbert
2014-11-24 11:55 ` [Qemu-devel] [PATCH RESEND for 2.3 6/6] migration: clear the dead code arei.gonglei
2014-12-10  2:33 ` [Qemu-devel] [PATCH RESEND for 2.3 0/6] xbzrle: optimize the xbzrle Amit Shah
2014-12-10 10:09   ` Amit Shah
2014-12-11  2:24     ` ChenLiang
2014-12-16 15:01     ` Dr. David Alan Gilbert
2014-12-22 13:00       ` ChenLiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).