[Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue
@ 2014-03-27  3:18 arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

From: ChenLiang <chenliang88@huawei.com>

V3-->V4
* Excluding auto convergence changes, modify it later.

V2-->V3
* rename the bitmap_sync_cnt to bitmap_sync_counter
* expose xbzrle cache miss rate

V1-->V2
* expose the counter that logs the times of updating the dirty
  bitmap to end user.

a. Optimization the xbzrle remarkable decrease the cache misses.
    The efficiency of compress increases more than fifty times.
    Before the patch set, the cache almost totally miss when the
    number of cache item less than the dirty page number. Now the
    hot pages in the cache will not be replaced by other pages.

b. Reducing the data copy

c. Fix one corruption issues.

ChenLiang (8):
  XBZRLE: Fix one XBZRLE corruption issues
  migration: Add counts of updating the dirty bitmap
  migration: expose the bitmap_sync_count to the end user
  migration: expose xbzrle cache miss rate
  XBZRLE: optimize XBZRLE to decrease the cache misses
  XBZRLE: rebuild the cache_is_cached function
  migration: optimize xbzrle by reducing data copy
  migration: clear the dead code

 arch_init.c                    |  74 +++++++++++++++++-------------
 docs/xbzrle.txt                |   8 ++++
 hmp.c                          |   4 ++
 include/migration/migration.h  |   2 +
 include/migration/page_cache.h |  10 ++--
 migration.c                    |   3 ++
 page_cache.c                   | 101 +++++++++++------------------------------
 qapi-schema.json               |   9 +++-
 qmp-commands.hx                |  15 ++++--
 9 files changed, 111 insertions(+), 115 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-27  3:26   ` [Qemu-devel] for 2.0? " Eric Blake
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 2/8] migration: Add counts of updating the dirty bitmap arei.gonglei
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

The page may not be inserted into cache after executing save_xbzrle_page.
In case of failure to insert, the original page should be sent rather
than the page in the cache.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 60c975d..2ac68c2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -340,7 +340,7 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 
 #define ENCODING_FLAG_XBZRLE 0x1
 
-static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
+static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
                             ram_addr_t current_addr, RAMBlock *block,
                             ram_addr_t offset, int cont, bool last_stage)
 {
@@ -348,19 +348,23 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
     uint8_t *prev_cached_page;
 
     if (!cache_is_cached(XBZRLE.cache, current_addr)) {
+        acct_info.xbzrle_cache_miss++;
         if (!last_stage) {
-            if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) {
+            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
                 return -1;
+            } else {
+                /* update *current_data when the page has been
+                   inserted into cache */
+                *current_data = get_cached_data(XBZRLE.cache, current_addr);
             }
         }
-        acct_info.xbzrle_cache_miss++;
         return -1;
     }
 
     prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
 
     /* save current buffer into memory */
-    memcpy(XBZRLE.current_buf, current_data, TARGET_PAGE_SIZE);
+    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
 
     /* XBZRLE encoding (if there is no overflow) */
     encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
@@ -373,7 +377,10 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
         DPRINTF("Overflow\n");
         acct_info.xbzrle_overflows++;
         /* update data in the cache */
-        memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE);
+        if (!last_stage) {
+            memcpy(prev_cached_page, *current_data, TARGET_PAGE_SIZE);
+            *current_data = prev_cached_page;
+        }
         return -1;
     }
 
@@ -598,15 +605,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
                  */
                 xbzrle_cache_zero_page(current_addr);
             } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
-                bytes_sent = save_xbzrle_page(f, p, current_addr, block,
+                bytes_sent = save_xbzrle_page(f, &p, current_addr, block,
                                               offset, cont, last_stage);
                 if (!last_stage) {
-                    /* We must send exactly what's in the xbzrle cache
-                     * even if the page wasn't xbzrle compressed, so that
-                     * it's right next time.
-                     */
-                    p = get_cached_data(XBZRLE.cache, current_addr);
-
                     /* Can't send this cached data async, since the cache page
                      * might get updated before it gets to the wire
                      */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 2/8] migration: Add counts of updating the dirty bitmap
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user arei.gonglei
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Add counts to log the times of updating the dirty bitmap.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index 2ac68c2..200af0e 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -110,6 +110,8 @@ static bool mig_throttle_on;
 static int dirty_rate_high_cnt;
 static void check_guest_throttling(void);
 
+static uint64_t bitmap_sync_count;
+
 /***********************************************************/
 /* ram save/restore */
 
@@ -487,6 +489,8 @@ static void migration_bitmap_sync(void)
     int64_t end_time;
     int64_t bytes_xfer_now;
 
+    bitmap_sync_count++;
+
     if (!bytes_xfer_prev) {
         bytes_xfer_prev = ram_bytes_transferred();
     }
@@ -734,6 +738,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_dirty_pages = ram_pages;
     mig_throttle_on = false;
     dirty_rate_high_cnt = 0;
+    bitmap_sync_count = 0;
 
     if (migrate_use_xbzrle()) {
         qemu_mutex_lock_iothread();
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 2/8] migration: Add counts of updating the dirty bitmap arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-27  3:27   ` Eric Blake
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate arei.gonglei
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

expose the count that logs the times of updating the dirty bitmap to
end user.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 arch_init.c                   |  1 +
 hmp.c                         |  2 ++
 include/migration/migration.h |  1 +
 migration.c                   |  2 ++
 qapi-schema.json              |  4 +++-
 qmp-commands.hx               | 13 +++++++++----
 6 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 200af0e..a7c87de 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -536,6 +536,7 @@ static void migration_bitmap_sync(void)
         s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
         start_time = end_time;
         num_dirty_pages_period = 0;
+        s->dirty_sync_count = bitmap_sync_count;
     }
 }
 
diff --git a/hmp.c b/hmp.c
index 2f279c4..77a8d18 100644
--- a/hmp.c
+++ b/hmp.c
@@ -188,6 +188,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->ram->normal);
         monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n",
                        info->ram->normal_bytes >> 10);
+        monitor_printf(mon, "dirty sync count: %" PRIu64 "\n",
+                       info->ram->dirty_sync_count);
         if (info->ram->dirty_pages_rate) {
             monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
                            info->ram->dirty_pages_rate);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3e1e6c7..9e62b99 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -61,6 +61,7 @@ struct MigrationState
     bool enabled_capabilities[MIGRATION_CAPABILITY_MAX];
     int64_t xbzrle_cache_size;
     int64_t setup_time;
+    int64_t dirty_sync_count;
 };
 
 void process_incoming_migration(QEMUFile *f);
diff --git a/migration.c b/migration.c
index e0e24d4..e8f8ca5 100644
--- a/migration.c
+++ b/migration.c
@@ -226,6 +226,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
         info->ram->normal_bytes = norm_mig_bytes_transferred();
         info->ram->dirty_pages_rate = s->dirty_pages_rate;
         info->ram->mbps = s->mbps;
+        info->ram->dirty_sync_count = s->dirty_sync_count;
 
         if (blk_mig_active()) {
             info->has_disk = true;
@@ -259,6 +260,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
         info->ram->normal = norm_mig_pages_transferred();
         info->ram->normal_bytes = norm_mig_bytes_transferred();
         info->ram->mbps = s->mbps;
+        info->ram->dirty_sync_count = s->dirty_sync_count;
         break;
     case MIG_STATE_ERROR:
         info->has_status = true;
diff --git a/qapi-schema.json b/qapi-schema.json
index b68cd44..4592f62 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -651,13 +651,15 @@
 #
 # @mbps: throughput in megabits/sec. (since 1.6)
 #
+# @dirty-sync-count: number of times that dirty ram was synchronized (since 2.1)
+#
 # Since: 0.14.0
 ##
 { 'type': 'MigrationStats',
   'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
            'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
            'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
-           'mbps' : 'number' } }
+           'mbps' : 'number', 'dirty-sync-count' : 'int' } }
 
 ##
 # @XBZRLECacheStats
diff --git a/qmp-commands.hx b/qmp-commands.hx
index a22621f..0820510 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2967,6 +2967,7 @@ The main json-object contains the following:
             pages. This is just normal pages times size of one page,
             but this way upper levels don't need to care about page
             size (json-int)
+         - "dirty-sync-count": times that dirty ram was synchronized (json-int)
 - "disk": only present if "status" is "active" and it is a block migration,
   it is a json-object with the following disk information:
          - "transferred": amount transferred in bytes (json-int)
@@ -3004,7 +3005,8 @@ Examples:
           "downtime":12345,
           "duplicate":123,
           "normal":123,
-          "normal-bytes":123456
+          "normal-bytes":123456,
+          "dirty-sync-count":15
         }
      }
    }
@@ -3029,7 +3031,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":123,
             "normal":123,
-            "normal-bytes":123456
+            "normal-bytes":123456,
+            "dirty-sync-count":15
          }
       }
    }
@@ -3049,7 +3052,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":123,
             "normal":123,
-            "normal-bytes":123456
+            "normal-bytes":123456,
+            "dirty-sync-count":15
          },
          "disk":{
             "total":20971520,
@@ -3075,7 +3079,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":10,
             "normal":3333,
-            "normal-bytes":3412992
+            "normal-bytes":3412992,
+            "dirty-sync-count":15
          },
          "xbzrle-cache":{
             "cache-size":67108864,
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (2 preceding siblings ...)
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-27  3:28   ` Eric Blake
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

expose xbzrle cache miss rate

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 arch_init.c                   | 18 ++++++++++++++++++
 hmp.c                         |  2 ++
 include/migration/migration.h |  1 +
 migration.c                   |  1 +
 qapi-schema.json              |  5 ++++-
 qmp-commands.hx               |  2 ++
 6 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index a7c87de..15ca4c0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -235,6 +235,7 @@ typedef struct AccountingInfo {
     uint64_t xbzrle_bytes;
     uint64_t xbzrle_pages;
     uint64_t xbzrle_cache_miss;
+    double xbzrle_cache_miss_rate;
     uint64_t xbzrle_overflows;
 } AccountingInfo;
 
@@ -290,6 +291,11 @@ uint64_t xbzrle_mig_pages_cache_miss(void)
     return acct_info.xbzrle_cache_miss;
 }
 
+double xbzrle_mig_cache_miss_rate(void)
+{
+    return acct_info.xbzrle_cache_miss_rate;
+}
+
 uint64_t xbzrle_mig_pages_overflow(void)
 {
     return acct_info.xbzrle_overflows;
@@ -488,6 +494,8 @@ static void migration_bitmap_sync(void)
     static int64_t num_dirty_pages_period;
     int64_t end_time;
     int64_t bytes_xfer_now;
+    static uint64_t xbzrle_cache_miss_prev;
+    static uint64_t iterations_prev;
 
     bitmap_sync_count++;
 
@@ -531,6 +539,16 @@ static void migration_bitmap_sync(void)
         } else {
              mig_throttle_on = false;
         }
+        if (migrate_use_xbzrle()) {
+            if (iterations_prev != 0) {
+                acct_info.xbzrle_cache_miss_rate =
+                   (double)(acct_info.xbzrle_cache_miss -
+                            xbzrle_cache_miss_prev) /
+                   (acct_info.iterations - iterations_prev);
+            }
+            iterations_prev = acct_info.iterations;
+            xbzrle_cache_miss_prev = acct_info.xbzrle_cache_miss;
+        }
         s->dirty_pages_rate = num_dirty_pages_period * 1000
             / (end_time - start_time);
         s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
diff --git a/hmp.c b/hmp.c
index 77a8d18..18a850d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -214,6 +214,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->xbzrle_cache->pages);
         monitor_printf(mon, "xbzrle cache miss: %" PRIu64 "\n",
                        info->xbzrle_cache->cache_miss);
+        monitor_printf(mon, "xbzrle cache miss rate: %0.2f\n",
+                       info->xbzrle_cache->cache_miss_rate);
         monitor_printf(mon, "xbzrle overflow : %" PRIu64 "\n",
                        info->xbzrle_cache->overflow);
     }
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 9e62b99..430e48d 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -126,6 +126,7 @@ uint64_t xbzrle_mig_bytes_transferred(void);
 uint64_t xbzrle_mig_pages_transferred(void);
 uint64_t xbzrle_mig_pages_overflow(void);
 uint64_t xbzrle_mig_pages_cache_miss(void);
+double xbzrle_mig_cache_miss_rate(void);
 
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
diff --git a/migration.c b/migration.c
index e8f8ca5..9477576 100644
--- a/migration.c
+++ b/migration.c
@@ -185,6 +185,7 @@ static void get_xbzrle_cache_stats(MigrationInfo *info)
         info->xbzrle_cache->bytes = xbzrle_mig_bytes_transferred();
         info->xbzrle_cache->pages = xbzrle_mig_pages_transferred();
         info->xbzrle_cache->cache_miss = xbzrle_mig_pages_cache_miss();
+        info->xbzrle_cache->cache_miss_rate = xbzrle_mig_cache_miss_rate();
         info->xbzrle_cache->overflow = xbzrle_mig_pages_overflow();
     }
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 4592f62..c6f043d 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -674,13 +674,16 @@
 #
 # @cache-miss: number of cache miss
 #
+# @cache-miss-rate: rate of cache miss (since 2.1)
+#
 # @overflow: number of overflows
 #
 # Since: 1.2
 ##
 { 'type': 'XBZRLECacheStats',
   'data': {'cache-size': 'int', 'bytes': 'int', 'pages': 'int',
-           'cache-miss': 'int', 'overflow': 'int' } }
+           'cache-miss': 'int', 'cache-miss-rate': 'number',
+           'overflow': 'int' } }
 
 ##
 # @MigrationInfo
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 0820510..3096c31 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2979,6 +2979,7 @@ The main json-object contains the following:
          - "bytes": number of bytes transferred for XBZRLE compressed pages
          - "pages": number of XBZRLE compressed pages
          - "cache-miss": number of XBRZRLE page cache misses
+         - "cache-miss-rate": rate of XBRZRLE page cache misses
          - "overflow": number of times XBZRLE overflows.  This means
            that the XBZRLE encoding was bigger than just sent the
            whole page, and then we sent the whole page instead (as as
@@ -3087,6 +3088,7 @@ Examples:
             "bytes":20971520,
             "pages":2444343,
             "cache-miss":2244,
+            "cache-miss-rate":0.123,
             "overflow":34434
          }
       }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (3 preceding siblings ...)
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-04-01 16:47   ` Dr. David Alan Gilbert
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 6/8] XBZRLE: rebuild the cache_is_cached function arei.gonglei
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Avoid hot pages being replaced by others to remarkably decrease cache
misses

Sample results with the test program which quote from xbzrle.txt ran in
vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)

the test program:

include <stdlib.h>
include <stdio.h>
int main()
 {
        char *buf = (char *) calloc(4096, 4096);
        while (1) {
            int i;
            for (i = 0; i < 4096 * 4; i++) {
                buf[i * 4096 / 4]++;
            }
            printf(".");
        }
 }

before this patch:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
"cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
"cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
"ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
"transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
"dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
"normal":1403465}},"id":"libvirt-706"}

18k pages sent compressed
cache-miss-rate is 98.7%, totally miss.

after optimizing:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
"cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
"cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
"ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
"transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
"dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
"normal":387794}},"id":"libvirt-266"}

194k pages sent compressed
The value of cache-miss-rate decrease to 48.59%.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c                    |  8 +++++---
 docs/xbzrle.txt                |  8 ++++++++
 include/migration/page_cache.h | 10 +++++++---
 page_cache.c                   | 23 +++++++++++++++++++----
 4 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 15ca4c0..84a4bd3 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -343,7 +343,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 
     /* We don't care if this fails to allocate a new cache page
      * as long as it updated an old one */
-    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE);
+    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE,
+                 bitmap_sync_count);
 }
 
 #define ENCODING_FLAG_XBZRLE 0x1
@@ -355,10 +356,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
     int encoded_len = 0, bytes_sent = -1;
     uint8_t *prev_cached_page;
 
-    if (!cache_is_cached(XBZRLE.cache, current_addr)) {
+    if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) {
         acct_info.xbzrle_cache_miss++;
         if (!last_stage) {
-            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
+            if (cache_insert(XBZRLE.cache, current_addr, *current_data,
+                             bitmap_sync_count) == -1) {
                 return -1;
             } else {
                 /* update *current_data when the page has been
diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
index cc3a26a..52c8511 100644
--- a/docs/xbzrle.txt
+++ b/docs/xbzrle.txt
@@ -71,6 +71,14 @@ encoded buffer:
 encoded length 24
 e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
 
+Cache update strategy
+=====================
+Keeping the hot pages in the cache is effective for decreased cache
+misses. XBZRLE uses a counter as the age of each page. The counter will
+increase after each ram dirty bitmap sync. When a cache conflict is
+detected, XBZRLE will only evict pages in the cache that are older than
+a threshold.
+
 Usage
 ======================
 1. Verify the destination QEMU version is able to decode the new format.
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 2d5ce2d..10ed532 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -43,8 +43,10 @@ void cache_fini(PageCache *cache);
  *
  * @cache pointer to the PageCache struct
  * @addr: page addr
+ * @current_age: current bitmap generation
  */
-bool cache_is_cached(const PageCache *cache, uint64_t addr);
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age);
 
 /**
  * get_cached_data: Get the data cached for an addr
@@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr);
  * cache_insert: insert the page into the cache. the page cache
  * will dup the data on insert. the previous value will be overwritten
  *
- * Returns -1 on error
+ * Returns -1 when the page isn't inserted into cache
  *
  * @cache pointer to the PageCache struct
  * @addr: page address
  * @pdata: pointer to the page
+ * @current_age: current bitmap generation
  */
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age);
 
 /**
  * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index b033681..c78157b 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -33,6 +33,9 @@
     do { } while (0)
 #endif
 
+/* the page in cache will not be replaced in two cycles */
+#define CACHED_PAGE_LIFETIME 2
+
 typedef struct CacheItem CacheItem;
 
 struct CacheItem {
@@ -121,7 +124,8 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr)
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
 {
     size_t pos;
 
@@ -130,7 +134,12 @@ bool cache_is_cached(const PageCache *cache, uint64_t addr)
 
     pos = cache_get_cache_pos(cache, addr);
 
-    return (cache->page_cache[pos].it_addr == addr);
+    if (cache->page_cache[pos].it_addr == addr) {
+        /* update the it_age when the cache hit */
+        cache->page_cache[pos].it_age = current_age;
+        return true;
+    }
+    return false;
 }
 
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
@@ -150,7 +159,8 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age)
 {
 
     CacheItem *it = NULL;
@@ -161,6 +171,11 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
 
+    if (it->it_data &&
+        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
+        /* the cache page is fresh, don't replace it */
+        return -1;
+    }
     /* allocate page */
     if (!it->it_data) {
         it->it_data = g_try_malloc(cache->page_size);
@@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
 
     memcpy(it->it_data, pdata, cache->page_size);
 
-    it->it_age = ++cache->max_item_age;
+    it->it_age = current_age;
     it->it_addr = addr;
 
     return 0;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 6/8] XBZRLE: rebuild the cache_is_cached function
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (4 preceding siblings ...)
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy arei.gonglei
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 8/8] migration: clear the dead code arei.gonglei
  7 siblings, 0 replies; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Rebuild the cache_is_cached function by cache_get_by_addr. And
drops the asserts because the caller is also asserting the same
thing.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 page_cache.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/page_cache.c b/page_cache.c
index c78157b..3190c55 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -124,24 +124,6 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr,
-                     uint64_t current_age)
-{
-    size_t pos;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
-
-    pos = cache_get_cache_pos(cache, addr);
-
-    if (cache->page_cache[pos].it_addr == addr) {
-        /* update the it_age when the cache hit */
-        cache->page_cache[pos].it_age = current_age;
-        return true;
-    }
-    return false;
-}
-
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
 {
     size_t pos;
@@ -159,14 +141,26 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
+{
+    CacheItem *it;
+
+    it = cache_get_by_addr(cache, addr);
+
+    if (it->it_addr == addr) {
+        /* update the it_age when the cache hit */
+        it->it_age = current_age;
+        return true;
+    }
+    return false;
+}
+
 int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
                  uint64_t current_age)
 {
 
-    CacheItem *it = NULL;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
+    CacheItem *it;
 
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (5 preceding siblings ...)
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 6/8] XBZRLE: rebuild the cache_is_cached function arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  2014-03-28 19:59   ` Dr. David Alan Gilbert
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 8/8] migration: clear the dead code arei.gonglei
  7 siblings, 1 reply; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Reducing data copy can reduce cpu overhead.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 84a4bd3..94b62e2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
 
-    /* save current buffer into memory */
-    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
-
     /* XBZRLE encoding (if there is no overflow) */
-    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
+    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
                                        TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
                                        TARGET_PAGE_SIZE);
     if (encoded_len == 0) {
@@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     /* we need to update the data in the cache, in order to get the same data */
     if (!last_stage) {
-        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
+        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
+                             TARGET_PAGE_SIZE);
     }
 
     /* Send XBZRLE based compressed page */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v4 8/8] migration: clear the dead code
  2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (6 preceding siblings ...)
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-03-27  3:18 ` arei.gonglei
  7 siblings, 0 replies; 21+ messages in thread
From: arei.gonglei @ 2014-03-27  3:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

clear the dead code

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c  | 13 -------------
 page_cache.c | 58 ----------------------------------------------------------
 2 files changed, 71 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 94b62e2..db38c9a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,14 +164,11 @@ static inline bool is_zero_range(uint8_t *p, uint64_t size)
 static struct {
     /* buffer used for XBZRLE encoding */
     uint8_t *encoded_buf;
-    /* buffer for storing page content */
-    uint8_t *current_buf;
     /* Cache for XBZRLE, Protected by lock. */
     PageCache *cache;
     QemuMutex lock;
 } XBZRLE = {
     .encoded_buf = NULL,
-    .current_buf = NULL,
     .cache = NULL,
 };
 /* buffer used for XBZRLE decoding */
@@ -723,10 +720,8 @@ static void migration_end(void)
         cache_fini(XBZRLE.cache);
         g_free(XBZRLE.cache);
         g_free(XBZRLE.encoded_buf);
-        g_free(XBZRLE.current_buf);
         XBZRLE.cache = NULL;
         XBZRLE.encoded_buf = NULL;
-        XBZRLE.current_buf = NULL;
     }
     XBZRLE_cache_unlock();
 }
@@ -779,14 +774,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
             return -1;
         }
 
-        XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
-        if (!XBZRLE.current_buf) {
-            DPRINTF("Error allocating current_buf\n");
-            g_free(XBZRLE.encoded_buf);
-            XBZRLE.encoded_buf = NULL;
-            return -1;
-        }
-
         acct_clear();
     }
 
diff --git a/page_cache.c b/page_cache.c
index 3190c55..7dfa762 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -48,7 +48,6 @@ struct PageCache {
     CacheItem *page_cache;
     unsigned int page_size;
     int64_t max_num_items;
-    uint64_t max_item_age;
     int64_t num_items;
 };
 
@@ -76,7 +75,6 @@ PageCache *cache_init(int64_t num_pages, unsigned int page_size)
     }
     cache->page_size = page_size;
     cache->num_items = 0;
-    cache->max_item_age = 0;
     cache->max_num_items = num_pages;
 
     DPRINTF("Setting cache buckets to %" PRId64 "\n", cache->max_num_items);
@@ -187,59 +185,3 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
 
     return 0;
 }
-
-int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
-{
-    PageCache *new_cache;
-    int64_t i;
-
-    CacheItem *old_it, *new_it;
-
-    g_assert(cache);
-
-    /* cache was not inited */
-    if (cache->page_cache == NULL) {
-        return -1;
-    }
-
-    /* same size */
-    if (pow2floor(new_num_pages) == cache->max_num_items) {
-        return cache->max_num_items;
-    }
-
-    new_cache = cache_init(new_num_pages, cache->page_size);
-    if (!(new_cache)) {
-        DPRINTF("Error creating new cache\n");
-        return -1;
-    }
-
-    /* move all data from old cache */
-    for (i = 0; i < cache->max_num_items; i++) {
-        old_it = &cache->page_cache[i];
-        if (old_it->it_addr != -1) {
-            /* check for collision, if there is, keep MRU page */
-            new_it = cache_get_by_addr(new_cache, old_it->it_addr);
-            if (new_it->it_data && new_it->it_age >= old_it->it_age) {
-                /* keep the MRU page */
-                g_free(old_it->it_data);
-            } else {
-                if (!new_it->it_data) {
-                    new_cache->num_items++;
-                }
-                g_free(new_it->it_data);
-                new_it->it_data = old_it->it_data;
-                new_it->it_age = old_it->it_age;
-                new_it->it_addr = old_it->it_addr;
-            }
-        }
-    }
-
-    g_free(cache->page_cache);
-    cache->page_cache = new_cache->page_cache;
-    cache->max_num_items = new_cache->max_num_items;
-    cache->num_items = new_cache->num_items;
-
-    g_free(new_cache);
-
-    return cache->max_num_items;
-}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] for 2.0? Re: [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
@ 2014-03-27  3:26   ` Eric Blake
  2014-03-27  3:45     ` Gonglei (Arei)
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2014-03-27  3:26 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On 03/26/2014 09:18 PM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> The page may not be inserted into cache after executing save_xbzrle_page.
> In case of failure to insert, the original page should be sent rather
> than the page in the cache.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> ---
>  arch_init.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)

Should this patch be included in 2.0 as a bug fix?  The rest of the
series is probably better off in 2.1.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user arei.gonglei
@ 2014-03-27  3:27   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2014-03-27  3:27 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 784 bytes --]

On 03/26/2014 09:18 PM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> expose the count that logs the times of updating the dirty bitmap to
> end user.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  arch_init.c                   |  1 +
>  hmp.c                         |  2 ++
>  include/migration/migration.h |  1 +
>  migration.c                   |  2 ++
>  qapi-schema.json              |  4 +++-
>  qmp-commands.hx               | 13 +++++++++----
>  6 files changed, 18 insertions(+), 5 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate arei.gonglei
@ 2014-03-27  3:28   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2014-03-27  3:28 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 737 bytes --]

On 03/26/2014 09:18 PM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> expose xbzrle cache miss rate
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  arch_init.c                   | 18 ++++++++++++++++++
>  hmp.c                         |  2 ++
>  include/migration/migration.h |  1 +
>  migration.c                   |  1 +
>  qapi-schema.json              |  5 ++++-
>  qmp-commands.hx               |  2 ++
>  6 files changed, 28 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] for 2.0? Re: [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues
  2014-03-27  3:26   ` [Qemu-devel] for 2.0? " Eric Blake
@ 2014-03-27  3:45     ` Gonglei (Arei)
  2014-03-27 12:31       ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Gonglei (Arei) @ 2014-03-27  3:45 UTC (permalink / raw)
  To: Eric Blake, qemu-devel@nongnu.org
  Cc: chenliang (T), Huangweidong (C), quintela@redhat.com,
	dgilbert@redhat.com, owasserm@redhat.com, pbonzini@redhat.com

> >  arch_init.c | 25 +++++++++++++------------
> >  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> Should this patch be included in 2.0 as a bug fix?  The rest of the
> series is probably better off in 2.1.
> 
Yes, it should be, but I am not so clear how to do it. 
Eric, Could you give me some advices? Thank you so much.

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] for 2.0? Re: [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues
  2014-03-27  3:45     ` Gonglei (Arei)
@ 2014-03-27 12:31       ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2014-03-27 12:31 UTC (permalink / raw)
  To: Gonglei (Arei), Eric Blake, qemu-devel@nongnu.org
  Cc: owasserm@redhat.com, chenliang (T), Huangweidong (C),
	dgilbert@redhat.com, quintela@redhat.com

Il 27/03/2014 04:45, Gonglei (Arei) ha scritto:
>>>  arch_init.c | 25 +++++++++++++------------
>>>  1 file changed, 13 insertions(+), 12 deletions(-)
>>
>> Should this patch be included in 2.0 as a bug fix?  The rest of the
>> series is probably better off in 2.1.
>>
> Yes, it should be, but I am not so clear how to do it.
> Eric, Could you give me some advices? Thank you so much.

David and Juan will sort that out. :)

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-03-28 19:59   ` Dr. David Alan Gilbert
  2014-03-29  7:51     ` Gonglei (Arei)
  0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2014-03-28 19:59 UTC (permalink / raw)
  To: arei.gonglei; +Cc: ChenLiang, pbonzini, weidong.huang, qemu-devel, quintela

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Reducing data copy can reduce cpu overhead.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> ---
>  arch_init.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 84a4bd3..94b62e2 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
>  
> -    /* save current buffer into memory */
> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> -
>      /* XBZRLE encoding (if there is no overflow) */
> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
>                                         TARGET_PAGE_SIZE);

Hi Gonglei,

I've got a world which has this patch series on, and it's producing some XBZRLE errors,
and I suspect that it's down to my original worries of running xbzrle_encode_buffer
on changing data.

My setup is a pair of machines, with a guest with 4GB RAM running SPECjbb, 1GbE
between them and a 2GB xbzrle cache, and I can reliably trigger:

Failed to load XBZRLE page - decode error!


so I added some debug and saw:

qemu-system-x86_64: xbzrle_decode_buffer: Failed decode (uleb128 a) ret=1 i=45 count=0

xbzrle data:: 0000:  38 04 c0 d4  a4 fc 5c 04  d9 d4 a4 fc  5c 04 f2 d4
xbzrle data:: 0010:  a4 fc 88 02  03 c1 dd 79  01 01 03 17  10 87 b2 a3
xbzrle data:: 0020:  e8 8e b2 a3  e8 95 b2 a3  e8 9c b2 a3  e8 00*0c a3  <--- * corresponds to i=45
xbzrle data:: 0030:  b2 a3 e8 aa  b2 a3 e8 b1  b2 a3 e8 00  10 b8 b2 a3
xbzrle data:: 0040:  e8 bf b2 a3  e8 c6 b2 a3  e8 d2 ca a3  e8 00 0c d9
xbzrle data:: 0050:  ca a3 e8 e0  ca a3 e8 e7  ca a3 e8 00  08 ee ca a3
xbzrle data:: 0060:  e8 f5 ca a3  e8 00 08 fc  ca a3 e8 03  cb a3 e8 00
xbzrle data:: 0070:  0c 0a cb a3  e8 11 cb a3  e8 18 cb a3  e8 00 08 1f
xbzrle data:: 0080:  cb a3 e8 26  cb a3 e8 40  03 29 de 79  e9 02 01 82
xbzrle data:: 0090:  03 04 00 00  00 00 14 01  00 03 01 52  07 04 00 00
xbzrle data:: 00a0:  00 00 28 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00b0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00c0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00d0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00e0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 00f0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0100:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0110:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0120:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
xbzrle data:: 0130:  00 00 e0 08  01 5a 03 01  b1 01 01 b1

If I understand this correctly the zero-run was found to be '0' length, and
that should never happen - xbzrle should always output non-0 lengths
for both it's zero and nz run lengths according to my reading of the code
and the error check on the decode.

So I added:
@@ -73,6 +74,11 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
             return d;
         }
 
+        if (zrun_len == 0) {
+            error_report("xbzrle_encode_buffer: zrun_len=0! i=%d\n",i);
+            return -1;
+        }
+
         d += uleb128_encode_small(dst + d, zrun_len);

on the encode side, and yes that's triggering (I also added it for the
nzrun version)

The code in xbzrle.c is basically like:

a  loop {
b     while *ptr == 0   increment
c     save count of 0's
d     while *ptr != 0   increment
e     save count of none=0's
f  }

With your patch the data can be changing during this loop since
the code now runs directly on current_data, so that a byte that might
have read as none-0 by loop (b) above, gets changed by the guest
to 0 just after (b) exits.  When it enters (d) it reads the byte
find it's 0 and outputs a '0' length count which is invalid format.

Dave

>      if (encoded_len == 0) {
> @@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      /* we need to update the data in the cache, in order to get the same data */
>      if (!last_stage) {
> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> +                             TARGET_PAGE_SIZE);
>      }
>  
>      /* Send XBZRLE based compressed page */
> -- 
> 1.7.12.4
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy
  2014-03-28 19:59   ` Dr. David Alan Gilbert
@ 2014-03-29  7:51     ` Gonglei (Arei)
  0 siblings, 0 replies; 21+ messages in thread
From: Gonglei (Arei) @ 2014-03-29  7:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: chenliang (T), pbonzini@redhat.com, Huangweidong (C),
	qemu-devel@nongnu.org, quintela@redhat.com

> 
> Hi Gonglei,
> 
> I've got a world which has this patch series on, and it's producing some XBZRLE
> errors,
> and I suspect that it's down to my original worries of running
> xbzrle_encode_buffer
> on changing data.
> 
> My setup is a pair of machines, with a guest with 4GB RAM running SPECjbb,
> 1GbE
> between them and a 2GB xbzrle cache, and I can reliably trigger:
> 
> Failed to load XBZRLE page - decode error!
> 
> 
> so I added some debug and saw:
> 
> qemu-system-x86_64: xbzrle_decode_buffer: Failed decode (uleb128 a) ret=1
> i=45 count=0
> 
> xbzrle data:: 0000:  38 04 c0 d4  a4 fc 5c 04  d9 d4 a4 fc  5c 04 f2 d4
> xbzrle data:: 0010:  a4 fc 88 02  03 c1 dd 79  01 01 03 17  10 87 b2 a3
> xbzrle data:: 0020:  e8 8e b2 a3  e8 95 b2 a3  e8 9c b2 a3  e8 00*0c a3
> <--- * corresponds to i=45
> xbzrle data:: 0030:  b2 a3 e8 aa  b2 a3 e8 b1  b2 a3 e8 00  10 b8 b2 a3
> xbzrle data:: 0040:  e8 bf b2 a3  e8 c6 b2 a3  e8 d2 ca a3  e8 00 0c d9
> xbzrle data:: 0050:  ca a3 e8 e0  ca a3 e8 e7  ca a3 e8 00  08 ee ca a3
> xbzrle data:: 0060:  e8 f5 ca a3  e8 00 08 fc  ca a3 e8 03  cb a3 e8 00
> xbzrle data:: 0070:  0c 0a cb a3  e8 11 cb a3  e8 18 cb a3  e8 00 08 1f
> xbzrle data:: 0080:  cb a3 e8 26  cb a3 e8 40  03 29 de 79  e9 02 01 82
> xbzrle data:: 0090:  03 04 00 00  00 00 14 01  00 03 01 52  07 04 00 00
> xbzrle data:: 00a0:  00 00 28 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 00b0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 00c0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 00d0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 00e0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 00f0:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 0100:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 0110:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 0120:  00 00 24 01  00 03 01 52  07 08 00 00  00 00 00 00
> xbzrle data:: 0130:  00 00 e0 08  01 5a 03 01  b1 01 01 b1
> 
> If I understand this correctly the zero-run was found to be '0' length, and
> that should never happen - xbzrle should always output non-0 lengths
> for both it's zero and nz run lengths according to my reading of the code
> and the error check on the decode.
> 
> So I added:
> @@ -73,6 +74,11 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t
> *new_buf, int slen,
>              return d;
>          }
> 
> +        if (zrun_len == 0) {
> +            error_report("xbzrle_encode_buffer: zrun_len=0! i=%d\n",i);
> +            return -1;
> +        }
> +
>          d += uleb128_encode_small(dst + d, zrun_len);
> 
> on the encode side, and yes that's triggering (I also added it for the
> nzrun version)
> 
> The code in xbzrle.c is basically like:
> 
> a  loop {
> b     while *ptr == 0   increment
> c     save count of 0's
> d     while *ptr != 0   increment
> e     save count of none=0's
> f  }
> 
> With your patch the data can be changing during this loop since
> the code now runs directly on current_data, so that a byte that might
> have read as none-0 by loop (b) above, gets changed by the guest
> to 0 just after (b) exits.  When it enters (d) it reads the byte
> find it's 0 and outputs a '0' length count which is invalid format.
> 
> Dave
Nice catch. I will send another patch to fix it. Thanks!

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
@ 2014-04-01 16:47   ` Dr. David Alan Gilbert
  2014-04-02  9:01     ` Gonglei (Arei)
  2014-04-03 14:37     ` 陈梁
  0 siblings, 2 replies; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-01 16:47 UTC (permalink / raw)
  To: arei.gonglei; +Cc: ChenLiang, weidong.huang, qemu-devel, pbonzini

I've got a world with just patches 1..5 on that's seeing corruptions, but
I've not seen where the problem is.  So far the world with 1..4 on hasn't
hit those corruption, but maybe I need to test more.

Have you tested this set with google stressapptest?

Let it migrate for a few cycles with stress apptest running, then ctrl-z
the stressapptest program to let the migration complete, then fg it
to collect the results.

Dave

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Avoid hot pages being replaced by others to remarkably decrease cache
> misses
> 
> Sample results with the test program which quote from xbzrle.txt ran in
> vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)
> 
> the test program:
> 
> include <stdlib.h>
> include <stdio.h>
> int main()
>  {
>         char *buf = (char *) calloc(4096, 4096);
>         while (1) {
>             int i;
>             for (i = 0; i < 4096 * 4; i++) {
>                 buf[i * 4096 / 4]++;
>             }
>             printf(".");
>         }
>  }
> 
> before this patch:
> virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
> {"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
> "cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
> "cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
> "ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
> "transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
> "dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
> "normal":1403465}},"id":"libvirt-706"}
> 
> 18k pages sent compressed
> cache-miss-rate is 98.7%, totally miss.
> 
> after optimizing:
> virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
> {"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
> "cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
> "cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
> "ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
> "transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
> "dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
> "normal":387794}},"id":"libvirt-266"}
> 
> 194k pages sent compressed
> The value of cache-miss-rate decrease to 48.59%.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  arch_init.c                    |  8 +++++---
>  docs/xbzrle.txt                |  8 ++++++++
>  include/migration/page_cache.h | 10 +++++++---
>  page_cache.c                   | 23 +++++++++++++++++++----
>  4 files changed, 39 insertions(+), 10 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 15ca4c0..84a4bd3 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -343,7 +343,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
>  
>      /* We don't care if this fails to allocate a new cache page
>       * as long as it updated an old one */
> -    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE);
> +    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE,
> +                 bitmap_sync_count);
>  }
>  
>  #define ENCODING_FLAG_XBZRLE 0x1
> @@ -355,10 +356,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>      int encoded_len = 0, bytes_sent = -1;
>      uint8_t *prev_cached_page;
>  
> -    if (!cache_is_cached(XBZRLE.cache, current_addr)) {
> +    if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) {
>          acct_info.xbzrle_cache_miss++;
>          if (!last_stage) {
> -            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
> +            if (cache_insert(XBZRLE.cache, current_addr, *current_data,
> +                             bitmap_sync_count) == -1) {
>                  return -1;
>              } else {
>                  /* update *current_data when the page has been
> diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
> index cc3a26a..52c8511 100644
> --- a/docs/xbzrle.txt
> +++ b/docs/xbzrle.txt
> @@ -71,6 +71,14 @@ encoded buffer:
>  encoded length 24
>  e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
>  
> +Cache update strategy
> +=====================
> +Keeping the hot pages in the cache is effective for decreased cache
> +misses. XBZRLE uses a counter as the age of each page. The counter will
> +increase after each ram dirty bitmap sync. When a cache conflict is
> +detected, XBZRLE will only evict pages in the cache that are older than
> +a threshold.
> +
>  Usage
>  ======================
>  1. Verify the destination QEMU version is able to decode the new format.
> diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
> index 2d5ce2d..10ed532 100644
> --- a/include/migration/page_cache.h
> +++ b/include/migration/page_cache.h
> @@ -43,8 +43,10 @@ void cache_fini(PageCache *cache);
>   *
>   * @cache pointer to the PageCache struct
>   * @addr: page addr
> + * @current_age: current bitmap generation
>   */
> -bool cache_is_cached(const PageCache *cache, uint64_t addr);
> +bool cache_is_cached(const PageCache *cache, uint64_t addr,
> +                     uint64_t current_age);
>  
>  /**
>   * get_cached_data: Get the data cached for an addr
> @@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr);
>   * cache_insert: insert the page into the cache. the page cache
>   * will dup the data on insert. the previous value will be overwritten
>   *
> - * Returns -1 on error
> + * Returns -1 when the page isn't inserted into cache
>   *
>   * @cache pointer to the PageCache struct
>   * @addr: page address
>   * @pdata: pointer to the page
> + * @current_age: current bitmap generation
>   */
> -int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);
> +int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
> +                 uint64_t current_age);
>  
>  /**
>   * cache_resize: resize the page cache. In case of size reduction the extra
> diff --git a/page_cache.c b/page_cache.c
> index b033681..c78157b 100644
> --- a/page_cache.c
> +++ b/page_cache.c
> @@ -33,6 +33,9 @@
>      do { } while (0)
>  #endif
>  
> +/* the page in cache will not be replaced in two cycles */
> +#define CACHED_PAGE_LIFETIME 2
> +
>  typedef struct CacheItem CacheItem;
>  
>  struct CacheItem {
> @@ -121,7 +124,8 @@ static size_t cache_get_cache_pos(const PageCache *cache,
>      return pos;
>  }
>  
> -bool cache_is_cached(const PageCache *cache, uint64_t addr)
> +bool cache_is_cached(const PageCache *cache, uint64_t addr,
> +                     uint64_t current_age)
>  {
>      size_t pos;
>  
> @@ -130,7 +134,12 @@ bool cache_is_cached(const PageCache *cache, uint64_t addr)
>  
>      pos = cache_get_cache_pos(cache, addr);
>  
> -    return (cache->page_cache[pos].it_addr == addr);
> +    if (cache->page_cache[pos].it_addr == addr) {
> +        /* update the it_age when the cache hit */
> +        cache->page_cache[pos].it_age = current_age;
> +        return true;
> +    }
> +    return false;
>  }
>  
>  static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
> @@ -150,7 +159,8 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
>      return cache_get_by_addr(cache, addr)->it_data;
>  }
>  
> -int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
> +int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
> +                 uint64_t current_age)
>  {
>  
>      CacheItem *it = NULL;
> @@ -161,6 +171,11 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>      /* actual update of entry */
>      it = cache_get_by_addr(cache, addr);
>  
> +    if (it->it_data &&
> +        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
> +        /* the cache page is fresh, don't replace it */
> +        return -1;
> +    }
>      /* allocate page */
>      if (!it->it_data) {
>          it->it_data = g_try_malloc(cache->page_size);
> @@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>  
>      memcpy(it->it_data, pdata, cache->page_size);
>  
> -    it->it_age = ++cache->max_item_age;
> +    it->it_age = current_age;
>      it->it_addr = addr;
>  
>      return 0;
> -- 
> 1.7.12.4
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-01 16:47   ` Dr. David Alan Gilbert
@ 2014-04-02  9:01     ` Gonglei (Arei)
  2014-04-03 14:37     ` 陈梁
  1 sibling, 0 replies; 21+ messages in thread
From: Gonglei (Arei) @ 2014-04-02  9:01 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: chenliang (T), Huangweidong (C), qemu-devel@nongnu.org,
	pbonzini@redhat.com

> Subject: Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to
> decrease the cache misses
> 
> I've got a world with just patches 1..5 on that's seeing corruptions, but
> I've not seen where the problem is.  So far the world with 1..4 on hasn't
> hit those corruption, but maybe I need to test more.
> 
> Have you tested this set with google stressapptest?
> 
> Let it migrate for a few cycles with stress apptest running, then ctrl-z
> the stressapptest program to let the migration complete, then fg it
> to collect the results.
> 
xbzrle_cache_zero_page may insert a page which has been in the cache. Thanks,

> Dave
> 

Best regards,
-Gonglei

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-01 16:47   ` Dr. David Alan Gilbert
  2014-04-02  9:01     ` Gonglei (Arei)
@ 2014-04-03 14:37     ` 陈梁
  2014-04-03 14:40       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 21+ messages in thread
From: 陈梁 @ 2014-04-03 14:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ChenLiang, weidong.huang, arei.gonglei, qemu-devel,
	陈梁, pbonzini


> I've got a world with just patches 1..5 on that's seeing corruptions, but
> I've not seen where the problem is.  So far the world with 1..4 on hasn't
> hit those corruption, but maybe I need to test more.
> 
> Have you tested this set with google stressapptest?
> 
> Let it migrate for a few cycles with stress apptest running, then ctrl-z
> the stressapptest program to let the migration complete, then fg it
> to collect the results.
> 
> Dave
> 
> * arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>> 
>> Avoid hot pages being replaced by others to remarkably decrease cache
>> misses
>> 
>> Sample results with the test program which quote from xbzrle.txt ran in
>> vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)
>> 
>> the test program:
>> 
>> include <stdlib.h>
>> include <stdio.h>
>> int main()
>> {
>>        char *buf = (char *) calloc(4096, 4096);
>>        while (1) {
>>            int i;
>>            for (i = 0; i < 4096 * 4; i++) {
>>                buf[i * 4096 / 4]++;
>>            }
>>            printf(".");
>>        }
>> }
>> 
>> before this patch:
>> virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
>> {"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
>> "cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
>> "cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
>> "ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
>> "transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
>> "dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
>> "normal":1403465}},"id":"libvirt-706"}
>> 
>> 18k pages sent compressed
>> cache-miss-rate is 98.7%, totally miss.
>> 
>> after optimizing:
>> virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
>> {"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
>> "cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
>> "cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
>> "ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
>> "transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
>> "dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
>> "normal":387794}},"id":"libvirt-266"}
>> 
>> 194k pages sent compressed
>> The value of cache-miss-rate decrease to 48.59%.
>> 
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>> arch_init.c                    |  8 +++++---
>> docs/xbzrle.txt                |  8 ++++++++
>> include/migration/page_cache.h | 10 +++++++---
>> page_cache.c                   | 23 +++++++++++++++++++----
>> 4 files changed, 39 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch_init.c b/arch_init.c
>> index 15ca4c0..84a4bd3 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -343,7 +343,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
>> 
>>     /* We don't care if this fails to allocate a new cache page
>>      * as long as it updated an old one */
>> -    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE);
>> +    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE,
>> +                 bitmap_sync_count);
>> }
>> 
>> #define ENCODING_FLAG_XBZRLE 0x1
>> @@ -355,10 +356,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>>     int encoded_len = 0, bytes_sent = -1;
>>     uint8_t *prev_cached_page;
>> 
>> -    if (!cache_is_cached(XBZRLE.cache, current_addr)) {
>> +    if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) {
>>         acct_info.xbzrle_cache_miss++;
>>         if (!last_stage) {
>> -            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
>> +            if (cache_insert(XBZRLE.cache, current_addr, *current_data,
>> +                             bitmap_sync_count) == -1) {
>>                 return -1;
>>             } else {
>>                 /* update *current_data when the page has been
>> diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
>> index cc3a26a..52c8511 100644
>> --- a/docs/xbzrle.txt
>> +++ b/docs/xbzrle.txt
>> @@ -71,6 +71,14 @@ encoded buffer:
>> encoded length 24
>> e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
>> 
>> +Cache update strategy
>> +=====================
>> +Keeping the hot pages in the cache is effective for decreased cache
>> +misses. XBZRLE uses a counter as the age of each page. The counter will
>> +increase after each ram dirty bitmap sync. When a cache conflict is
>> +detected, XBZRLE will only evict pages in the cache that are older than
>> +a threshold.
>> +
>> Usage
>> ======================
>> 1. Verify the destination QEMU version is able to decode the new format.
>> diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
>> index 2d5ce2d..10ed532 100644
>> --- a/include/migration/page_cache.h
>> +++ b/include/migration/page_cache.h
>> @@ -43,8 +43,10 @@ void cache_fini(PageCache *cache);
>>  *
>>  * @cache pointer to the PageCache struct
>>  * @addr: page addr
>> + * @current_age: current bitmap generation
>>  */
>> -bool cache_is_cached(const PageCache *cache, uint64_t addr);
>> +bool cache_is_cached(const PageCache *cache, uint64_t addr,
>> +                     uint64_t current_age);
>> 
>> /**
>>  * get_cached_data: Get the data cached for an addr
>> @@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr);
>>  * cache_insert: insert the page into the cache. the page cache
>>  * will dup the data on insert. the previous value will be overwritten
>>  *
>> - * Returns -1 on error
>> + * Returns -1 when the page isn't inserted into cache
>>  *
>>  * @cache pointer to the PageCache struct
>>  * @addr: page address
>>  * @pdata: pointer to the page
>> + * @current_age: current bitmap generation
>>  */
>> -int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);
>> +int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
>> +                 uint64_t current_age);
>> 
>> /**
>>  * cache_resize: resize the page cache. In case of size reduction the extra
>> diff --git a/page_cache.c b/page_cache.c
>> index b033681..c78157b 100644
>> --- a/page_cache.c
>> +++ b/page_cache.c
>> @@ -33,6 +33,9 @@
>>     do { } while (0)
>> #endif
>> 
>> +/* the page in cache will not be replaced in two cycles */
>> +#define CACHED_PAGE_LIFETIME 2
>> +
>> typedef struct CacheItem CacheItem;
>> 
>> struct CacheItem {
>> @@ -121,7 +124,8 @@ static size_t cache_get_cache_pos(const PageCache *cache,
>>     return pos;
>> }
>> 
>> -bool cache_is_cached(const PageCache *cache, uint64_t addr)
>> +bool cache_is_cached(const PageCache *cache, uint64_t addr,
>> +                     uint64_t current_age)
>> {
>>     size_t pos;
>> 
>> @@ -130,7 +134,12 @@ bool cache_is_cached(const PageCache *cache, uint64_t addr)
>> 
>>     pos = cache_get_cache_pos(cache, addr);
>> 
>> -    return (cache->page_cache[pos].it_addr == addr);
>> +    if (cache->page_cache[pos].it_addr == addr) {
>> +        /* update the it_age when the cache hit */
>> +        cache->page_cache[pos].it_age = current_age;
>> +        return true;
>> +    }
>> +    return false;
>> }
>> 
>> static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
>> @@ -150,7 +159,8 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
>>     return cache_get_by_addr(cache, addr)->it_data;
>> }
>> 
>> -int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>> +int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
>> +                 uint64_t current_age)
>> {
>> 
>>     CacheItem *it = NULL;
>> @@ -161,6 +171,11 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>>     /* actual update of entry */
>>     it = cache_get_by_addr(cache, addr);
>> 
>> +    if (it->it_data &&
>> +        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
>> +        /* the cache page is fresh, don't replace it */

Hi Dave, is it ok? 

-    if (it->it_data &&
+    if (it->it_data && it->it_addr != addr &&
        it->it_age + CACHED_PAGE_LIFETIME > current_age) {

ChenLiang
>> +        return -1;
>> +    }
>>     /* allocate page */
>>     if (!it->it_data) {
>>         it->it_data = g_try_malloc(cache->page_size);
>> @@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>> 
>>     memcpy(it->it_data, pdata, cache->page_size);
>> 
>> -    it->it_age = ++cache->max_item_age;
>> +    it->it_age = current_age;
>>     it->it_addr = addr;
>> 
>>     return 0;
>> -- 
>> 1.7.12.4
>> 
>> 
>> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-03 14:37     ` 陈梁
@ 2014-04-03 14:40       ` Dr. David Alan Gilbert
  2014-04-03 15:36         ` 陈梁
  0 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-03 14:40 UTC (permalink / raw)
  To: ????; +Cc: ChenLiang, arei.gonglei, weidong.huang, qemu-devel, pbonzini

* ???? (chenliang0016@icloud.com) wrote:

<snip>

> Hi Dave, is it ok? 
> 
> -    if (it->it_data &&
> +    if (it->it_data && it->it_addr != addr &&
>         it->it_age + CACHED_PAGE_LIFETIME > current_age) {
> 

I've not had a chance to retry it yet; did you try google stressapptest on it?

Dave

> ChenLiang
> >> +        return -1;
> >> +    }
> >>     /* allocate page */
> >>     if (!it->it_data) {
> >>         it->it_data = g_try_malloc(cache->page_size);
> >> @@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
> >> 
> >>     memcpy(it->it_data, pdata, cache->page_size);
> >> 
> >> -    it->it_age = ++cache->max_item_age;
> >> +    it->it_age = current_age;
> >>     it->it_addr = addr;
> >> 
> >>     return 0;
> >> -- 
> >> 1.7.12.4
> >> 
> >> 
> >> 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-03 14:40       ` Dr. David Alan Gilbert
@ 2014-04-03 15:36         ` 陈梁
  0 siblings, 0 replies; 21+ messages in thread
From: 陈梁 @ 2014-04-03 15:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ChenLiang, weidong.huang, arei.gonglei, qemu-devel,
	陈梁, pbonzini


> * ???? (chenliang0016@icloud.com) wrote:
> 
> <snip>
> 
>> Hi Dave, is it ok? 
>> 
>> -    if (it->it_data &&
>> +    if (it->it_data && it->it_addr != addr &&
>>        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
>> 
> 
> I've not had a chance to retry it yet; did you try google stressapptest on it?
> 
> Dave
yes, I try google stressapptest on it. The problem has gone.
> 
>> ChenLiang
>>>> +        return -1;
>>>> +    }
>>>>    /* allocate page */
>>>>    if (!it->it_data) {
>>>>        it->it_data = g_try_malloc(cache->page_size);
>>>> @@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
>>>> 
>>>>    memcpy(it->it_data, pdata, cache->page_size);
>>>> 
>>>> -    it->it_age = ++cache->max_item_age;
>>>> +    it->it_age = current_age;
>>>>    it->it_addr = addr;
>>>> 
>>>>    return 0;
>>>> -- 
>>>> 1.7.12.4
>>>> 
>>>> 
>>>> 
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>> 
>> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-04-03 15:37 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27  3:18 [Qemu-devel] [PATCH v4 0/8] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 1/8] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
2014-03-27  3:26   ` [Qemu-devel] for 2.0? " Eric Blake
2014-03-27  3:45     ` Gonglei (Arei)
2014-03-27 12:31       ` Paolo Bonzini
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 2/8] migration: Add counts of updating the dirty bitmap arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 3/8] migration: expose the bitmap_sync_count to the end user arei.gonglei
2014-03-27  3:27   ` Eric Blake
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 4/8] migration: expose xbzrle cache miss rate arei.gonglei
2014-03-27  3:28   ` Eric Blake
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 5/8] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
2014-04-01 16:47   ` Dr. David Alan Gilbert
2014-04-02  9:01     ` Gonglei (Arei)
2014-04-03 14:37     ` 陈梁
2014-04-03 14:40       ` Dr. David Alan Gilbert
2014-04-03 15:36         ` 陈梁
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 6/8] XBZRLE: rebuild the cache_is_cached function arei.gonglei
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 7/8] migration: optimize xbzrle by reducing data copy arei.gonglei
2014-03-28 19:59   ` Dr. David Alan Gilbert
2014-03-29  7:51     ` Gonglei (Arei)
2014-03-27  3:18 ` [Qemu-devel] [PATCH v4 8/8] migration: clear the dead code arei.gonglei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).