qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/10] migration: improve and cleanup compression
@ 2018-03-27  9:10 guangrong.xiao
  2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 01/10] migration: stop compressing page in migration thread guangrong.xiao
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: guangrong.xiao @ 2018-03-27  9:10 UTC (permalink / raw)
  To: pbonzini, mst, mtosatti
  Cc: qemu-devel, kvm, dgilbert, peterx, jiang.biao2, wei.w.wang,
	Xiao Guangrong

From: Xiao Guangrong <xiaoguangrong@tencent.com>

Changelog in v2:
Thanks to the review from Dave, Peter, Wei and Jiang Biao, the changes
in this version are:
1) include the performance number in the cover letter
2)add some comments to explain how to use z_stream->opaque in the
   patchset
3) allocate a internal buffer for per thread to store the data to
   be compressed
4) add a new patch that moves some code to ram_save_host_page() so
   that 'goto' can be omitted gracefully
5) split the optimization of compression and decompress into two
   separated patches
6) refine and correct code styles


This is the first part of our work to improve compression to make it
be more useful in the production.

The first patch resolves the problem that the migration thread spends
too much CPU resource to compression memory if it jumps to a new block
that causes the network is used very deficient.

The second patch fixes the performance issue that too many VM-exits
happen during live migration if compression is being used, it is caused
by huge memory returned to kernel frequently as the memory is allocated
and freed for every signal call to compress2()

The remaining patches clean the code up dramatically

Performance numbers:
We have tested it on my desktop, i7-4790 + 16G, by locally live migrate
the VM which has 8 vCPUs + 6G memory and the max-bandwidth is limited to
350. During the migration, a workload which has 8 threads repeatedly
written total 6G memory in the VM.

Before this patchset, its bandwidth is ~25 mbps, after applying, the
bandwidth is ~50 mbp.

We also collected the perf data for patch 2 and 3 on our production,
before the patchset:
+  57.88%  kqemu  [kernel.kallsyms]        [k] queued_spin_lock_slowpath
+  10.55%  kqemu  [kernel.kallsyms]        [k] __lock_acquire
+   4.83%  kqemu  [kernel.kallsyms]        [k] flush_tlb_func_common

-   1.16%  kqemu  [kernel.kallsyms]        [k] lock_acquire                                       ▒
   - lock_acquire                                                                                 ▒
      - 15.68% _raw_spin_lock                                                                     ▒
         + 29.42% __schedule                                                                      ▒
         + 29.14% perf_event_context_sched_out                                                    ▒
         + 23.60% tdp_page_fault                                                                  ▒
         + 10.54% do_anonymous_page                                                               ▒
         + 2.07% kvm_mmu_notifier_invalidate_range_start                                          ▒
         + 1.83% zap_pte_range                                                                    ▒
         + 1.44% kvm_mmu_notifier_invalidate_range_end


apply our work:
+  51.92%  kqemu  [kernel.kallsyms]        [k] queued_spin_lock_slowpath
+  14.82%  kqemu  [kernel.kallsyms]        [k] __lock_acquire
+   1.47%  kqemu  [kernel.kallsyms]        [k] mark_lock.clone.0
+   1.46%  kqemu  [kernel.kallsyms]        [k] native_sched_clock
+   1.31%  kqemu  [kernel.kallsyms]        [k] lock_acquire
+   1.24%  kqemu  libc-2.12.so             [.] __memset_sse2

-  14.82%  kqemu  [kernel.kallsyms]        [k] __lock_acquire                                     ▒
   - __lock_acquire                                                                               ▒
      - 99.75% lock_acquire                                                                       ▒
         - 18.38% _raw_spin_lock                                                                  ▒
            + 39.62% tdp_page_fault                                                               ▒
            + 31.32% __schedule                                                                   ▒
            + 27.53% perf_event_context_sched_out                                                 ▒
            + 0.58% hrtimer_interrupt


We can see the TLB flush and mmu-lock contention have gone.

Xiao Guangrong (10):
  migration: stop compressing page in migration thread
  migration: stop compression to allocate and free memory frequently
  migration: stop decompression to allocate and free memory frequently
  migration: detect compression and decompression errors
  migration: introduce control_save_page()
  migration: move some code ram_save_host_page
  migration: move calling control_save_page to the common place
  migration: move calling save_zero_page to the common place
  migration: introduce save_normal_page()
  migration: remove ram_save_compressed_page()

 migration/qemu-file.c |  43 ++++-
 migration/qemu-file.h |   6 +-
 migration/ram.c       | 479 ++++++++++++++++++++++++++++++--------------------
 3 files changed, 322 insertions(+), 206 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-04-02  3:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-27  9:10 [Qemu-devel] [PATCH v2 00/10] migration: improve and cleanup compression guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 01/10] migration: stop compressing page in migration thread guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 02/10] migration: stop compression to allocate and free memory frequently guangrong.xiao
2018-03-28  9:25   ` Peter Xu
2018-03-29  3:41     ` Xiao Guangrong
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 03/10] migration: stop decompression " guangrong.xiao
2018-03-28  9:42   ` Peter Xu
2018-03-29  3:43     ` Xiao Guangrong
2018-03-29  4:14       ` Peter Xu
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 04/10] migration: detect compression and decompression errors guangrong.xiao
2018-03-28  9:59   ` Peter Xu
2018-03-29  3:51     ` Xiao Guangrong
2018-03-29  4:25       ` Peter Xu
2018-03-30  3:11         ` Xiao Guangrong
2018-04-02  4:26           ` Peter Xu
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 05/10] migration: introduce control_save_page() guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 06/10] migration: move some code ram_save_host_page guangrong.xiao
2018-03-28 10:05   ` Peter Xu
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 07/10] migration: move calling control_save_page to the common place guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 08/10] migration: move calling save_zero_page " guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 09/10] migration: introduce save_normal_page() guangrong.xiao
2018-03-27  9:10 ` [Qemu-devel] [PATCH v2 10/10] migration: remove ram_save_compressed_page() guangrong.xiao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).