qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Xiao Guangrong <guangrong.xiao@gmail.com>,
	mst@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org,
	kvm@vger.kernel.org, peterx@redhat.com, jiang.biao2@zte.com.cn,
	wei.w.wang@intel.com, Xiao Guangrong <xiaoguangrong@tencent.com>,
	Stefan Hajnoczi <stefanha@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v3 00/10] migration: improve and cleanup compression
Date: Mon, 9 Apr 2018 20:30:52 +0100	[thread overview]
Message-ID: <20180409193052.GM2449@work-vm> (raw)
In-Reply-To: <b8ec9404-b34f-bd4f-df02-8227683f0ef8@redhat.com>

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> On 08/04/2018 05:19, Xiao Guangrong wrote:
> > 
> > Hi Paolo, Michael, Stefan and others,
> > 
> > Could anyone merge this patchset if it is okay to you guys?
> 
> Hi Guangrong,
> 
> Dave and Juan will take care of merging it.  However, right now QEMU is
> in freeze so they may wait a week or two.  If they have reviewed it,
> it's certainly on their radar!

Yep, one of us will get it at the start of 2.13.

Dave

> Thanks,
> 
> Paolo
> 
> > On 03/30/2018 03:51 PM, guangrong.xiao@gmail.com wrote:
> >> From: Xiao Guangrong <xiaoguangrong@tencent.com>
> >>
> >> Changelog in v3:
> >> Following changes are from Peter's review:
> >> 1) use comp_param[i].file and decomp_param[i].compbuf to indicate if
> >>     the thread is properly init'd or not
> >> 2) save the file which is used by ram loader to the global variable
> >>     instead it is cached per decompression thread
> >>
> >> Changelog in v2:
> >> Thanks to the review from Dave, Peter, Wei and Jiang Biao, the changes
> >> in this version are:
> >> 1) include the performance number in the cover letter
> >> 2)add some comments to explain how to use z_stream->opaque in the
> >>     patchset
> >> 3) allocate a internal buffer for per thread to store the data to
> >>     be compressed
> >> 4) add a new patch that moves some code to ram_save_host_page() so
> >>     that 'goto' can be omitted gracefully
> >> 5) split the optimization of compression and decompress into two
> >>     separated patches
> >> 6) refine and correct code styles
> >>
> >>
> >> This is the first part of our work to improve compression to make it
> >> be more useful in the production.
> >>
> >> The first patch resolves the problem that the migration thread spends
> >> too much CPU resource to compression memory if it jumps to a new block
> >> that causes the network is used very deficient.
> >>
> >> The second patch fixes the performance issue that too many VM-exits
> >> happen during live migration if compression is being used, it is caused
> >> by huge memory returned to kernel frequently as the memory is allocated
> >> and freed for every signal call to compress2()
> >>
> >> The remaining patches clean the code up dramatically
> >>
> >> Performance numbers:
> >> We have tested it on my desktop, i7-4790 + 16G, by locally live migrate
> >> the VM which has 8 vCPUs + 6G memory and the max-bandwidth is limited to
> >> 350. During the migration, a workload which has 8 threads repeatedly
> >> written total 6G memory in the VM.
> >>
> >> Before this patchset, its bandwidth is ~25 mbps, after applying, the
> >> bandwidth is ~50 mbp.
> >>
> >> We also collected the perf data for patch 2 and 3 on our production,
> >> before the patchset:
> >> +  57.88%  kqemu  [kernel.kallsyms]        [k] queued_spin_lock_slowpath
> >> +  10.55%  kqemu  [kernel.kallsyms]        [k] __lock_acquire
> >> +   4.83%  kqemu  [kernel.kallsyms]        [k] flush_tlb_func_common
> >>
> >> -   1.16%  kqemu  [kernel.kallsyms]        [k]
> >> lock_acquire                                       ▒
> >>     -
> >> lock_acquire                                                                                
> >> ▒
> >>        - 15.68%
> >> _raw_spin_lock                                                                    
> >> ▒
> >>           + 29.42%
> >> __schedule                                                                     
> >> ▒
> >>           + 29.14%
> >> perf_event_context_sched_out                                                   
> >> ▒
> >>           + 23.60%
> >> tdp_page_fault                                                                 
> >> ▒
> >>           + 10.54%
> >> do_anonymous_page                                                              
> >> ▒
> >>           + 2.07%
> >> kvm_mmu_notifier_invalidate_range_start                                         
> >> ▒
> >>           + 1.83%
> >> zap_pte_range                                                                   
> >> ▒
> >>           + 1.44% kvm_mmu_notifier_invalidate_range_end
> >>
> >>
> >> apply our work:
> >> +  51.92%  kqemu  [kernel.kallsyms]        [k] queued_spin_lock_slowpath
> >> +  14.82%  kqemu  [kernel.kallsyms]        [k] __lock_acquire
> >> +   1.47%  kqemu  [kernel.kallsyms]        [k] mark_lock.clone.0
> >> +   1.46%  kqemu  [kernel.kallsyms]        [k] native_sched_clock
> >> +   1.31%  kqemu  [kernel.kallsyms]        [k] lock_acquire
> >> +   1.24%  kqemu  libc-2.12.so             [.] __memset_sse2
> >>
> >> -  14.82%  kqemu  [kernel.kallsyms]        [k]
> >> __lock_acquire                                     ▒
> >>     -
> >> __lock_acquire                                                                              
> >> ▒
> >>        - 99.75%
> >> lock_acquire                                                                      
> >> ▒
> >>           - 18.38%
> >> _raw_spin_lock                                                                 
> >> ▒
> >>              + 39.62%
> >> tdp_page_fault                                                              
> >> ▒
> >>              + 31.32%
> >> __schedule                                                                  
> >> ▒
> >>              + 27.53%
> >> perf_event_context_sched_out                                                
> >> ▒
> >>              + 0.58% hrtimer_interrupt
> >>
> >>
> >> We can see the TLB flush and mmu-lock contention have gone.
> >>
> >> Xiao Guangrong (10):
> >>    migration: stop compressing page in migration thread
> >>    migration: stop compression to allocate and free memory frequently
> >>    migration: stop decompression to allocate and free memory frequently
> >>    migration: detect compression and decompression errors
> >>    migration: introduce control_save_page()
> >>    migration: move some code to ram_save_host_page
> >>    migration: move calling control_save_page to the common place
> >>    migration: move calling save_zero_page to the common place
> >>    migration: introduce save_normal_page()
> >>    migration: remove ram_save_compressed_page()
> >>
> >>   migration/qemu-file.c |  43 ++++-
> >>   migration/qemu-file.h |   6 +-
> >>   migration/ram.c       | 482
> >> ++++++++++++++++++++++++++++++--------------------
> >>   3 files changed, 324 insertions(+), 207 deletions(-)
> >>
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-04-09 19:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-30  7:51 [Qemu-devel] [PATCH v3 00/10] migration: improve and cleanup compression guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 01/10] migration: stop compressing page in migration thread guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 02/10] migration: stop compression to allocate and free memory frequently guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 03/10] migration: stop decompression " guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 04/10] migration: detect compression and decompression errors guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 05/10] migration: introduce control_save_page() guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 06/10] migration: move some code to ram_save_host_page guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 07/10] migration: move calling control_save_page to the common place guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 08/10] migration: move calling save_zero_page " guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 09/10] migration: introduce save_normal_page() guangrong.xiao
2018-03-30  7:51 ` [Qemu-devel] [PATCH v3 10/10] migration: remove ram_save_compressed_page() guangrong.xiao
2018-03-31  8:22 ` [Qemu-devel] [PATCH v3 00/10] migration: improve and cleanup compression no-reply
2018-04-08  3:19 ` Xiao Guangrong
2018-04-09  9:17   ` Paolo Bonzini
2018-04-09 19:30     ` Dr. David Alan Gilbert [this message]
2018-04-25 17:04 ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180409193052.GM2449@work-vm \
    --to=dgilbert@redhat.com \
    --cc=guangrong.xiao@gmail.com \
    --cc=jiang.biao2@zte.com.cn \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).