Re: [PATCH 07/12] migration: hold the lock only if it is really needed

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Xiao Guangrong <guangrong.xiao@gmail.com>
To: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	dgilbert@redhat.com, qemu-devel@nongnu.org, wei.w.wang@intel.com,
	jiang.biao2@zte.com.cn, pbonzini@redhat.com
Subject: Re: [PATCH 07/12] migration: hold the lock only if it is really needed
Date: Wed, 18 Jul 2018 16:56:13 +0800	[thread overview]
Message-ID: <7383ea26-99ba-3264-32e9-a60be451a40c@gmail.com> (raw)
In-Reply-To: <20180712082602.GA11068@xz-mi>



On 07/12/2018 04:26 PM, Peter Xu wrote:
> On Thu, Jul 12, 2018 at 03:47:57PM +0800, Xiao Guangrong wrote:
>>
>>
>> On 07/11/2018 04:21 PM, Peter Xu wrote:
>>> On Thu, Jun 28, 2018 at 05:33:58PM +0800, Xiao Guangrong wrote:
>>>>
>>>>
>>>> On 06/19/2018 03:36 PM, Peter Xu wrote:
>>>>> On Mon, Jun 04, 2018 at 05:55:15PM +0800, guangrong.xiao@gmail.com wrote:
>>>>>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>>>>>
>>>>>> Try to hold src_page_req_mutex only if the queue is not
>>>>>> empty
>>>>>
>>>>> Pure question: how much this patch would help?  Basically if you are
>>>>> running compression tests then I think it means you are with precopy
>>>>> (since postcopy cannot work with compression yet), then here the lock
>>>>> has no contention at all.
>>>>
>>>> Yes, you are right, however we can observe it is in the top functions
>>>> (after revert this patch):
>>>>
>>>> Samples: 29K of event 'cycles', Event count (approx.): 22263412260
>>>> +   7.99%  kqemu  qemu-system-x86_64       [.] ram_bytes_total
>>>> +   6.95%  kqemu  [kernel.kallsyms]        [k] copy_user_enhanced_fast_string
>>>> +   6.23%  kqemu  qemu-system-x86_64       [.] qemu_put_qemu_file
>>>> +   6.20%  kqemu  qemu-system-x86_64       [.] qemu_event_set
>>>> +   5.80%  kqemu  qemu-system-x86_64       [.] __ring_put
>>>> +   4.82%  kqemu  qemu-system-x86_64       [.] compress_thread_data_done
>>>> +   4.11%  kqemu  qemu-system-x86_64       [.] ring_is_full
>>>> +   3.07%  kqemu  qemu-system-x86_64       [.] threads_submit_request_prepare
>>>> +   2.83%  kqemu  qemu-system-x86_64       [.] ring_mp_get
>>>> +   2.71%  kqemu  qemu-system-x86_64       [.] __ring_is_full
>>>> +   2.46%  kqemu  qemu-system-x86_64       [.] buffer_zero_sse2
>>>> +   2.40%  kqemu  qemu-system-x86_64       [.] add_to_iovec
>>>> +   2.21%  kqemu  qemu-system-x86_64       [.] ring_get
>>>> +   1.96%  kqemu  [kernel.kallsyms]        [k] __lock_acquire
>>>> +   1.90%  kqemu  libc-2.12.so             [.] memcpy
>>>> +   1.55%  kqemu  qemu-system-x86_64       [.] ring_len
>>>> +   1.12%  kqemu  libpthread-2.12.so       [.] pthread_mutex_unlock
>>>> +   1.11%  kqemu  qemu-system-x86_64       [.] ram_find_and_save_block
>>>> +   1.07%  kqemu  qemu-system-x86_64       [.] ram_save_host_page
>>>> +   1.04%  kqemu  qemu-system-x86_64       [.] qemu_put_buffer
>>>> +   0.97%  kqemu  qemu-system-x86_64       [.] compress_page_with_multi_thread
>>>> +   0.96%  kqemu  qemu-system-x86_64       [.] ram_save_target_page
>>>> +   0.93%  kqemu  libpthread-2.12.so       [.] pthread_mutex_lock
>>>
>>> (sorry to respond late; I was busy with other stuff for the
>>>    release...)
>>>
>>
>> You're welcome. :)
>>
>>> I am trying to find out anything related to unqueue_page() but I
>>> failed.  Did I miss anything obvious there?
>>>
>>
>> unqueue_page() was not listed here indeed, i think the function
>> itself is light enough (a check then directly return) so it
>> did not leave a trace here.
>>
>> This perf data was got after reverting this patch, i.e, it's
>> based on the lockless multithread model, then unqueue_page() is
>> the only place using mutex in the main thread.
>>
>> And you can see the overload of mutext was gone after applying
>> this patch in the mail i replied to Dave.
> 
> I see.  It's not a big portion of CPU resource, though of course I
> don't have reason to object to this change as well.
> 
> Actually what interested me more is why ram_bytes_total() is such a
> hot spot.  AFAIU it's only called in ram_find_and_save_block() per
> call, and it should be mostly a constant if we don't plug/unplug
> memories.  Not sure that means that's a better spot to work on.
> 

I noticed it too. That could be another work we will work on. :)

next prev parent reply	other threads:[~2018-07-18  8:56 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-04  9:55 [PATCH 00/12] migration: improve multithreads for compression and decompression guangrong.xiao
2018-06-04  9:55 ` [PATCH 01/12] migration: do not wait if no free thread guangrong.xiao
2018-06-11  7:39   ` Peter Xu
2018-06-12  2:42     ` Xiao Guangrong
2018-06-12  3:15       ` Peter Xu
2018-06-13 15:43         ` Dr. David Alan Gilbert
2018-06-14  3:19           ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 02/12] migration: fix counting normal page for compression guangrong.xiao
2018-06-13 15:51   ` Dr. David Alan Gilbert
2018-06-14  3:32     ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 03/12] migration: fix counting xbzrle cache_miss_rate guangrong.xiao
2018-06-13 16:09   ` Dr. David Alan Gilbert
2018-06-15 11:30   ` Dr. David Alan Gilbert
2018-06-04  9:55 ` [PATCH 04/12] migration: introduce migration_update_rates guangrong.xiao
2018-06-13 16:17   ` Dr. David Alan Gilbert
2018-06-14  3:35     ` Xiao Guangrong
2018-06-15 11:32     ` Dr. David Alan Gilbert
2018-06-04  9:55 ` [PATCH 05/12] migration: show the statistics of compression guangrong.xiao
2018-06-04 22:31   ` Eric Blake
2018-06-06 12:44     ` Xiao Guangrong
2018-06-13 16:25   ` Dr. David Alan Gilbert
2018-06-14  6:48     ` Xiao Guangrong
2018-07-16 19:01       ` Dr. David Alan Gilbert
2018-07-18  8:51         ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 06/12] migration: do not detect zero page for compression guangrong.xiao
2018-06-19  7:30   ` Peter Xu
2018-06-28  9:12     ` Xiao Guangrong
2018-06-28  9:36       ` Daniel P. Berrangé
2018-06-29  3:50         ` Xiao Guangrong
2018-06-29  9:54         ` Dr. David Alan Gilbert
2018-06-29  9:42       ` Dr. David Alan Gilbert
2018-07-03  3:53         ` Xiao Guangrong
2018-07-16 18:58           ` Dr. David Alan Gilbert
2018-07-18  8:46             ` Xiao Guangrong
2018-07-22 16:05               ` Michael S. Tsirkin
2018-07-23  7:12                 ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 07/12] migration: hold the lock only if it is really needed guangrong.xiao
2018-06-19  7:36   ` Peter Xu
2018-06-28  9:33     ` Xiao Guangrong
2018-06-29 11:22       ` Dr. David Alan Gilbert
2018-07-03  6:27         ` Xiao Guangrong
2018-07-11  8:21       ` Peter Xu
2018-07-12  7:47         ` Xiao Guangrong
2018-07-12  8:26           ` Peter Xu
2018-07-18  8:56             ` Xiao Guangrong [this message]
2018-07-18 10:18               ` Peter Xu
2018-07-13 17:44           ` Dr. David Alan Gilbert
2018-06-04  9:55 ` [PATCH 08/12] migration: do not flush_compressed_data at the end of each iteration guangrong.xiao
2018-07-13 18:01   ` Dr. David Alan Gilbert
2018-07-18  8:44     ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 09/12] ring: introduce lockless ring buffer guangrong.xiao
2018-06-20  4:52   ` Peter Xu
2018-06-28 10:02     ` Xiao Guangrong
2018-06-28 11:55       ` Wei Wang
2018-06-29  3:55         ` Xiao Guangrong
2018-07-03 15:55           ` Paul E. McKenney
2018-06-20  5:55   ` Peter Xu
2018-06-28 14:00     ` Xiao Guangrong
2018-06-20 12:38   ` Michael S. Tsirkin
2018-06-29  7:30     ` Xiao Guangrong
2018-06-29 13:08       ` Michael S. Tsirkin
2018-07-03  7:31         ` Xiao Guangrong
2018-06-28 13:36   ` Jason Wang
2018-06-29  3:59     ` Xiao Guangrong
2018-06-29  6:15       ` Jason Wang
2018-06-29  7:47         ` Xiao Guangrong
2018-06-29  4:23     ` Michael S. Tsirkin
2018-06-29  7:44       ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 10/12] migration: introduce lockless multithreads model guangrong.xiao
2018-06-20  6:52   ` Peter Xu
2018-06-28 14:25     ` Xiao Guangrong
2018-07-13 16:24     ` Dr. David Alan Gilbert
2018-07-18  7:12       ` Xiao Guangrong
2018-06-04  9:55 ` [PATCH 11/12] migration: use lockless Multithread model for compression guangrong.xiao
2018-06-04  9:55 ` [PATCH 12/12] migration: use lockless Multithread model for decompression guangrong.xiao
2018-06-11  8:00 ` [PATCH 00/12] migration: improve multithreads for compression and decompression Peter Xu
2018-06-12  3:19   ` Xiao Guangrong
2018-06-12  5:36     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7383ea26-99ba-3264-32e9-a60be451a40c@gmail.com \
    --to=guangrong.xiao@gmail.com \
    --cc=dgilbert@redhat.com \
    --cc=jiang.biao2@zte.com.cn \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.