Re: [PATCH v7 04/11] KVM: MMU: zap pages in batch

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@redhat.com, avi.kivity@gmail.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v7 04/11] KVM: MMU: zap pages in batch
Date: Wed, 29 May 2013 21:09:09 +0800	[thread overview]
Message-ID: <51A5FDF5.8020003@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130529111132.GA5931@amt.cnet>

On 05/29/2013 07:11 PM, Marcelo Tosatti wrote:
> On Tue, May 28, 2013 at 11:02:09PM +0800, Xiao Guangrong wrote:
>> On 05/28/2013 08:18 AM, Marcelo Tosatti wrote:
>>> On Mon, May 27, 2013 at 10:20:12AM +0800, Xiao Guangrong wrote:
>>>> On 05/25/2013 04:34 AM, Marcelo Tosatti wrote:
>>>>> On Thu, May 23, 2013 at 03:55:53AM +0800, Xiao Guangrong wrote:
>>>>>> Zap at lease 10 pages before releasing mmu-lock to reduce the overload
>>>>>> caused by requiring lock
>>>>>>
>>>>>> After the patch, kvm_zap_obsolete_pages can forward progress anyway,
>>>>>> so update the comments
>>>>>>
>>>>>> [ It improves kernel building 0.6% ~ 1% ]
>>>>>
>>>>> Can you please describe the overload in more detail? Under what scenario
>>>>> is kernel building improved?
>>>>
>>>> Yes.
>>>>
>>>> The scenario is we do kernel building, meanwhile, repeatedly read PCI rom
>>>> every one second.
>>>>
>>>> [
>>>>    echo 1 > /sys/bus/pci/devices/0000\:00\:03.0/rom
>>>>    cat /sys/bus/pci/devices/0000\:00\:03.0/rom > /dev/null
>>>> ]
>>>
>>> Can't see why it reflects real world scenario (or a real world
>>> scenario with same characteristics regarding kvm_mmu_zap_all vs faults)?
>>>
>>> Point is, it would be good to understand why this change 
>>> is improving performance? What are these cases where breaking out of
>>> kvm_mmu_zap_all due to either (need_resched || spin_needbreak) on zapped
>>> < 10 ?
>>
>> When guest read ROM, qemu will set the memory to map the device's firmware,
>> that is why kvm_mmu_zap_all can be called in the scenario.
>>
>> The reasons why it heart the performance are:
>> 1): Qemu use a global io-lock to sync all vcpu, so that the io-lock is held
>>     when we do kvm_mmu_zap_all(). If kvm_mmu_zap_all() is not efficient, all
>>     other vcpus need wait a long time to do I/O.
>>
>> 2): kvm_mmu_zap_all() is triggered in vcpu context. so it can block the IPI
>>     request from other vcpus.
>>
>> Is it enough?
> 
> That is no problem. The problem is why you chose "10" as the minimum number of
> pages to zap before considering reschedule. I would expect the need to

Well, my description above explained why batch-zapping is needed - we do
not want the vcpu spend lots of time to zap all pages because it hurts other
vcpus running.

But, why the batch page number is "10"... I can not answer this, i just guessed
that '10' can make vcpu do not spend long time on zap_all_pages and do
not cause mmu-lock too hungry. "10" is the speculative value and i am not sure
it is the best value but at lease, i think it can work.

> reschedule to be rare enough that one kvm_mmu_zap_all instance (between
> schedule in and schedule out) to be able to release no less than a
> thousand pages.

Unfortunately, no.

This information is I replied Gleb in his mail where he raced a question that
why "collapse tlb flush is needed":

======
It seems no.
Since we have reloaded mmu before zapping the obsolete pages, the mmu-lock
is easily contended. I did the simple track:

+       int num = 0;
 restart:
        list_for_each_entry_safe_reverse(sp, node,
              &kvm->arch.active_mmu_pages, link) {
@@ -4265,6 +4265,7 @@ restart:
                if (batch >= BATCH_ZAP_PAGES &&
                      cond_resched_lock(&kvm->mmu_lock)) {
                        batch = 0;
+                       num++;
                        goto restart;
                }

@@ -4277,6 +4278,7 @@ restart:
         * may use the pages.
         */
        kvm_mmu_commit_zap_page(kvm, &invalid_list);
+       printk("lock-break: %d.\n", num);
 }

I do read pci rom when doing kernel building in the guest which
has 1G memory and 4vcpus with ept enabled, this is the normal
workload and normal configuration.

# dmesg
[ 2338.759099] lock-break: 8.
[ 2339.732442] lock-break: 5.
[ 2340.904446] lock-break: 3.
[ 2342.513514] lock-break: 3.
[ 2343.452229] lock-break: 3.
[ 2344.981599] lock-break: 4.

Basically, we need to break many times.
======

You can see we should break 3 times to zap all pages even if we have zapoed
10 pages in batch. It is obviously that it need break more times without
batch-zapping.

next prev parent reply	other threads:[~2013-05-29 13:09 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-22 19:55 [PATCH v7 00/11] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 01/11] KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 02/11] KVM: MMU: drop unnecessary kvm_reload_remote_mmus Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 03/11] KVM: MMU: fast invalidate all pages Xiao Guangrong
2013-05-24 20:23   ` Marcelo Tosatti
2013-05-26  8:26     ` Gleb Natapov
2013-05-26 20:37       ` Marcelo Tosatti
2013-05-27 22:59         ` Xiao Guangrong
2013-05-27  2:02     ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 04/11] KVM: MMU: zap pages in batch Xiao Guangrong
2013-05-24 20:34   ` Marcelo Tosatti
2013-05-27  2:20     ` Xiao Guangrong
2013-05-28  0:18       ` Marcelo Tosatti
2013-05-28 15:02         ` Xiao Guangrong
2013-05-29 11:11           ` Marcelo Tosatti
2013-05-29 13:09             ` Xiao Guangrong [this message]
2013-05-29 13:21               ` Marcelo Tosatti
2013-05-29 14:00                 ` Xiao Guangrong
2013-05-29 13:32               ` Marcelo Tosatti
2013-05-29 14:02                 ` Xiao Guangrong
2013-05-29 16:03                   ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 05/11] KVM: x86: use the fast way to invalidate all pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 06/11] KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 07/11] KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 08/11] KVM: MMU: do not reuse the obsolete page Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 09/11] KVM: MMU: introduce kvm_mmu_prepare_zap_obsolete_page Xiao Guangrong
2013-05-23  5:57   ` Gleb Natapov
2013-05-23  6:13     ` Xiao Guangrong
2013-05-23  6:18       ` Gleb Natapov
2013-05-23  6:31         ` Xiao Guangrong
2013-05-23  7:37           ` Gleb Natapov
2013-05-23  7:50             ` Xiao Guangrong
2013-05-23  8:09               ` Gleb Natapov
2013-05-23  8:33                 ` Xiao Guangrong
2013-05-23 11:13                 ` Xiao Guangrong
2013-05-23 12:39                   ` Gleb Natapov
2013-05-23 13:03                     ` Xiao Guangrong
2013-05-23 15:57                       ` Gleb Natapov
2013-05-24  5:39                         ` Xiao Guangrong
2013-05-24  5:53                           ` Xiao Guangrong
2013-05-28  0:13   ` Marcelo Tosatti
2013-05-28 14:51     ` Xiao Guangrong
2013-05-29 12:25       ` Marcelo Tosatti
2013-05-29 13:43         ` Xiao Guangrong
2013-05-22 19:55 ` [PATCH v7 10/11] KVM: MMU: collapse TLB flushes when zap all pages Xiao Guangrong
2013-05-23  6:12   ` Gleb Natapov
2013-05-23  6:26     ` Xiao Guangrong
2013-05-23  7:24       ` Gleb Natapov
2013-05-23  7:37         ` Xiao Guangrong
2013-05-23  7:38           ` Xiao Guangrong
2013-05-23  7:56             ` Gleb Natapov
2013-05-28  0:36   ` Marcelo Tosatti
2013-05-28 15:19     ` Xiao Guangrong
2013-05-29  3:03       ` Xiao Guangrong
2013-05-29 12:39         ` Marcelo Tosatti
2013-05-29 13:19           ` Xiao Guangrong
2013-05-30  0:53             ` Gleb Natapov
2013-05-30 16:24               ` Takuya Yoshikawa
2013-05-30 17:10                 ` Takuya Yoshikawa
2013-05-22 19:56 ` [PATCH v7 11/11] KVM: MMU: reduce KVM_REQ_MMU_RELOAD when root page is zapped Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A5FDF5.8020003@linux.vnet.ibm.com \
    --to=xiaoguangrong@linux.vnet.ibm.com \
    --cc=avi.kivity@gmail.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).