From: Marcelo Tosatti <mtosatti@redhat.com>
To: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
Cc: avi@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp
Subject: Re: [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()
Date: Thu, 12 Apr 2012 19:56:45 -0300 [thread overview]
Message-ID: <20120412225645.GA19030@amt.cnet> (raw)
In-Reply-To: <20120411202207.2c575e8eab2a6d279c1ec65a@gmail.com>
On Wed, Apr 11, 2012 at 08:22:07PM +0900, Takuya Yoshikawa wrote:
> I am now testing the following patch.
>
> Note: this technique is used in several subsystems, e.g. jbd.
>
> Although people tend to say that holding mmu_lock during get_dirty is
> always a problem, my impression is slightly different.
Other than potential performance improvement, the worst case scenario
of holding mmu_lock for hundreds of milliseconds at the beginning
of migration of huge guests must be fixed.
> When we call get_dirty, most of hot memory pages have already been
> written at least once and faults are becoming rare.
>
> Actually I rarely saw rescheduling due to mmu_lock contention when
> I tested this patch locally -- though not enough.
>
> In contrast, if we do O(1), we need to write protect 511 pages soon
> after the get_dirty and the chance of mmu_lock contention may increase
> if multiple VCPUs try to write to memory.
>
> Anyway, this patch is small and seems effective.
>
> Takuya
>
> ===
> From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
>
> get_dirty_log() needs to hold mmu_lock during write protecting dirty
> pages and this can be long when there are many dirty pages to protect.
>
> As the guest can get faulted during that time, this may result in a
> severe latency problem which would prevent the system to scale.
>
> This patch mitigates this by checking mmu_lock contention for every 2K
> dirty pages we protect: we have selected this value since it took about
> 100us to get 2K dirty pages.
>
> TODO: more numbers.
>
> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> ---
> arch/x86/include/asm/kvm_host.h | 6 +++---
> arch/x86/kvm/mmu.c | 12 +++++++++---
> arch/x86/kvm/x86.c | 18 +++++++++++++-----
> 3 files changed, 25 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f624ca7..26b39c1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -712,9 +712,9 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
>
> int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
> void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> - struct kvm_memory_slot *slot,
> - gfn_t gfn_offset, unsigned long mask);
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> + struct kvm_memory_slot *slot,
> + gfn_t gfn_offset, unsigned long mask);
> void kvm_mmu_zap_all(struct kvm *kvm);
> unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
> void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 29ad6f9..b88c5cc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1081,20 +1081,26 @@ static int __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, int level
> *
> * Used when we do not need to care about huge page mappings: e.g. during dirty
> * logging we do not have any such mappings.
> + *
> + * Returns the number of pages protected by this.
> */
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> - struct kvm_memory_slot *slot,
> - gfn_t gfn_offset, unsigned long mask)
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> + struct kvm_memory_slot *slot,
> + gfn_t gfn_offset, unsigned long mask)
> {
> unsigned long *rmapp;
> + int nr_protected = 0;
>
> while (mask) {
> rmapp = &slot->rmap[gfn_offset + __ffs(mask)];
> __rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL);
> + ++nr_protected;
>
> /* clear the first set bit */
> mask &= mask - 1;
> }
> +
> + return nr_protected;
> }
>
> static int rmap_write_protect(struct kvm *kvm, u64 gfn)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0d9a578..b636669 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3092,7 +3092,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> unsigned long n, i;
> unsigned long *dirty_bitmap;
> unsigned long *dirty_bitmap_buffer;
> - bool is_dirty = false;
> + int nr_protected = 0;
>
> mutex_lock(&kvm->slots_lock);
>
> @@ -3121,15 +3121,23 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> if (!dirty_bitmap[i])
> continue;
>
> - is_dirty = true;
> -
> mask = xchg(&dirty_bitmap[i], 0);
> dirty_bitmap_buffer[i] = mask;
>
> offset = i * BITS_PER_LONG;
> - kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
> + nr_protected += kvm_mmu_write_protect_pt_masked(kvm, memslot,
> + offset, mask);
> + if (nr_protected > 2048) {
Can you expand on the reasoning behind this?
next prev parent reply other threads:[~2012-04-12 23:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-11 11:22 [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched() Takuya Yoshikawa
2012-04-12 22:56 ` Marcelo Tosatti [this message]
2012-04-14 0:35 ` Takuya Yoshikawa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120412225645.GA19030@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=takuya.yoshikawa@gmail.com \
--cc=yoshikawa.takuya@oss.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox