Re: [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
Cc: avi@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp
Subject: Re: [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()
Date: Thu, 12 Apr 2012 19:56:45 -0300	[thread overview]
Message-ID: <20120412225645.GA19030@amt.cnet> (raw)
In-Reply-To: <20120411202207.2c575e8eab2a6d279c1ec65a@gmail.com>

On Wed, Apr 11, 2012 at 08:22:07PM +0900, Takuya Yoshikawa wrote:
> I am now testing the following patch.
> 
> Note: this technique is used in several subsystems, e.g. jbd.
> 
> Although people tend to say that holding mmu_lock during get_dirty is
> always a problem, my impression is slightly different.

Other than potential performance improvement, the worst case scenario
of holding mmu_lock for hundreds of milliseconds at the beginning 
of migration of huge guests must be fixed.

> When we call get_dirty, most of hot memory pages have already been
> written at least once and faults are becoming rare.
> 
> Actually I rarely saw rescheduling due to mmu_lock contention when
> I tested this patch locally -- though not enough.
> 
> In contrast, if we do O(1), we need to write protect 511 pages soon
> after the get_dirty and the chance of mmu_lock contention may increase
> if multiple VCPUs try to write to memory.
> 
> Anyway, this patch is small and seems effective.
> 
> 	Takuya
> 
> ===
> From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> 
> get_dirty_log() needs to hold mmu_lock during write protecting dirty
> pages and this can be long when there are many dirty pages to protect.
> 
> As the guest can get faulted during that time, this may result in a
> severe latency problem which would prevent the system to scale.
> 
> This patch mitigates this by checking mmu_lock contention for every 2K
> dirty pages we protect: we have selected this value since it took about
> 100us to get 2K dirty pages.
> 
> TODO: more numbers.
> 
> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++---
>  arch/x86/kvm/mmu.c              |   12 +++++++++---
>  arch/x86/kvm/x86.c              |   18 +++++++++++++-----
>  3 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f624ca7..26b39c1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -712,9 +712,9 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
>  
>  int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
>  void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> -				     struct kvm_memory_slot *slot,
> -				     gfn_t gfn_offset, unsigned long mask);
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> +				    struct kvm_memory_slot *slot,
> +				    gfn_t gfn_offset, unsigned long mask);
>  void kvm_mmu_zap_all(struct kvm *kvm);
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
>  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 29ad6f9..b88c5cc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1081,20 +1081,26 @@ static int __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, int level
>   *
>   * Used when we do not need to care about huge page mappings: e.g. during dirty
>   * logging we do not have any such mappings.
> + *
> + * Returns the number of pages protected by this.
>   */
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> -				     struct kvm_memory_slot *slot,
> -				     gfn_t gfn_offset, unsigned long mask)
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> +				    struct kvm_memory_slot *slot,
> +				    gfn_t gfn_offset, unsigned long mask)
>  {
>  	unsigned long *rmapp;
> +	int nr_protected = 0;
>  
>  	while (mask) {
>  		rmapp = &slot->rmap[gfn_offset + __ffs(mask)];
>  		__rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL);
> +		++nr_protected;
>  
>  		/* clear the first set bit */
>  		mask &= mask - 1;
>  	}
> +
> +	return nr_protected;
>  }
>  
>  static int rmap_write_protect(struct kvm *kvm, u64 gfn)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0d9a578..b636669 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3092,7 +3092,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  	unsigned long n, i;
>  	unsigned long *dirty_bitmap;
>  	unsigned long *dirty_bitmap_buffer;
> -	bool is_dirty = false;
> +	int nr_protected = 0;
>  
>  	mutex_lock(&kvm->slots_lock);
>  
> @@ -3121,15 +3121,23 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  		if (!dirty_bitmap[i])
>  			continue;
>  
> -		is_dirty = true;
> -
>  		mask = xchg(&dirty_bitmap[i], 0);
>  		dirty_bitmap_buffer[i] = mask;
>  
>  		offset = i * BITS_PER_LONG;
> -		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
> +		nr_protected += kvm_mmu_write_protect_pt_masked(kvm, memslot,
> +								offset, mask);
> +		if (nr_protected > 2048) {

Can you expand on the reasoning behind this?

next prev parent reply	other threads:[~2012-04-12 23:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-11 11:22 [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched() Takuya Yoshikawa
2012-04-12 22:56 ` Marcelo Tosatti [this message]
2012-04-14  0:35   ` Takuya Yoshikawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120412225645.GA19030@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=takuya.yoshikawa@gmail.com \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.