Re: [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Marcelo Tosatti <mtosatti@redhat.com>
To: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
Cc: avi@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp
Subject: Re: [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched()
Date: Thu, 12 Apr 2012 19:56:45 -0300	[thread overview]
Message-ID: <20120412225645.GA19030@amt.cnet> (raw)
In-Reply-To: <20120411202207.2c575e8eab2a6d279c1ec65a@gmail.com>

On Wed, Apr 11, 2012 at 08:22:07PM +0900, Takuya Yoshikawa wrote:
> I am now testing the following patch.
> 
> Note: this technique is used in several subsystems, e.g. jbd.
> 
> Although people tend to say that holding mmu_lock during get_dirty is
> always a problem, my impression is slightly different.

Other than potential performance improvement, the worst case scenario
of holding mmu_lock for hundreds of milliseconds at the beginning 
of migration of huge guests must be fixed.

> When we call get_dirty, most of hot memory pages have already been
> written at least once and faults are becoming rare.
> 
> Actually I rarely saw rescheduling due to mmu_lock contention when
> I tested this patch locally -- though not enough.
> 
> In contrast, if we do O(1), we need to write protect 511 pages soon
> after the get_dirty and the chance of mmu_lock contention may increase
> if multiple VCPUs try to write to memory.
> 
> Anyway, this patch is small and seems effective.
> 
> 	Takuya
> 
> ===
> From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> 
> get_dirty_log() needs to hold mmu_lock during write protecting dirty
> pages and this can be long when there are many dirty pages to protect.
> 
> As the guest can get faulted during that time, this may result in a
> severe latency problem which would prevent the system to scale.
> 
> This patch mitigates this by checking mmu_lock contention for every 2K
> dirty pages we protect: we have selected this value since it took about
> 100us to get 2K dirty pages.
> 
> TODO: more numbers.
> 
> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++---
>  arch/x86/kvm/mmu.c              |   12 +++++++++---
>  arch/x86/kvm/x86.c              |   18 +++++++++++++-----
>  3 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f624ca7..26b39c1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -712,9 +712,9 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
>  
>  int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
>  void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> -				     struct kvm_memory_slot *slot,
> -				     gfn_t gfn_offset, unsigned long mask);
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> +				    struct kvm_memory_slot *slot,
> +				    gfn_t gfn_offset, unsigned long mask);
>  void kvm_mmu_zap_all(struct kvm *kvm);
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
>  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 29ad6f9..b88c5cc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1081,20 +1081,26 @@ static int __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, int level
>   *
>   * Used when we do not need to care about huge page mappings: e.g. during dirty
>   * logging we do not have any such mappings.
> + *
> + * Returns the number of pages protected by this.
>   */
> -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> -				     struct kvm_memory_slot *slot,
> -				     gfn_t gfn_offset, unsigned long mask)
> +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> +				    struct kvm_memory_slot *slot,
> +				    gfn_t gfn_offset, unsigned long mask)
>  {
>  	unsigned long *rmapp;
> +	int nr_protected = 0;
>  
>  	while (mask) {
>  		rmapp = &slot->rmap[gfn_offset + __ffs(mask)];
>  		__rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL);
> +		++nr_protected;
>  
>  		/* clear the first set bit */
>  		mask &= mask - 1;
>  	}
> +
> +	return nr_protected;
>  }
>  
>  static int rmap_write_protect(struct kvm *kvm, u64 gfn)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0d9a578..b636669 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3092,7 +3092,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  	unsigned long n, i;
>  	unsigned long *dirty_bitmap;
>  	unsigned long *dirty_bitmap_buffer;
> -	bool is_dirty = false;
> +	int nr_protected = 0;
>  
>  	mutex_lock(&kvm->slots_lock);
>  
> @@ -3121,15 +3121,23 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  		if (!dirty_bitmap[i])
>  			continue;
>  
> -		is_dirty = true;
> -
>  		mask = xchg(&dirty_bitmap[i], 0);
>  		dirty_bitmap_buffer[i] = mask;
>  
>  		offset = i * BITS_PER_LONG;
> -		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
> +		nr_protected += kvm_mmu_write_protect_pt_masked(kvm, memslot,
> +								offset, mask);
> +		if (nr_protected > 2048) {

Can you expand on the reasoning behind this?

next prev parent reply	other threads:[~2012-04-12 23:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-11 11:22 [PATCH being tested] KVM: Reduce mmu_lock contention during dirty logging by cond_resched() Takuya Yoshikawa
2012-04-12 22:56 ` Marcelo Tosatti [this message]
2012-04-14  0:35   ` Takuya Yoshikawa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120412225645.GA19030@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=takuya.yoshikawa@gmail.com \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox