kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Paul Mackerras <paulus@ozlabs.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/8] KVM: Introduce memslots hva tree
Date: Wed, 19 May 2021 23:07:03 +0000	[thread overview]
Message-ID: <YKWaFwgMNSaQQuQP@google.com> (raw)
In-Reply-To: <cf1695b3e1ba495a4d23cbdc66e0fa9b7b535cc3.1621191551.git.maciej.szmigiero@oracle.com>

Nit: something like "KVM: Use interval tree to do fast hva lookup in memslots"
would be more helpful when perusing the shortlogs.  Stating that a tree is being
added doesn't provide any hint as to why, or even the what is somewhat unclear.

On Sun, May 16, 2021, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> 
> The current memslots implementation only allows quick binary search by gfn,
> quick lookup by hva is not possible - the implementation has to do a linear
> scan of the whole memslots array, even though the operation being performed
> might apply just to a single memslot.
> 
> This significantly hurts performance of per-hva operations with higher
> memslot counts.
> 
> Since hva ranges can overlap between memslots an interval tree is needed
> for tracking them.
> 
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---

...

> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d3a35646dfd8..f59847b6e9b3 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -27,6 +27,7 @@
>  #include <linux/rcuwait.h>
>  #include <linux/refcount.h>
>  #include <linux/nospec.h>
> +#include <linux/interval_tree.h>
>  #include <linux/hashtable.h>
>  #include <asm/signal.h>
>  
> @@ -358,6 +359,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
>  
>  struct kvm_memory_slot {
>  	struct hlist_node id_node;
> +	struct interval_tree_node hva_node;
>  	gfn_t base_gfn;
>  	unsigned long npages;
>  	unsigned long *dirty_bitmap;
> @@ -459,6 +461,7 @@ static inline int kvm_arch_vcpu_memslots_id(struct kvm_vcpu *vcpu)
>   */
>  struct kvm_memslots {
>  	u64 generation;
> +	struct rb_root_cached hva_tree;
>  	/* The mapping table from slot id to the index in memslots[]. */
>  	DECLARE_HASHTABLE(id_hash, 7);
>  	atomic_t lru_slot;
> @@ -679,6 +682,11 @@ static inline struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu)
>  	return __kvm_memslots(vcpu->kvm, as_id);
>  }
>  
> +#define kvm_for_each_hva_range_memslot(node, slots, start, last)	     \

kvm_for_each_memslot_in_range()?  Or kvm_for_each_memslot_in_hva_range()?

Please add a comment about whether start is inclusive or exclusive.

I'd also be in favor of hiding this in kvm_main.c, just above the MMU notifier
usage.  It'd be nice to discourage arch code from adding lookups that more than
likely belong in generic code.

> +	for (node = interval_tree_iter_first(&slots->hva_tree, start, last); \
> +	     node;							     \
> +	     node = interval_tree_iter_next(node, start, last))	     \
> +
>  static inline
>  struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
>  {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 50f9bc9bb1e0..a55309432c9a 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -488,6 +488,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm,
>  	struct kvm_memslots *slots;
>  	int i, idx;
>  
> +	if (range->end == range->start || WARN_ON(range->end < range->staart))

I'm pretty sure both of these are WARNable offenses, i.e. they can be combined.
It'd also be a good idea to use WARN_ON_ONCE(); if a caller does manage to
trigger this, odds are good it will get spammed.

Also, does interval_tree_iter_first() explode if given bad inputs?  If not, I'd
probably say just omit this entirely.  If it does explode, it might be a good idea
to work the sanity check into the macro, even if the macro is hidden here.

> +		return 0;
> +
>  	/* A null handler is allowed if and only if on_lock() is provided. */
>  	if (WARN_ON_ONCE(IS_KVM_NULL_FN(range->on_lock) &&
>  			 IS_KVM_NULL_FN(range->handler)))
> @@ -507,15 +510,18 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm,
>  	}
>  
>  	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> +		struct interval_tree_node *node;
> +
>  		slots = __kvm_memslots(kvm, i);
> -		kvm_for_each_memslot(slot, slots) {
> +		kvm_for_each_hva_range_memslot(node, slots,
> +					       range->start, range->end - 1) {
>  			unsigned long hva_start, hva_end;
>  
> +			slot = container_of(node, struct kvm_memory_slot,
> +					    hva_node);

Eh, let that poke out.  The 80 limit is more of a guideline.

>  			hva_start = max(range->start, slot->userspace_addr);
>  			hva_end = min(range->end, slot->userspace_addr +
>  						  (slot->npages << PAGE_SHIFT));
> -			if (hva_start >= hva_end)
> -				continue;
>  
>  			/*
>  			 * To optimize for the likely case where the address
> @@ -787,6 +793,7 @@ static struct kvm_memslots *kvm_alloc_memslots(void)
>  	if (!slots)
>  		return NULL;
>  
> +	slots->hva_tree = RB_ROOT_CACHED;
>  	hash_init(slots->id_hash);
>  
>  	return slots;
> @@ -1113,10 +1120,14 @@ static inline void kvm_memslot_delete(struct kvm_memslots *slots,
>  		atomic_set(&slots->lru_slot, 0);
>  
>  	for (i = dmemslot - mslots; i < slots->used_slots; i++) {
> +		interval_tree_remove(&mslots[i].hva_node, &slots->hva_tree);
>  		hash_del(&mslots[i].id_node);

I think it would make sense to add helpers for these?  Not sure I like the names,
but it would certainly dedup the code a bit.

static void kvm_memslot_remove(struct kvm_memslots *slots,
			       struct kvm_memslot *memslot)
{
	interval_tree_remove(&memslot->hva_node, &slots->hva_tree);
	hash_del(&memslot->id_node);
}

static void kvm_memslot_insert(struct kvm_memslots *slots,
			       struct kvm_memslot *memslot)
{
	interval_tree_insert(&memslot->hva_node, &slots->hva_tree);
	hash_add(slots->id_hash, &memslot->id_node, memslot->id);
}

> +
>  		mslots[i] = mslots[i + 1];
> +		interval_tree_insert(&mslots[i].hva_node, &slots->hva_tree);
>  		hash_add(slots->id_hash, &mslots[i].id_node, mslots[i].id);
>  	}
> +	interval_tree_remove(&mslots[i].hva_node, &slots->hva_tree);
>  	hash_del(&mslots[i].id_node);
>  	mslots[i] = *memslot;
>  }

  reply	other threads:[~2021-05-19 23:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-16 21:44 [PATCH v3 0/8] KVM: Scalable memslots implementation Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 1/8] KVM: x86: Cache total page count to avoid traversing the memslot array Maciej S. Szmigiero
2021-05-19 21:00   ` Sean Christopherson
2021-05-21  7:03     ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 2/8] KVM: Integrate gfn_to_memslot_approx() into search_memslots() Maciej S. Szmigiero
2021-05-19 21:24   ` Sean Christopherson
2021-05-21  7:03     ` Maciej S. Szmigiero
2021-06-10 16:17     ` Paolo Bonzini
2021-05-16 21:44 ` [PATCH v3 3/8] KVM: Resolve memslot ID via a hash table instead of via a static array Maciej S. Szmigiero
2021-05-19 22:31   ` Sean Christopherson
2021-05-21  7:05     ` Maciej S. Szmigiero
2021-05-22 11:11       ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 4/8] KVM: Introduce memslots hva tree Maciej S. Szmigiero
2021-05-19 23:07   ` Sean Christopherson [this message]
2021-05-21  7:06     ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 5/8] KVM: s390: Introduce kvm_s390_get_gfn_end() Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 6/8] KVM: Keep memslots in tree-based structures instead of array-based ones Maciej S. Szmigiero
2021-05-19 23:10   ` Sean Christopherson
2021-05-21  7:06     ` Maciej S. Szmigiero
2021-05-25 23:21   ` Sean Christopherson
2021-06-01 20:24     ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 7/8] KVM: Optimize gfn lookup in kvm_zap_gfn_range() Maciej S. Szmigiero
2021-05-26 17:33   ` Sean Christopherson
2021-06-01 20:25     ` Maciej S. Szmigiero
2021-05-16 21:44 ` [PATCH v3 8/8] KVM: Optimize overlapping memslots check Maciej S. Szmigiero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKWaFwgMNSaQQuQP@google.com \
    --to=seanjc@google.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=borntraeger@de.ibm.com \
    --cc=chenhuacai@kernel.org \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imammedo@redhat.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=james.morse@arm.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).