From: Avi Kivity <avi@redhat.com>
To: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: mtosatti@redhat.com, kvm@vger.kernel.org
Subject: Re: [PATCH 2/3] KVM: MMU: Improve iteration over sptes linked by rmap
Date: Thu, 15 Mar 2012 11:44:52 +0200 [thread overview]
Message-ID: <4F61BA14.4070305@redhat.com> (raw)
In-Reply-To: <20120315182038.329d5ad7.yoshikawa.takuya@oss.ntt.co.jp>
On 03/15/2012 11:20 AM, Takuya Yoshikawa wrote:
> Iteration using rmap_next(), the actual body is pte_list_next(), is
> inefficient: every time we call it we start from checking whether rmap
> holds a single spte or points to a descriptor which links more sptes.
>
> In the case of shadow paging, this quadratic total iteration cost is a
> problem. Even for two dimensional paging, with EPT/NPT on, in which we
> almost always have a single spte, the extra checks at the end of the
> iteration should be eliminated.
>
> This patch fixes this by introducing rmap_iterator which keeps the
> iteration context for the next search. Furthermore the implementation
> of rmap_next() is splitted into two functions - rmap_get_first() and
> rmap_get_next() - to avoid repeatedly checking whether the rmap being
> iterated on has only one spte.
>
> Note: we just remove pte_list_next() because we can think of parent_ptes
> as a reverse mapping.
>
> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> ---
> arch/x86/kvm/mmu.c | 198 ++++++++++++++++++++++++++++------------------
> arch/x86/kvm/mmu_audit.c | 8 +-
> 2 files changed, 124 insertions(+), 82 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 384d3c0..d042087 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -842,32 +842,6 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
> return count;
> }
>
> -static u64 *pte_list_next(unsigned long *pte_list, u64 *spte)
> -{
> - struct pte_list_desc *desc;
> - u64 *prev_spte;
> - int i;
> -
> - if (!*pte_list)
> - return NULL;
> - else if (!(*pte_list & 1)) {
> - if (!spte)
> - return (u64 *)*pte_list;
> - return NULL;
> - }
> - desc = (struct pte_list_desc *)(*pte_list & ~1ul);
> - prev_spte = NULL;
> - while (desc) {
> - for (i = 0; i < PTE_LIST_EXT && desc->sptes[i]; ++i) {
> - if (prev_spte == spte)
> - return desc->sptes[i];
> - prev_spte = desc->sptes[i];
> - }
> - desc = desc->more;
> - }
> - return NULL;
> -}
> -
> static void
> pte_list_desc_remove_entry(unsigned long *pte_list, struct pte_list_desc *desc,
> int i, struct pte_list_desc *prev_desc)
> @@ -988,11 +962,6 @@ static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
> return pte_list_add(vcpu, spte, rmapp);
> }
>
> -static u64 *rmap_next(unsigned long *rmapp, u64 *spte)
> -{
> - return pte_list_next(rmapp, spte);
> -}
> -
> static void rmap_remove(struct kvm *kvm, u64 *spte)
> {
> struct kvm_mmu_page *sp;
> @@ -1005,6 +974,72 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
> pte_list_remove(spte, rmapp);
> }
>
> +/*
> + * Used by the following functions to iterate over the sptes linked by a rmap.
> + * Only sptep can be used outside of these functions.
> + */
> +struct rmap_iterator {
> + u64 *sptep; /* points to the current spte */
> + /* private fields */
> + struct pte_list_desc *desc; /* holds the sptep if not NULL */
> + int pos; /* index of the sptep */
> +};
> +
> +/*
> + * Iteration must be started by this function. This should also be used after
> + * removing/dropping sptes from rmap because in such cases the information in
> + * the itererator may not be valid.
Note: this suggests rmap_remove(struct rmap_iterator *ri) to remove an
rmap from the iterator while keeping it valid. Converts a potentially
quadratic kvm_mmu_rmap_write_protect() to linear.
> + *
> + * Returns true if spte is found, false otherwise.
> + */
> +static bool rmap_get_first(unsigned long rmap, struct rmap_iterator *iter)
> +{
> + if (!rmap) {
> + iter->sptep = NULL;
> + return false;
> + }
> +
> + if (!(rmap & 1)) {
> + iter->sptep = (u64 *)rmap;
> + iter->desc = NULL;
> + } else {
> + iter->desc = (struct pte_list_desc *)(rmap & ~1ul);
> + iter->pos = 0;
> + iter->sptep = iter->desc->sptes[iter->pos];
> + }
> +
> + return true;
> +}
Might be simplified to return the sptep or NULL; so you don't have to
copy it to the iterator.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2012-03-15 9:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-15 9:18 [PATCH 0/3] KVM: MMU: Improve rmap handling Takuya Yoshikawa
2012-03-15 9:19 ` [PATCH 1/3] KVM: MMU: Make pte_list_desc fit cache lines well Takuya Yoshikawa
2012-03-15 9:22 ` Avi Kivity
2012-03-15 9:20 ` [PATCH 2/3] KVM: MMU: Improve iteration over sptes linked by rmap Takuya Yoshikawa
2012-03-15 9:44 ` Avi Kivity [this message]
2012-03-15 10:25 ` Takuya Yoshikawa
2012-03-15 9:21 ` [PATCH 3/3] KVM: MMU: Separate trivial NULL check out from rmap_get_next() Takuya Yoshikawa
2012-03-15 9:49 ` Avi Kivity
2012-03-15 10:15 ` Takuya Yoshikawa
2012-03-15 12:01 ` Avi Kivity
2012-03-15 13:41 ` Takuya Yoshikawa
2012-03-15 13:45 ` Avi Kivity
2012-03-15 13:46 ` Avi Kivity
2012-03-20 6:37 ` Takuya Yoshikawa
2012-03-20 11:56 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F61BA14.4070305@redhat.com \
--to=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=yoshikawa.takuya@oss.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox