* [PATCH 1/3] KVM: Introduce a helper to check if gfn is in memslot @ 2011-04-18 18:32 Takuya Yoshikawa 2011-04-18 18:34 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa 0 siblings, 2 replies; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-18 18:32 UTC (permalink / raw) To: avi, mtosatti; +Cc: kvm, yoshikawa.takuya From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> This will be used later. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> --- include/linux/kvm_host.h | 5 +++++ virt/kvm/kvm_main.c | 6 ++---- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0bc3d37..9101698 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -612,6 +612,11 @@ static inline unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, return slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE; } +static inline bool gfn_in_memslot(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return (gfn >= slot->base_gfn) && (gfn < slot->base_gfn + slot->npages); +} + static inline gpa_t gfn_to_gpa(gfn_t gfn) { return (gpa_t)gfn << PAGE_SHIFT; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5814645..6df199d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -940,8 +940,7 @@ static struct kvm_memory_slot *__gfn_to_memslot(struct kvm_memslots *slots, for (i = 0; i < slots->nmemslots; ++i) { struct kvm_memory_slot *memslot = &slots->memslots[i]; - if (gfn >= memslot->base_gfn - && gfn < memslot->base_gfn + memslot->npages) + if (gfn_in_memslot(memslot, gfn)) return memslot; } return NULL; @@ -964,8 +963,7 @@ int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn) if (memslot->flags & KVM_MEMSLOT_INVALID) continue; - if (gfn >= memslot->base_gfn - && gfn < memslot->base_gfn + memslot->npages) + if (gfn_in_memslot(memslot, gfn)) return 1; } return 0; -- 1.7.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-18 18:32 [PATCH 1/3] KVM: Introduce a helper to check if gfn is in memslot Takuya Yoshikawa @ 2011-04-18 18:34 ` Takuya Yoshikawa 2011-04-20 9:07 ` Avi Kivity 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa 1 sibling, 1 reply; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-18 18:34 UTC (permalink / raw) To: avi, mtosatti; +Cc: kvm, yoshikawa.takuya From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> This will be optimized later. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> --- arch/x86/kvm/paging_tmpl.h | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 74f8567..109939a 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -109,6 +109,14 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) return access; } +static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, + gfn_t table_gfn, int offset, pt_element_t *ptep) +{ + return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, + offset, sizeof(*ptep), + PFERR_USER_MASK | PFERR_WRITE_MASK); +} + /* * Fetch a guest pte for a guest virtual address */ @@ -160,9 +168,7 @@ walk: walker->table_gfn[walker->level - 1] = table_gfn; walker->pte_gpa[walker->level - 1] = pte_gpa; - if (kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, &pte, - offset, sizeof(pte), - PFERR_USER_MASK|PFERR_WRITE_MASK)) { + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset, &pte)) { present = false; break; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-18 18:34 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa @ 2011-04-20 9:07 ` Avi Kivity 2011-04-20 9:35 ` Roedel, Joerg 0 siblings, 1 reply; 27+ messages in thread From: Avi Kivity @ 2011-04-20 9:07 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: mtosatti, kvm, yoshikawa.takuya, Roedel, Joerg On 04/18/2011 09:34 PM, Takuya Yoshikawa wrote: > From: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > This will be optimized later. > > Signed-off-by: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > --- > arch/x86/kvm/paging_tmpl.h | 12 +++++++++--- > 1 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > index 74f8567..109939a 100644 > --- a/arch/x86/kvm/paging_tmpl.h > +++ b/arch/x86/kvm/paging_tmpl.h > @@ -109,6 +109,14 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) > return access; > } > > +static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > + gfn_t table_gfn, int offset, pt_element_t *ptep) > +{ > + return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, > + offset, sizeof(*ptep), > + PFERR_USER_MASK | PFERR_WRITE_MASK); > +} > + > /* > * Fetch a guest pte for a guest virtual address > */ > @@ -160,9 +168,7 @@ walk: > walker->table_gfn[walker->level - 1] = table_gfn; > walker->pte_gpa[walker->level - 1] = pte_gpa; > > - if (kvm_read_guest_page_mmu(vcpu, mmu, table_gfn,&pte, > - offset, sizeof(pte), > - PFERR_USER_MASK|PFERR_WRITE_MASK)) { > + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset,&pte)) { > present = false; > break; > } I think it's better to avoid a separate function for this. The reason is I'd like to use ptep_user for cmpxchg_gpte() later on in walk_addr_generic(), so we use the same calculation for both read and write. So please just inline the new code in walk_addr_generic(). In fact there's probably a bug there for nested npt - we use gfn_to_page(table_gfn), but table_gfn is actually an ngfn, not a gfn. Joerg, am I right here? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-20 9:07 ` Avi Kivity @ 2011-04-20 9:35 ` Roedel, Joerg 2011-04-20 10:05 ` Avi Kivity 0 siblings, 1 reply; 27+ messages in thread From: Roedel, Joerg @ 2011-04-20 9:35 UTC (permalink / raw) To: Avi Kivity Cc: Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp On Wed, Apr 20, 2011 at 05:07:12AM -0400, Avi Kivity wrote: > On 04/18/2011 09:34 PM, Takuya Yoshikawa wrote: > > From: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > > > This will be optimized later. > > > > Signed-off-by: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > --- > > arch/x86/kvm/paging_tmpl.h | 12 +++++++++--- > > 1 files changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > > index 74f8567..109939a 100644 > > --- a/arch/x86/kvm/paging_tmpl.h > > +++ b/arch/x86/kvm/paging_tmpl.h > > @@ -109,6 +109,14 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) > > return access; > > } > > > > +static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > + gfn_t table_gfn, int offset, pt_element_t *ptep) > > +{ > > + return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, > > + offset, sizeof(*ptep), > > + PFERR_USER_MASK | PFERR_WRITE_MASK); > > +} > > + > > /* > > * Fetch a guest pte for a guest virtual address > > */ > > @@ -160,9 +168,7 @@ walk: > > walker->table_gfn[walker->level - 1] = table_gfn; > > walker->pte_gpa[walker->level - 1] = pte_gpa; > > > > - if (kvm_read_guest_page_mmu(vcpu, mmu, table_gfn,&pte, > > - offset, sizeof(pte), > > - PFERR_USER_MASK|PFERR_WRITE_MASK)) { > > + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset,&pte)) { > > present = false; > > break; > > } > > > I think it's better to avoid a separate function for this. The reason > is I'd like to use ptep_user for cmpxchg_gpte() later on in > walk_addr_generic(), so we use the same calculation for both read and > write. So please just inline the new code in walk_addr_generic(). > > In fact there's probably a bug there for nested npt - we use > gfn_to_page(table_gfn), but table_gfn is actually an ngfn, not a gfn. > Joerg, am I right here? This patch seems only to introduce another wrapper around kvm_read_guest_page_mmu(), so I don't see a problem in this patch. The kvm_read_guest_page_mmu takes care whether it gets a l1-gfn or l2-gfn (by calling mmu->translate_gpa). Regards, Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-20 9:35 ` Roedel, Joerg @ 2011-04-20 10:05 ` Avi Kivity 2011-04-20 11:06 ` Roedel, Joerg 0 siblings, 1 reply; 27+ messages in thread From: Avi Kivity @ 2011-04-20 10:05 UTC (permalink / raw) To: Roedel, Joerg Cc: Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp On 04/20/2011 12:35 PM, Roedel, Joerg wrote: > On Wed, Apr 20, 2011 at 05:07:12AM -0400, Avi Kivity wrote: > > On 04/18/2011 09:34 PM, Takuya Yoshikawa wrote: > > > From: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > > > > > This will be optimized later. > > > > > > Signed-off-by: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > > --- > > > arch/x86/kvm/paging_tmpl.h | 12 +++++++++--- > > > 1 files changed, 9 insertions(+), 3 deletions(-) > > > > > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > > > index 74f8567..109939a 100644 > > > --- a/arch/x86/kvm/paging_tmpl.h > > > +++ b/arch/x86/kvm/paging_tmpl.h > > > @@ -109,6 +109,14 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) > > > return access; > > > } > > > > > > +static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > > + gfn_t table_gfn, int offset, pt_element_t *ptep) > > > +{ > > > + return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, > > > + offset, sizeof(*ptep), > > > + PFERR_USER_MASK | PFERR_WRITE_MASK); > > > +} > > > + > > > /* > > > * Fetch a guest pte for a guest virtual address > > > */ > > > @@ -160,9 +168,7 @@ walk: > > > walker->table_gfn[walker->level - 1] = table_gfn; > > > walker->pte_gpa[walker->level - 1] = pte_gpa; > > > > > > - if (kvm_read_guest_page_mmu(vcpu, mmu, table_gfn,&pte, > > > - offset, sizeof(pte), > > > - PFERR_USER_MASK|PFERR_WRITE_MASK)) { > > > + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset,&pte)) { > > > present = false; > > > break; > > > } > > > > > > I think it's better to avoid a separate function for this. The reason > > is I'd like to use ptep_user for cmpxchg_gpte() later on in > > walk_addr_generic(), so we use the same calculation for both read and > > write. So please just inline the new code in walk_addr_generic(). > > > > In fact there's probably a bug there for nested npt - we use > > gfn_to_page(table_gfn), but table_gfn is actually an ngfn, not a gfn. > > Joerg, am I right here? > > This patch seems only to introduce another wrapper around > kvm_read_guest_page_mmu(), so I don't see a problem in this patch. By patch 3, ptep_user will be computed in this function and no longer available for setting the accessed bit later on. > The kvm_read_guest_page_mmu takes care whether it gets a l1-gfn or > l2-gfn (by calling mmu->translate_gpa). But cmpxchg_gpte() does not. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-20 10:05 ` Avi Kivity @ 2011-04-20 11:06 ` Roedel, Joerg 2011-04-20 11:18 ` Avi Kivity 0 siblings, 1 reply; 27+ messages in thread From: Roedel, Joerg @ 2011-04-20 11:06 UTC (permalink / raw) To: Avi Kivity Cc: Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp On Wed, Apr 20, 2011 at 06:05:08AM -0400, Avi Kivity wrote: > On 04/20/2011 12:35 PM, Roedel, Joerg wrote: > > This patch seems only to introduce another wrapper around > > kvm_read_guest_page_mmu(), so I don't see a problem in this patch. > > By patch 3, ptep_user will be computed in this function and no longer > available for setting the accessed bit later on. > > > The kvm_read_guest_page_mmu takes care whether it gets a l1-gfn or > > l2-gfn (by calling mmu->translate_gpa). > > But cmpxchg_gpte() does not. You are right, cmpxchg_gpte needs to handle this too. But the bug is not introduced with this patch-set it was there before. The cmpxchg_gpte function treats all table_gfns as l1-gfns. I'll send a fix soon. Regards, Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-20 11:06 ` Roedel, Joerg @ 2011-04-20 11:18 ` Avi Kivity 2011-04-20 13:33 ` [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too Roedel, Joerg 2011-04-21 1:07 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa 0 siblings, 2 replies; 27+ messages in thread From: Avi Kivity @ 2011-04-20 11:18 UTC (permalink / raw) To: Roedel, Joerg Cc: Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp On 04/20/2011 02:06 PM, Roedel, Joerg wrote: > On Wed, Apr 20, 2011 at 06:05:08AM -0400, Avi Kivity wrote: > > On 04/20/2011 12:35 PM, Roedel, Joerg wrote: > > > > This patch seems only to introduce another wrapper around > > > kvm_read_guest_page_mmu(), so I don't see a problem in this patch. > > > > By patch 3, ptep_user will be computed in this function and no longer > > available for setting the accessed bit later on. > > > > > The kvm_read_guest_page_mmu takes care whether it gets a l1-gfn or > > > l2-gfn (by calling mmu->translate_gpa). > > > > But cmpxchg_gpte() does not. > > You are right, cmpxchg_gpte needs to handle this too. But the bug is not > introduced with this patch-set it was there before. Correct. The reason I don't want the helper, is so we can use ptep_user in both places (not for efficiency, just to make sure it's exactly the same value). > The cmpxchg_gpte function treats all table_gfns as l1-gfns. I'll send a > fix soon. Thanks. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too 2011-04-20 11:18 ` Avi Kivity @ 2011-04-20 13:33 ` Roedel, Joerg 2011-04-21 1:02 ` Takuya Yoshikawa 2011-04-21 1:07 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa 1 sibling, 1 reply; 27+ messages in thread From: Roedel, Joerg @ 2011-04-20 13:33 UTC (permalink / raw) To: Avi Kivity Cc: Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp On Wed, Apr 20, 2011 at 07:18:12AM -0400, Avi Kivity wrote: > On 04/20/2011 02:06 PM, Roedel, Joerg wrote: > > The cmpxchg_gpte function treats all table_gfns as l1-gfns. I'll send a > > fix soon. > > Thanks. Here is a fix for review. I am out-of-office starting in nearly one hour until next Tuesday. So the corrections will most likely not happen before :) The patch ist tested with npt and shadow paging as well as with npt-on-npt (64 bit wit kvm). Regards, Joerg >From 6b1dcd9f17bbd482061180001d1f45c3adcef430 Mon Sep 17 00:00:00 2001 From: Joerg Roedel <joerg.roedel@amd.com> Date: Wed, 20 Apr 2011 15:22:21 +0200 Subject: [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too This patch makes the cmpxchg_gpte() function aware of the difference between l1-gfns and l2-gfns when nested virtualization is in use. This fixes a potential data-corruption problem in the l1-guest and makes the code work correct (at least as correct as the hardware which is emulated in this code) again. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> --- arch/x86/kvm/paging_tmpl.h | 30 +++++++++++++++++++++++------- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 74f8567..e442bf4 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -78,15 +78,21 @@ static gfn_t gpte_to_gfn_lvl(pt_element_t gpte, int lvl) return (gpte & PT_LVL_ADDR_MASK(lvl)) >> PAGE_SHIFT; } -static bool FNAME(cmpxchg_gpte)(struct kvm *kvm, +static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gfn_t table_gfn, unsigned index, pt_element_t orig_pte, pt_element_t new_pte) { pt_element_t ret; pt_element_t *table; struct page *page; + gpa_t gpa; - page = gfn_to_page(kvm, table_gfn); + gpa = mmu->translate_gpa(vcpu, table_gfn << PAGE_SHIFT, + PFERR_USER_MASK|PFERR_WRITE_MASK); + if (gpa == UNMAPPED_GVA) + return -EFAULT; + + page = gfn_to_page(vcpu->kvm, gpa_to_gfn(gpa)); table = kmap_atomic(page, KM_USER0); ret = CMPXCHG(&table[index], orig_pte, new_pte); @@ -192,11 +198,17 @@ walk: #endif if (!eperm && !rsvd_fault && !(pte & PT_ACCESSED_MASK)) { + int ret; trace_kvm_mmu_set_accessed_bit(table_gfn, index, sizeof(pte)); - if (FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, - index, pte, pte|PT_ACCESSED_MASK)) + ret = FNAME(cmpxchg_gpte)(vcpu, mmu, table_gfn, + index, pte, pte|PT_ACCESSED_MASK); + if (ret < 0) { + present = false; + break; + } else if (ret) goto walk; + mark_page_dirty(vcpu->kvm, table_gfn); pte |= PT_ACCESSED_MASK; } @@ -245,13 +257,17 @@ walk: goto error; if (write_fault && !is_dirty_gpte(pte)) { - bool ret; + int ret; trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte)); - ret = FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, index, pte, + ret = FNAME(cmpxchg_gpte)(vcpu, mmu, table_gfn, index, pte, pte|PT_DIRTY_MASK); - if (ret) + if (ret < 0) { + present = false; + goto error; + } if (ret) goto walk; + mark_page_dirty(vcpu->kvm, table_gfn); pte |= PT_DIRTY_MASK; walker->ptes[walker->level - 1] = pte; -- 1.7.1 -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too 2011-04-20 13:33 ` [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too Roedel, Joerg @ 2011-04-21 1:02 ` Takuya Yoshikawa 2011-04-21 8:11 ` Avi Kivity 0 siblings, 1 reply; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-21 1:02 UTC (permalink / raw) To: Roedel, Joerg Cc: Avi Kivity, Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org On Wed, 20 Apr 2011 15:33:16 +0200 "Roedel, Joerg" <Joerg.Roedel@amd.com> wrote: > @@ -245,13 +257,17 @@ walk: > goto error; > > if (write_fault && !is_dirty_gpte(pte)) { > - bool ret; > + int ret; > > trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte)); > - ret = FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, index, pte, > + ret = FNAME(cmpxchg_gpte)(vcpu, mmu, table_gfn, index, pte, > pte|PT_DIRTY_MASK); > - if (ret) > + if (ret < 0) { > + present = false; > + goto error; > + } if (ret) > goto walk; Preferably else if or another line ? :) Takuya > + > mark_page_dirty(vcpu->kvm, table_gfn); > pte |= PT_DIRTY_MASK; > walker->ptes[walker->level - 1] = pte; > -- > 1.7.1 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too 2011-04-21 1:02 ` Takuya Yoshikawa @ 2011-04-21 8:11 ` Avi Kivity 0 siblings, 0 replies; 27+ messages in thread From: Avi Kivity @ 2011-04-21 8:11 UTC (permalink / raw) To: Takuya Yoshikawa Cc: Roedel, Joerg, Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org On 04/21/2011 04:02 AM, Takuya Yoshikawa wrote: > On Wed, 20 Apr 2011 15:33:16 +0200 > "Roedel, Joerg"<Joerg.Roedel@amd.com> wrote: > > > @@ -245,13 +257,17 @@ walk: > > goto error; > > > > if (write_fault&& !is_dirty_gpte(pte)) { > > - bool ret; > > + int ret; > > > > trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte)); > > - ret = FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, index, pte, > > + ret = FNAME(cmpxchg_gpte)(vcpu, mmu, table_gfn, index, pte, > > pte|PT_DIRTY_MASK); > > - if (ret) > > + if (ret< 0) { > > + present = false; > > + goto error; > > + } if (ret) > > goto walk; > > Preferably else if or another line ? :) Yes, I added an 'else' and applied. Thanks. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte 2011-04-20 11:18 ` Avi Kivity 2011-04-20 13:33 ` [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too Roedel, Joerg @ 2011-04-21 1:07 ` Takuya Yoshikawa 1 sibling, 0 replies; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-21 1:07 UTC (permalink / raw) To: Avi Kivity Cc: Roedel, Joerg, Takuya Yoshikawa, mtosatti@redhat.com, kvm@vger.kernel.org On Wed, 20 Apr 2011 14:18:12 +0300 Avi Kivity <avi@redhat.com> wrote: > Correct. The reason I don't want the helper, is so we can use ptep_user > in both places (not for efficiency, just to make sure it's exactly the > same value). > Thank you for your explanation, now I've got the picture! I will send a new patch taking into account your advice. Takuya > > The cmpxchg_gpte function treats all table_gfns as l1-gfns. I'll send a > > fix soon. > ^ permalink raw reply [flat|nested] 27+ messages in thread
* [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-18 18:32 [PATCH 1/3] KVM: Introduce a helper to check if gfn is in memslot Takuya Yoshikawa 2011-04-18 18:34 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa @ 2011-04-18 18:38 ` Takuya Yoshikawa 2011-04-18 18:52 ` Joerg Roedel ` (2 more replies) 1 sibling, 3 replies; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-18 18:38 UTC (permalink / raw) To: avi, mtosatti; +Cc: kvm, yoshikawa.takuya From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> We optimize multi level guest page table walk as follows: 1. We cache the memslot which, probably, includes the next guest page tables to avoid searching for it many times. 2. We use get_user() instead of copy_from_user(). Note that this is kind of a restricted way of Xiao's more generic work: "KVM: optimize memslots searching and cache GPN to GFN." With this patch applied, paging64_walk_addr_generic() has improved as the following tracing results show. Before: 3.169 us | paging64_walk_addr_generic(); 1.880 us | paging64_walk_addr_generic(); 1.243 us | paging64_walk_addr_generic(); 1.517 us | paging64_walk_addr_generic(); 3.009 us | paging64_walk_addr_generic(); 1.814 us | paging64_walk_addr_generic(); 1.340 us | paging64_walk_addr_generic(); 1.659 us | paging64_walk_addr_generic(); 1.748 us | paging64_walk_addr_generic(); 1.488 us | paging64_walk_addr_generic(); After: 1.714 us | paging64_walk_addr_generic(); 0.806 us | paging64_walk_addr_generic(); 0.664 us | paging64_walk_addr_generic(); 0.619 us | paging64_walk_addr_generic(); 0.645 us | paging64_walk_addr_generic(); 0.605 us | paging64_walk_addr_generic(); 1.388 us | paging64_walk_addr_generic(); 0.753 us | paging64_walk_addr_generic(); 0.594 us | paging64_walk_addr_generic(); 0.833 us | paging64_walk_addr_generic(); Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> --- arch/x86/kvm/paging_tmpl.h | 37 ++++++++++++++++++++++++++++++++----- 1 files changed, 32 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 109939a..614aa3f 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -109,12 +109,37 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) return access; } +/* + * Read the guest PTE refered to by table_gfn and offset and put it into ptep. + * + * *slot_hint, if not NULL, should point to a memslot which probably includes + * the guest PTE. The actual memslot will be put back into this so that + * callers can cache it. + */ static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, - gfn_t table_gfn, int offset, pt_element_t *ptep) + gfn_t table_gfn, int offset, pt_element_t *ptep, + struct kvm_memory_slot **slot_hint) { - return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, - offset, sizeof(*ptep), - PFERR_USER_MASK | PFERR_WRITE_MASK); + unsigned long addr; + pt_element_t __user *ptep_user; + gfn_t real_gfn; + + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), + PFERR_USER_MASK | PFERR_WRITE_MASK); + if (real_gfn == UNMAPPED_GVA) + return -EFAULT; + + real_gfn = gpa_to_gfn(real_gfn); + + if (!(*slot_hint) || !gfn_in_memslot(*slot_hint, real_gfn)) + *slot_hint = gfn_to_memslot(vcpu->kvm, real_gfn); + + addr = gfn_to_hva_memslot(*slot_hint, real_gfn); + if (kvm_is_error_hva(addr)) + return -EFAULT; + + ptep_user = (pt_element_t __user *)((void *)addr + offset); + return get_user(*ptep, ptep_user); } /* @@ -130,6 +155,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, gpa_t pte_gpa; bool eperm, present, rsvd_fault; int offset, write_fault, user_fault, fetch_fault; + struct kvm_memory_slot *slot_cache = NULL; write_fault = access & PFERR_WRITE_MASK; user_fault = access & PFERR_USER_MASK; @@ -168,7 +194,8 @@ walk: walker->table_gfn[walker->level - 1] = table_gfn; walker->pte_gpa[walker->level - 1] = pte_gpa; - if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset, &pte)) { + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, + offset, &pte, &slot_cache)) { present = false; break; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa @ 2011-04-18 18:52 ` Joerg Roedel 2011-04-19 1:24 ` Takuya Yoshikawa 2011-04-19 1:42 ` Xiao Guangrong 2011-04-20 9:02 ` Avi Kivity 2 siblings, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2011-04-18 18:52 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: avi, mtosatti, kvm, yoshikawa.takuya On Tue, Apr 19, 2011 at 03:38:14AM +0900, Takuya Yoshikawa wrote: > From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> > > We optimize multi level guest page table walk as follows: > > 1. We cache the memslot which, probably, includes the next guest page > tables to avoid searching for it many times. > 2. We use get_user() instead of copy_from_user(). > > Note that this is kind of a restricted way of Xiao's more generic > work: "KVM: optimize memslots searching and cache GPN to GFN." > > With this patch applied, paging64_walk_addr_generic() has improved > as the following tracing results show. > > Before: > 3.169 us | paging64_walk_addr_generic(); > 1.880 us | paging64_walk_addr_generic(); > 1.243 us | paging64_walk_addr_generic(); > 1.517 us | paging64_walk_addr_generic(); > 3.009 us | paging64_walk_addr_generic(); > 1.814 us | paging64_walk_addr_generic(); > 1.340 us | paging64_walk_addr_generic(); > 1.659 us | paging64_walk_addr_generic(); > 1.748 us | paging64_walk_addr_generic(); > 1.488 us | paging64_walk_addr_generic(); > > After: > 1.714 us | paging64_walk_addr_generic(); > 0.806 us | paging64_walk_addr_generic(); > 0.664 us | paging64_walk_addr_generic(); > 0.619 us | paging64_walk_addr_generic(); > 0.645 us | paging64_walk_addr_generic(); > 0.605 us | paging64_walk_addr_generic(); > 1.388 us | paging64_walk_addr_generic(); > 0.753 us | paging64_walk_addr_generic(); > 0.594 us | paging64_walk_addr_generic(); > 0.833 us | paging64_walk_addr_generic(); Nice optimization! What scenarios have you used to test it? Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-18 18:52 ` Joerg Roedel @ 2011-04-19 1:24 ` Takuya Yoshikawa 2011-04-19 6:20 ` Joerg Roedel 0 siblings, 1 reply; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-19 1:24 UTC (permalink / raw) To: Joerg Roedel; +Cc: Takuya Yoshikawa, avi, mtosatti, kvm Joerg Roedel <joro@8bytes.org> wrote: > Nice optimization! What scenarios have you used to test it? I used my desktop Phenom II box, running the latest qemu-kvm. So probably, NPT was ON by default. The guest was running a .ogg movie during that test. I am not an MMU expert. So I would be glad if I can know what scenarios should be tested for this patch! Can I test nested SVM easily, e.g.? Thanks, Takuya ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-19 1:24 ` Takuya Yoshikawa @ 2011-04-19 6:20 ` Joerg Roedel 0 siblings, 0 replies; 27+ messages in thread From: Joerg Roedel @ 2011-04-19 6:20 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: Takuya Yoshikawa, avi, mtosatti, kvm On Tue, Apr 19, 2011 at 10:24:10AM +0900, Takuya Yoshikawa wrote: > Can I test nested SVM easily, e.g.? Yes, nested SVM would be good to test too with those changes. We should make sure to not break something there. Regards, Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa 2011-04-18 18:52 ` Joerg Roedel @ 2011-04-19 1:42 ` Xiao Guangrong 2011-04-19 3:47 ` Takuya Yoshikawa 2011-04-20 9:02 ` Avi Kivity 2 siblings, 1 reply; 27+ messages in thread From: Xiao Guangrong @ 2011-04-19 1:42 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: avi, mtosatti, kvm, yoshikawa.takuya On 04/19/2011 02:38 AM, Takuya Yoshikawa wrote: > From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> > > We optimize multi level guest page table walk as follows: > > 1. We cache the memslot which, probably, includes the next guest page > tables to avoid searching for it many times. Yeah, the hit is very high, after optimizing the algorithm of memslots (http://lwn.net/Articles/429308/), maybe the advantage is not so significant, could you apply this patchset and test again please? > 2. We use get_user() instead of copy_from_user(). > > Note that this is kind of a restricted way of Xiao's more generic > work: "KVM: optimize memslots searching and cache GPN to GFN." > > With this patch applied, paging64_walk_addr_generic() has improved > as the following tracing results show. > > Before: > 3.169 us | paging64_walk_addr_generic(); > 1.880 us | paging64_walk_addr_generic(); > 1.243 us | paging64_walk_addr_generic(); > 1.517 us | paging64_walk_addr_generic(); > 3.009 us | paging64_walk_addr_generic(); > 1.814 us | paging64_walk_addr_generic(); > 1.340 us | paging64_walk_addr_generic(); > 1.659 us | paging64_walk_addr_generic(); > 1.748 us | paging64_walk_addr_generic(); > 1.488 us | paging64_walk_addr_generic(); > > After: > 1.714 us | paging64_walk_addr_generic(); > 0.806 us | paging64_walk_addr_generic(); > 0.664 us | paging64_walk_addr_generic(); > 0.619 us | paging64_walk_addr_generic(); > 0.645 us | paging64_walk_addr_generic(); > 0.605 us | paging64_walk_addr_generic(); > 1.388 us | paging64_walk_addr_generic(); > 0.753 us | paging64_walk_addr_generic(); > 0.594 us | paging64_walk_addr_generic(); > 0.833 us | paging64_walk_addr_generic(); > > Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> > --- > arch/x86/kvm/paging_tmpl.h | 37 ++++++++++++++++++++++++++++++++----- > 1 files changed, 32 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > index 109939a..614aa3f 100644 > --- a/arch/x86/kvm/paging_tmpl.h > +++ b/arch/x86/kvm/paging_tmpl.h > @@ -109,12 +109,37 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) > return access; > } > > +/* > + * Read the guest PTE refered to by table_gfn and offset and put it into ptep. > + * > + * *slot_hint, if not NULL, should point to a memslot which probably includes > + * the guest PTE. The actual memslot will be put back into this so that > + * callers can cache it. > + */ > static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > - gfn_t table_gfn, int offset, pt_element_t *ptep) > + gfn_t table_gfn, int offset, pt_element_t *ptep, > + struct kvm_memory_slot **slot_hint) > { > - return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, > - offset, sizeof(*ptep), > - PFERR_USER_MASK | PFERR_WRITE_MASK); > + unsigned long addr; > + pt_element_t __user *ptep_user; > + gfn_t real_gfn; > + > + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > + PFERR_USER_MASK | PFERR_WRITE_MASK); > + if (real_gfn == UNMAPPED_GVA) > + return -EFAULT; > + > + real_gfn = gpa_to_gfn(real_gfn); > + > + if (!(*slot_hint) || !gfn_in_memslot(*slot_hint, real_gfn)) > + *slot_hint = gfn_to_memslot(vcpu->kvm, real_gfn); > + You forgot to check the result. (if *slot_hint == NULL)? ... ;-) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-19 1:42 ` Xiao Guangrong @ 2011-04-19 3:47 ` Takuya Yoshikawa 2011-04-20 9:09 ` Avi Kivity 0 siblings, 1 reply; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-19 3:47 UTC (permalink / raw) To: Xiao Guangrong; +Cc: Takuya Yoshikawa, avi, mtosatti, kvm Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote: > > We optimize multi level guest page table walk as follows: > > > > 1. We cache the memslot which, probably, includes the next guest page > > tables to avoid searching for it many times. > > Yeah, the hit is very high, after optimizing the algorithm of memslots > (http://lwn.net/Articles/429308/), maybe the advantage is not so significant, > could you apply this patchset and test again please? Any sorting, including tree based, strategies have tradoffs. Compared to that, what I wanted to do here was to improve the table walk locally without sacrificing other things. Of course, my strategy depends on the assumption that the page tables will be in the same slot in very high probability. So if certain algorithm seems to be addapted, yes, I will test based on that. IIRC, any practically good algorithm has not been found yet, right? > > > 2. We use get_user() instead of copy_from_user(). > > > > Note that this is kind of a restricted way of Xiao's more generic > > work: "KVM: optimize memslots searching and cache GPN to GFN." > > > > With this patch applied, paging64_walk_addr_generic() has improved > > as the following tracing results show. > > > > + > > + if (!(*slot_hint) || !gfn_in_memslot(*slot_hint, real_gfn)) > > + *slot_hint = gfn_to_memslot(vcpu->kvm, real_gfn); > > + > > You forgot to check the result. (if *slot_hint == NULL)? ... ;-) Thank you! I will check later. Takuya ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-19 3:47 ` Takuya Yoshikawa @ 2011-04-20 9:09 ` Avi Kivity 0 siblings, 0 replies; 27+ messages in thread From: Avi Kivity @ 2011-04-20 9:09 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: Xiao Guangrong, Takuya Yoshikawa, mtosatti, kvm On 04/19/2011 06:47 AM, Takuya Yoshikawa wrote: > So if certain algorithm seems to be addapted, yes, I will test based > on that. IIRC, any practically good algorithm has not been found yet, > right? I think a simple sort based on size will provide the same optimization (just the cache, not get_user()) without any downsides. Most memory in a guest is usually in just one or two slots, that's the reason for the high hit rate. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa 2011-04-18 18:52 ` Joerg Roedel 2011-04-19 1:42 ` Xiao Guangrong @ 2011-04-20 9:02 ` Avi Kivity 2011-04-29 2:46 ` Andi Kleen 2 siblings, 1 reply; 27+ messages in thread From: Avi Kivity @ 2011-04-20 9:02 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: mtosatti, kvm, yoshikawa.takuya On 04/18/2011 09:38 PM, Takuya Yoshikawa wrote: > From: Takuya Yoshikawa<yoshikawa.takuya@oss.ntt.co.jp> > > We optimize multi level guest page table walk as follows: > > 1. We cache the memslot which, probably, includes the next guest page > tables to avoid searching for it many times. > 2. We use get_user() instead of copy_from_user(). > > Note that this is kind of a restricted way of Xiao's more generic > work: "KVM: optimize memslots searching and cache GPN to GFN." Good optimization. copy_from_user() really isn't optimized for short buffers, I expect much of the improvement comes from that. > +/* > + * Read the guest PTE refered to by table_gfn and offset and put it into ptep. > + * > + * *slot_hint, if not NULL, should point to a memslot which probably includes > + * the guest PTE. The actual memslot will be put back into this so that > + * callers can cache it. > + */ Please drop the slot_hint optimization. First, it belongs in a separate patch. Second, I prefer to see a generic slot sort instead of an ad-hoc cache. > static int FNAME(read_guest_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > - gfn_t table_gfn, int offset, pt_element_t *ptep) > + gfn_t table_gfn, int offset, pt_element_t *ptep, > + struct kvm_memory_slot **slot_hint) > { > - return kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, ptep, > - offset, sizeof(*ptep), > - PFERR_USER_MASK | PFERR_WRITE_MASK); > + unsigned long addr; > + pt_element_t __user *ptep_user; > + gfn_t real_gfn; > + > + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > + PFERR_USER_MASK | PFERR_WRITE_MASK); > + if (real_gfn == UNMAPPED_GVA) > + return -EFAULT; > + > + real_gfn = gpa_to_gfn(real_gfn); > + > + if (!(*slot_hint) || !gfn_in_memslot(*slot_hint, real_gfn)) > + *slot_hint = gfn_to_memslot(vcpu->kvm, real_gfn); > + > + addr = gfn_to_hva_memslot(*slot_hint, real_gfn); > + if (kvm_is_error_hva(addr)) > + return -EFAULT; > + > + ptep_user = (pt_element_t __user *)((void *)addr + offset); > + return get_user(*ptep, ptep_user); > } > > /* > @@ -130,6 +155,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, > gpa_t pte_gpa; > bool eperm, present, rsvd_fault; > int offset, write_fault, user_fault, fetch_fault; > + struct kvm_memory_slot *slot_cache = NULL; > > write_fault = access& PFERR_WRITE_MASK; > user_fault = access& PFERR_USER_MASK; > @@ -168,7 +194,8 @@ walk: > walker->table_gfn[walker->level - 1] = table_gfn; > walker->pte_gpa[walker->level - 1] = pte_gpa; > > - if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, offset,&pte)) { > + if (FNAME(read_guest_pte)(vcpu, mmu, table_gfn, > + offset,&pte,&slot_cache)) { > present = false; > break; > } -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-20 9:02 ` Avi Kivity @ 2011-04-29 2:46 ` Andi Kleen 2011-04-29 5:38 ` Takuya Yoshikawa 0 siblings, 1 reply; 27+ messages in thread From: Andi Kleen @ 2011-04-29 2:46 UTC (permalink / raw) To: Avi Kivity; +Cc: Takuya Yoshikawa, mtosatti, kvm, yoshikawa.takuya Avi Kivity <avi@redhat.com> writes: > > Good optimization. copy_from_user() really isn't optimized for short > buffers, I expect much of the improvement comes from that. Actually it is equivalent to get_user for the lenghts supported by get_user, assuming you pass in a constant length. You probably do not. -Andi -- ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 2:46 ` Andi Kleen @ 2011-04-29 5:38 ` Takuya Yoshikawa 2011-04-29 6:30 ` Takuya Yoshikawa 2011-04-29 6:59 ` Andi Kleen 0 siblings, 2 replies; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-29 5:38 UTC (permalink / raw) To: Andi Kleen; +Cc: Avi Kivity, mtosatti, kvm, yoshikawa.takuya On Thu, 28 Apr 2011 19:46:00 -0700 Andi Kleen <andi@firstfloor.org> wrote: > Avi Kivity <avi@redhat.com> writes: > > > > Good optimization. copy_from_user() really isn't optimized for short > > buffers, I expect much of the improvement comes from that. > > Actually it is equivalent to get_user for the lenghts supported by > get_user, assuming you pass in a constant length. You probably do not. > > -Andi Weird, I actually measured some changes even after dropping other parts than get_user() usage. Only I can guess for that reason is the reduction of some function calls by inlining some functions. Takuya ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 5:38 ` Takuya Yoshikawa @ 2011-04-29 6:30 ` Takuya Yoshikawa 2011-04-29 6:59 ` Andi Kleen 1 sibling, 0 replies; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-29 6:30 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: Andi Kleen, Avi Kivity, mtosatti, kvm, yoshikawa.takuya On Fri, 29 Apr 2011 14:38:08 +0900 Takuya Yoshikawa <takuya.yoshikawa@gmail.com> wrote: > On Thu, 28 Apr 2011 19:46:00 -0700 > Andi Kleen <andi@firstfloor.org> wrote: > > > Avi Kivity <avi@redhat.com> writes: > > > > > > Good optimization. copy_from_user() really isn't optimized for short > > > buffers, I expect much of the improvement comes from that. > > > > Actually it is equivalent to get_user for the lenghts supported by > > get_user, assuming you pass in a constant length. You probably do not. > > > > -Andi > > > Weird, I actually measured some changes even after dropping other parts > than get_user() usage. > > Only I can guess for that reason is the reduction of some function calls > by inlining some functions. A bit to clarify. Original path: kvm_read_guest_page_mmu() kvm_read_guest_page() copy_from_user() Takuya ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 5:38 ` Takuya Yoshikawa 2011-04-29 6:30 ` Takuya Yoshikawa @ 2011-04-29 6:59 ` Andi Kleen 2011-04-29 13:51 ` Takuya Yoshikawa 1 sibling, 1 reply; 27+ messages in thread From: Andi Kleen @ 2011-04-29 6:59 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: Andi Kleen, Avi Kivity, mtosatti, kvm, yoshikawa.takuya > Only I can guess for that reason is the reduction of some function calls > by inlining some functions. Yes once at a time cfu was inline too and just checked for the right sizes and the used g*u, but it got lost in the "icache over everything else" mania which is unfortunately en vogue for quite some time in kernel land (aka measuring patches only based on their impact on the .text size, not the actual performance) But you might getter better gains by fixing this general c*u() regression. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 6:59 ` Andi Kleen @ 2011-04-29 13:51 ` Takuya Yoshikawa 2011-04-29 16:05 ` Andi Kleen 0 siblings, 1 reply; 27+ messages in thread From: Takuya Yoshikawa @ 2011-04-29 13:51 UTC (permalink / raw) To: Andi Kleen; +Cc: Avi Kivity, mtosatti, kvm, yoshikawa.takuya Andi Kleen <andi@firstfloor.org> wrote: > > Only I can guess for that reason is the reduction of some function calls > > by inlining some functions. > > Yes once at a time cfu was inline too and just checked for the right > sizes and the used g*u, but it got lost in the "icache over everything else" mania which is unfortunately en vogue for quite some time in kernel land (aka > measuring patches only based on their impact on the .text > size, not the actual performance) > > But you might getter better gains by fixing this general > c*u() regression. > What I mentioned was about not only cfu but 3 functions. Originally, we were doing the following function calls: walk_addr_generic() ---1 kvm_read_guest_page_mmu() ---2 kvm_read_guest_page() ---3 copy_from_user() ---4 And now, with my patch already applied, we are not using generic kvm_read_guest_page_mmu() and some address calculations are done in walk_addr_generic(): walk_addr_generic() ---1' get_user() ---2' The length is passed in from walk_addr_generic(). Do you think the following case would not differ so much from (1' 2') ? walk_addr_generic() ---1'' copy_from_user() ---2'' This can satisfy your "assuming you pass in a constant length" and almost same as get_user() ? Takuya ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 13:51 ` Takuya Yoshikawa @ 2011-04-29 16:05 ` Andi Kleen 2011-05-01 13:32 ` Avi Kivity 0 siblings, 1 reply; 27+ messages in thread From: Andi Kleen @ 2011-04-29 16:05 UTC (permalink / raw) To: Takuya Yoshikawa; +Cc: Andi Kleen, Avi Kivity, mtosatti, kvm, yoshikawa.takuya > Do you think the following case would not differ so much > from (1' 2') ? > > walk_addr_generic() ---1'' > copy_from_user() ---2'' Yes it should be the same and is cleaner. If you do a make .../foo.i and look at the code coming out of the preprocessor you'll see it expands to a if (!__builtin_constant_p(size)) return copy_user_generic(dst, (__force void *)src, size); switch (size) { case 1:__get_user_asm(*(u8 *)dst, (u8 __user *)src, ret, "b", "b", "=q", 1); return ret; case 2: .. case 4: .. case 8: .. case 10: .. case 16: .. } Ok it looks like the 32bit kernel only handles 1/2/4. Maybe that was the problem if you ran on 32bit. -Andi ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-04-29 16:05 ` Andi Kleen @ 2011-05-01 13:32 ` Avi Kivity 2011-05-01 20:51 ` Andi Kleen 0 siblings, 1 reply; 27+ messages in thread From: Avi Kivity @ 2011-05-01 13:32 UTC (permalink / raw) To: Andi Kleen; +Cc: Takuya Yoshikawa, mtosatti, kvm, yoshikawa.takuya On 04/29/2011 07:05 PM, Andi Kleen wrote: > > Do you think the following case would not differ so much > > from (1' 2') ? > > > > walk_addr_generic() ---1'' > > copy_from_user() ---2'' > > Yes it should be the same and is cleaner. > > If you do a make .../foo.i and look at the code coming out of the > preprocessor you'll see it expands to a > > if (!__builtin_constant_p(size)) > return copy_user_generic(dst, (__force void *)src, size); > switch (size) { > case 1:__get_user_asm(*(u8 *)dst, (u8 __user *)src, > ret, "b", "b", "=q", 1); > return ret; > case 2: .. > case 4: .. > case 8: .. > case 10: .. > case 16: .. > } > > Ok it looks like the 32bit kernel only handles 1/2/4. Maybe that > was the problem if you ran on 32bit. I'm happy with a slower copy_from_user() for that particular case. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk 2011-05-01 13:32 ` Avi Kivity @ 2011-05-01 20:51 ` Andi Kleen 0 siblings, 0 replies; 27+ messages in thread From: Andi Kleen @ 2011-05-01 20:51 UTC (permalink / raw) To: Avi Kivity; +Cc: Andi Kleen, Takuya Yoshikawa, mtosatti, kvm, yoshikawa.takuya >> Ok it looks like the 32bit kernel only handles 1/2/4. Maybe that >> was the problem if you ran on 32bit. > > I'm happy with a slower copy_from_user() for that particular case. It wouldn't be hard to fix. -Andi ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2011-05-01 20:51 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-18 18:32 [PATCH 1/3] KVM: Introduce a helper to check if gfn is in memslot Takuya Yoshikawa 2011-04-18 18:34 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa 2011-04-20 9:07 ` Avi Kivity 2011-04-20 9:35 ` Roedel, Joerg 2011-04-20 10:05 ` Avi Kivity 2011-04-20 11:06 ` Roedel, Joerg 2011-04-20 11:18 ` Avi Kivity 2011-04-20 13:33 ` [PATCH] KVM: MMU: Make cmpxchg_gpte aware of nesting too Roedel, Joerg 2011-04-21 1:02 ` Takuya Yoshikawa 2011-04-21 8:11 ` Avi Kivity 2011-04-21 1:07 ` [PATCH 2/3] KVM: MMU: Introduce a helper to read guest pte Takuya Yoshikawa 2011-04-18 18:38 ` [RFC PATCH 3/3] KVM: MMU: Optimize guest page table walk Takuya Yoshikawa 2011-04-18 18:52 ` Joerg Roedel 2011-04-19 1:24 ` Takuya Yoshikawa 2011-04-19 6:20 ` Joerg Roedel 2011-04-19 1:42 ` Xiao Guangrong 2011-04-19 3:47 ` Takuya Yoshikawa 2011-04-20 9:09 ` Avi Kivity 2011-04-20 9:02 ` Avi Kivity 2011-04-29 2:46 ` Andi Kleen 2011-04-29 5:38 ` Takuya Yoshikawa 2011-04-29 6:30 ` Takuya Yoshikawa 2011-04-29 6:59 ` Andi Kleen 2011-04-29 13:51 ` Takuya Yoshikawa 2011-04-29 16:05 ` Andi Kleen 2011-05-01 13:32 ` Avi Kivity 2011-05-01 20:51 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).