From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
To: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
KVM list <kvm@vger.kernel.org>
Subject: Re: [PATCH 5/6] KVM: MMU: prefetch ptes when intercepted guest #PF
Date: Thu, 17 Jun 2010 15:25:00 +0800 [thread overview]
Message-ID: <4C19CDCC.4060404@cn.fujitsu.com> (raw)
In-Reply-To: <4C1766D6.3040000@redhat.com>
Avi Kivity wrote:
>> + if (*spte != shadow_trap_nonpresent_pte)
>> + continue;
>> +
>> + gfn = sp->gfn + (i<< ((sp->role.level - 1) * PT64_LEVEL_BITS));
>>
>
Avi,
Thanks for your comment.
> Can calculate outside the loop and use +=.
>
It's nice, will do it in the next version.
> Can this in fact work for level != PT_PAGE_TABLE_LEVEL? We might start
> at PT_PAGE_DIRECTORY_LEVEL but get 4k pages while iterating.
Ah, i forgot it. We can't assume that the host also support huge page for
next gfn, as Marcelo's suggestion, we should "only map with level > 1 if
the host page matches the size".
Um, the problem is, when we get host page size, we should hold 'mm->mmap_sem',
it can't used in atomic context and it's also a slow path, we hope pte prefetch
path is fast.
How about only allow prefetch for sp.leve = 1 now? i'll improve it in the future,
i think it need more time :-)
>
>> +
>> + pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn);
>> + if (is_error_pfn(pfn)) {
>> + kvm_release_pfn_clean(pfn);
>> + break;
>> + }
>> + if (pte_prefetch_topup_memory_cache(vcpu))
>> + break;
>> +
>> + mmu_set_spte(vcpu, spte, ACC_ALL, ACC_ALL, 0, 0, 1, NULL,
>> + sp->role.level, gfn, pfn, true, false);
>> + }
>> +}
>>
>
> Nice. Direct prefetch should usually succeed.
>
> Can later augment to call get_users_pages_fast(..., PTE_PREFETCH_NUM,
> ...) to reduce gup overhead.
But we can't assume the gfn's hva is consecutive, for example, gfn and gfn+1
maybe in the different slots.
>
>>
>> +static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, u64 *sptep)
>> +{
>> + struct kvm_mmu_page *sp;
>> + pt_element_t *table = NULL;
>> + int offset = 0, shift, index, i;
>> +
>> + sp = page_header(__pa(sptep));
>> + index = sptep - sp->spt;
>> +
>> + if (PTTYPE == 32) {
>> + shift = PAGE_SHIFT - (PT_LEVEL_BITS -
>> + PT64_LEVEL_BITS) * sp->role.level;
>> + offset = sp->role.quadrant<< shift;
>> + }
>> +
>> + for (i = index + 1; i< min(PT64_ENT_PER_PAGE,
>> + index + PTE_PREFETCH_NUM); i++) {
>> + struct page *page;
>> + pt_element_t gpte;
>> + unsigned pte_access;
>> + u64 *spte = sp->spt + i;
>> + gfn_t gfn;
>> + pfn_t pfn;
>> + int dirty;
>> +
>> + if (*spte != shadow_trap_nonpresent_pte)
>> + continue;
>> +
>> + pte_access = sp->role.access;
>> + if (sp->role.direct) {
>> + dirty = 1;
>> + gfn = sp->gfn + (i<< ((sp->role.level - 1) *
>> + PT64_LEVEL_BITS));
>> + goto gfn_mapping;
>> + }
>>
>
> Should just call direct_pte_prefetch.
>
OK, will fix it.
>> +
>> + if (!table) {
>> + page = gfn_to_page_atomic(vcpu->kvm, sp->gfn);
>> + if (is_error_page(page)) {
>> + kvm_release_page_clean(page);
>> + break;
>> + }
>> + table = kmap_atomic(page, KM_USER0);
>> + table = (pt_element_t *)((char *)table + offset);
>> + }
>>
>
> Why not kvm_read_guest_atomic()? Can do it outside the loop.
Do you mean that read all prefetched sptes at one time?
If prefetch one spte fail, the later sptes that we read is waste, so i
choose read next spte only when current spte is prefetched successful.
But i not have strong opinion on it since it's fast to read all sptes at
one time, at the worst case, only 16 * 8 = 128 bytes we need to read.
>
>> +
>> + gpte = table[i];
>> + if (!(gpte& PT_ACCESSED_MASK))
>> + continue;
>> +
>> + if (!is_present_gpte(gpte)) {
>> + if (!sp->unsync)
>> + *spte = shadow_notrap_nonpresent_pte;
>>
>
> Need __set_spte().
Oops, fix it.
>
>> + continue;
>> + }
>> + dirty = is_dirty_gpte(gpte);
>> + gfn = (gpte& PT64_BASE_ADDR_MASK)>> PAGE_SHIFT;
>> + pte_access = pte_access& FNAME(gpte_access)(vcpu, gpte);
>> +gfn_mapping:
>> + pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn);
>> + if (is_error_pfn(pfn)) {
>> + kvm_release_pfn_clean(pfn);
>> + break;
>> + }
>> +
>> + if (pte_prefetch_topup_memory_cache(vcpu))
>> + break;
>> + mmu_set_spte(vcpu, spte, sp->role.access, pte_access, 0, 0,
>> + dirty, NULL, sp->role.level, gfn, pfn,
>> + true, false);
>> + }
>> + if (table)
>> + kunmap_atomic((char *)table - offset, KM_USER0);
>> +}
>>
>
> I think lot of code can be shared with the pte prefetch in invlpg.
>
Yes, please allow me to cleanup those code after my future patchset:
[PATCH v4 9/9] KVM MMU: optimize sync/update unsync-page
it's the last part in the 'allow multiple shadow pages' patchset.
next prev parent reply other threads:[~2010-06-17 7:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4C16E6ED.7020009@cn.fujitsu.com>
2010-06-15 2:46 ` [PATCH 1/6] KVM: MMU: fix gfn got in kvm_mmu_page_get_gfn() Xiao Guangrong
[not found] ` <4C16E75F.6020003@cn.fujitsu.com>
2010-06-15 2:46 ` [PATCH 2/6] KVM: MMU: fix conflict access permissions in direct sp Xiao Guangrong
[not found] ` <4C16E7AD.1060101@cn.fujitsu.com>
2010-06-15 2:46 ` [PATCH 3/6] KVM: MMU: introduce gfn_to_page_atomic() and gfn_to_pfn_atomic() Xiao Guangrong
2010-06-15 11:22 ` Avi Kivity
2010-06-16 7:59 ` Andi Kleen
2010-06-16 8:05 ` Avi Kivity
2010-06-16 8:49 ` Andi Kleen
2010-06-16 9:40 ` Avi Kivity
2010-06-16 10:51 ` huang ying
2010-06-16 11:08 ` Avi Kivity
2010-06-17 6:46 ` Xiao Guangrong
[not found] ` <4C16E7F4.5060801@cn.fujitsu.com>
2010-06-15 2:46 ` [PATCH 4/6] KVM: MMU: introduce mmu_topup_memory_cache_atomic() Xiao Guangrong
[not found] ` <4C16E82E.5010306@cn.fujitsu.com>
2010-06-15 2:47 ` [PATCH 5/6] KVM: MMU: prefetch ptes when intercepted guest #PF Xiao Guangrong
2010-06-15 11:41 ` Avi Kivity
2010-06-17 7:25 ` Xiao Guangrong [this message]
2010-06-17 8:07 ` Avi Kivity
2010-06-17 9:04 ` Xiao Guangrong
2010-06-17 9:20 ` Avi Kivity
2010-06-17 9:29 ` Xiao Guangrong
2010-06-16 3:55 ` Marcelo Tosatti
[not found] ` <4C16E867.8040606@cn.fujitsu.com>
2010-06-15 2:47 ` [PATCH 6/6] KVM: MMU: trace pte prefetch Xiao Guangrong
2010-06-15 12:09 ` Avi Kivity
2010-06-17 7:27 ` Xiao Guangrong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C19CDCC.4060404@cn.fujitsu.com \
--to=xiaoguangrong@cn.fujitsu.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox