From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [patch 09/13] KVM: MMU: out of sync shadow core Date: Thu, 11 Sep 2008 10:15:06 -0300 Message-ID: <20080911131506.GA20572@dmt.cnet> References: <20080906184822.560099087@localhost.localdomain> <20080906192431.211131067@localhost.localdomain> <48C3B496.20905@qumranet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.redhat.com ([66.187.233.31]:59961 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750889AbYIKNQD (ORCPT ); Thu, 11 Sep 2008 09:16:03 -0400 Content-Disposition: inline In-Reply-To: <48C3B496.20905@qumranet.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Sep 07, 2008 at 02:01:42PM +0300, Avi Kivity wrote: > Marcelo Tosatti wrote: >> Allow global and single-root, single-role-per-gfn leaf shadowed >> pagetables to be unsynced. >> >> Global unsync pages are saved into a per-vm array, synced on cr4/cr0 writes. >> >> > > Why not a list? > >> Non-global unsync pages are linked off their root shadow page, synced >> on cr3/cr4/cr0 writes. >> >> Some of this logic is simplistic and could be smarter (page_multimapped and >> the full root sync on higher level pagetable sharing). >> >> Also unsyncing of non-leaf nodes might be interesting (but more complicated). >> >> >> +static struct kvm_mmu_page *kvm_mmu_lookup_page_root(struct kvm_vcpu >> *vcpu, >> + gfn_t gfn) >> +{ >> + unsigned index; >> + struct hlist_head *bucket; >> + struct kvm_mmu_page *sp; >> + struct hlist_node *node; >> + struct kvm *kvm = vcpu->kvm; >> + int level = vcpu->arch.mmu.root_level; >> + if (!is_long_mode(vcpu) && is_pae(vcpu)) >> + level--; >> + >> + pgprintk("%s: looking for gfn %lx\n", __func__, gfn); >> + index = kvm_page_table_hashfn(gfn); >> + bucket = &kvm->arch.mmu_page_hash[index]; >> + hlist_for_each_entry(sp, node, bucket, hash_link) >> + if (sp->gfn == gfn && !sp->role.metaphysical >> + && !sp->role.invalid && sp->role.level == level) { >> + pgprintk("%s: found role %x\n", >> + __func__, sp->role.word); >> + return sp; >> + } >> + return NULL; >> +} >> > > I'm worried about the complexity this (and the rest) introduces. > > A possible alternative is: > > - for non-leaf pages, including roots, add a 'unsync_children' flag. > - when marking a page unsync, set the flag recursively on all parents > - when switching cr3, recursively descend to locate unsynced leaves, > clearing flags along the way > - to speed this up, put a bitmap with 1 bit per pte in the pages (512 > bits = 64 bytes) > - the bitmap can be externally allocated to save space, or not > > This means we no longer have to worry about multiple roots, when a page > acquires another root while it is unsynced, etc. While trying to implement it, found a pretty nasty issue with this. A process can switch to a shadow root which has no present entries, whose unvisible (by cr3 switch time) children may be unsynced. The Xen implementation gets away with that by syncing all oos pages on cr3 switch (which are a few). The obvious way to fix it is to resync at kvm_mmu_get_page time, whenever grabbing an unsynced page. Which greatly reduces the out-of-sync time of a pagetable, breaking the ability to populate via pagefault without resync. The singleroot scheme avoids such "invisible" unsynced pages, all out-of-sync pages in the root in question are reachable via sp->unsynced_pages. Any ideas on how to get around that situation?