From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [patch 07/13] KVM: MMU: mode specific sync_page Date: Mon, 8 Sep 2008 03:03:54 -0300 Message-ID: <20080908060354.GA1014@dmt.cnet> References: <20080906184822.560099087@localhost.localdomain> <20080906192431.043506161@localhost.localdomain> <48C3A455.5080100@qumranet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: KVM list To: Avi Kivity Return-path: Received: from mx1.redhat.com ([66.187.233.31]:57928 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752837AbYIHGF0 (ORCPT ); Mon, 8 Sep 2008 02:05:26 -0400 Content-Disposition: inline In-Reply-To: <48C3A455.5080100@qumranet.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Sep 07, 2008 at 12:52:21PM +0300, Avi Kivity wrote: > What if vcpu0 is in mode X, while vcpu1 is in mode Y. vcpu0 writes to > some pagetable, causing both mode X and mode Y shadows to become > unsynced, so on the next resync (either by vcpu0 or vcpu1) we need to > sync both modes. >>From the oos core patch: - hlist_for_each_entry(sp, node, bucket, hash_link) - if (sp->gfn == gfn && sp->role.word == role.word) { + hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link) + if (sp->gfn == gfn) { + /* + * If a pagetable becomes referenced by more than one + * root, or has multiple roles, unsync it and disable + * oos. For higher level pgtables the entire tree + * has to be synced. + */ + if (sp->root_gfn != root_gfn) { + kvm_set_pg_inuse(sp); + if (set_shared_mmu_page(vcpu, sp)) + tmp = bucket->first; + kvm_clear_pg_inuse(sp); + unsyncable = 0; + } So as soon as a pagetable is shadowed with different modes, its resynced and unsyncing is disabled. > Same problem with kvm_mmu_pte_write(), which right now hacks around it. > > Maybe we need a ->ops member. >> + if (!is_present_pte(*pt)) { >> + rmap_remove(vcpu->kvm, &sp->spt[i]); >> + sp->spt[i] = shadow_notrap_nonpresent_pte; >> + pt++; >> + continue; >> + } >> > > Are we missing a tlb flush? Or will the caller take care of it? Yes, there's a local TLB flush missing, which can be collapsed into a single kvm_x86_ops->tlb_flush in the caller. >> + >> + pte_access = sp->role.access & FNAME(gpte_access)(vcpu, *pt); >> + /* user */ >> + if (pte_access & ACC_USER_MASK) >> + spte |= shadow_user_mask; >> > > There are some special cases involving cr0.wp=0 and the user mask. so > spte.u is not correlated exactly with gpte.u. How come? >> + /* guest->shadow accessed sync */ >> + if (!(*pt & PT_ACCESSED_MASK)) >> + spte &= ~PT_ACCESSED_MASK; >> > > spte shouldn't be accessible at all if gpte is not accessed, so we can > set gpte.a on the next access (similar to spte not being writeable if > gpte is not dirty). Right. Perhaps accessed bit synchronization to guest could be performed lazily somehow, so as to avoid a vmexit on every first page access. >> + /* shadow->guest accessed sync */ >> + if (spte & PT_ACCESSED_MASK) >> + set_bit(PT_ACCESSED_SHIFT, (unsigned long *)pt); >> > > host accessed and guest accessed are very different. We shouldn't set > host accessed unless we're sure the guest will access the page very soon. > >> + set_shadow_pte(&sp->spt[i], spte); >> > > What if permissions are reduced? Then a local TLB flush is needed. Flushing the TLB's of remote vcpus should be done by the guest AFAICS. > You can use PT_* instead of shadow_* as this will never be called when > ept is active. > > I'm worried about the duplication with kvm_mmu_set_pte(). Perhaps that > can be refactored instead to be the inner loop. Will look into that.