From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [patch 3/3] KVM: MMU: prepopulate the shadow on invlpg Date: Fri, 31 Oct 2008 20:33:11 -0200 Message-ID: <20081031223311.GA31882@dmt.cnet> References: <20081025223111.498934405@localhost.localdomain> <20081025223243.946600413@localhost.localdomain> <49045906.7070305@redhat.com> <20081031194727.GD21772@dmt.cnet> <490B6359.7000307@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx2.redhat.com ([66.187.237.31]:50073 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751792AbYJaWeb (ORCPT ); Fri, 31 Oct 2008 18:34:31 -0400 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id m9VMYOse022513 for ; Fri, 31 Oct 2008 18:34:24 -0400 Content-Disposition: inline In-Reply-To: <490B6359.7000307@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Oct 31, 2008 at 09:58:17PM +0200, Avi Kivity wrote: > Marcelo Tosatti wrote: >>>> + sw->pte_gpa = (sp->gfn << PAGE_SHIFT); >>>> + sw->pte_gpa += (sptep - sp->spt) * sizeof(pt_element_t); >>>> + >>>> + if (is_shadow_present_pte(*sptep)) { >>>> rmap_remove(vcpu->kvm, sptep); >>>> + sw->pte_gpa = -1; >>>> >>> Why? The pte could have heen replaced (for example, a write access >>> to a cow page). >>> >> >> Well look-aheads on address space teardown will be useless. OTOH the >> guest pte read cost is minimal compared to an exit. >> > > Don't understand. We will incur an exit if a pte is replaced and > invlpg'ed due to a copy-on-write (do guests actually execute invlpg > after a cow? I don't think they have to). > > What is the downside? A pagetable teardown that does not involve > zeroing the page? I don't think we'll see invlpg on that path, more > likely a complete tlb flush. Err, I'm on crack. The assumption is that the common case is pte invalidation + invlpg: kunmap_atomic, page aging clearing the accessed bit, page reclaim. Linux COW will invalidate + invlpg (do_wp_page) first: entry = mk_pte(new_page, vma->vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* * Clear the pte entry and flush it first, before * updating the * pte with the new entry. This will avoid a race * condition * seen in the presence of one thread doing SMC and * another * thread doing COW. */ ptep_clear_flush_notify(vma, address, page_table); Not sure about Windows. >> Whatever you prefer. Learning guest behaviour as suggested earlier >> would be optimal, but simple is good. >> > > We're way past simple. We can reclaim some of the complexity by always > doing unsync, and dropping emulation and kvm_mmu_set_pte(), but need to > make sure we don't regress on performance. I think Windows does a pde > write on context switch, which will add a vmexit, but Windows > applications are not too context switch intensive AFAIK.