From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: KVM: MMU: Tracking guest writes through EPT entries ? Date: Thu, 30 Aug 2012 18:22:24 +0800 Message-ID: <503F3EE0.6080502@linux.vnet.ibm.com> References: <501747A1.6000105@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Felix Return-path: Received: from e28smtp07.in.ibm.com ([122.248.162.7]:52762 "EHLO e28smtp07.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852Ab2H3KWo (ORCPT ); Thu, 30 Aug 2012 06:22:44 -0400 Received: from /spool/local by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 30 Aug 2012 15:52:41 +0530 Received: from d28av01.in.ibm.com (d28av01.in.ibm.com [9.184.220.63]) by d28relay03.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7UAMPSZ5243386 for ; Thu, 30 Aug 2012 15:52:26 +0530 Received: from d28av01.in.ibm.com (loopback [127.0.0.1]) by d28av01.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7UFq1WQ023959 for ; Thu, 30 Aug 2012 21:22:01 +0530 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 08/28/2012 11:30 AM, Felix wrote: > Xiao Guangrong linux.vnet.ibm.com> writes: > >> >> On 07/31/2012 01:18 AM, Sunil wrote: >>> Hello List, >>> >>> I am a KVM newbie and studying KVM mmu code. >>> >>> On the existing guest, I am trying to track all guest writes by >>> marking page table entry as read-only in EPT entry [ I am using Intel >>> machine with vmx and ept support ]. Looks like EPT support re-uses >>> shadow page table(SPT) code and hence some of SPT routines. >>> >>> I was thinking of below possible approach. Use pte_list_walk() to >>> traverse through list of sptes and use mmu_spte_update() to flip the >>> PT_WRITABLE_MASK flag. But all SPTEs are not part of any single list; >>> but on separate lists (based on gfn, page level, memory_slot). So, >>> recording all the faulted guest GFN and then using above method work ? >>> >> >> There are two ways to write-protect all sptes: >> - use kvm_mmu_slot_remove_write_access() on all memslots >> - walk the shadow page cache to get the shadow pages in the highest level >> (level = 4 on EPT), then write-protect its entries. >> >> If you just want to do it for the specified gfn, you can use >> rmap_write_protect(). >> >> Just inquisitive, what is your purpose? :) >> >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > Hi, Guangrong, > > I have done similar things like Sunil did. Simply for study purpose. However, I > found some very weird situations. Basically, in the guest vm, I allocate a chunk > of memory (with size of a page) in a user level program. Through a guest kernel > level module and my self defined hypercall, I pass the gva of this memory to > kvm. Then I try different methods in the hypercall handler to write protect this > page of memory. You can see that I want to write protect it through ETP instead > of write protected in the guest page tables. > > 1. I use kvm_mmu_gva_to_gpa_read to translate the gva into gpa. Based on the > function, kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I change the codes to > read sptep (the pointer to spte) instead of spte, so I can modify the spte > corresponding to this gpa. What I observe is that if I modify spte[0] (I think > this is the lowest level page table entry corresponding to EPT table; I can > successfully modify it as the changes are reflected in the result of calling > kvm_mmu_get_spte_hierarchy again), but my user level program in vm can still > write to this page. > > In your this blog post, you mentioned (the shadow pages in the highest level > (level = 4 on EPT)), I don't understand this part. Does this mean I have to > modify spte[3] instead of spte[0]? I just try modify spte[1] and spte[3], both > can cause vmexit. So I am totally confused about the meaning of level used in > shadow page table and its relations to shadow page table. Can you help me to > understand this? > > 2. As suggested by this post, I also use rmap_write_protect() to write protect > this page. With kvm_mmu_get_spte_hierarchy(vcpu, gpa, spte[4]), I still can see > that spte[0] gives me xxxxxx005 such result, this means that the function is > called successfully. But still I can write to this page. > > I even try the function kvm_age_hva() to remove this spte, this gives me 0 of > spte[0], but I still can write to this page. So I am further confused about the > level used in the shadow page? > kvm_mmu_get_spte_hierarchy get sptes out of mmu-lock, you can hold spin_lock(&vcpu->kvm->mmu_lock) and use for_each_shadow_entry instead. And, after change, did you flush all tlbs? If it can not work, please post your code.