From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: How KVM sync guest page table with corresponding shadow page table? Date: Thu, 30 Aug 2012 18:44:16 +0800 Message-ID: <503F4400.6010302@linux.vnet.ibm.com> References: <20120824072916.GA67464@cs.nctu.edu.tw> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org To: =?UTF-8?B?IumZs+mfi+S7uyAoV2VpLVJlbiBDaGVuKSI=?= Return-path: Received: from e23smtp07.au.ibm.com ([202.81.31.140]:43118 "EHLO e23smtp07.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752604Ab2H3Kop (ORCPT ); Thu, 30 Aug 2012 06:44:45 -0400 Received: from /spool/local by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 30 Aug 2012 20:42:53 +1000 Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q7UAiI2W29884422 for ; Thu, 30 Aug 2012 20:44:18 +1000 Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q7UAiIaJ021524 for ; Thu, 30 Aug 2012 20:44:18 +1000 In-Reply-To: <20120824072916.GA67464@cs.nctu.edu.tw> Sender: kvm-owner@vger.kernel.org List-ID: On 08/24/2012 03:29 PM, =E9=99=B3=E9=9F=8B=E4=BB=BB (Wei-Ren Chen) wrot= e: > Hi Guangrong, >=20 > I am not familiar with the term used in paging world, so I need to = ask > some dumb questions. >=20 >> It is controlled by shadow page table, guest-page-tables are write-p= rotected >> on shadow pages (the W bit on PTE is cleared). >=20 > O.K. >=20 >> There has a special case, called unsync shadow page, if the page onl= y used >> as guest page structure on the lowest level (level =3D 1), we allow = it to be >> writable, it will be sync-ed when the guest flush the tlb (e.g: CR3 = reload, >> invlpg...) because according to x86 TLB rules, it needs to flush tlb= to apply >> the change. >=20 > What "guest page structure on the lowest level" means? Take a simpl= e two-level > page hierarchy as an example, do you mean the page table (not page di= rectory)? Yes. > In short, if the architecture requires when OS modifies Nth-level pag= e table > entry, the OS has to flush tlb, then we can unsync shadow page since = the modifying > operations can be trapped by executing tlb flush instruction. Is that= right? >=20 Right. Note, as I mentioned above, only Page Table (N0) can be unsync. >> In the normal case, guest writes its page table will generate #PF si= nce the >> page is write-protected as we mention above. >=20 > Let me try to illustrate what happen after HW generating #PF, pleas= e correct > me if I am wrong. :) >=20 > #PF causes an VMExit, and the control returned to KVM. KVM will use= guest > virtual address (I think this information is accompanied with #PF, st= ored in cr4 > on x86 for example) to walk guest page table, and KVM will find guest= allows CR2 if no virtualization, Exit qualification if it is running in guest = mode > this access, so something must be wrong in shadow page table. Then KV= M will > go through shadow page table, found the corresponding entry has been = set > to write-protected. Now KVM knows it has to sync guest page table and= the > shadow page table, but how KVM knows what guest OS want to write into= guest > page table entry?=20 If shadow page table can mark to writable, we do not need care "what guest OS want to write", just set the shadow page table to writab= le and return to guest, let the guest reexecute the instruction again. If shadow page table can not mark to writable (such as MMIO or the page= is used as Page Directory), we can emulate the instruction. >=20 > Let's assume KVM knows, what's next? KVM will make the guest page t= able > writable by setting shadow page table entry, contine to execute guest > instruction which triggers #PF, then sync shadow page table at the sa= me Yes, guest reexecutes the instruction, no #PF again since the shadow pa= ge table has already been make writable. > time? It would be great if you can point me out where should I look i= nto > in KVM source code as a starting code, that makes me understand shado= w > page table stuff more concrete, and I don't have to guess how things > work. :) Welcome! :)