From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kai Huang Subject: Re: PML (Page Modification Logging) design for Xen Date: Thu, 12 Feb 2015 10:35:29 +0800 Message-ID: <54DC1171.1030000@linux.intel.com> References: <54DB129D.3060102@linux.intel.com> <54DB4294.1080406@citrix.com> <54DB6392020000780005F08B@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <54DB6392020000780005F08B@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Andrew Cooper Cc: keir@xen.org, kevin.tian@intel.com, tim@xen.org, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 02/11/2015 09:13 PM, Jan Beulich wrote: >>>> On 11.02.15 at 12:52, wrote: >> On 11/02/15 08:28, Kai Huang wrote: >>> With PML, we don't have to use write protection but just clear D-bit >>> of EPT entry of guest memory to do dirty logging, with an additional >>> PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can >>> reduce hypervisor overhead when guest is in dirty logging mode, and >>> therefore more CPU cycles can be allocated to guest, so it's expected >>> benchmarks in guest will have better performance comparing to non-PML. >> One issue with basic EPT A/D tracking was the scan of the EPT tables. >> Here, hardware will give us a list of affected gfns, but how is Xen >> supposed to efficiently clear the dirty bits again? Using EPT >> misconfiguration is no better than the existing fault path. > Why not? The misconfiguration exit ought to clear the D bit for all > 511 entries in the L1 table (and set it for the one entry that is > currently serving the access). All further D bit handling will then > be PML based. Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the sequences are as follows: 1) PML enabled for the domain. 2) ept_invalidate_emt (or ept_invalidate_emt_range) is called. 3) Guest accesses specific GPA (which has been invalidated by step 2), and EPT misconfig is triggered. 4) Then resolve_misconfig is called, which fixes up GFN (above GPA >> 12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we clear D-bit of EPT entry (instead of clear W-bit) if p2m type is p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML. The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared. > >>> - PML buffer flush >>> >>> There are two places we need to flush PML buffer. The first place is >>> PML buffer full VMEXIT handler (apparently), and the second place is >>> in paging_log_dirty_op (either peek or clean), as vcpus are running >>> asynchronously along with paging_log_dirty_op is called from userspace >>> via hypercall, and it's possible there are dirty GPAs logged in vcpus' >>> PML buffers but not full. Therefore we'd better to flush all vcpus' >>> PML buffers before reporting dirty GPAs to userspace. >> Why apparently? It would be quite easy for a guest to dirty 512 frames >> without otherwise taking a vmexit. > I silently replaced apparently with obviously while reading... > >>> We handle above two cases by flushing PML buffer at the beginning of >>> all VMEXITs. This solves the first case above, and it also solves the >>> second case, as prior to paging_log_dirty_op, domain_pause is called, >>> which kicks vcpus (that are in guest mode) out of guest mode via >>> sending IPI, which cause VMEXIT, to them. >>> >>> This also makes log-dirty radix tree more updated as PML buffer is >>> flushed on basis of all VMEXITs but not only PML buffer full VMEXIT. >> My gut feeling is that this is substantial overhead on a common path, >> but this largely depends on how the dirty bits can be cleared efficiently. > I agree on the overhead part, but I don't see what relation this has > to the dirty bit clearing - a PML buffer flush doesn't involve any > alterations of D bits. No the flush is not related to the dirty bit clearing. The PML buffer flush just does following (which I should have clarified in my design, sorry): 1) read out PML index 2) Loop all GPAs logged in the PML buffer according to PML index, and update them to log-dirty radix tree. I agree there's overhead on VMEXIT common path, but the overhead should not be substantial, comparing to the overhead of VMEXIT itself. Thanks, -Kai > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel