From mboxrd@z Thu Jan  1 00:00:00 1970
From: Junaid Shahid <junaids@google.com>
Subject: [PATCH v2 5/5] kvm: x86: mmu: Update documentation for fast page fault mechanism
Date: Tue,  8 Nov 2016 15:00:30 -0800
Message-ID: <1478646030-101103-6-git-send-email-junaids@google.com>
References: <cover.1476839873.git.junaids@google.com>
 <1478646030-101103-1-git-send-email-junaids@google.com>
Cc: andreslc@google.com, pfeiner@google.com, pbonzini@redhat.com,
        guangrong.xiao@linux.intel.com
To: kvm@vger.kernel.org
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-pf0-f171.google.com ([209.85.192.171]:33989 "EHLO
        mail-pf0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S932262AbcKHXAg (ORCPT <rfc822;kvm@vger.kernel.org>);
        Tue, 8 Nov 2016 18:00:36 -0500
Received: by mail-pf0-f171.google.com with SMTP id n85so114982215pfi.1
        for <kvm@vger.kernel.org>; Tue, 08 Nov 2016 15:00:36 -0800 (PST)
In-Reply-To: <1478646030-101103-1-git-send-email-junaids@google.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Add a brief description of the lockless access tracking mechanism
to the documentation of fast page faults in locking.txt.

Signed-off-by: Junaid Shahid <junaids@google.com>
---
 Documentation/virtual/kvm/locking.txt | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/locking.txt b/Documentation/virtual/kvm/locking.txt
index f2491a8..e7a1f7c 100644
--- a/Documentation/virtual/kvm/locking.txt
+++ b/Documentation/virtual/kvm/locking.txt
@@ -12,9 +12,16 @@ KVM Lock Overview
 Fast page fault:
 
 Fast page fault is the fast path which fixes the guest page fault out of
-the mmu-lock on x86. Currently, the page fault can be fast only if the
-shadow page table is present and it is caused by write-protect, that means
-we just need change the W bit of the spte.
+the mmu-lock on x86. Currently, the page fault can be fast in one of the
+following two cases:
+
+1. Access Tracking: The SPTE is not present, but it is marked for access
+tracking i.e. the VMX_EPT_TRACK_ACCESS mask is set. That means we need to
+restore the saved RWX bits. This is described in more detail later below.
+
+2. Write-Protection: The SPTE is present and the fault is
+caused by write-protect. That means we just need to change the W bit of the 
+spte.
 
 What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
 SPTE_MMU_WRITEABLE bit on the spte:
@@ -24,7 +31,8 @@ SPTE_MMU_WRITEABLE bit on the spte:
   page write-protection.
 
 On fast page fault path, we will use cmpxchg to atomically set the spte W
-bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, this
+bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or 
+restore the saved RWX bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This
 is safe because whenever changing these bits can be detected by cmpxchg.
 
 But we need carefully check these cases:
@@ -128,6 +136,17 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always
 atomically update the spte, the race caused by fast page fault can be avoided,
 See the comments in spte_has_volatile_bits() and mmu_spte_update().
 
+Lockless Access Tracking:
+
+This is used for Intel CPUs that are using EPT but do not support the EPT A/D
+bits. In this case, when the KVM MMU notifier is called to track accesses to a
+page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present
+by clearing the RWX bits in the PTE and storing the original bits in some
+unused/ignored bits. In addition, the VMX_EPT_TRACK_ACCESS mask is also set on
+the PTE (also using unused/ignored bits). When the VM tries to access the page
+later on, a fault is generated and the fast page fault mechanism described
+above is used to atomically restore the PTE to its original state.
+
 3. Reference
 ------------
 
-- 
2.8.0.rc3.226.g39d4020