From: Maxim Levitsky <mlevitsk@redhat.com>
To: kvm@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org,
Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
linux-kernel@vger.kernel.org, x86@kernel.org,
Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 4/4] KVM: selftests: dirty_log_test: support multiple write retires
Date: Wed, 11 Dec 2024 14:37:06 -0500 [thread overview]
Message-ID: <20241211193706.469817-5-mlevitsk@redhat.com> (raw)
In-Reply-To: <20241211193706.469817-1-mlevitsk@redhat.com>
If dirty_log_test is run nested, it is possible for entries in the emulated
PML log to appear before the actual memory write is committed to the RAM,
due to the way KVM retries memory writes as a response to a MMU fault.
In addition to that in some very rare cases retry can happen more than
once, which will lead to the test failure because once the write is
finally committed it may have a very outdated iteration value.
Detect and avoid this case.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
tools/testing/selftests/kvm/dirty_log_test.c | 52 +++++++++++++++++++-
1 file changed, 50 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c
index a9428076a681..f07126b0205d 100644
--- a/tools/testing/selftests/kvm/dirty_log_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_test.c
@@ -154,6 +154,7 @@ static atomic_t vcpu_sync_stop_requested;
* sem_vcpu_stop and before vcpu continues to run.
*/
static bool dirty_ring_vcpu_ring_full;
+
/*
* This is only used for verifying the dirty pages. Dirty ring has a very
* tricky case when the ring just got full, kvm will do userspace exit due to
@@ -168,7 +169,51 @@ static bool dirty_ring_vcpu_ring_full;
* dirty gfn we've collected, so that if a mismatch of data found later in the
* verifying process, we let it pass.
*/
-static uint64_t dirty_ring_last_page;
+static uint64_t dirty_ring_last_page = -1ULL;
+
+/*
+ * In addition to the above, it is possible (especially if this
+ * test is run nested) for the above scenario to repeat multiple times:
+ *
+ * The following can happen:
+ *
+ * - L1 vCPU: Memory write is logged to PML but not committed.
+ *
+ * - L1 test thread: Ignores the write because its last dirty ring entry
+ * Resets the dirty ring which:
+ * - Resets the A/D bits in EPT
+ * - Issues tlb flush (invept), which is intercepted by L0
+ *
+ * - L0: frees the whole nested ept mmu root as the response to invept,
+ * and thus ensures that when memory write is retried, it will fault again
+ *
+ * - L1 vCPU: Same memory write is logged to the PML but not committed again.
+ *
+ * - L1 test thread: Ignores the write because its last dirty ring entry (again)
+ * Resets the dirty ring which:
+ * - Resets the A/D bits in EPT (again)
+ * - Issues tlb flush (again) which is intercepted by L0
+ *
+ * ...
+ *
+ * N times
+ *
+ * - L1 vCPU: Memory write is logged in the PML and then committed.
+ * Lots of other memory writes are logged and committed.
+ * ...
+ *
+ * - L1 test thread: Sees the memory write along with other memory writes
+ * in the dirty ring, and since the write is usually not
+ * the last entry in the dirty-ring and has a very outdated
+ * iteration, the test fails.
+ *
+ *
+ * Note that this is only possible when the write was the last log entry
+ * write during iteration N-1, thus remember last iteration last log entry
+ * and also don't fail when it is reported in the next iteration, together with
+ * an outdated iteration count.
+ */
+static uint64_t dirty_ring_prev_iteration_last_page;
enum log_mode_t {
/* Only use KVM_GET_DIRTY_LOG for logging */
@@ -320,6 +365,8 @@ static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns,
struct kvm_dirty_gfn *cur;
uint32_t count = 0;
+ dirty_ring_prev_iteration_last_page = dirty_ring_last_page;
+
while (true) {
cur = &dirty_gfns[*fetch_index % test_dirty_ring_count];
if (!dirty_gfn_is_dirtied(cur))
@@ -622,7 +669,8 @@ static void vm_dirty_log_verify(enum vm_guest_mode mode, unsigned long *bmap)
*/
min_iter = iteration - 1;
continue;
- } else if (page == dirty_ring_last_page) {
+ } else if (page == dirty_ring_last_page ||
+ page == dirty_ring_prev_iteration_last_page) {
/*
* Please refer to comments in
* dirty_ring_last_page.
--
2.26.3
prev parent reply other threads:[~2024-12-11 19:37 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 19:37 [PATCH 0/4] KVM: selftests: dirty_log_test: fixes for running the test nested Maxim Levitsky
2024-12-11 19:37 ` [PATCH 1/4] KVM: VMX: read the PML log in the same order as it was written Maxim Levitsky
2024-12-12 0:44 ` Sean Christopherson
2024-12-12 21:37 ` Maxim Levitsky
2024-12-13 6:19 ` Sean Christopherson
2024-12-13 19:56 ` Maxim Levitsky
2024-12-13 20:31 ` Sean Christopherson
2024-12-11 19:37 ` [PATCH 2/4] KVM: selftests: dirty_log_test: Limit s390x workaround to s390x Maxim Levitsky
2024-12-11 19:37 ` [PATCH 3/4] KVM: selftests: dirty_log_test: run the guest until some dirty ring entries were harvested Maxim Levitsky
2024-12-11 19:37 ` Maxim Levitsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241211193706.469817-5-mlevitsk@redhat.com \
--to=mlevitsk@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox