All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Jim Mattson <jmattson@google.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	linux-mm@kvack.org, Sean Christopherson <seanjc@google.com>,
	Emanuele Giuseppe Esposito <eesposit@redhat.com>
Subject: Re: The root cause of failure of access_tracking_perf_test in a nested guest
Date: Fri, 23 Sep 2022 13:28:35 -0700	[thread overview]
Message-ID: <Yy4W86qofpjoh2LA@google.com> (raw)
In-Reply-To: <CALMp9eSJbb6sSmv4c8c3ebCtfgdAARgryq5jHXdRmhxm6fYQsw@mail.gmail.com>

On Fri, Sep 23, 2022 at 12:25:00PM -0700, Jim Mattson wrote:
> On Fri, Sep 23, 2022 at 3:16 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> >
> > Because of this, when the guest clears the accessed bit in its nested EPT entries, KVM doesn't
> > notice/intercept it and corresponding EPT sptes remain the same, thus later the guest access to
> > the memory is not intercepted and because of this doesn't turn back
> > the accessed bit in the guest EPT tables.
> 
> Does the guest execute an INVEPT after clearing the accessed bit?

No, that's the problem. In L1, access_tracking_perf_test is using
page_idle to mark guest memory as idle, which results in clear_young()
notifiers being sent to KVM clear access bits. clear_young() is
explicitly allowed to omit flushes, so KVM happily obliges.

	/*
	 * clear_young is a lightweight version of clear_flush_young. Like the
	 * latter, it is supposed to test-and-clear the young/accessed bitflag
	 * in the secondary pte, but it may omit flushing the secondary tlb.
	 */
	int (*clear_young)(struct mmu_notifier *subscription,
			   struct mm_struct *mm,
			   unsigned long start,
			   unsigned long end);

We could modify page_idle so that KVM performs TLB flushes. For example,
add a mechanism for userspace to trigger a TLB flush. Or change
page_idle to use clear_flush_young() (although that would be incredibly
expensive since page_idle only allows clearing one pfn at a time). But
I'm not sure creating a new userspace API just for this test is really
worth it, especially with multigen LRU coming soon.

  reply	other threads:[~2022-09-23 20:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 10:16 The root cause of failure of access_tracking_perf_test in a nested guest Maxim Levitsky
2022-09-23 11:57 ` Emanuele Giuseppe Esposito
2022-09-23 17:30 ` David Matlack
2022-09-23 19:25 ` Jim Mattson
2022-09-23 20:28   ` David Matlack [this message]
2022-09-26  8:50     ` Emanuele Giuseppe Esposito
2022-10-04 18:52       ` Mingwei Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yy4W86qofpjoh2LA@google.com \
    --to=dmatlack@google.com \
    --cc=eesposit@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.