All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yu Zhao <yuzhao@google.com>
Cc: James Houghton <jthoughton@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	Ankit Agrawal <ankita@nvidia.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 David Matlack <dmatlack@google.com>,
	David Rientjes <rientjes@google.com>,
	 James Morse <james.morse@arm.com>,
	Jonathan Corbet <corbet@lwn.net>, Marc Zyngier <maz@kernel.org>,
	 Oliver Upton <oliver.upton@linux.dev>,
	Raghavendra Rao Ananta <rananta@google.com>,
	 Ryan Roberts <ryan.roberts@arm.com>,
	Shaoqin Huang <shahuang@redhat.com>,
	 Suzuki K Poulose <suzuki.poulose@arm.com>,
	Wei Xu <weixugc@google.com>,  Will Deacon <will@kernel.org>,
	Zenghui Yu <yuzenghui@huawei.com>,
	kvmarm@lists.linux.dev,  kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,  linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v5 8/9] mm: multi-gen LRU: Have secondary MMUs participate in aging
Date: Wed, 12 Jun 2024 10:23:38 -0700	[thread overview]
Message-ID: <ZmnZmj8iVmcLf8fo@google.com> (raw)
In-Reply-To: <CAOUHufYCmYNngmS=rOSAQRB0N9ai+mA0aDrB9RopBvPHEK42Ng@mail.gmail.com>

On Wed, Jun 12, 2024, Yu Zhao wrote:
> On Wed, Jun 12, 2024 at 10:02 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Jun 11, 2024, James Houghton wrote:
> > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > index e8fc5ecb59b2..24a3ff639919 100644
> > > --- a/mm/rmap.c
> > > +++ b/mm/rmap.c
> > > @@ -870,13 +870,10 @@ static bool folio_referenced_one(struct folio *folio,
> > >                       continue;
> > >               }
> > >
> > > -             if (pvmw.pte) {
> > > -                     if (lru_gen_enabled() &&
> > > -                         pte_young(ptep_get(pvmw.pte))) {
> > > -                             lru_gen_look_around(&pvmw);
> > > +             if (lru_gen_enabled() && pvmw.pte) {
> > > +                     if (lru_gen_look_around(&pvmw))
> > >                               referenced++;
> > > -                     }
> > > -
> > > +             } else if (pvmw.pte) {
> > >                       if (ptep_clear_flush_young_notify(vma, address,
> > >                                               pvmw.pte))
> > >                               referenced++;
> >
> > Random question not really related to KVM/secondary MMU participation.  AFAICT,
> > the MGLRU approach doesn't flush TLBs after aging pages.  How does MGLRU mitigate
> > false negatives on pxx_young() due to the CPU not setting Accessed bits because
> > of stale TLB entries?
> 
> I do think there can be false negatives but we have not been able to
> measure their practical impacts since we disabled the flush on some
> host MMUs long ago (NOT by MGLRU), e.g., on x86 and ppc,
> ptep_clear_flush_young() is just ptep_test_andclear_young().

Aha!  That's what I was missing, I somehow didn't see x86's ptep_clear_flush_young().

That begs the question, why does KVM flush TLBs on architectures that don't need
to?  And since kvm_mmu_notifier_clear_young() explicitly doesn't flush, are there
even any KVM-supported architectures for which the flush is mandatory?

Skipping the flush on KVM x86 seems like a complete no-brainer.

Will, Marc and/or Oliver, what are arm64's requirements in this area?  E.g. I see
that arm64's version of __ptep_clear_flush_young() does TLBI but not DSB.  Should
KVM be doing something similar?  Can KVM safely skip even the TBLI?

> theoretical basis is that, given the TLB coverage trend (Figure 1 in
> [1]), when a system is running out of memory, it's unlikely to have
> many long-lived entries in its TLB. IOW, if that system had a stable
> working set (hot memory) that can fit into its TLB, it wouldn't hit
> page reclaim. Again, this is based on the theory (proposition) that
> for most systems, their TLB coverages are much smaller than their
> memory sizes.
> 
> If/when the above proposition doesn't hold, the next step in the page
> reclaim path, which is to unmap the PTE, will cause a page fault. The
> fault can be minor or major (requires IO), depending on the race
> between the reclaiming and accessing threads. In this case, the
> tradeoff, in a steady state, is between the PF cost of pages we
> shouldn't reclaim and the flush cost of pages we scan. The PF cost is
> higher than the flush cost per page. But we scan many pages and only
> reclaim a few of them; pages we shouldn't reclaim are a (small)
> portion of the latter.
> 
> [1] https://www.usenix.org/legacy/events/osdi02/tech/full_papers/navarro/navarro.pdf

  reply	other threads:[~2024-06-12 17:23 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-11  0:21 [PATCH v5 0/9] mm: multi-gen LRU: Walk secondary MMU page tables while aging James Houghton
2024-06-11  0:21 ` James Houghton
2024-06-11  0:21 ` [PATCH v5 1/9] KVM: Add lockless memslot walk to KVM James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  0:21 ` [PATCH v5 2/9] KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  0:21 ` [PATCH v5 3/9] KVM: arm64: " James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  5:57   ` Oliver Upton
2024-06-11  5:57     ` Oliver Upton
2024-06-11 16:52     ` James Houghton
2024-06-11 16:52       ` James Houghton
2024-06-11  0:21 ` [PATCH v5 4/9] mm: Add test_clear_young_fast_only MMU notifier James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  5:33   ` Yu Zhao
2024-06-11  5:33     ` Yu Zhao
2024-06-11 16:49     ` James Houghton
2024-06-11 16:49       ` James Houghton
2024-06-11 18:54       ` Oliver Upton
2024-06-11 19:49         ` Sean Christopherson
2024-06-13  6:52           ` Oliver Upton
2024-06-14  0:48             ` James Houghton
2024-06-11 19:42       ` Sean Christopherson
2024-06-11 23:04         ` James Houghton
2024-06-12  0:34           ` Sean Christopherson
2024-06-14  0:45             ` James Houghton
2024-06-14 16:12               ` Sean Christopherson
2024-06-14 18:23                 ` James Houghton
2024-06-14 23:17                   ` Sean Christopherson
2024-06-17 16:50                     ` James Houghton
2024-06-17 18:37                       ` Sean Christopherson
2024-06-28 23:38                         ` James Houghton
2024-07-08 16:50                           ` James Houghton
2024-07-09 17:49                             ` Sean Christopherson
2024-07-10 23:10                               ` James Houghton
2024-07-12 15:06                                 ` Sean Christopherson
2024-07-15 23:15                                   ` James Houghton
2024-06-11 20:39       ` Yu Zhao
2024-06-11  0:21 ` [PATCH v5 5/9] KVM: Add kvm_fast_age_gfn and kvm_fast_test_age_gfn James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  0:21 ` [PATCH v5 6/9] KVM: x86: Move tdp_mmu_enabled and shadow_accessed_mask James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  0:21 ` [PATCH v5 7/9] KVM: x86: Implement kvm_fast_test_age_gfn and kvm_fast_age_gfn James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-11  0:21 ` [PATCH v5 8/9] mm: multi-gen LRU: Have secondary MMUs participate in aging James Houghton
2024-06-11  0:21   ` James Houghton
2024-06-12 16:02   ` Sean Christopherson
2024-06-12 16:59     ` Yu Zhao
2024-06-12 17:23       ` Sean Christopherson [this message]
2024-06-13  6:49         ` Oliver Upton
2024-07-05 18:35   ` Yu Zhao
2024-07-08 17:30     ` James Houghton
2024-07-08 23:41       ` Yu Zhao
2024-07-22 20:45         ` James Houghton
2024-07-22 21:23           ` Yu Zhao
2024-06-11  0:21 ` [PATCH v5 9/9] KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test James Houghton
2024-06-11  0:21   ` James Houghton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmnZmj8iVmcLf8fo@google.com \
    --to=seanjc@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=ankita@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dmatlack@google.com \
    --cc=james.morse@arm.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=rananta@google.com \
    --cc=rientjes@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=weixugc@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.