Re: [PATCH v7 2/2] arm64: support batched/deferred tlb shootdown during page reclamation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Catalin Marinas <catalin.marinas@arm.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Yicong Yang <yangyicong@huawei.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org, x86@kernel.org,
	will@kernel.org, anshuman.khandual@arm.com,
	linux-doc@vger.kernel.org, corbet@lwn.net, peterz@infradead.org,
	arnd@arndb.de, punit.agrawal@bytedance.com,
	linux-kernel@vger.kernel.org, darren@os.amperecomputing.com,
	yangyicong@hisilicon.com, huzhanyuan@oppo.com,
	lipeifeng@oppo.com, zhangshiming@oppo.com, guojian@oppo.com,
	realmz6@gmail.com, linux-mips@vger.kernel.org,
	openrisc@lists.librecores.org, linuxppc-dev@lists.ozlabs.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	wangkefeng.wang@huawei.com, xhao@linux.alibaba.com,
	prime.zeng@hisilicon.com, Barry Song <v-songbaohua@oppo.com>,
	Nadav Amit <namit@vmware.com>, Mel Gorman <mgorman@suse.de>
Subject: Re: [PATCH v7 2/2] arm64: support batched/deferred tlb shootdown during page reclamation
Date: Mon, 9 Jan 2023 17:19:00 +0000	[thread overview]
Message-ID: <Y7xMhPTAwcUT4O6b@arm.com> (raw)
In-Reply-To: <CAGsJ_4yC0i6MYwvosRSrdQ1iT7n88ypmK3aOQJkuusqNKtddtg@mail.gmail.com>

On Sun, Jan 08, 2023 at 06:48:41PM +0800, Barry Song wrote:
> On Fri, Jan 6, 2023 at 2:15 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, Nov 17, 2022 at 04:26:48PM +0800, Yicong Yang wrote:
> > > It is tested on 4,8,128 CPU platforms and shows to be beneficial on
> > > large systems but may not have improvement on small systems like on
> > > a 4 CPU platform. So make ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH depends
> > > on CONFIG_EXPERT for this stage and make this disabled on systems
> > > with less than 8 CPUs. User can modify this threshold according to
> > > their own platforms by CONFIG_NR_CPUS_FOR_BATCHED_TLB.
> >
> > What's the overhead of such batching on systems with 4 or fewer CPUs? If
> > it isn't noticeable, I'd rather have it always on than some number
> > chosen on whichever SoC you tested.
> 
> On the one hand, tlb flush is cheap on a small system. so batching tlb flush
> helps very minorly.

Yes, it probably won't help on small systems but I don't like config
options choosing the threshold, which may be different from system to
system even if they have the same number of CPUs. A run-time tunable
would be a better option.

> On the other hand, since we have batched the tlb flush, new PTEs might be
> invisible to others before the final broadcast is done and Ack-ed.

The new PTEs could indeed be invisible at the TLB level but not at the
memory (page table) level since this is done under the PTL IIUC.

> thus, there
> is a risk someone else might do mprotect or similar things  on those deferred
> pages which will ask for read-modify-write on those deferred PTEs.

And this should be fine, we have things like the PTL in place for the
actual memory access to the page table.

> in this
> case, mm will do an explicit flush by flush_tlb_batched_pending which is
> not required if tlb flush is not deferred.

I don't fully understand why it's needed, or at least why it would be
needed on arm64. At the end of an mprotect(), we have the final PTEs in
place and we just need to issue a TLBI for that range.
change_pte_range() for example has a tlb_flush_pte_range() if the PTE
was present and that won't be done lazily. If there are other TLBIs
pending for the same range, they'll be done later though likely
unnecessarily but still cheaper than issuing a flush_tlb_mm().

> void flush_tlb_batched_pending(struct mm_struct *mm)
> {
>        int batch = atomic_read(&mm->tlb_flush_batched);
>        int pending = batch & TLB_FLUSH_BATCH_PENDING_MASK;
>        int flushed = batch >> TLB_FLUSH_BATCH_FLUSHED_SHIFT;
> 
>        if (pending != flushed) {
>                flush_tlb_mm(mm);
>         /*
>          * If the new TLB flushing is pending during flushing, leave
>          * mm->tlb_flush_batched as is, to avoid losing flushing.
>         */
>       atomic_cmpxchg(&mm->tlb_flush_batched, batch,
>            pending | (pending << TLB_FLUSH_BATCH_FLUSHED_SHIFT));
>      }
> }

I guess this works on x86 better as it avoids the IPIs if this flush
already happened. But on arm64 we already issued the TLBI, we just
didn't wait for it to complete via a DSB.

> I believe Anshuman has contributed many points on this in those previous
> discussions.

Yeah, I should re-read the old threads.

-- 
Catalin

next prev parent reply	other threads:[~2023-01-09 17:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-17  8:26 [PATCH v7 0/2] arm64: support batched/deferred tlb shootdown during page reclamation Yicong Yang
2022-11-17  8:26 ` [PATCH v7 1/2] mm/tlbbatch: Introduce arch_tlbbatch_should_defer() Yicong Yang
2022-11-29 23:23   ` Andrew Morton
2022-11-30  2:23     ` Yicong Yang
2022-11-30  2:57       ` Anshuman Khandual
2022-11-17  8:26 ` [PATCH v7 2/2] arm64: support batched/deferred tlb shootdown during page reclamation Yicong Yang
2022-11-23 14:07   ` Anshuman Khandual
2023-01-05 18:14   ` Catalin Marinas
2023-01-08 10:48     ` Barry Song
2023-01-09 17:19       ` Catalin Marinas [this message]
2023-01-09 21:28         ` Barry Song
2022-11-29 11:09 ` [PATCH v7 0/2] " Yicong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7xMhPTAwcUT4O6b@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=darren@os.amperecomputing.com \
    --cc=guojian@oppo.com \
    --cc=huzhanyuan@oppo.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lipeifeng@oppo.com \
    --cc=mgorman@suse.de \
    --cc=namit@vmware.com \
    --cc=openrisc@lists.librecores.org \
    --cc=peterz@infradead.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=realmz6@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yangyicong@huawei.com \
    --cc=zhangshiming@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).