From: Byungchul Park <byungchul@sk.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel_team@skhynix.com, akpm@linux-foundation.org,
namit@vmware.com, xhao@linux.alibaba.com,
mgorman@techsingularity.net, hughd@google.com,
willy@infradead.org, david@redhat.com, peterz@infradead.org,
luto@kernel.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com
Subject: Re: [v4 0/3] Reduce TLB flushes under some specific conditions
Date: Fri, 10 Nov 2023 10:32:24 +0900 [thread overview]
Message-ID: <20231110013224.GD72073@system.software.com> (raw)
In-Reply-To: <87il6bijtu.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Thu, Nov 09, 2023 at 01:20:29PM +0800, Huang, Ying wrote:
> Byungchul Park <byungchul@sk.com> writes:
>
> > Hi everyone,
> >
> > While I'm working with CXL memory, I have been facing migration overhead
> > esp. TLB shootdown on promotion or demotion between different tiers.
> > Yeah.. most TLB shootdowns on migration through hinting fault can be
> > avoided thanks to Huang Ying's work, commit 4d4b6d66db ("mm,unmap: avoid
> > flushing TLB in batch if PTE is inaccessible").
> >
> > However, it's only for ones using hinting fault. I thought it'd be much
> > better if we have a general mechanism to reduce # of TLB flushes and
> > TLB misses, that we can apply to any type of migration. I tried it only
> > for tiering migration for now tho.
> >
> > I'm suggesting a mechanism to reduce TLB flushes by keeping source and
> > destination of folios participated in the migrations until all TLB
> > flushes required are done, only if those folios are not mapped with
> > write permission PTE entries at all. I worked Based on v6.6-rc5.
> >
> > Can you believe it? I saw the number of TLB full flush reduced about
> > 80% and iTLB miss reduced about 50%, and the time wise performance
> > always shows at least 1% stable improvement with the workload I tested
> > with, XSBench. However, I believe that it would help more with other
> > ones or any real ones. It'd be appreciated to let me know if I'm missing
> > something.
>
> Can you help to test the effect of commit 7e12beb8ca2a ("migrate_pages:
> batch flushing TLB") for your test case? To test it, you can revert it
> and compare the performance before and after the reverting.
I will.
> And, how do you trigger migration when testing XSBench? Use a tiered
> memory system, and migrate pages between DRAM and CXL memory back and
> forth? If so, how many pages will you migrate for each migration?
Honestly I've been focusing on the migration # and TLB #. I will get
back to you.
Byungchul
> --
> Best Regards,
> Huang, Ying
>
> >
> > Byungchul
> >
> > ---
> >
> > Changes from v3:
> >
> > 1. Don't use the kconfig, CONFIG_MIGRC, and remove sysctl knob,
> > migrc_enable. (feedbacked by Nadav)
> > 2. Remove the optimization skipping CPUs that have already
> > performed TLB flushes needed by any reason when performing
> > TLB flushes by migrc because I can't tell the performance
> > difference between w/ the optimization and w/o that.
> > (feedbacked by Nadav)
> > 3. Minimize arch-specific code. While at it, move all the migrc
> > declarations and inline functions from include/linux/mm.h to
> > mm/internal.h (feedbacked by Dave Hansen, Nadav)
> > 4. Separate a part making migrc paused when the system is in
> > high memory pressure to another patch. (feedbacked by Nadav)
> > 5. Rename:
> > a. arch_tlbbatch_clean() to arch_tlbbatch_clear(),
> > b. tlb_ubc_nowr to tlb_ubc_ro,
> > c. migrc_try_flush_free_folios() to migrc_flush_free_folios(),
> > d. migrc_stop to migrc_pause.
> > (feedbacked by Nadav)
> > 6. Use ->lru list_head instead of introducing a new llist_head.
> > (feedbacked by Nadav)
> > 7. Use non-atomic operations of page-flag when it's safe.
> > (feedbacked by Nadav)
> > 8. Use stack instead of keeping a pointer of 'struct migrc_req'
> > in struct task, which is for manipulating it locally.
> > (feedbacked by Nadav)
> > 9. Replace a lot of simple functions to inline functions placed
> > in a header, mm/internal.h. (feedbacked by Nadav)
> > 10. Add additional sufficient comments. (feedbacked by Nadav)
> > 11. Remove a lot of wrapper functions. (feedbacked by Nadav)
> >
> > Changes from RFC v2:
> >
> > 1. Remove additional occupation in struct page. To do that,
> > unioned with lru field for migrc's list and added a page
> > flag. I know page flag is a thing that we don't like to add
> > but no choice because migrc should distinguish folios under
> > migrc's control from others. Instead, I force migrc to be
> > used only on 64 bit system to mitigate you guys from getting
> > angry.
> > 2. Remove meaningless internal object allocator that I
> > introduced to minimize impact onto the system. However, a ton
> > of tests showed there was no difference.
> > 3. Stop migrc from working when the system is in high memory
> > pressure like about to perform direct reclaim. At the
> > condition where the swap mechanism is heavily used, I found
> > the system suffered from regression without this control.
> > 4. Exclude folios that pte_dirty() == true from migrc's interest
> > so that migrc can work simpler.
> > 5. Combine several patches that work tightly coupled to one.
> > 6. Add sufficient comments for better review.
> > 7. Manage migrc's request in per-node manner (from globally).
> > 8. Add TLB miss improvement in commit message.
> > 9. Test with more CPUs(4 -> 16) to see bigger improvement.
> >
> > Changes from RFC:
> >
> > 1. Fix a bug triggered when a destination folio at the previous
> > migration becomes a source folio at the next migration,
> > before the folio gets handled properly so that the folio can
> > play with another migration. There was inconsistency in the
> > folio's state. Fixed it.
> > 2. Split the patch set into more pieces so that the folks can
> > review better. (Feedbacked by Nadav Amit)
> > 3. Fix a wrong usage of barrier e.g. smp_mb__after_atomic().
> > (Feedbacked by Nadav Amit)
> > 4. Tried to add sufficient comments to explain the patch set
> > better. (Feedbacked by Nadav Amit)
> >
> > Byungchul Park (3):
> > mm/rmap: Recognize read-only TLB entries during batched TLB flush
> > mm: Defer TLB flush by keeping both src and dst folios at migration
> > mm: Pause migrc mechanism at high memory pressure
> >
> > arch/x86/include/asm/tlbflush.h | 3 +
> > arch/x86/mm/tlb.c | 11 ++
> > include/linux/mm_types.h | 21 +++
> > include/linux/mmzone.h | 9 ++
> > include/linux/page-flags.h | 4 +
> > include/linux/sched.h | 7 +
> > include/trace/events/mmflags.h | 3 +-
> > mm/internal.h | 78 ++++++++++
> > mm/memory.c | 11 ++
> > mm/migrate.c | 266 ++++++++++++++++++++++++++++++++
> > mm/page_alloc.c | 30 +++-
> > mm/rmap.c | 35 ++++-
> > 12 files changed, 475 insertions(+), 3 deletions(-)
next prev parent reply other threads:[~2023-11-10 1:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 4:59 [v4 0/3] Reduce TLB flushes under some specific conditions Byungchul Park
2023-11-09 4:59 ` [v4 1/3] mm/rmap: Recognize read-only TLB entries during batched TLB flush Byungchul Park
2023-11-09 20:26 ` kernel test robot
2023-11-09 4:59 ` [v4 2/3] mm: Defer TLB flush by keeping both src and dst folios at migration Byungchul Park
2023-11-09 14:36 ` Matthew Wilcox
2023-11-10 1:29 ` Byungchul Park
2024-01-15 7:55 ` Byungchul Park
2023-11-09 17:09 ` kernel test robot
2023-11-09 19:07 ` kernel test robot
2023-11-09 4:59 ` [v4 3/3] mm: Pause migrc mechanism at high memory pressure Byungchul Park
2023-11-09 5:20 ` [v4 0/3] Reduce TLB flushes under some specific conditions Huang, Ying
2023-11-10 1:32 ` Byungchul Park [this message]
2023-11-15 2:57 ` Byungchul Park
2023-11-09 14:26 ` Dave Hansen
2023-11-10 1:08 ` Byungchul Park
2023-11-15 6:43 ` Byungchul Park
2024-01-15 7:58 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231110013224.GD72073@system.software.com \
--to=byungchul@sk.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=xhao@linux.alibaba.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).