From: Byungchul Park <byungchul@sk.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel_team@skhynix.com, akpm@linux-foundation.org,
ying.huang@intel.com, vernhao@tencent.com,
mgorman@techsingularity.net, hughd@google.com,
willy@infradead.org, david@redhat.com, peterz@infradead.org,
luto@kernel.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com
Subject: Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90%
Date: Mon, 27 May 2024 10:57:32 +0900 [thread overview]
Message-ID: <20240527015732.GA61604@system.software.com> (raw)
In-Reply-To: <982317c0-7faa-45f0-82a1-29978c3c9f4d@intel.com>
On Fri, May 24, 2024 at 10:16:39AM -0700, Dave Hansen wrote:
> On 5/9/24 23:51, Byungchul Park wrote:
> > To achieve that:
> >
> > 1. For the folios that map only to non-writable tlb entries, prevent
> > tlb flush during unmapping but perform it just before the folios
> > actually become used, out of buddy or pcp.
>
> Is this just _pure_ unmapping (like MADV_DONTNEED), or does it apply to
> changing the memory map, like munmap() itself?
I think it can be applied to any unmapping of ro ones but LUF for now is
working only with unmapping during folio migrion and reclaim.
> > 2. When any non-writable ptes change to writable e.g. through fault
> > handler, give up luf mechanism and perform tlb flush required
> > right away.
> >
> > 3. When a writable mapping is created e.g. through mmap(), give up
> > luf mechanism and perform tlb flush required right away.
>
> Let's say you do this:
>
> fd = open("/some/file", O_RDONLY);
> ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...);
> foo1 = *ptr1;
>
> You now have a read-only PTE pointing to the first page of /some/file.
> Let's say try_to_unmap() comes along and decides it can_luf_folio().
> The page gets pulled out of the page cache and freed, the PTE is zeroed.
> But the TLB is never flushed.
>
> Now, someone does:
>
> fd2 = open("/some/other/file", O_RDONLY);
> ptr2 = mmap(ptr1, size, PROT_READ, MAP_FIXED, fd, ...);
> foo2 = *ptr2;
>
> and they overwrite the old VMA. Does foo2 have the contents of the new
> "/some/other/file" or the old "/some/file"? How does the new mmap()
Good point. It should've give up LUF at the 2nd mmap() in this case.
I will fix it by introducing a new flag in task_struct indicating if LUF
has left stale maps for the task so that LUF can give up and flush right
away in mmap().
> know that there was something to flush?
>
> BTW, the same thing could happen without a new mmap(). Someone could
> modify the file in the middle, maybe even from another process.
Thank you for the pointing out. I will fix it too by introducing a new
flag in inode or something to make LUF aware if updating the file has
been tried so that LUF can give up and flush right away in the case.
Plus, I will add another give-up at code changing the permission of vma
to writable.
Thank you very much.
Byungchul
> fd = open("/some/file", O_RDONLY);
> ptr1 = mmap(-1, size, PROT_READ, ..., fd, ...);
> foo1 = *ptr1;
> // LUF happens here
> // "/some/file" changes
> foo2 = *ptr1; // Does this see the change?
next prev parent reply other threads:[~2024-05-27 1:57 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 6:51 [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% Byungchul Park
2024-05-10 6:51 ` [PATCH v10 01/12] x86/tlb: add APIs manipulating tlb batch's arch data Byungchul Park
2024-05-10 6:51 ` [PATCH v10 02/12] arm64: tlbflush: " Byungchul Park
2024-05-10 6:51 ` [PATCH v10 03/12] riscv, tlb: " Byungchul Park
2024-05-10 6:51 ` [PATCH v10 04/12] x86/tlb, riscv/tlb, mm/rmap: separate arch_tlbbatch_clear() out of arch_tlbbatch_flush() Byungchul Park
2024-05-10 6:51 ` [PATCH v10 05/12] mm: buddy: make room for a new variable, ugen, in struct page Byungchul Park
2024-05-10 6:52 ` [PATCH v10 06/12] mm: add folio_put_ugen() to deliver unmap generation number to pcp or buddy Byungchul Park
2024-05-10 6:52 ` [PATCH v10 07/12] mm: add a parameter, unmap generation number, to free_unref_folios() Byungchul Park
2024-05-10 6:52 ` [PATCH v10 08/12] mm/rmap: recognize read-only tlb entries during batched tlb flush Byungchul Park
2024-05-10 6:52 ` [PATCH v10 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped Byungchul Park
2024-05-10 6:52 ` [PATCH v10 10/12] mm: separate move/undo parts from migrate_pages_batch() Byungchul Park
2024-05-10 6:52 ` [PATCH v10 11/12] mm, migrate: apply luf mechanism to unmapping during migration Byungchul Park
2024-05-10 6:52 ` [PATCH v10 12/12] mm, vmscan: apply luf mechanism to unmapping during folio reclaim Byungchul Park
2024-05-11 6:54 ` [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% Huang, Ying
2024-05-13 1:41 ` Byungchul Park
2024-05-11 7:15 ` Huang, Ying
2024-05-13 1:44 ` Byungchul Park
2024-05-22 2:16 ` Byungchul Park
2024-05-22 7:38 ` Huang, Ying
2024-05-22 10:27 ` Byungchul Park
2024-05-22 14:15 ` Byungchul Park
2024-05-24 17:16 ` Dave Hansen
2024-05-27 1:57 ` Byungchul Park [this message]
2024-05-27 2:43 ` Dave Hansen
2024-05-27 3:46 ` Byungchul Park
2024-05-27 4:19 ` Byungchul Park
2024-05-27 4:25 ` Byungchul Park
2024-05-27 22:58 ` Byungchul Park
2024-05-29 2:16 ` Huang, Ying
2024-05-30 1:02 ` Byungchul Park
2024-05-27 3:10 ` Huang, Ying
2024-05-27 3:56 ` Byungchul Park
2024-05-28 15:14 ` Dave Hansen
2024-05-29 5:00 ` Byungchul Park
2024-05-29 16:41 ` Dave Hansen
2024-05-30 0:50 ` Byungchul Park
2024-05-30 0:59 ` Byungchul Park
2024-05-30 1:11 ` Huang, Ying
2024-05-30 1:33 ` Byungchul Park
2024-05-30 7:18 ` Byungchul Park
2024-05-30 8:24 ` Huang, Ying
2024-05-30 8:41 ` Byungchul Park
2024-05-30 13:50 ` Dave Hansen
2024-05-31 2:06 ` Byungchul Park
2024-05-30 9:33 ` Byungchul Park
2024-05-31 1:45 ` Huang, Ying
2024-05-31 2:20 ` Byungchul Park
2024-05-28 8:41 ` David Hildenbrand
2024-05-29 4:39 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240527015732.GA61604@system.software.com \
--to=byungchul@sk.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rjgolo@gmail.com \
--cc=tglx@linutronix.de \
--cc=vernhao@tencent.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).