From: Byungchul Park <byungchul@sk.com>
To: Nadav Amit <namit@vmware.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
"kernel_team@skhynix.com" <kernel_team@skhynix.com>,
Andrew Morton <akpm@linux-foundation.org>,
"ying.huang@intel.com" <ying.huang@intel.com>,
"xhao@linux.alibaba.com" <xhao@linux.alibaba.com>,
"mgorman@techsingularity.net" <mgorman@techsingularity.net>,
"hughd@google.com" <hughd@google.com>,
"willy@infradead.org" <willy@infradead.org>,
"david@redhat.com" <david@redhat.com>,
"peterz@infradead.org" <peterz@infradead.org>,
Andy Lutomirski <luto@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"bp@alien8.de" <bp@alien8.de>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
Subject: Re: [v3 2/3] mm: Defer TLB flush by keeping both src and dst folios at migration
Date: Fri, 10 Nov 2023 10:02:01 +0900 [thread overview]
Message-ID: <20231110010201.GA72073@system.software.com> (raw)
In-Reply-To: <C47A7C40-BE3E-4F0F-B854-D40D4795A236@vmware.com>
On Thu, Nov 09, 2023 at 10:16:57AM +0000, Nadav Amit wrote:
>
>
> > On Nov 8, 2023, at 6:12 AM, Byungchul Park <byungchul@sk.com> wrote:
> >
> > !! External Email
> >
> > On Mon, Oct 30, 2023 at 09:51:30PM +0900, Byungchul Park wrote:
> >>>> diff --git a/mm/memory.c b/mm/memory.c
> >>>> index 6c264d2f969c..75dc48b6e15f 100644
> >>>> --- a/mm/memory.c
> >>>> +++ b/mm/memory.c
> >>>> @@ -3359,6 +3359,19 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >>>> if (vmf->page)
> >>>> folio = page_folio(vmf->page);
> >>>>
> >>>> + /*
> >>>> + * This folio has its read copy to prevent inconsistency while
> >>>> + * deferring TLB flushes. However, the problem might arise if
> >>>> + * it's going to become writable.
> >>>> + *
> >>>> + * To prevent it, give up the deferring TLB flushes and perform
> >>>> + * TLB flush right away.
> >>>> + */
> >>>> + if (folio && migrc_pending_folio(folio)) {
> >>>> + migrc_unpend_folio(folio);
> >>>> + migrc_try_flush_free_folios(NULL);
> >>>
> >>> So many potential function calls… Probably they should have been combined
> >>> into one and at least migrc_pending_folio() should have been an inline
> >>> function in the header.
> >>
> >> I will try to change it as you mention.
> >>
> >>>> + }
> >>>> +
> >>>
> >>> What about mprotect? I thought David has changed it so it can set writable
> >>> PTEs.
> >>
> >> I will check it out.
> >
> > I found mprotect stuff is already performing TLB flushes needed for it.
> > So some redundant TLB flushes might happen by migrc but it's not that
> > harmful I think. Thanks.
>
> Let me explain the scenario I am concerned with. Assume page P is RO, and
> moves from Psrc to Pdst. Pointer “p” points to P. Initially (*p == 0).
>
> Let’s also assume we also have an atomic variable “a”. Initially (a == 0).
>
> I hope I got the migration function names right, but I hope the problem
> itself can be clear regardless.
>
> CPU0 CPU1 CPU2 CPU3
> ---- ---- ---- ----
> (user-mode) (user-mode)
>
> Access *p
> [Psrc cached in TLB]
>
> migrate_pages_batch()
> -> migrate_folio_unmap()
>
> [ PTE updated,
> still no flush ]
>
> mprotect(p,
> RW)
Here,
mprotect()
do_mprotect_pkey()
tlb_finish_mmu()
tlb_flush_mmu()
I thought TLB flush for mprotect() is performed by tlb_flush_mmu() so
any cached TLB entries on other CPUs can have chance to update. Could
you correct me if I get it wrong? Thanks.
Byungchul
>
> [ Psrc is
> RW ]
>
> [ flush
> deferred]
>
>
> *p = 1 # Pdst
>
> xchg(&a, 1)
> mfence
> if (a == 1)
> assert(*p == 1);
>
>
>
> Now at this point the assertion might fail. CPU2 wrote into Pdst, whereas
> CPU1 reads from Psrc. But based on x86 memory model, userspace might not
> expect this scenario to be possible, hence leading to bugs.
next prev parent reply other threads:[~2023-11-10 1:02 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-30 7:25 [v3 0/3] Reduce TLB flushes under some specific conditions Byungchul Park
2023-10-30 7:25 ` [v3 1/3] mm/rmap: Recognize non-writable TLB entries during TLB batch flush Byungchul Park
2023-10-30 7:52 ` Nadav Amit
2023-10-30 10:26 ` Byungchul Park
2023-10-30 7:25 ` [v3 2/3] mm: Defer TLB flush by keeping both src and dst folios at migration Byungchul Park
2023-10-30 8:00 ` David Hildenbrand
2023-10-30 9:58 ` Byungchul Park
2023-11-01 3:06 ` Huang, Ying
2023-10-30 8:50 ` Nadav Amit
2023-10-30 12:51 ` Byungchul Park
2023-10-30 15:58 ` Nadav Amit
2023-10-30 22:40 ` Byungchul Park
2023-11-08 4:12 ` Byungchul Park
2023-11-09 10:16 ` Nadav Amit
2023-11-10 1:02 ` Byungchul Park [this message]
2023-11-10 3:13 ` Byungchul Park
2023-11-10 22:18 ` Nadav Amit
2023-11-15 5:48 ` Byungchul Park
2023-11-09 5:35 ` Byungchul Park
2023-10-30 7:25 ` [v3 3/3] mm, migrc: Add a sysctl knob to enable/disable MIGRC mechanism Byungchul Park
2023-10-30 8:51 ` Nadav Amit
2023-10-30 10:36 ` Byungchul Park
2023-10-30 17:55 ` [v3 0/3] Reduce TLB flushes under some specific conditions Dave Hansen
2023-10-30 18:32 ` Nadav Amit
2023-10-30 22:55 ` Byungchul Park
2023-10-31 8:46 ` David Hildenbrand
2023-10-31 2:37 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231110010201.GA72073@system.software.com \
--to=byungchul@sk.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=xhao@linux.alibaba.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.