Re: Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race

Linux kernel -stable discussions
 help / color / mirror / Atom feed

From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>,
	"Uschakow, Stanislav" <suschako@amazon.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"trix@redhat.com" <trix@redhat.com>,
	"ndesaulniers@google.com" <ndesaulniers@google.com>,
	"nathan@kernel.org" <nathan@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"muchun.song@linux.dev" <muchun.song@linux.dev>,
	"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>,
	"liam.howlett@oracle.com" <liam.howlett@oracle.com>,
	"osalvador@suse.de" <osalvador@suse.de>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
Date: Wed, 19 Nov 2025 17:29:19 +0100	[thread overview]
Message-ID: <968d5458-7d2b-4a8d-a2a6-0931cd87898f@kernel.org> (raw)
In-Reply-To: <1d53ef79-c88c-4c5b-af82-1eb22306993b@lucifer.local>

>>
>> So what I am currently looking into is simply reducing (batching) the number
>> of IPIs.
> 
> As in the IPIs we are now generating in tlb_remove_table_sync_one()?
> 
> Or something else?

Yes, for now. I'm essentially reducing the number of 
tlb_remove_table_sync_one() calls.

> 
> As this bug is only an issue when we don't use IPIs for pgtable freeing right
> (e.g. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set), as otherwise
> tlb_remove_table_sync_one() is a no-op?

Right. But it's still confusing: I think for page table unsharing we 
always need an IPI one way or the other to make sure GUP-fast was called.

At least for preventing that anybody would be able to reuse the page 
table in the meantime.

That is either:

(a) The TLB shootdown implied an IPI

(b) We manually send one

But that's where it gets confusing: nowadays x86 also selects 
MMU_GATHER_RCU_TABLE_FREE, meaning we would get a double IPI?

This is so complicated, so I might be missing something.

But it's the same behavior we have in collapse_huge_page() where we first

> 
>>
>> In essence, we only have to send one IPI when unsharing multiple page
>> tables, and we only have to send one when we are the last one sharing the
>> page table (before it can get reused).
> 
> Right, hopefully that significantly cuts down on the amount genrated.

I'd assume that the problem of the current approach is that when we fork 
a child and it quits, that we call __unmap_hugepage_range(). If the 
range is large enough to cover many PMD tables (multiple gigabytes?), we
essentially send one IPI per PMD table we are unsharing, when we really 
only have to send one.

That's the theory ...

-- 
Cheers

David

next prev parent reply	other threads:[~2025-11-19 16:29 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 14:30 Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race Uschakow, Stanislav
2025-09-01 10:58 ` Jann Horn
2025-09-01 11:26   ` David Hildenbrand
2025-09-04 12:39     ` Uschakow, Stanislav
2025-10-08 22:54     ` Prakash Sangappa
2025-10-09  7:23       ` David Hildenbrand
2025-10-09 15:06         ` Prakash Sangappa
2025-10-09  7:40   ` David Hildenbrand
2025-10-09  8:19     ` David Hildenbrand
2025-10-16  9:21     ` Lorenzo Stoakes
2025-10-16 19:13       ` David Hildenbrand
2025-10-16 18:44     ` Jann Horn
2025-10-16 19:10       ` David Hildenbrand
2025-10-16 19:26         ` Jann Horn
2025-10-16 19:44           ` David Hildenbrand
2025-10-16 20:25             ` Jann Horn
2025-10-20 15:00       ` Lorenzo Stoakes
2025-10-20 15:33         ` Jann Horn
2025-10-24 12:24           ` Lorenzo Stoakes
2025-10-24 18:22             ` Jann Horn
2025-10-24 19:02               ` Lorenzo Stoakes
2025-10-24 19:43                 ` Jann Horn
2025-10-24 19:58                   ` Lorenzo Stoakes
2025-10-24 21:41                     ` Jann Horn
2025-10-29 16:19                   ` David Hildenbrand
2025-10-29 18:02                     ` Lorenzo Stoakes
2025-11-18 10:03                       ` David Hildenbrand (Red Hat)
2025-11-19 16:08                         ` Lorenzo Stoakes
2025-11-19 16:29                           ` David Hildenbrand (Red Hat) [this message]
2025-11-19 16:31                             ` David Hildenbrand (Red Hat)
2025-11-20 15:47                               ` David Hildenbrand (Red Hat)
2025-12-03 17:22                                 ` Prakash Sangappa
2025-12-03 19:45                                   ` David Hildenbrand (Red Hat)
2025-10-20 17:18         ` David Hildenbrand
2025-10-24  9:59           ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=968d5458-7d2b-4a8d-a2a6-0931cd87898f@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=suschako@amazon.de \
    --cc=trix@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox