From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Lance Yang <lance.yang@linux.dev>, akpm@linux-foundation.org
Cc: will@kernel.org, aneesh.kumar@kernel.org, npiggin@gmail.com,
peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, arnd@arndb.de, lorenzo.stoakes@oracle.com,
ziy@nvidia.com, baolin.wang@linux.alibaba.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, baohua@kernel.org, ioworker0@gmail.com,
shy828301@gmail.com, riel@surriel.com, jannh@google.com,
linux-arch@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC 1/3] mm/tlb: allow architectures to skip redundant TLB sync IPIs
Date: Thu, 18 Dec 2025 14:08:50 +0100 [thread overview]
Message-ID: <f7552c2e-e12c-47c8-9b60-ec645cddf804@kernel.org> (raw)
In-Reply-To: <4ff8abad-186a-41b7-a269-70e9b1dc61e5@linux.dev>
On 12/15/25 06:48, Lance Yang wrote:
>
>
> On 2025/12/13 16:00, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> When unsharing hugetlb PMD page tables, we currently send two IPIs:
>> one for TLB invalidation, and another to synchronize with concurrent
>> GUP-fast walkers.
>>
>> However, if the TLB flush already reaches all CPUs, the second IPI is
>> redundant. GUP-fast runs with IRQs disabled, so when the TLB flush IPI
>> completes, any concurrent GUP-fast must have finished.
>>
>> Add tlb_table_flush_implies_ipi_broadcast() to let architectures indicate
>> their TLB flush provides full synchronization, enabling the redundant IPI
>> to be skipped.
>>
>> The default implementation returns false to maintain current behavior.
>>
>> Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>> include/asm-generic/tlb.h | 22 +++++++++++++++++++++-
>> 1 file changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
>> index 324a21f53b64..3f0add95604f 100644
>> --- a/include/asm-generic/tlb.h
>> +++ b/include/asm-generic/tlb.h
>> @@ -248,6 +248,21 @@ static inline void tlb_remove_table(struct mmu_gather *tlb, void *table)
>> #define tlb_needs_table_invalidate() (true)
>> #endif
>>
>> +/*
>> + * Architectures can override if their TLB flush already broadcasts IPIs to all
>> + * CPUs when freeing or unsharing page tables.
>> + *
>> + * Return true only when the flush guarantees:
>> + * - IPIs reach all CPUs with potentially stale paging-structure cache entries
>> + * - Synchronization with IRQ-disabled code like GUP-fast
>> + */
>> +#ifndef tlb_table_flush_implies_ipi_broadcast
>> +static inline bool tlb_table_flush_implies_ipi_broadcast(void)
>> +{
>> + return false;
>> +}
>> +#endif
>
> As the kernel test robot reported[1][2], the compiler is unhappy with
> patch #3:
>
> ```
> mm/khugepaged.c: In function 'collapse_huge_page':
>>>>> mm/khugepaged.c:1185:14: error: implicit declaration of function 'tlb_table_flush_implies_ipi_broadcast' [-Werror=implicit-function-declaration]
> 1185 | if (!tlb_table_flush_implies_ipi_broadcast())
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors
> ```
>
> I'll move tlb_table_flush_implies_ipi_broadcast() outside of
> CONFIG_MMU_GATHER_RCU_TABLE_FREE in next version, making the complier
> happy on architectures that don't enable that config ;)
Yeah, that's probably cleanest.
--
Cheers
David
next prev parent reply other threads:[~2025-12-18 13:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-13 8:00 [PATCH RFC 0/3] skip redundant TLB sync IPIs Lance Yang
2025-12-13 8:00 ` [PATCH RFC 1/3] mm/tlb: allow architectures to " Lance Yang
2025-12-15 5:48 ` Lance Yang
2025-12-18 13:08 ` David Hildenbrand (Red Hat) [this message]
2025-12-13 8:00 ` [PATCH RFC 2/3] x86/mm: implement redundant IPI elimination for PMD unsharing Lance Yang
2025-12-18 13:08 ` David Hildenbrand (Red Hat)
2025-12-22 3:19 ` [PATCH RFC 2/3] x86/mm: implement redundant IPI elimination for Lance Yang
2025-12-23 9:44 ` David Hildenbrand (Red Hat)
2025-12-23 11:13 ` Lance Yang
2025-12-13 8:00 ` [PATCH RFC 3/3] mm/khugepaged: skip redundant IPI in collapse_huge_page() Lance Yang
2025-12-14 13:24 ` kernel test robot
2025-12-14 13:56 ` kernel test robot
2025-12-18 13:13 ` David Hildenbrand (Red Hat)
2025-12-18 14:35 ` Lance Yang
2025-12-19 8:25 ` David Hildenbrand (Red Hat)
2025-12-21 10:43 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7552c2e-e12c-47c8-9b60-ec645cddf804@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=arnd@arndb.de \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dev.jain@arm.com \
--cc=hpa@zytor.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=lance.yang@linux.dev \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.