From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Lance Yang <lance.yang@linux.dev>, Peter Zijlstra <peterz@infradead.org>
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
aneesh.kumar@kernel.org, arnd@arndb.de, baohua@kernel.org,
baolin.wang@linux.alibaba.com, boris.ostrovsky@oracle.com,
bp@alien8.de, dave.hansen@intel.com, dave.hansen@linux.intel.com,
dev.jain@arm.com, hpa@zytor.com, hughd@google.com,
ioworker0@gmail.com, jannh@google.com, jgross@suse.com,
kvm@vger.kernel.org, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
lorenzo.stoakes@oracle.com, mingo@redhat.com, npache@redhat.com,
npiggin@gmail.com, pbonzini@redhat.com, riel@surriel.com,
ryan.roberts@arm.com, seanjc@google.com, shy828301@gmail.com,
tglx@linutronix.de, virtualization@lists.linux.dev,
will@kernel.org, x86@kernel.org, ypodemsk@redhat.com,
ziy@nvidia.com
Subject: Re: [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table
Date: Thu, 5 Feb 2026 14:25:04 +0100 [thread overview]
Message-ID: <06d48a52-e4ec-47cd-b3fb-0fccd4dc49f4@kernel.org> (raw)
In-Reply-To: <d6944cd8-d3b7-4b16-ab52-a61e7dc2221c@linux.dev>
On 2/2/26 16:52, Lance Yang wrote:
>
>
> On 2026/2/2 23:09, Peter Zijlstra wrote:
>> On Mon, Feb 02, 2026 at 10:37:39PM +0800, Lance Yang wrote:
>>>
>>>
>>>
>>> PT_RECLAIM=y does have IPI for unshare/collapse — those paths call
>>> tlb_flush_unshared_tables() (for hugetlb unshare) and
>>> collapse_huge_page()
>>> (in khugepaged collapse), which already send IPIs today (broadcast to
>>> all
>>> CPUs via tlb_remove_table_sync_one()).
>>>
>>> What PT_RECLAIM=y doesn't need IPI for is table freeing (
>>> __tlb_remove_table_one() uses call_rcu() instead). But table
>>> modification
>>> (unshare, collapse) still needs IPI to synchronize with lockless
>>> walkers,
>>> regardless of PT_RECLAIM.
>>>
>>> So PT_RECLAIM=y is not broken; it already has IPI where needed. This
>>> series
>>> just makes those IPIs targeted instead of broadcast. Does that clarify?
>>
>> Oh bah, reading is hard. I had missed they had more table_sync_one()
>> calls,
>> rather than remove_table_one().
>>
>> So you *can* replace table_sync_one() with rcu_sync(), that will provide
>> the same guarantees. Its just a 'little' bit slower on the update side,
>> but does not incur the read side cost.
>
> Yep, we could replace the IPI with synchronize_rcu() on the sync side:
>
> - Currently: TLB flush → send IPI → wait for walkers to finish
> - With synchronize_rcu(): TLB flush → synchronize_rcu() -> waits for
> grace period
>
> Lockless walkers (e.g. GUP-fast) use local_irq_disable();
> synchronize_rcu() also
> waits for regions with preemption/interrupts disabled, so it should
> work, IIUC.
>
> And then, the trade-off would be:
> - Read side: zero cost (no per-CPU tracking)
> - Write side: wait for RCU grace period (potentially slower)
>
> For collapse/unshare, that write-side latency might be acceptable :)
>
> @David, what do you think?
Given that we just fixed the write-side latency from breaking Oracle's
databases completely, we have to be a bit careful here :)
The thing is: on many x86 configs we don't need *any* TLB flushed or RCU
syncs.
So "how much slower" are we talking about, especially on bigger/loaded
systems?
--
Cheers,
David
next prev parent reply other threads:[~2026-02-05 13:25 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-02 7:45 [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Lance Yang
2026-02-02 7:45 ` [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with " Lance Yang
2026-02-02 9:42 ` Peter Zijlstra
2026-02-02 12:14 ` Lance Yang
2026-02-02 12:51 ` Peter Zijlstra
2026-02-02 13:23 ` Lance Yang
2026-02-02 13:42 ` Peter Zijlstra
2026-02-02 14:28 ` Lance Yang
2026-02-02 16:20 ` Dave Hansen
2026-02-02 11:37 ` kernel test robot
2026-02-03 23:49 ` kernel test robot
2026-02-02 7:45 ` [PATCH v4 2/3] mm: switch callers to tlb_remove_table_sync_mm() Lance Yang
2026-02-02 7:45 ` [PATCH v4 3/3] x86/tlb: add architecture-specific TLB IPI optimization support Lance Yang
2026-02-25 20:11 ` Sean Christopherson
2026-02-26 11:37 ` Lance Yang
2026-02-26 18:24 ` Sean Christopherson
2026-03-01 6:56 ` Lance Yang
2026-02-02 9:54 ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Peter Zijlstra
2026-02-02 11:00 ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table Lance Yang
2026-02-02 12:50 ` Peter Zijlstra
2026-02-02 12:58 ` Lance Yang
2026-02-02 13:07 ` Lance Yang
2026-02-02 13:37 ` Peter Zijlstra
2026-02-02 14:37 ` Lance Yang
2026-02-02 15:09 ` Peter Zijlstra
2026-02-02 15:52 ` Lance Yang
2026-02-05 13:25 ` David Hildenbrand (Arm) [this message]
2026-02-05 15:01 ` Lance Yang
2026-02-05 15:05 ` David Hildenbrand (Arm)
2026-02-05 15:28 ` Lance Yang
2026-02-05 15:09 ` Dave Hansen
2026-02-05 15:31 ` Lance Yang
2026-02-05 15:41 ` Dave Hansen
2026-02-05 16:30 ` Lance Yang
2026-02-05 16:46 ` David Hildenbrand (Arm)
2026-02-05 16:48 ` Matthew Wilcox
2026-02-05 17:06 ` David Hildenbrand (Arm)
2026-02-05 18:36 ` Dave Hansen
2026-02-05 22:49 ` David Hildenbrand (Arm)
2026-02-05 21:30 ` David Hildenbrand (Arm)
2026-02-05 17:00 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=06d48a52-e4ec-47cd-b3fb-0fccd4dc49f4@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=arnd@arndb.de \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dev.jain@arm.com \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=jgross@suse.com \
--cc=kvm@vger.kernel.org \
--cc=lance.yang@linux.dev \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=npiggin@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=seanjc@google.com \
--cc=shy828301@gmail.com \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ypodemsk@redhat.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.