All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: Peter Zijlstra <peterz@infradead.org>, david@kernel.org
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	aneesh.kumar@kernel.org, arnd@arndb.de, baohua@kernel.org,
	baolin.wang@linux.alibaba.com, boris.ostrovsky@oracle.com,
	bp@alien8.de, dave.hansen@intel.com, dave.hansen@linux.intel.com,
	dev.jain@arm.com, hpa@zytor.com, hughd@google.com,
	ioworker0@gmail.com, jannh@google.com, jgross@suse.com,
	kvm@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	lorenzo.stoakes@oracle.com, mingo@redhat.com, npache@redhat.com,
	npiggin@gmail.com, pbonzini@redhat.com, riel@surriel.com,
	ryan.roberts@arm.com, seanjc@google.com, shy828301@gmail.com,
	tglx@linutronix.de, virtualization@lists.linux.dev,
	will@kernel.org, x86@kernel.org, ypodemsk@redhat.com,
	ziy@nvidia.com
Subject: Re: [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table
Date: Mon, 2 Feb 2026 23:52:31 +0800	[thread overview]
Message-ID: <d6944cd8-d3b7-4b16-ab52-a61e7dc2221c@linux.dev> (raw)
In-Reply-To: <20260202150957.GD1282955@noisy.programming.kicks-ass.net>



On 2026/2/2 23:09, Peter Zijlstra wrote:
> On Mon, Feb 02, 2026 at 10:37:39PM +0800, Lance Yang wrote:
>>
>>
>> On 2026/2/2 21:37, Peter Zijlstra wrote:
>>> On Mon, Feb 02, 2026 at 09:07:10PM +0800, Lance Yang wrote:
>>>
>>>>>> Right, but if we can use full RCU for PT_RECLAIM, why can't we do so
>>>>>> unconditionally and not add overhead?
>>>>>
>>>>> The sync (IPI) is mainly needed for unshare (e.g. hugetlb) and collapse
>>>>> (khugepaged) paths, regardless of whether table free uses RCU, IIUC.
>>>>
>>>> In addition: We need the sync when we modify page tables (e.g. unshare,
>>>> collapse), not only when we free them. RCU can defer freeing but does
>>>> not prevent lockless walkers from seeing concurrent in-place
>>>> modifications, so we need the IPI to synchronize with those walkers
>>>> first.
>>>
>>> Currently PT_RECLAIM=y has no IPI; are you saying that is broken? If
>>> not, then why do we need this at all?
>>
>> PT_RECLAIM=y does have IPI for unshare/collapse — those paths call
>> tlb_flush_unshared_tables() (for hugetlb unshare) and collapse_huge_page()
>> (in khugepaged collapse), which already send IPIs today (broadcast to all
>> CPUs via tlb_remove_table_sync_one()).
>>
>> What PT_RECLAIM=y doesn't need IPI for is table freeing (
>> __tlb_remove_table_one() uses call_rcu() instead). But table modification
>> (unshare, collapse) still needs IPI to synchronize with lockless walkers,
>> regardless of PT_RECLAIM.
>>
>> So PT_RECLAIM=y is not broken; it already has IPI where needed. This series
>> just makes those IPIs targeted instead of broadcast. Does that clarify?
> 
> Oh bah, reading is hard. I had missed they had more table_sync_one() calls,
> rather than remove_table_one().
> 
> So you *can* replace table_sync_one() with rcu_sync(), that will provide
> the same guarantees. Its just a 'little' bit slower on the update side,
> but does not incur the read side cost.

Yep, we could replace the IPI with synchronize_rcu() on the sync side:

- Currently: TLB flush → send IPI → wait for walkers to finish
- With synchronize_rcu(): TLB flush → synchronize_rcu() -> waits for 
grace period

Lockless walkers (e.g. GUP-fast) use local_irq_disable(); 
synchronize_rcu() also
waits for regions with preemption/interrupts disabled, so it should 
work, IIUC.

And then, the trade-off would be:
- Read side: zero cost (no per-CPU tracking)
- Write side: wait for RCU grace period (potentially slower)

For collapse/unshare, that write-side latency might be acceptable :)

@David, what do you think?

> 
> I really think anything here needs to better explain the various
> requirements. Because now everybody gets to pay the price for hugetlb
> shared crud, while 'nobody' will actually use that.

Right. If we go with synchronize_rcu(), the read-side cost goes away ...

Thanks,
Lance

  reply	other threads:[~2026-02-02 15:52 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-02  7:45 [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Lance Yang
2026-02-02  7:45 ` [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with " Lance Yang
2026-02-02  9:42   ` Peter Zijlstra
2026-02-02 12:14     ` Lance Yang
2026-02-02 12:51       ` Peter Zijlstra
2026-02-02 13:23         ` Lance Yang
2026-02-02 13:42           ` Peter Zijlstra
2026-02-02 14:28             ` Lance Yang
2026-02-02 16:20       ` Dave Hansen
2026-02-02 11:37   ` kernel test robot
2026-02-03 23:49   ` kernel test robot
2026-02-02  7:45 ` [PATCH v4 2/3] mm: switch callers to tlb_remove_table_sync_mm() Lance Yang
2026-02-02  7:45 ` [PATCH v4 3/3] x86/tlb: add architecture-specific TLB IPI optimization support Lance Yang
2026-02-25 20:11   ` Sean Christopherson
2026-02-26 11:37     ` Lance Yang
2026-02-26 18:24       ` Sean Christopherson
2026-03-01  6:56         ` Lance Yang
2026-02-02  9:54 ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Peter Zijlstra
2026-02-02 11:00   ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table Lance Yang
2026-02-02 12:50     ` Peter Zijlstra
2026-02-02 12:58       ` Lance Yang
2026-02-02 13:07         ` Lance Yang
2026-02-02 13:37           ` Peter Zijlstra
2026-02-02 14:37             ` Lance Yang
2026-02-02 15:09               ` Peter Zijlstra
2026-02-02 15:52                 ` Lance Yang [this message]
2026-02-05 13:25                   ` David Hildenbrand (Arm)
2026-02-05 15:01                     ` Lance Yang
2026-02-05 15:05                       ` David Hildenbrand (Arm)
2026-02-05 15:28                         ` Lance Yang
2026-02-05 15:09                       ` Dave Hansen
2026-02-05 15:31                         ` Lance Yang
2026-02-05 15:41                           ` Dave Hansen
2026-02-05 16:30                             ` Lance Yang
2026-02-05 16:46                               ` David Hildenbrand (Arm)
2026-02-05 16:48                               ` Matthew Wilcox
2026-02-05 17:06                                 ` David Hildenbrand (Arm)
2026-02-05 18:36                                   ` Dave Hansen
2026-02-05 22:49                                     ` David Hildenbrand (Arm)
2026-02-05 21:30                                   ` David Hildenbrand (Arm)
2026-02-05 17:00                               ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6944cd8-d3b7-4b16-ab52-a61e7dc2221c@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=arnd@arndb.de \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=jgross@suse.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mingo@redhat.com \
    --cc=npache@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=ryan.roberts@arm.com \
    --cc=seanjc@google.com \
    --cc=shy828301@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux.dev \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ypodemsk@redhat.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.