From: "Andy Lutomirski" <luto@kernel.org>
To: "Linus Torvalds" <torvalds@linux-foundation.org>,
"Nadav Amit" <nadav.amit@gmail.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Anton Blanchard" <anton@ozlabs.org>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Paul Mackerras" <paulus@ozlabs.org>,
"Randy Dunlap" <rdunlap@infradead.org>,
linux-arch <linux-arch@vger.kernel.org>,
"the arch/x86 maintainers" <x86@kernel.org>,
"Rik van Riel" <riel@surriel.com>,
"Dave Hansen" <dave.hansen@intel.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
Subject: Re: [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms
Date: Sun, 09 Jan 2022 11:52:52 -0800 [thread overview]
Message-ID: <355c148c-06a8-4e15-a77b-0ea2e22bf708@www.fastmail.com> (raw)
In-Reply-To: <CAHk-=wgtS7aNL=bxuwVFKiUzijc1EFiFXWTNLH-CHEgxciUVdg@mail.gmail.com>
On Sun, Jan 9, 2022, at 11:10 AM, Linus Torvalds wrote:
> On Sun, Jan 9, 2022 at 12:49 AM Nadav Amit <nadav.amit@gmail.com> wrote:
>>
>> I do not know whether it is a pure win, but there is a tradeoff.
>
> Hmm. I guess only some serious testing would tell.
>
> On x86, I'd be a bit worried about removing lazy TLB simply because
> even with ASID support there (called PCIDs by Intel for NIH reasons),
> the actual ASID space on x86 was at least originally very very
> limited.
>
> Architecturally, x86 may expose 12 bits of ASID space, but iirc at
> least the first few implementations actually only internally had one
> or two bits, and hashed the 12 bits down to that internal very limited
> hardware TLB ID space.
>
> We only use a handful of ASIDs per CPU on x86 partly for this reason
> (but also since there's no remote hardware TLB shootdown, there's no
> reason to have a bigger global ASID space, so ASIDs aren't _that_
> common).
>
> And I don't know how many non-PCID x86 systems (perhaps virtualized?)
> there might be out there.
>
> But it would be very interesting to test some "disable lazy tlb"
> patch. The main problem workloads tend to be IO, and I'm not sure how
> many of the automated performance tests would catch issues. I guess
> some threaded pipe ping-pong test (with each thread pinned to
> different cores) would show it.
My original PCID series actually did remove lazy TLB on x86. I don't remember why, but people objected. The issue isn't the limited PCID space -- IIRC it's just that MOV CR3 is slooooow. If we get rid of lazy TLB on x86, then we are writing CR3 twice on even a very short idle. That adds maybe 1k cycles, which isn't great.
>
> And I guess there is some load that triggered the original powerpc
> patch by Nick&co, and that Andy has been using..
I don't own a big enough machine. The workloads I'm aware of with the problem have massively multithreaded programs using many CPUs, and transitions into and out of lazy mode ping-pong the cacheline.
>
> Anybody willing to cook up a patch and run some benchmarks? Perhaps
> one that basically just replaces "set ->mm to NULL" with "set ->mm to
> &init_mm" - so that the lazy TLB code is still *there*, but it never
> triggers..
It would
>
> I think it's mainly 'copy_thread()' in kernel/fork.c and the 'init_mm'
> initializer in mm/init-mm.c, but there's probably other things too
> that have that knowledge of the special "tsk->mm = NULL" situation.
I think, for a little test, we would leave all the mm == NULL code alone and just change the enter-lazy logic. On top of all the cleanups in this series, that would be trivial.
>
> Linus
next prev parent reply other threads:[~2022-01-09 19:53 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-08 16:43 [PATCH 00/23] mm, sched: Rework lazy mm handling Andy Lutomirski
2022-01-08 16:43 ` [PATCH 01/23] membarrier: Document why membarrier() works Andy Lutomirski
2022-01-12 15:30 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 02/23] x86/mm: Handle unlazying membarrier core sync in the arch code Andy Lutomirski
2022-01-12 15:40 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 03/23] membarrier: Remove membarrier_arch_switch_mm() prototype in core code Andy Lutomirski
2022-01-08 16:43 ` [PATCH 04/23] membarrier: Make the post-switch-mm barrier explicit Andy Lutomirski
2022-01-12 15:52 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 05/23] membarrier, kthread: Use _ONCE accessors for task->mm Andy Lutomirski
2022-01-12 15:55 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 06/23] powerpc/membarrier: Remove special barrier on mm switch Andy Lutomirski
2022-01-10 8:42 ` Christophe Leroy
2022-01-12 15:57 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 07/23] membarrier: Rewrite sync_core_before_usermode() and improve documentation Andy Lutomirski
2022-01-12 16:11 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 08/23] membarrier: Remove redundant clear of mm->membarrier_state in exec_mmap() Andy Lutomirski
2022-01-12 16:13 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 09/23] membarrier: Fix incorrect barrier positions during exec and kthread_use_mm() Andy Lutomirski
2022-01-12 16:30 ` Mathieu Desnoyers
2022-01-12 17:08 ` Mathieu Desnoyers
2022-01-08 16:43 ` [PATCH 10/23] x86/events, x86/insn-eval: Remove incorrect active_mm references Andy Lutomirski
2022-01-08 16:43 ` [PATCH 11/23] sched/scs: Initialize shadow stack on idle thread bringup, not shutdown Andy Lutomirski
2022-01-10 22:06 ` Sami Tolvanen
2022-01-08 16:43 ` [PATCH 12/23] Rework "sched/core: Fix illegal RCU from offline CPUs" Andy Lutomirski
2022-01-08 16:43 ` [PATCH 13/23] exec: Remove unnecessary vmacache_seqnum clear in exec_mmap() Andy Lutomirski
2022-01-08 16:43 ` [PATCH 14/23] sched, exec: Factor current mm changes out from exec Andy Lutomirski
2022-01-08 16:44 ` [PATCH 15/23] kthread: Switch to __change_current_mm() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 16/23] sched: Use lightweight hazard pointers to grab lazy mms Andy Lutomirski
2022-01-08 19:22 ` Linus Torvalds
2022-01-08 22:04 ` Andy Lutomirski
2022-01-09 0:53 ` Linus Torvalds
2022-01-09 3:58 ` Andy Lutomirski
2022-01-09 4:38 ` Linus Torvalds
2022-01-09 20:19 ` Andy Lutomirski
2022-01-09 20:48 ` Linus Torvalds
2022-01-09 21:51 ` Linus Torvalds
2022-01-10 0:52 ` Andy Lutomirski
2022-01-10 2:36 ` Rik van Riel
2022-01-10 3:51 ` Linus Torvalds
2022-01-10 4:56 ` Nicholas Piggin
2022-01-10 5:17 ` Nicholas Piggin
2022-01-10 17:19 ` Linus Torvalds
2022-01-11 2:24 ` Nicholas Piggin
2022-01-10 20:52 ` Andy Lutomirski
2022-01-11 3:10 ` Nicholas Piggin
2022-01-11 15:39 ` Andy Lutomirski
2022-01-11 22:48 ` Nicholas Piggin
2022-01-12 0:42 ` Nicholas Piggin
2022-01-11 10:39 ` Will Deacon
2022-01-11 15:22 ` Andy Lutomirski
2022-01-09 5:56 ` Nadav Amit
2022-01-09 6:48 ` Linus Torvalds
2022-01-09 8:49 ` Nadav Amit
2022-01-09 19:10 ` Linus Torvalds
2022-01-09 19:52 ` Andy Lutomirski [this message]
2022-01-09 20:00 ` Linus Torvalds
2022-01-09 20:34 ` Nadav Amit
2022-01-09 20:48 ` Andy Lutomirski
2022-01-09 19:22 ` Rik van Riel
2022-01-09 19:34 ` Nadav Amit
2022-01-09 19:37 ` Rik van Riel
2022-01-09 19:51 ` Nadav Amit
2022-01-09 19:54 ` Linus Torvalds
2022-01-08 16:44 ` [PATCH 17/23] x86/mm: Make use/unuse_temporary_mm() non-static Andy Lutomirski
2022-01-08 16:44 ` [PATCH 18/23] x86/mm: Allow temporary mms when IRQs are on Andy Lutomirski
2022-01-08 16:44 ` [PATCH 19/23] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery Andy Lutomirski
2022-01-10 13:13 ` Ard Biesheuvel
2022-01-08 16:44 ` [PATCH 20/23] x86/mm: Remove leave_mm() in favor of unlazy_mm_irqs_off() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 21/23] x86/mm: Use unlazy_mm_irqs_off() in TLB flush IPIs Andy Lutomirski
2022-01-08 16:44 ` [PATCH 22/23] x86/mm: Optimize for_each_possible_lazymm_cpu() Andy Lutomirski
2022-01-08 16:44 ` [PATCH 23/23] x86/mm: Opt in to IRQs-off activate_mm() Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=355c148c-06a8-4e15-a77b-0ea2e22bf708@www.fastmail.com \
--to=luto@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=anton@ozlabs.org \
--cc=benh@kernel.crashing.org \
--cc=dave.hansen@intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=nadav.amit@gmail.com \
--cc=npiggin@gmail.com \
--cc=paulus@ozlabs.org \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=riel@surriel.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).