From: Rik van Riel <riel@surriel.com>
To: linux-kernel@vger.kernel.org
Cc: dave.hansen@linux.intel.com, luto@kernel.org,
peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, x86@kernel.org, kernel-team@meta.com,
hpa@zytor.com, Rik van Riel <riel@surriel.com>
Subject: [PATCH 1/3] x86,tlb: update mm_cpumask lazily
Date: Fri, 8 Nov 2024 19:27:48 -0500 [thread overview]
Message-ID: <20241109003727.3958374-2-riel@surriel.com> (raw)
In-Reply-To: <20241109003727.3958374-1-riel@surriel.com>
On busy multi-threaded workloads, there can be significant contention
on the mm_cpumask at context switch time.
Reduce that contention by updating mm_cpumask lazily, setting the CPU bit
at context switch time (if not already set), and clearing the CPU bit at
the first TLB flush sent to a CPU where the process isn't running.
When a flurry of TLB flushes for a process happen, only the first one
will be sent to CPUs where the process isn't running. The others will
be sent to CPUs where the process is currently running.
On an AMD Milan system with 36 cores, there is a noticeable difference:
$ hackbench --groups 20 --loops 10000
Before: ~4.5s +/- 0.1s
After: ~4.2s +/- 0.1s
Signed-off-by: Rik van Riel <riel@surriel.com>
---
arch/x86/mm/tlb.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 86593d1b787d..f19f6378cabf 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -606,18 +606,15 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
cond_mitigation(tsk);
/*
- * Stop remote flushes for the previous mm.
- * Skip kernel threads; we never send init_mm TLB flushing IPIs,
- * but the bitmap manipulation can cause cache line contention.
+ * Leave this CPU in prev's mm_cpumask. Atomic writes to
+ * mm_cpumask can be expensive under contention. The CPU
+ * will be removed lazily at TLB flush time.
*/
- if (prev != &init_mm) {
- VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
- mm_cpumask(prev)));
- cpumask_clear_cpu(cpu, mm_cpumask(prev));
- }
+ VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu,
+ mm_cpumask(prev)));
/* Start receiving IPIs and then read tlb_gen (and LAM below) */
- if (next != &init_mm)
+ if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))
cpumask_set_cpu(cpu, mm_cpumask(next));
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
@@ -761,8 +758,10 @@ static void flush_tlb_func(void *info)
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
/* Can only happen on remote CPUs */
- if (f->mm && f->mm != loaded_mm)
+ if (f->mm && f->mm != loaded_mm) {
+ cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(f->mm));
return;
+ }
}
if (unlikely(loaded_mm == &init_mm))
--
2.45.2
next prev parent reply other threads:[~2024-11-09 0:40 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-09 0:27 [PATCh 0/3] x86,tlb: context switch optimizations Rik van Riel
2024-11-09 0:27 ` Rik van Riel [this message]
2024-11-13 2:59 ` [tip: x86/mm] x86/mm/tlb: Update mm_cpumask lazily tip-bot2 for Rik van Riel
2024-11-09 0:27 ` [PATCH 2/3] x86,tlb: add tracepoint for TLB flush IPI to stale CPU Rik van Riel
2024-11-13 2:59 ` [tip: x86/mm] x86/mm/tlb: Add " tip-bot2 for Rik van Riel
2024-11-09 0:27 ` [PATCH 3/3] x86,tlb: put cpumask_test_cpu in prev == next under CONFIG_DEBUG_VM Rik van Riel
2024-11-13 2:59 ` [tip: x86/mm] x86/mm/tlb: Put cpumask_test_cpu() check in switch_mm_irqs_off() " tip-bot2 for Rik van Riel
2024-11-13 9:55 ` [PATCh 0/3] x86,tlb: context switch optimizations Borislav Petkov
2024-11-13 10:00 ` Ingo Molnar
2024-11-13 14:38 ` Rik van Riel
2024-11-14 11:33 ` Peter Zijlstra
2024-11-13 14:55 ` Rik van Riel
2024-11-14 9:52 ` Ingo Molnar
2024-11-14 11:36 ` Peter Zijlstra
2024-11-14 14:27 ` Rik van Riel
2024-11-14 14:40 ` Peter Zijlstra
2024-11-14 11:36 ` Peter Zijlstra
2024-11-14 11:43 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241109003727.3958374-2-riel@surriel.com \
--to=riel@surriel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox