Re: [PATCH v11 06/12] x86/mm: use INVLPGB for kernel TLB flushes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Hansen <dave.hansen@intel.com>
To: Rik van Riel <riel@surriel.com>, Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com,
	nadav.amit@gmail.com, thomas.lendacky@amd.com,
	kernel-team@meta.com, linux-mm@kvack.org,
	akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com,
	mhklinux@outlook.com, andrew.cooper3@citrix.com,
	Manali Shukla <Manali.Shukla@amd.com>
Subject: Re: [PATCH v11 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Date: Tue, 18 Feb 2025 14:27:31 -0800	[thread overview]
Message-ID: <5f4d58fe-4fa1-4b59-81a7-e8c8d3030d0a@intel.com> (raw)
In-Reply-To: <724d17ce3fbe07d1d9404f8f32ba518071bcfa4a.camel@surriel.com>

On 2/18/25 10:00, Rik van Riel wrote:
> On Sat, 2025-02-15 at 02:08 +0000, Yosry Ahmed wrote:
>> So I think what Dave wants (and I agree) is:
>> 	if (!broadcast_kernel_range_flush(info))
>> 		ipi_kernel_range_flush(info)
>>
>> Where ipi_kernel_range_flush() contains old_thing1() and oldthing2().

That's OK-ish. But it still smells of hacking in the new concept without
refactoring things properly.

Let's logically inline the code that we've got.  I think it actually
ends up looking something like this:

	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
		if (info->end == TLB_FLUSH_ALL) {
			invlpgb_flush_all();
		} else {
			for_each(addr)
				invlpgb_flush_addr_nosync(addr, nr);
		}
	} else {
		if (info->end == TLB_FLUSH_ALL)
 			on_each_cpu(do_flush_tlb_all, NULL, 1);
	 	else
 			on_each_cpu(do_kernel_range_flush, info, 1);
	}

Where we've got two inputs:

	1. INVLPGB support (or not)
	2. TLB_FLUSH_ALL (basically ranged or full flush)

So I think we should group by *one* of those. The above groups by
INVLPGB support and this groups by TLB_FLUSH_ALL:

	if (info->end == TLB_FLUSH_ALL) {
		if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
			invlpgb_flush_all();
		} else {
			on_each_cpu(do_flush_tlb_all, NULL, 1);
		}
	} else {
		if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
 			for_each(addr)
				invlpgb_flush_addr_nosync(addr, nr);
	 	else
 			on_each_cpu(do_kernel_range_flush, info, 1);
	}

So, if we create some helpers that give some consistent naming:

static tlb_flush_all_ipi(...)
{
	on_each_cpu(do_flush_tlb_all, NULL, 1);
}

static tlb_flush_all(...)
{
	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
		invlpgb_flush_all(...);
	else
		tlb_flush_all_ipi(...);
}

and then also create the ranged equivalents (which internally have the
same cpu_feature_enabled() check):

	tlb_flush_range_ipi(...)
	invlpgb_flush_range(...)

Then we can have the top-level code be:

	if (info->end == TLB_FLUSH_ALL)
		tlb_flush_all(info);
	else
		tlb_flush_range(info);

That actually looks way nicer than what we have today. For bonus points,
if a third way of flushing the TLB showed up, it would slot right in:

 static tlb_flush_all(...)
 {
	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
		invlpgb_flush_all(...);
+	else if cpu_feature_enabled(X86_FEATURE_RAR))
+		rar_flush_all(...);
	else
		tlb_flush_all_ipi(...);
 }

That's *exactly* the way we want the code to read. At the higher level,
it's deciding based on the generic thing that *everybody* cares about:
ranged or full flush. Then, at the lower level, it's deciding how to
implement that high-level flush concept.

next prev parent reply	other threads:[~2025-02-18 22:27 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-13 16:13 [PATCH v11 00/12] AMD broadcast TLB invalidation Rik van Riel
2025-02-13 16:13 ` [PATCH v11 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Rik van Riel
2025-02-21 15:30   ` [tip: x86/mm] x86/mm: Make " tip-bot2 for Rik van Riel
2025-02-13 16:13 ` [PATCH v11 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Rik van Riel
2025-02-21 15:30   ` [tip: x86/mm] x86/mm: Remove " tip-bot2 for Rik van Riel
2025-02-13 16:13 ` [PATCH v11 03/12] x86/mm: consolidate full flush threshold decision Rik van Riel
2025-02-14 18:07   ` Dave Hansen
2025-02-19 11:21   ` Borislav Petkov
2025-02-13 16:13 ` [PATCH v11 04/12] x86/mm: get INVLPGB count max from CPUID Rik van Riel
2025-02-14 18:16   ` Dave Hansen
2025-02-19 11:56   ` Borislav Petkov
2025-02-19 17:52     ` Rik van Riel
2025-02-19 18:23       ` Borislav Petkov
2025-02-19 19:26       ` Dave Hansen
2025-02-13 16:13 ` [PATCH v11 05/12] x86/mm: add INVLPGB support code Rik van Riel
2025-02-14 18:22   ` Dave Hansen
2025-02-18 17:23     ` Rik van Riel
2025-02-19 12:04   ` Borislav Petkov
2025-02-19 17:42     ` Rik van Riel
2025-02-19 19:01       ` Dave Hansen
2025-02-19 19:15         ` Borislav Petkov
2025-02-20  2:49           ` Rik van Riel
2025-02-20 10:23             ` Borislav Petkov
2025-02-13 16:13 ` [PATCH v11 06/12] x86/mm: use INVLPGB for kernel TLB flushes Rik van Riel
2025-02-14 18:35   ` Dave Hansen
2025-02-14 19:40     ` Peter Zijlstra
2025-02-14 19:55       ` Dave Hansen
2025-02-15  1:25         ` Rik van Riel
2025-02-15  2:08           ` Yosry Ahmed
2025-02-18 18:00             ` Rik van Riel
2025-02-18 22:27               ` Dave Hansen [this message]
2025-02-19  1:46                 ` Yosry Ahmed
2025-02-13 16:13 ` [PATCH v11 07/12] x86/mm: use INVLPGB in flush_tlb_all Rik van Riel
2025-02-14 18:57   ` Dave Hansen
2025-02-13 16:13 ` [PATCH v11 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Rik van Riel
2025-02-14 18:51   ` Dave Hansen
2025-02-18 19:31     ` Rik van Riel
2025-02-18 19:46       ` Dave Hansen
2025-02-18 20:06         ` Rik van Riel
2025-02-13 16:14 ` [PATCH v11 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Rik van Riel
2025-02-14 19:53   ` Dave Hansen
2025-02-17 13:22     ` Brendan Jackman
2025-02-20 15:25     ` Rik van Riel
2025-02-13 16:14 ` [PATCH v11 10/12] x86/mm: do targeted broadcast flushing from tlbbatch code Rik van Riel
2025-02-13 16:14 ` [PATCH v11 11/12] x86/mm: enable AMD translation cache extensions Rik van Riel
2025-02-13 16:14 ` [PATCH v11 12/12] x86/mm: only invalidate final translations with INVLPGB Rik van Riel
2025-02-13 18:31 ` [PATCH v11 00/12] AMD broadcast TLB invalidation Brendan Jackman
2025-02-13 18:38   ` Brendan Jackman
2025-02-13 20:02   ` Rik van Riel
2025-02-14  9:36     ` Peter Zijlstra
2025-02-14  9:54       ` Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f4d58fe-4fa1-4b59-81a7-e8c8d3030d0a@intel.com \
    --to=dave.hansen@intel.com \
    --cc=Manali.Shukla@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhklinux@outlook.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    --cc=yosry.ahmed@linux.dev \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.