stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <gregkh@linuxfoundation.org>
To: luto@kernel.org, akpm@linux-foundation.org,
	arjan@linux.intel.com, bpetkov@suse.de, dave.hansen@intel.com,
	edubezval@gmail.com, eduval@amazon.com,
	gregkh@linuxfoundation.org, mgorman@suse.de, mhocko@suse.com,
	mingo@kernel.org, nadav.amit@gmail.com, namit@vmware.com,
	peterz@infradead.org, riel@redhat.com, tglx@linutronix.de,
	torvalds@linux-foundation.org
Cc: <stable@vger.kernel.org>, <stable-commits@vger.kernel.org>
Subject: Patch "x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases" has been added to the 4.9-stable tree
Date: Thu, 21 Dec 2017 09:41:07 +0100	[thread overview]
Message-ID: <15138456677187@kroah.com> (raw)


This is a note to let you know that I've just added the patch titled

    x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From 454bbad9793f59f5656ce5971ee473a8be736ef5 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto@kernel.org>
Date: Sun, 28 May 2017 10:00:12 -0700
Subject: x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases

From: Andy Lutomirski <luto@kernel.org>

commit 454bbad9793f59f5656ce5971ee473a8be736ef5 upstream.

The local flush path is very similar to the remote flush path.
Merge them.

This is intended to make no difference to behavior whatsoever.  It
removes some code and will make future changes to the flushing
mechanics simpler.

This patch does remove one small optimization: flush_tlb_mm_range()
now has an unconditional smp_mb() instead of using MOV to CR3 or
INVLPG as a full barrier when applicable.  I think this is okay for
a few reasons.  First, smp_mb() is quite cheap compared to the cost
of a TLB flush.  Second, this rearrangement makes a bigger
optimization available: with some work on the SMP function call
code, we could do the local and remote flushes in parallel.  Third,
I'm planning a rework of the TLB flush algorithm that will require
an atomic operation at the beginning of each flush, and that
operation will replace the smp_mb().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Eduardo Valentin <eduval@amazon.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/tlbflush.h |    1 
 arch/x86/mm/tlb.c               |  111 +++++++++++++++++-----------------------
 2 files changed, 48 insertions(+), 64 deletions(-)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -216,7 +216,6 @@ static inline void __flush_tlb_one(unsig
  * ..but the i386 has somewhat limited tlb flushing capabilities,
  * and page-granular flushes are available only on i486 and up.
  */
-
 struct flush_tlb_info {
 	struct mm_struct *mm;
 	unsigned long start;
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -216,22 +216,9 @@ void switch_mm_irqs_off(struct mm_struct
  * write/read ordering problems.
  */
 
-/*
- * TLB flush funcation:
- * 1) Flush the tlb entries if the cpu uses the mm that's being flushed.
- * 2) Leave the mm if we are in the lazy tlb mode.
- */
-static void flush_tlb_func(void *info)
+static void flush_tlb_func_common(const struct flush_tlb_info *f,
+				  bool local, enum tlb_flush_reason reason)
 {
-	const struct flush_tlb_info *f = info;
-
-	inc_irq_stat(irq_tlb_count);
-
-	if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.active_mm))
-		return;
-
-	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
-
 	if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
 		leave_mm(smp_processor_id());
 		return;
@@ -239,7 +226,9 @@ static void flush_tlb_func(void *info)
 
 	if (f->end == TLB_FLUSH_ALL) {
 		local_flush_tlb();
-		trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL);
+		if (local)
+			count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+		trace_tlb_flush(reason, TLB_FLUSH_ALL);
 	} else {
 		unsigned long addr;
 		unsigned long nr_pages =
@@ -249,10 +238,32 @@ static void flush_tlb_func(void *info)
 			__flush_tlb_single(addr);
 			addr += PAGE_SIZE;
 		}
-		trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, nr_pages);
+		if (local)
+			count_vm_tlb_events(NR_TLB_LOCAL_FLUSH_ONE, nr_pages);
+		trace_tlb_flush(reason, nr_pages);
 	}
 }
 
+static void flush_tlb_func_local(void *info, enum tlb_flush_reason reason)
+{
+	const struct flush_tlb_info *f = info;
+
+	flush_tlb_func_common(f, true, reason);
+}
+
+static void flush_tlb_func_remote(void *info)
+{
+	const struct flush_tlb_info *f = info;
+
+	inc_irq_stat(irq_tlb_count);
+
+	if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.active_mm))
+		return;
+
+	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
+	flush_tlb_func_common(f, false, TLB_REMOTE_SHOOTDOWN);
+}
+
 void native_flush_tlb_others(const struct cpumask *cpumask,
 			     const struct flush_tlb_info *info)
 {
@@ -269,11 +280,11 @@ void native_flush_tlb_others(const struc
 		cpu = smp_processor_id();
 		cpumask = uv_flush_tlb_others(cpumask, info);
 		if (cpumask)
-			smp_call_function_many(cpumask, flush_tlb_func,
+			smp_call_function_many(cpumask, flush_tlb_func_remote,
 					       (void *)info, 1);
 		return;
 	}
-	smp_call_function_many(cpumask, flush_tlb_func,
+	smp_call_function_many(cpumask, flush_tlb_func_remote,
 			       (void *)info, 1);
 }
 
@@ -315,59 +326,33 @@ static unsigned long tlb_single_page_flu
 void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 				unsigned long end, unsigned long vmflag)
 {
-	unsigned long addr;
-	struct flush_tlb_info info;
-	/* do a global flush by default */
-	unsigned long base_pages_to_flush = TLB_FLUSH_ALL;
-
-	preempt_disable();
-	if (current->active_mm != mm) {
-		/* Synchronize with switch_mm. */
-		smp_mb();
+	int cpu;
 
-		goto out;
-	}
-
-	if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
-		leave_mm(smp_processor_id());
-
-		/* Synchronize with switch_mm. */
-		smp_mb();
+	struct flush_tlb_info info = {
+		.mm = mm,
+	};
 
-		goto out;
-	}
+	cpu = get_cpu();
 
-	if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
-		base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+	/* Synchronize with switch_mm. */
+	smp_mb();
 
-	/*
-	 * Both branches below are implicit full barriers (MOV to CR or
-	 * INVLPG) that synchronize with switch_mm.
-	 */
-	if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
-		base_pages_to_flush = TLB_FLUSH_ALL;
-		count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-		local_flush_tlb();
+	/* Should we flush just the requested range? */
+	if ((end != TLB_FLUSH_ALL) &&
+	    !(vmflag & VM_HUGETLB) &&
+	    ((end - start) >> PAGE_SHIFT) <= tlb_single_page_flush_ceiling) {
+		info.start = start;
+		info.end = end;
 	} else {
-		/* flush range by one by one 'invlpg' */
-		for (addr = start; addr < end;	addr += PAGE_SIZE) {
-			count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
-			__flush_tlb_single(addr);
-		}
-	}
-	trace_tlb_flush(TLB_LOCAL_MM_SHOOTDOWN, base_pages_to_flush);
-out:
-	info.mm = mm;
-	if (base_pages_to_flush == TLB_FLUSH_ALL) {
 		info.start = 0UL;
 		info.end = TLB_FLUSH_ALL;
-	} else {
-		info.start = start;
-		info.end = end;
 	}
-	if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
+
+	if (mm == current->active_mm)
+		flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
+	if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
 		flush_tlb_others(mm_cpumask(mm), &info);
-	preempt_enable();
+	put_cpu();
 }
 
 


Patches currently in stable-queue which might be from luto@kernel.org are

queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch

                 reply	other threads:[~2017-12-21  8:45 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15138456677187@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=bpetkov@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=edubezval@gmail.com \
    --cc=eduval@amazon.com \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=stable-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).