* [PATCH v5 0/3] mm: fixes of tlb_flush_pending
@ 2017-07-31 16:43 Nadav Amit
  2017-07-31 16:43 ` [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending Nadav Amit
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Nadav Amit @ 2017-07-31 16:43 UTC (permalink / raw)
  To: linux-mm
  Cc: nadav.amit, mgorman, riel, luto, Nadav Amit, Minchan Kim,
	Sergey Senozhatsky
These three patches address tlb_flush_pending issues. The first one address
a race when accessing tlb_flush_pending and is the important one.
The next two patch addresses Andrew Morton question regarding the barriers.
This patch is not really related to the first one: the atomic operations
atomic_read() and atomic_inc() do not act as a memory barrier, and
replacing existing barriers with smp_mb__after_atomic() did not seem
beneficial. Yet, while reviewing the memory barriers around the use of
tlb_flush_pending, few issues were identified.
v4 -> v5:
 - Fixing embarrassing build mistake (0day)
v3 -> v4:
 - Change function names to indicate they inc/dec and not set/clear
   (Sergey)
 - Avoid additional barriers, and instead revert the patch that accessed
   mm_tlb_flush_pending without a lock (Mel)
v2 -> v3:
 - Do not init tlb_flush_pending if it is not defined without (Sergey)
 - Internalize memory barriers to mm_tlb_flush_pending (Minchan) 
v1 -> v2:
 - Explain the implications of the implications of the race (Andrew)
 - Mark the patch that address the race as stable (Andrew)
 - Add another patch to clean the use of barriers (Andrew)
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Nadav Amit (3):
  mm: migrate: prevent racy access to tlb_flush_pending
  mm: migrate: fix barriers around tlb_flush_pending
  Revert "mm: numa: defer TLB flush for THP migration as long as
    possible"
 include/linux/mm_types.h | 45 ++++++++++++++++++++++++++++++++-------------
 kernel/fork.c            |  2 +-
 mm/debug.c               |  2 +-
 mm/huge_memory.c         |  7 +++++++
 mm/migrate.c             |  6 ------
 mm/mprotect.c            |  4 ++--
 6 files changed, 43 insertions(+), 23 deletions(-)
-- 
2.11.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 8+ messages in thread
* [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending
  2017-07-31 16:43 [PATCH v5 0/3] mm: fixes of tlb_flush_pending Nadav Amit
@ 2017-07-31 16:43 ` Nadav Amit
  2017-08-01  6:01   ` Minchan Kim
  2017-07-31 16:43 ` [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending Nadav Amit
  2017-07-31 16:43 ` [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible" Nadav Amit
  2 siblings, 1 reply; 8+ messages in thread
From: Nadav Amit @ 2017-07-31 16:43 UTC (permalink / raw)
  To: linux-mm; +Cc: nadav.amit, mgorman, riel, luto, Minchan Kim, stable, Nadav Amit
From: Nadav Amit <nadav.amit@gmail.com>
Setting and clearing mm->tlb_flush_pending can be performed by multiple
threads, since mmap_sem may only be acquired for read in
task_numa_work(). If this happens, tlb_flush_pending might be cleared
while one of the threads still changes PTEs and batches TLB flushes.
This can lead to the same race between migration and
change_protection_range() that led to the introduction of
tlb_flush_pending. The result of this race was data corruption, which
means that this patch also addresses a theoretically possible data
corruption.
An actual data corruption was not observed, yet the race was
was confirmed by adding assertion to check tlb_flush_pending is not set
by two threads, adding artificial latency in change_protection_range()
and using sysctl to reduce kernel.numa_balancing_scan_delay_ms.
Fixes: 20841405940e ("mm: fix TLB flush race between migration, and
change_protection_range")
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
---
 include/linux/mm_types.h | 31 ++++++++++++++++++++++---------
 kernel/fork.c            |  2 +-
 mm/debug.c               |  2 +-
 mm/mprotect.c            |  4 ++--
 4 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 45cdb27791a3..f5263dd0f1bc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -493,7 +493,7 @@ struct mm_struct {
 	 * can move process memory needs to flush the TLB when moving a
 	 * PROT_NONE or PROT_NUMA mapped page.
 	 */
-	bool tlb_flush_pending;
+	atomic_t tlb_flush_pending;
 #endif
 	struct uprobes_state uprobes_state;
 #ifdef CONFIG_HUGETLB_PAGE
@@ -528,33 +528,46 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm)
 static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
 {
 	barrier();
-	return mm->tlb_flush_pending;
+	return atomic_read(&mm->tlb_flush_pending) > 0;
 }
-static inline void set_tlb_flush_pending(struct mm_struct *mm)
+
+static inline void init_tlb_flush_pending(struct mm_struct *mm)
 {
-	mm->tlb_flush_pending = true;
+	atomic_set(&mm->tlb_flush_pending, 0);
+}
+
+static inline void inc_tlb_flush_pending(struct mm_struct *mm)
+{
+	atomic_inc(&mm->tlb_flush_pending);
 
 	/*
-	 * Guarantee that the tlb_flush_pending store does not leak into the
+	 * Guarantee that the tlb_flush_pending increase does not leak into the
 	 * critical section updating the page tables
 	 */
 	smp_mb__before_spinlock();
 }
+
 /* Clearing is done after a TLB flush, which also provides a barrier. */
-static inline void clear_tlb_flush_pending(struct mm_struct *mm)
+static inline void dec_tlb_flush_pending(struct mm_struct *mm)
 {
 	barrier();
-	mm->tlb_flush_pending = false;
+	atomic_dec(&mm->tlb_flush_pending);
 }
 #else
 static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
 {
 	return false;
 }
-static inline void set_tlb_flush_pending(struct mm_struct *mm)
+
+static inline void init_tlb_flush_pending(struct mm_struct *mm)
 {
 }
-static inline void clear_tlb_flush_pending(struct mm_struct *mm)
+
+static inline void inc_tlb_flush_pending(struct mm_struct *mm)
+{
+}
+
+static inline void dec_tlb_flush_pending(struct mm_struct *mm)
 {
 }
 #endif
diff --git a/kernel/fork.c b/kernel/fork.c
index e53770d2bf95..840e7a7132e1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -809,7 +809,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_aio(mm);
 	mm_init_owner(mm, p);
 	mmu_notifier_mm_init(mm);
-	clear_tlb_flush_pending(mm);
+	init_tlb_flush_pending(mm);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
 #endif
diff --git a/mm/debug.c b/mm/debug.c
index db1cd26d8752..d70103bb4731 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -159,7 +159,7 @@ void dump_mm(const struct mm_struct *mm)
 		mm->numa_next_scan, mm->numa_scan_offset, mm->numa_scan_seq,
 #endif
 #if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_COMPACTION)
-		mm->tlb_flush_pending,
+		atomic_read(&mm->tlb_flush_pending),
 #endif
 		mm->def_flags, &mm->def_flags
 	);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 8edd0d576254..0c413774c1e3 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -245,7 +245,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
 	BUG_ON(addr >= end);
 	pgd = pgd_offset(mm, addr);
 	flush_cache_range(vma, addr, end);
-	set_tlb_flush_pending(mm);
+	inc_tlb_flush_pending(mm);
 	do {
 		next = pgd_addr_end(addr, end);
 		if (pgd_none_or_clear_bad(pgd))
@@ -257,7 +257,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma,
 	/* Only flush the TLB if we actually modified any entries: */
 	if (pages)
 		flush_tlb_range(vma, start, end);
-	clear_tlb_flush_pending(mm);
+	dec_tlb_flush_pending(mm);
 
 	return pages;
 }
-- 
2.11.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related	[flat|nested] 8+ messages in thread
* [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending
  2017-07-31 16:43 [PATCH v5 0/3] mm: fixes of tlb_flush_pending Nadav Amit
  2017-07-31 16:43 ` [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending Nadav Amit
@ 2017-07-31 16:43 ` Nadav Amit
  2017-08-01 11:05   ` Rik van Riel
  2017-07-31 16:43 ` [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible" Nadav Amit
  2 siblings, 1 reply; 8+ messages in thread
From: Nadav Amit @ 2017-07-31 16:43 UTC (permalink / raw)
  To: linux-mm
  Cc: nadav.amit, mgorman, riel, luto, Nadav Amit, Minchan Kim,
	Sergey Senozhatsky
Reading tlb_flush_pending while the page-table lock is taken does not
require a barrier, since the lock/unlock already acts as a barrier.
Removing the barrier in mm_tlb_flush_pending() to address this issue.
However, migrate_misplaced_transhuge_page() calls mm_tlb_flush_pending()
while the page-table lock is already released, which may present a
problem on architectures with weak memory model (PPC). To deal with this
case, a new parameter is added to mm_tlb_flush_pending() to indicate
if it is read without the page-table lock taken, and calling
smp_mb__after_unlock_lock() in this case.
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
---
 include/linux/mm_types.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index f5263dd0f1bc..2956513619a7 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -522,12 +522,12 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm)
 /*
  * Memory barriers to keep this state in sync are graciously provided by
  * the page table locks, outside of which no page table modifications happen.
- * The barriers below prevent the compiler from re-ordering the instructions
- * around the memory barriers that are already present in the code.
+ * The barriers are used to ensure the order between tlb_flush_pending updates,
+ * which happen while the lock is not taken, and the PTE updates, which happen
+ * while the lock is taken, are serialized.
  */
 static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
 {
-	barrier();
 	return atomic_read(&mm->tlb_flush_pending) > 0;
 }
 
@@ -550,7 +550,13 @@ static inline void inc_tlb_flush_pending(struct mm_struct *mm)
 /* Clearing is done after a TLB flush, which also provides a barrier. */
 static inline void dec_tlb_flush_pending(struct mm_struct *mm)
 {
-	barrier();
+	/*
+	 * Guarantee that the tlb_flush_pending does not not leak into the
+	 * critical section, since we must order the PTE change and changes to
+	 * the pending TLB flush indication. We could have relied on TLB flush
+	 * as a memory barrier, but this behavior is not clearly documented.
+	 */
+	smp_mb__before_atomic();
 	atomic_dec(&mm->tlb_flush_pending);
 }
 #else
-- 
2.11.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related	[flat|nested] 8+ messages in thread
* [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible"
  2017-07-31 16:43 [PATCH v5 0/3] mm: fixes of tlb_flush_pending Nadav Amit
  2017-07-31 16:43 ` [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending Nadav Amit
  2017-07-31 16:43 ` [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending Nadav Amit
@ 2017-07-31 16:43 ` Nadav Amit
  2017-08-01 10:06   ` Mel Gorman
  2017-08-01 11:05   ` Rik van Riel
  2 siblings, 2 replies; 8+ messages in thread
From: Nadav Amit @ 2017-07-31 16:43 UTC (permalink / raw)
  To: linux-mm
  Cc: nadav.amit, mgorman, riel, luto, Nadav Amit, Minchan Kim,
	Sergey Senozhatsky
While deferring TLB flushes is a good practice, the reverted patch
caused pending TLB flushes to be checked while the page-table lock is
not taken. As a result, in architectures with weak memory model (PPC),
Linux may miss a memory-barrier, miss the fact TLB flushes are pending,
and cause (in theory) a memory corruption.
Since the alternative of using smp_mb__after_unlock_lock() was
considered a bit open-coded, and the performance impact is expected to
be small, the previous patch is reverted.
This reverts commit b0943d61b8fa420180f92f64ef67662b4f6cc493.
Suggested-by: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
---
 mm/huge_memory.c | 7 +++++++
 mm/migrate.c     | 6 ------
 2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 88c6167f194d..b51d83e410eb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1496,6 +1496,13 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 	}
 
 	/*
+	 * The page_table_lock above provides a memory barrier
+	 * with change_protection_range.
+	 */
+	if (mm_tlb_flush_pending(vma->vm_mm))
+		flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
+
+	/*
 	 * Migrate the THP to the requested node, returns with page unlocked
 	 * and access rights restored.
 	 */
diff --git a/mm/migrate.c b/mm/migrate.c
index 89a0a1707f4c..1f6c2f41b3cb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1935,12 +1935,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 		put_page(new_page);
 		goto out_fail;
 	}
-	/*
-	 * We are not sure a pending tlb flush here is for a huge page
-	 * mapping or not. Hence use the tlb range variant
-	 */
-	if (mm_tlb_flush_pending(mm))
-		flush_tlb_range(vma, mmun_start, mmun_end);
 
 	/* Prepare a page as a migration target */
 	__SetPageLocked(new_page);
-- 
2.11.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related	[flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending
  2017-07-31 16:43 ` [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending Nadav Amit
@ 2017-08-01  6:01   ` Minchan Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2017-08-01  6:01 UTC (permalink / raw)
  To: Nadav Amit; +Cc: linux-mm, nadav.amit, mgorman, riel, luto, stable
On Mon, Jul 31, 2017 at 09:43:23AM -0700, Nadav Amit wrote:
> From: Nadav Amit <nadav.amit@gmail.com>
> 
> Setting and clearing mm->tlb_flush_pending can be performed by multiple
> threads, since mmap_sem may only be acquired for read in
> task_numa_work(). If this happens, tlb_flush_pending might be cleared
> while one of the threads still changes PTEs and batches TLB flushes.
> 
> This can lead to the same race between migration and
> change_protection_range() that led to the introduction of
> tlb_flush_pending. The result of this race was data corruption, which
> means that this patch also addresses a theoretically possible data
> corruption.
> 
> An actual data corruption was not observed, yet the race was
> was confirmed by adding assertion to check tlb_flush_pending is not set
> by two threads, adding artificial latency in change_protection_range()
> and using sysctl to reduce kernel.numa_balancing_scan_delay_ms.
> 
> Fixes: 20841405940e ("mm: fix TLB flush race between migration, and
> change_protection_range")
> 
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: stable@vger.kernel.org
> 
> Signed-off-by: Nadav Amit <namit@vmware.com>
> Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible"
  2017-07-31 16:43 ` [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible" Nadav Amit
@ 2017-08-01 10:06   ` Mel Gorman
  2017-08-01 11:05   ` Rik van Riel
  1 sibling, 0 replies; 8+ messages in thread
From: Mel Gorman @ 2017-08-01 10:06 UTC (permalink / raw)
  To: Nadav Amit
  Cc: linux-mm, nadav.amit, riel, luto, Minchan Kim, Sergey Senozhatsky
On Mon, Jul 31, 2017 at 09:43:25AM -0700, Nadav Amit wrote:
> While deferring TLB flushes is a good practice, the reverted patch
> caused pending TLB flushes to be checked while the page-table lock is
> not taken. As a result, in architectures with weak memory model (PPC),
> Linux may miss a memory-barrier, miss the fact TLB flushes are pending,
> and cause (in theory) a memory corruption.
> 
> Since the alternative of using smp_mb__after_unlock_lock() was
> considered a bit open-coded, and the performance impact is expected to
> be small, the previous patch is reverted.
> 
> This reverts commit b0943d61b8fa420180f92f64ef67662b4f6cc493.
> 
> Suggested-by: Mel Gorman <mgorman@suse.de>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@suse.de>
-- 
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending
  2017-07-31 16:43 ` [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending Nadav Amit
@ 2017-08-01 11:05   ` Rik van Riel
  0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2017-08-01 11:05 UTC (permalink / raw)
  To: Nadav Amit, linux-mm
  Cc: nadav.amit, mgorman, luto, Minchan Kim, Sergey Senozhatsky
On Mon, 2017-07-31 at 09:43 -0700, Nadav Amit wrote:
> Reading tlb_flush_pending while the page-table lock is taken does not
> require a barrier, since the lock/unlock already acts as a barrier.
> Removing the barrier in mm_tlb_flush_pending() to address this issue.
> 
> However, migrate_misplaced_transhuge_page() calls
> mm_tlb_flush_pending()
> while the page-table lock is already released, which may present a
> problem on architectures with weak memory model (PPC). To deal with
> this
> case, a new parameter is added to mm_tlb_flush_pending() to indicate
> if it is read without the page-table lock taken, and calling
> smp_mb__after_unlock_lock() in this case.
> 
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
> 
> Signed-off-by: Nadav Amit <namit@vmware.com>
> 
Acked-by: Rik van Riel <riel@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible"
  2017-07-31 16:43 ` [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible" Nadav Amit
  2017-08-01 10:06   ` Mel Gorman
@ 2017-08-01 11:05   ` Rik van Riel
  1 sibling, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2017-08-01 11:05 UTC (permalink / raw)
  To: Nadav Amit, linux-mm
  Cc: nadav.amit, mgorman, luto, Minchan Kim, Sergey Senozhatsky
On Mon, 2017-07-31 at 09:43 -0700, Nadav Amit wrote:
> While deferring TLB flushes is a good practice, the reverted patch
> caused pending TLB flushes to be checked while the page-table lock is
> not taken. As a result, in architectures with weak memory model
> (PPC),
> Linux may miss a memory-barrier, miss the fact TLB flushes are
> pending,
> and cause (in theory) a memory corruption.
> 
> Since the alternative of using smp_mb__after_unlock_lock() was
> considered a bit open-coded, and the performance impact is expected
> to
> be small, the previous patch is reverted.
> 
> This reverts commit b0943d61b8fa420180f92f64ef67662b4f6cc493.
> 
> Suggested-by: Mel Gorman <mgorman@suse.de>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Nadav Amit <namit@vmware.com>
> 
Acked-by: Rik van Riel <riel@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-08-01 11:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-31 16:43 [PATCH v5 0/3] mm: fixes of tlb_flush_pending Nadav Amit
2017-07-31 16:43 ` [PATCH v5 1/3] mm: migrate: prevent racy access to tlb_flush_pending Nadav Amit
2017-08-01  6:01   ` Minchan Kim
2017-07-31 16:43 ` [PATCH v5 2/3] mm: migrate: fix barriers around tlb_flush_pending Nadav Amit
2017-08-01 11:05   ` Rik van Riel
2017-07-31 16:43 ` [PATCH v5 3/3] Revert "mm: numa: defer TLB flush for THP migration as long as possible" Nadav Amit
2017-08-01 10:06   ` Mel Gorman
2017-08-01 11:05   ` Rik van Riel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).