stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 1/7] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
@ 2020-03-06  6:28 ` Andrew Morton
  2020-03-06  6:28 ` [patch 2/7] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() Andrew Morton
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-06  6:28 UTC (permalink / raw)
  To: akpm, aquini, kirill.shutemov, linux-mm, mgorman, mhocko,
	mm-commits, stable, torvalds, vbabka, zi.yan

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa

: A user reported a bug against a distribution kernel while running a
: proprietary workload described as "memory intensive that is not swapping"
: that is expected to apply to mainline kernels.  The workload is
: read/write/modifying ranges of memory and checking the contents.  They
: reported that within a few hours that a bad PMD would be reported followed
: by a memory corruption where expected data was all zeros.  A partial
: report of the bad PMD looked like
: 
:   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
:   [ 5195.341184] ------------[ cut here ]------------
:   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
:   ....
:   [ 5195.410033] Call Trace:
:   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
:   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
:   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
:   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
:   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
:   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
:   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
: 
: Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
: was a false detection.  The bug does not trigger if automatic NUMA
: balancing or transparent huge pages is disabled.
: 
: The bug is due a race in change_pmd_range between a pmd_trans_huge and
: pmd_nond_or_clear_bad check without any locks held.  During the
: pmd_trans_huge check, a parallel protection update under lock can have
: cleared the PMD and filled it with a prot_numa entry between the transhuge
: check and the pmd_none_or_clear_bad check.
: 
: While this could be fixed with heavy locking, it's only necessary to make
: a copy of the PMD on the stack during change_pmd_range and avoid races.  A
: new helper is created for this as the check if quite subtle and the
: existing similar helpful is not suitable.  This passed 154 hours of
: testing (usually triggers between 20 minutes and 24 hours) without
: detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
: workload showed no significant change in behaviour.

Although Mel withdrew the patch on the face of LKML comment
https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
still open, and we have reports of Linpack test reporting bad residuals
after the bad PMD warning is observed.  In addition to that, bad
rss-counter and non-zero pgtables assertions are triggered on mm teardown
for the task hitting the bad PMD.

 host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
 ....
 host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
 host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096

The issue is observed on a v4.18-based distribution kernel, but the race
window is expected to be applicable to mainline kernels, as well.

[akpm@linux-foundation.org: fix comment typo, per Rafael]
Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.com
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mprotect.c |   38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

--- a/mm/mprotect.c~mm-numa-fix-bad-pmd-by-atomically-check-for-pmd_trans_huge-when-marking-page-tables-prot_numa
+++ a/mm/mprotect.c
@@ -161,6 +161,31 @@ static unsigned long change_pte_range(st
 	return pages;
 }
 
+/*
+ * Used when setting automatic NUMA hinting protection where it is
+ * critical that a numa hinting PMD is not confused with a bad PMD.
+ */
+static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd)
+{
+	pmd_t pmdval = pmd_read_atomic(pmd);
+
+	/* See pmd_none_or_trans_huge_or_clear_bad for info on barrier */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	barrier();
+#endif
+
+	if (pmd_none(pmdval))
+		return 1;
+	if (pmd_trans_huge(pmdval))
+		return 0;
+	if (unlikely(pmd_bad(pmdval))) {
+		pmd_clear_bad(pmd);
+		return 1;
+	}
+
+	return 0;
+}
+
 static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		pud_t *pud, unsigned long addr, unsigned long end,
 		pgprot_t newprot, int dirty_accountable, int prot_numa)
@@ -178,8 +203,17 @@ static inline unsigned long change_pmd_r
 		unsigned long this_pages;
 
 		next = pmd_addr_end(addr, end);
-		if (!is_swap_pmd(*pmd) && !pmd_trans_huge(*pmd) && !pmd_devmap(*pmd)
-				&& pmd_none_or_clear_bad(pmd))
+
+		/*
+		 * Automatic NUMA balancing walks the tables with mmap_sem
+		 * held for read. It's possible a parallel update to occur
+		 * between pmd_trans_huge() and a pmd_none_or_clear_bad()
+		 * check leading to a false positive and clearing.
+		 * Hence, it's necessary to atomically read the PMD value
+		 * for all the checks.
+		 */
+		if (!is_swap_pmd(*pmd) && !pmd_devmap(*pmd) &&
+		     pmd_none_or_clear_bad_unless_trans_huge(pmd))
 			goto next;
 
 		/* invoke the mmu notifier if the pmd is populated */
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch 2/7] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry()
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
  2020-03-06  6:28 ` [patch 1/7] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa Andrew Morton
@ 2020-03-06  6:28 ` Andrew Morton
  2020-03-06  6:28 ` [patch 3/7] mm: avoid data corruption on CoW fault into PFN-mapped VMA Andrew Morton
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-06  6:28 UTC (permalink / raw)
  To: aarcange, akpm, kirill.shutemov, linux-mm, mhocko, mm-commits,
	stable, torvalds, vbabka, william.kucharski, ying.huang, ziy

From: Huang Ying <ying.huang@intel.com>
Subject: mm: fix possible PMD dirty bit lost in set_pmd_migration_entry()

In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
atomically.  But the PMD is read before that with an ordinary memory
reading.  If the THP (transparent huge page) is written between the PMD
reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
data corruption.  The race window is quite small, but still possible in
theory, so need to be fixed.

The race is fixed via using the return value of pmdp_invalidate() to get
the original content of PMD, which is a read/modify/write atomic
operation.  So no THP writing can occur in between.

The race has been introduced when the THP migration support is added in
the commit 616b8371539a ("mm: thp: enable thp migration in generic path").
But this fix depends on the commit d52605d7cb30 ("mm: do not lose dirty
and accessed bits in pmdp_invalidate()").  So it's easy to be backported
after v4.16.  But the race window is really small, so it may be fine not
to backport the fix at all.

Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/huge_memory.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/huge_memory.c~mm-fix-possible-pmd-dirty-bit-lost-in-set_pmd_migration_entry
+++ a/mm/huge_memory.c
@@ -3043,8 +3043,7 @@ void set_pmd_migration_entry(struct page
 		return;
 
 	flush_cache_range(vma, address, address + HPAGE_PMD_SIZE);
-	pmdval = *pvmw->pmd;
-	pmdp_invalidate(vma, address, pvmw->pmd);
+	pmdval = pmdp_invalidate(vma, address, pvmw->pmd);
 	if (pmd_dirty(pmdval))
 		set_page_dirty(page);
 	entry = make_migration_entry(page, pmd_write(pmdval));
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch 3/7] mm: avoid data corruption on CoW fault into PFN-mapped VMA
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
  2020-03-06  6:28 ` [patch 1/7] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa Andrew Morton
  2020-03-06  6:28 ` [patch 2/7] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() Andrew Morton
@ 2020-03-06  6:28 ` Andrew Morton
  2020-03-06  6:28 ` [patch 4/7] fat: fix uninit-memory access for partial initialized inode Andrew Morton
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-06  6:28 UTC (permalink / raw)
  To: akpm, dan.j.williams, jmoyer, Justin.He, kirill.shutemov, kirill,
	linux-mm, mm-commits, stable, torvalds

From: "Kirill A. Shutemov" <kirill@shutemov.name>
Subject: mm: avoid data corruption on CoW fault into PFN-mapped VMA

Jeff Moyer has reported that one of xfstests triggers a warning when run
on DAX-enabled filesystem:

	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
	...
	wp_page_copy+0x98c/0xd50 (unreliable)
	do_wp_page+0xd8/0xad0
	__handle_mm_fault+0x748/0x1b90
	handle_mm_fault+0x120/0x1f0
	__do_page_fault+0x240/0xd70
	do_page_fault+0x38/0xd0
	handle_page_fault+0x10/0x30

The warning happens on failed __copy_from_user_inatomic() which tries to
copy data into a CoW page.

This happens because of race between MADV_DONTNEED and CoW page fault:

	CPU0					CPU1
 handle_mm_fault()
   do_wp_page()
     wp_page_copy()
       do_wp_page()
					madvise(MADV_DONTNEED)
					  zap_page_range()
					    zap_pte_range()
					      ptep_get_and_clear_full()
					      <TLB flush>
	 __copy_from_user_inatomic()
	 sees empty PTE and fails
	 WARN_ON_ONCE(1)
	 clear_page()

The solution is to re-try __copy_from_user_inatomic() under PTL after
checking that PTE is matches the orig_pte.

The second copy attempt can still fail, like due to non-readable PTE, but
there's nothing reasonable we can do about, except clearing the CoW page.

Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Justin He <Justin.He@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |   35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

--- a/mm/memory.c~mm-avoid-data-corruption-on-cow-fault-into-pfn-mapped-vma
+++ a/mm/memory.c
@@ -2257,7 +2257,7 @@ static inline bool cow_user_page(struct
 	bool ret;
 	void *kaddr;
 	void __user *uaddr;
-	bool force_mkyoung;
+	bool locked = false;
 	struct vm_area_struct *vma = vmf->vma;
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long addr = vmf->address;
@@ -2282,11 +2282,11 @@ static inline bool cow_user_page(struct
 	 * On architectures with software "accessed" bits, we would
 	 * take a double page fault, so mark it accessed here.
 	 */
-	force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
-	if (force_mkyoung) {
+	if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) {
 		pte_t entry;
 
 		vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+		locked = true;
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
 			/*
 			 * Other thread has already handled the fault
@@ -2310,18 +2310,37 @@ static inline bool cow_user_page(struct
 	 * zeroes.
 	 */
 	if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+		if (locked)
+			goto warn;
+
+		/* Re-validate under PTL if the page is still mapped */
+		vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+		locked = true;
+		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
+			/* The PTE changed under us. Retry page fault. */
+			ret = false;
+			goto pte_unlock;
+		}
+
 		/*
-		 * Give a warn in case there can be some obscure
-		 * use-case
+		 * The same page can be mapped back since last copy attampt.
+		 * Try to copy again under PTL.
 		 */
-		WARN_ON_ONCE(1);
-		clear_page(kaddr);
+		if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
+			/*
+			 * Give a warn in case there can be some obscure
+			 * use-case
+			 */
+warn:
+			WARN_ON_ONCE(1);
+			clear_page(kaddr);
+		}
 	}
 
 	ret = true;
 
 pte_unlock:
-	if (force_mkyoung)
+	if (locked)
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 	kunmap_atomic(kaddr);
 	flush_dcache_page(dst);
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch 4/7] fat: fix uninit-memory access for partial initialized inode
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (2 preceding siblings ...)
  2020-03-06  6:28 ` [patch 3/7] mm: avoid data corruption on CoW fault into PFN-mapped VMA Andrew Morton
@ 2020-03-06  6:28 ` Andrew Morton
  2020-03-06  6:28 ` [patch 6/7] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled Andrew Morton
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-06  6:28 UTC (permalink / raw)
  To: akpm, hirofumi, linux-mm, mm-commits, stable, torvalds

From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: fat: fix uninit-memory access for partial initialized inode

When get an error in the middle of reading an inode, some fields in the
inode might be still not initialized.  And then the evict_inode path may
access those fields via iput().

To fix, this makes sure that inode fields are initialized.

Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jp
Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fat/inode.c |   19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

--- a/fs/fat/inode.c~fat-fix-uninit-memory-access-for-partial-initialized-inode
+++ a/fs/fat/inode.c
@@ -750,6 +750,13 @@ static struct inode *fat_alloc_inode(str
 		return NULL;
 
 	init_rwsem(&ei->truncate_lock);
+	/* Zeroing to allow iput() even if partial initialized inode. */
+	ei->mmu_private = 0;
+	ei->i_start = 0;
+	ei->i_logstart = 0;
+	ei->i_attrs = 0;
+	ei->i_pos = 0;
+
 	return &ei->vfs_inode;
 }
 
@@ -1374,16 +1381,6 @@ out:
 	return 0;
 }
 
-static void fat_dummy_inode_init(struct inode *inode)
-{
-	/* Initialize this dummy inode to work as no-op. */
-	MSDOS_I(inode)->mmu_private = 0;
-	MSDOS_I(inode)->i_start = 0;
-	MSDOS_I(inode)->i_logstart = 0;
-	MSDOS_I(inode)->i_attrs = 0;
-	MSDOS_I(inode)->i_pos = 0;
-}
-
 static int fat_read_root(struct inode *inode)
 {
 	struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb);
@@ -1844,13 +1841,11 @@ int fat_fill_super(struct super_block *s
 	fat_inode = new_inode(sb);
 	if (!fat_inode)
 		goto out_fail;
-	fat_dummy_inode_init(fat_inode);
 	sbi->fat_inode = fat_inode;
 
 	fsinfo_inode = new_inode(sb);
 	if (!fsinfo_inode)
 		goto out_fail;
-	fat_dummy_inode_init(fsinfo_inode);
 	fsinfo_inode->i_ino = MSDOS_FSINFO_INO;
 	sbi->fsinfo_inode = fsinfo_inode;
 	insert_inode_hash(fsinfo_inode);
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch 6/7] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (3 preceding siblings ...)
  2020-03-06  6:28 ` [patch 4/7] fat: fix uninit-memory access for partial initialized inode Andrew Morton
@ 2020-03-06  6:28 ` Andrew Morton
  2020-03-07 20:58 ` + mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch added to -mm tree Andrew Morton
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-06  6:28 UTC (permalink / raw)
  To: akpm, cai, david, gerald.schaefer, iamjoonsoo.kim, linux-mm,
	mm-commits, stable, torvalds, vbabka

From: Vlastimil Babka <vbabka@suse.cz>
Subject: mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled

Commit cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
fixed memory hotplug with debug_pagealloc enabled, where onlining a page
goes through page freeing, which removes the direct mapping.  Some arches
don't like when the page is not mapped in the first place, so
generic_online_page() maps it first.  This is somewhat wasteful, but
better than special casing page freeing fast paths.

The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
it's actually enabled.  One has to test debug_pagealloc_enabled() since
031bc5743f15 ("mm/debug-pagealloc: make debug-pagealloc boottime
configurable"), or alternatively debug_pagealloc_enabled_static() since
8e57f8acbbd1 ("mm, debug_pagealloc: don't rely on static keys too early"),
but this is not done.

As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
will crash:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d
Oops: 0004 ilc:2 [#1] SMP
CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased)
Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100
0000000000000001 0000000000000000 0000000000000002 0000000000000100
0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000
000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20
Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1
0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3
>0000001ecd281b9e: 94fb5006 ni 6(%r5),251
0000001ecd281ba2: 41505008 la %r5,8(%r5)
0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e
0000001ecd281bac: 1a07 ar %r0,%r7
0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e
Call Trace:
[<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188
[<0000001ecd4d9516>] online_pages_range+0xf6/0x128
[<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8
[<0000001ecda28aae>] online_pages+0x2fe/0x3f0
[<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0
[<0000001ecd7add42>] device_online+0x5a/0xc8
[<0000001ecd7d0430>] state_store+0x88/0x118
[<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200
[<0000001ecd5064b6>] vfs_write+0x176/0x1e0
[<0000001ecd50676a>] ksys_write+0xa2/0x100
[<0000001ecda315d4>] system_call+0xd8/0x2c8

Fix this by checking debug_pagealloc_enabled_static() before calling
kernel_map_pages(). Backports for kernel before 5.5 should use
debug_pagealloc_enabled() instead. Also add comments.

Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.cz
Fixes: cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h  |    4 ++++
 mm/memory_hotplug.c |    8 +++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

--- a/include/linux/mm.h~mm-hotplug-fix-page-online-with-debug_pagealloc-compiled-but-not-enabled
+++ a/include/linux/mm.h
@@ -2715,6 +2715,10 @@ static inline bool debug_pagealloc_enabl
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
+/*
+ * When called in DEBUG_PAGEALLOC context, the call should most likely be
+ * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
+ */
 static inline void
 kernel_map_pages(struct page *page, int numpages, int enable)
 {
--- a/mm/memory_hotplug.c~mm-hotplug-fix-page-online-with-debug_pagealloc-compiled-but-not-enabled
+++ a/mm/memory_hotplug.c
@@ -574,7 +574,13 @@ EXPORT_SYMBOL_GPL(restore_online_page_ca
 
 void generic_online_page(struct page *page, unsigned int order)
 {
-	kernel_map_pages(page, 1 << order, 1);
+	/*
+	 * Freeing the page with debug_pagealloc enabled will try to unmap it,
+	 * so we should map it first. This is better than introducing a special
+	 * case in page freeing fast path.
+	 */
+	if (debug_pagealloc_enabled_static())
+		kernel_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
_

^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (4 preceding siblings ...)
  2020-03-06  6:28 ` [patch 6/7] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled Andrew Morton
@ 2020-03-07 20:58 ` Andrew Morton
  2020-03-10 23:59 ` + kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch " Andrew Morton
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-07 20:58 UTC (permalink / raw)
  To: bhe, david, mhocko, mm-commits, osalvador, richardw.yang, rppt,
	stable


The patch titled
     Subject: mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
has been added to the -mm tree.  Its filename is
     mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Baoquan He <bhe@redhat.com>
Subject: mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case

In section_deactivate(), pfn_to_page() doesn't work any more after
ms->section_mem_map is resetting to NULL in SPARSEMEM|!VMEMMAP case.  It
caused hot remove failure:

kernel BUG at mm/page_alloc.c:4806!
invalid opcode: 0000 [#1] SMP PTI
CPU: 3 PID: 8 Comm: kworker/u16:0 Tainted: G        W         5.5.0-next-20200205+ #340
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
RIP: 0010:free_pages+0x85/0xa0
Call Trace:
 __remove_pages+0x99/0xc0
 arch_remove_memory+0x23/0x4d
 try_remove_memory+0xc8/0x130
 ? walk_memory_blocks+0x72/0xa0
 __remove_memory+0xa/0x11
 acpi_memory_device_remove+0x72/0x100
 acpi_bus_trim+0x55/0x90
 acpi_device_hotplug+0x2eb/0x3d0
 acpi_hotplug_work_fn+0x1a/0x30
 process_one_work+0x1a7/0x370
 worker_thread+0x30/0x380
 ? flush_rcu_work+0x30/0x30
 kthread+0x112/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x35/0x40

Let's move the ->section_mem_map resetting after
depopulate_section_memmap() to fix it.

Link: http://lkml.kernel.org/r/20200307084229.28251-2-bhe@redhat.com
Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/sparse.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/mm/sparse.c~mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case
+++ a/mm/sparse.c
@@ -734,6 +734,7 @@ static void section_deactivate(unsigned
 	struct mem_section *ms = __pfn_to_section(pfn);
 	bool section_is_early = early_section(ms);
 	struct page *memmap = NULL;
+	bool empty = false;
 	unsigned long *subsection_map = ms->usage
 		? &ms->usage->subsection_map[0] : NULL;
 
@@ -764,7 +765,8 @@ static void section_deactivate(unsigned
 	 * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified
 	 */
 	bitmap_xor(subsection_map, map, subsection_map, SUBSECTIONS_PER_SECTION);
-	if (bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION)) {
+	empty = bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION);
+	if (empty) {
 		unsigned long section_nr = pfn_to_section_nr(pfn);
 
 		/*
@@ -779,13 +781,15 @@ static void section_deactivate(unsigned
 			ms->usage = NULL;
 		}
 		memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
-		ms->section_mem_map = (unsigned long)NULL;
 	}
 
 	if (section_is_early && memmap)
 		free_map_bootmem(memmap);
 	else
 		depopulate_section_memmap(pfn, nr_pages, altmap);
+
+	if (empty)
+		ms->section_mem_map = (unsigned long)NULL;
 }
 
 static struct page * __meminit section_activate(int nid, unsigned long pfn,
_

Patches currently in -mm which might be from bhe@redhat.com are

mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch
mm-hotplug-only-respect-mem=-parameter-during-boot-stage.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (5 preceding siblings ...)
  2020-03-07 20:58 ` + mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch added to -mm tree Andrew Morton
@ 2020-03-10 23:59 ` Andrew Morton
  2020-03-12  1:08 ` + page-flags-fix-a-crash-at-setpageerrorthp_swap.patch " Andrew Morton
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-10 23:59 UTC (permalink / raw)
  To: ast, ebiggers, gregkh, jeffv, jeyu, keescook, mcgrof, mm-commits,
	stable


The patch titled
     Subject: kmod: make request_module() return an error when autoloading is disabled
has been added to the -mm tree.  Its filename is
     kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: kmod: make request_module() return an error when autoloading is disabled

It's long been possible to disable kernel module autoloading completely by
setting /proc/sys/kernel/modprobe to the empty string.  This can be
preferable to setting it to a nonexistent file since it avoids the
overhead of an attempted execve(), avoids potential deadlocks, and avoids
the call to security_kernel_module_request() and thus on SELinux-based
systems eliminates the need to write SELinux rules to dontaudit
module_request.

However, when module autoloading is disabled in this way, request_module()
returns 0.  This is broken because callers expect 0 to mean that the
module was successfully loaded.

Apparently this was never noticed because this method of disabling module
autoloading isn't used much, and also most callers don't use the return
value of request_module() since it's always necessary to check whether the
module registered its functionality or not anyway.  But improperly
returning 0 can indeed confuse a few callers, for example get_fs_type() in
fs/filesystems.c where it causes a WARNING to be hit:

	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
		fs = __get_fs_type(name, len);
		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?
", len, name);
	}

This is easily reproduced with:

	echo > /proc/sys/kernel/modprobe
	mount -t NONEXISTENT none /

It causes:

	request_module fs-NONEXISTENT succeeded, but still no fs?
	WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
	[...]

Arguably this warning is broken and should be removed, since the module
could have been unloaded already.  However, request_module() should also
correctly return an error when it fails.  So let's make it return -ENOENT,
which matches the error when the modprobe binary doesn't exist.

Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kmod.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/kmod.c~kmod-make-request_module-return-an-error-when-autoloading-is-disabled
+++ a/kernel/kmod.c
@@ -120,7 +120,7 @@ out:
  * invoke it.
  *
  * If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * simply returns -ENOENT.
  */
 int __request_module(bool wait, const char *fmt, ...)
 {
@@ -137,7 +137,7 @@ int __request_module(bool wait, const ch
 	WARN_ON_ONCE(wait && current_is_async());
 
 	if (!modprobe_path[0])
-		return 0;
+		return -ENOENT;
 
 	va_start(args, fmt);
 	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
_

Patches currently in -mm which might be from ebiggers@google.com are

kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + page-flags-fix-a-crash-at-setpageerrorthp_swap.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (6 preceding siblings ...)
  2020-03-10 23:59 ` + kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch " Andrew Morton
@ 2020-03-12  1:08 ` Andrew Morton
  2020-03-12  2:58 ` + list-prevent-compiler-reloads-inside-safe-list-iteration.patch " Andrew Morton
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-12  1:08 UTC (permalink / raw)
  To: cai, david, mm-commits, stable, ying.huang


The patch titled
     Subject: page-flags: fix a crash at SetPageError(THP_SWAP)
has been added to the -mm tree.  Its filename is
     page-flags-fix-a-crash-at-setpageerrorthp_swap.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/page-flags-fix-a-crash-at-setpageerrorthp_swap.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/page-flags-fix-a-crash-at-setpageerrorthp_swap.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Qian Cai <cai@lca.pw>
Subject: page-flags: fix a crash at SetPageError(THP_SWAP)

The commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped
out") supported writing THP to a swap device but forgot to upgrade an
older commit df8c94d13c7e ("page-flags: define behavior of FS/IO-related
flags on compound pages") which could trigger a crash during THP swapping
out with DEBUG_VM_PGFLAGS=y,

kernel BUG at include/linux/page-flags.h:317!

page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
page:fffff3b2ec3a8000 refcount:512 mapcount:0 mapping:000000009eb0338c
index:0x7f6e58200 head:fffff3b2ec3a8000 order:9 compound_mapcount:0
compound_pincount:0
anon flags:
0x45fffe0000d8454(uptodate|lru|workingset|owner_priv_1|writeback|head|reclaim|swapbacked)

end_swap_bio_write()
  SetPageError(page)
    VM_BUG_ON_PAGE(1 && PageCompound(page))

<IRQ>
bio_endio+0x297/0x560
dec_pending+0x218/0x430 [dm_mod]
clone_endio+0xe4/0x2c0 [dm_mod]
bio_endio+0x297/0x560
blk_update_request+0x201/0x920
scsi_end_request+0x6b/0x4b0
scsi_io_completion+0x509/0x7e0
scsi_finish_command+0x1ed/0x2a0
scsi_softirq_done+0x1c9/0x1d0
__blk_mqnterrupt+0xf/0x20
</IRQ>

Fix by checking PF_NO_TAIL in those places instead.

Link: http://lkml.kernel.org/r/20200310235846.1319-1-cai@lca.pw
Fixes: bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-flags.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/page-flags.h~page-flags-fix-a-crash-at-setpageerrorthp_swap
+++ a/include/linux/page-flags.h
@@ -311,7 +311,7 @@ static inline int TestClearPage##uname(s
 
 __PAGEFLAG(Locked, locked, PF_NO_TAIL)
 PAGEFLAG(Waiters, waiters, PF_ONLY_HEAD) __CLEARPAGEFLAG(Waiters, waiters, PF_ONLY_HEAD)
-PAGEFLAG(Error, error, PF_NO_COMPOUND) TESTCLEARFLAG(Error, error, PF_NO_COMPOUND)
+PAGEFLAG(Error, error, PF_NO_TAIL) TESTCLEARFLAG(Error, error, PF_NO_TAIL)
 PAGEFLAG(Referenced, referenced, PF_HEAD)
 	TESTCLEARFLAG(Referenced, referenced, PF_HEAD)
 	__SETPAGEFLAG(Referenced, referenced, PF_HEAD)
_

Patches currently in -mm which might be from cai@lca.pw are

page-flags-fix-a-crash-at-setpageerrorthp_swap.patch
mm-disable-kcsan-for-kmemleak.patch
mm-swapfile-fix-data-races-in-try_to_unuse.patch
kasan-detect-negative-size-in-memory-operation-function-fix.patch
mm-vmscan-fix-data-races-at-kswapd_classzone_idx.patch
percpu_counter-fix-a-data-race-at-vm_committed_as.patch
mm-frontswap-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races-v2.patch
mm-swap_state-mark-various-intentional-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races-v2.patch
mm-page_counter-fix-various-data-races-at-memsw.patch
mm-memcontrol-fix-a-data-race-in-scan-count.patch
mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
mm-mempool-fix-a-data-race-in-mempool_free.patch
mm-util-annotate-an-data-race-at-vm_committed_as.patch
mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
mm-annotate-a-data-race-in-page_zonenum.patch
mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + list-prevent-compiler-reloads-inside-safe-list-iteration.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (7 preceding siblings ...)
  2020-03-12  1:08 ` + page-flags-fix-a-crash-at-setpageerrorthp_swap.patch " Andrew Morton
@ 2020-03-12  2:58 ` Andrew Morton
  2020-03-14 14:13   ` Paul E. McKenney
  2020-03-12 22:29 ` + fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch " Andrew Morton
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2020-03-12  2:58 UTC (permalink / raw)
  To: chris, David.Laight, elver, mark.rutland, mm-commits, paulmck,
	rdunlap, stable


The patch titled
     Subject: lib/list: prevent compiler reloads inside 'safe' list iteration
has been added to the -mm tree.  Its filename is
     list-prevent-compiler-reloads-inside-safe-list-iteration.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/list-prevent-compiler-reloads-inside-safe-list-iteration.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/list-prevent-compiler-reloads-inside-safe-list-iteration.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Wilson <chris@chris-wilson.co.uk>
Subject: lib/list: prevent compiler reloads inside 'safe' list iteration

Instruct the compiler to read the next element in the list iteration
once, and that it is not allowed to reload the value from the stale
element later. This is important as during the course of the safe
iteration, the stale element may be poisoned (unbeknownst to the
compiler).

This helps prevent kcsan warnings over 'unsafe' conduct in releasing the
list elements during list_for_each_entry_safe() and friends.

Link: http://lkml.kernel.org/r/20200310092119.14965-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/list.h |   50 +++++++++++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 14 deletions(-)

--- a/include/linux/list.h~list-prevent-compiler-reloads-inside-safe-list-iteration
+++ a/include/linux/list.h
@@ -537,6 +537,17 @@ static inline void list_splice_tail_init
 	list_entry((pos)->member.next, typeof(*(pos)), member)
 
 /**
+ * list_next_entry_safe - get the next element in list [once]
+ * @pos:	the type * to cursor
+ * @member:	the name of the list_head within the struct.
+ *
+ * Like list_next_entry() but prevents the compiler from reloading the
+ * next element.
+ */
+#define list_next_entry_safe(pos, member) \
+	list_entry(READ_ONCE((pos)->member.next), typeof(*(pos)), member)
+
+/**
  * list_prev_entry - get the prev element in list
  * @pos:	the type * to cursor
  * @member:	the name of the list_head within the struct.
@@ -545,6 +556,17 @@ static inline void list_splice_tail_init
 	list_entry((pos)->member.prev, typeof(*(pos)), member)
 
 /**
+ * list_prev_entry_safe - get the prev element in list [once]
+ * @pos:	the type * to cursor
+ * @member:	the name of the list_head within the struct.
+ *
+ * Like list_prev_entry() but prevents the compiler from reloading the
+ * previous element.
+ */
+#define list_prev_entry_safe(pos, member) \
+	list_entry(READ_ONCE((pos)->member.prev), typeof(*(pos)), member)
+
+/**
  * list_for_each	-	iterate over a list
  * @pos:	the &struct list_head to use as a loop cursor.
  * @head:	the head for your list.
@@ -686,9 +708,9 @@ static inline void list_splice_tail_init
  */
 #define list_for_each_entry_safe(pos, n, head, member)			\
 	for (pos = list_first_entry(head, typeof(*pos), member),	\
-		n = list_next_entry(pos, member);			\
+		n = list_next_entry_safe(pos, member);			\
 	     &pos->member != (head); 					\
-	     pos = n, n = list_next_entry(n, member))
+	     pos = n, n = list_next_entry_safe(n, member))
 
 /**
  * list_for_each_entry_safe_continue - continue list iteration safe against removal
@@ -700,11 +722,11 @@ static inline void list_splice_tail_init
  * Iterate over list of given type, continuing after current point,
  * safe against removal of list entry.
  */
-#define list_for_each_entry_safe_continue(pos, n, head, member) 		\
-	for (pos = list_next_entry(pos, member), 				\
-		n = list_next_entry(pos, member);				\
-	     &pos->member != (head);						\
-	     pos = n, n = list_next_entry(n, member))
+#define list_for_each_entry_safe_continue(pos, n, head, member) 	\
+	for (pos = list_next_entry(pos, member), 			\
+		n = list_next_entry_safe(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_next_entry_safe(n, member))
 
 /**
  * list_for_each_entry_safe_from - iterate over list from current point safe against removal
@@ -716,10 +738,10 @@ static inline void list_splice_tail_init
  * Iterate over list of given type from current point, safe against
  * removal of list entry.
  */
-#define list_for_each_entry_safe_from(pos, n, head, member) 			\
-	for (n = list_next_entry(pos, member);					\
-	     &pos->member != (head);						\
-	     pos = n, n = list_next_entry(n, member))
+#define list_for_each_entry_safe_from(pos, n, head, member) 		\
+	for (n = list_next_entry_safe(pos, member);			\
+	     &pos->member != (head);					\
+	     pos = n, n = list_next_entry_safe(n, member))
 
 /**
  * list_for_each_entry_safe_reverse - iterate backwards over list safe against removal
@@ -733,9 +755,9 @@ static inline void list_splice_tail_init
  */
 #define list_for_each_entry_safe_reverse(pos, n, head, member)		\
 	for (pos = list_last_entry(head, typeof(*pos), member),		\
-		n = list_prev_entry(pos, member);			\
+		n = list_prev_entry_safe(pos, member);			\
 	     &pos->member != (head); 					\
-	     pos = n, n = list_prev_entry(n, member))
+	     pos = n, n = list_prev_entry_safe(n, member))
 
 /**
  * list_safe_reset_next - reset a stale list_for_each_entry_safe loop
@@ -750,7 +772,7 @@ static inline void list_splice_tail_init
  * completing the current iteration of the loop body.
  */
 #define list_safe_reset_next(pos, n, member)				\
-	n = list_next_entry(pos, member)
+	n = list_next_entry_safe(pos, member)
 
 /*
  * Double linked lists with a single pointer list head.
_

Patches currently in -mm which might be from chris@chris-wilson.co.uk are

list-prevent-compiler-reloads-inside-safe-list-iteration.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (8 preceding siblings ...)
  2020-03-12  2:58 ` + list-prevent-compiler-reloads-inside-safe-list-iteration.patch " Andrew Morton
@ 2020-03-12 22:29 ` Andrew Morton
  2020-03-12 22:35 ` + mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch " Andrew Morton
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-12 22:29 UTC (permalink / raw)
  To: ast, ebiggers, gregkh, jeffv, jeyu, keescook, mcgrof, mm-commits,
	neilb, stable


The patch titled
     Subject: fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
has been added to the -mm tree.  Its filename is
     fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Eric Biggers <ebiggers@google.com>
Subject: fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()

After request_module(), nothing is stopping the module from being unloaded
until someone takes a reference to it via try_get_module().

The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.

Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().

Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Cc: <stable@vger.kernel.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/filesystems.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/filesystems.c~fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once
+++ a/fs/filesystems.c
@@ -272,7 +272,9 @@ struct file_system_type *get_fs_type(con
 	fs = __get_fs_type(name, len);
 	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
 		fs = __get_fs_type(name, len);
-		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
+		if (!fs)
+			pr_warn_once("request_module fs-%.*s succeeded, but still no fs?\n",
+				     len, name);
 	}
 
 	if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
_

Patches currently in -mm which might be from ebiggers@google.com are

kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch
fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch
docs-admin-guide-document-the-kernelmodprobe-sysctl.patch
selftests-kmod-test-disabling-module-autoloading.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (9 preceding siblings ...)
  2020-03-12 22:29 ` + fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch " Andrew Morton
@ 2020-03-12 22:35 ` Andrew Morton
  2020-03-12 22:35 ` + mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch " Andrew Morton
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-12 22:35 UTC (permalink / raw)
  To: chris, guro, hannes, mhocko, mm-commits, natechancellor, stable,
	tj


The patch titled
     Subject: mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
has been added to the -mm tree.  Its filename is
     mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: mm, memcg: fix corruption on 64-bit divisor in memory.high throttling

0e4b01df8659 had a bunch of fixups to use the right division method. 
However, it seems that after all that it still wasn't right -- div_u64
takes a 32-bit divisor.

The headroom is still large (2^32 pages), so on mundane systems you won't
hit this, but this should definitely be fixed.

Link: http://lkml.kernel.org/r/80780887060514967d414b3cd91f9a316a16ab98.1584036142.git.chris@chrisdown.name
Fixes: 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high")
Signed-off-by: Chris Down <chris@chrisdown.name>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: <stable@vger.kernel.org>	[5.4.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memcontrol.c~mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling
+++ a/mm/memcontrol.c
@@ -2350,7 +2350,7 @@ void mem_cgroup_handle_over_high(void)
 	 */
 	clamped_high = max(high, 1UL);
 
-	overage = div_u64((u64)(usage - high) << MEMCG_DELAY_PRECISION_SHIFT,
+	overage = div64_u64((u64)(usage - high) << MEMCG_DELAY_PRECISION_SHIFT,
 			  clamped_high);
 
 	penalty_jiffies = ((u64)overage * overage * HZ)
_

Patches currently in -mm which might be from chris@chrisdown.name are

mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch
mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (10 preceding siblings ...)
  2020-03-12 22:35 ` + mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch " Andrew Morton
@ 2020-03-12 22:35 ` Andrew Morton
  2020-03-13  0:26 ` + mm-do-not-allow-madv_pageout-for-cow-pages.patch " Andrew Morton
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-12 22:35 UTC (permalink / raw)
  To: chris, guro, hannes, mhocko, mm-commits, natechancellor, stable,
	tj


The patch titled
     Subject: mm, memcg: throttle allocators based on ancestral memory.high
has been added to the -mm tree.  Its filename is
     mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Chris Down <chris@chrisdown.name>
Subject: mm, memcg: throttle allocators based on ancestral memory.high

Prior to this commit, we only directly check the affected cgroup's
memory.high against its usage.  However, it's possible that we are being
reclaimed as a result of hitting an ancestor memory.high and should be
penalised based on that, instead.

This patch changes memory.high overage throttling to use the largest
overage in its ancestors when considering how many penalty jiffies to
charge.  This makes sure that we penalise poorly behaving cgroups in the
same way regardless of at what level of the hierarchy memory.high was
breached.

Link: http://lkml.kernel.org/r/8cd132f84bd7e16cdb8fde3378cdbf05ba00d387.1584036142.git.chris@chrisdown.name
Fixes: 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high")
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.4.x+]
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   93 ++++++++++++++++++++++++++++------------------
 1 file changed, 58 insertions(+), 35 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh
+++ a/mm/memcontrol.c
@@ -2308,28 +2308,41 @@ static void high_work_func(struct work_s
  #define MEMCG_DELAY_SCALING_SHIFT 14
 
 /*
- * Scheduled by try_charge() to be executed from the userland return path
- * and reclaims memory over the high limit.
+ * Get the number of jiffies that we should penalise a mischievous cgroup which
+ * is exceeding its memory.high by checking both it and its ancestors.
  */
-void mem_cgroup_handle_over_high(void)
+static unsigned long calculate_high_delay(struct mem_cgroup *memcg,
+					  unsigned int nr_pages)
 {
-	unsigned long usage, high, clamped_high;
-	unsigned long pflags;
-	unsigned long penalty_jiffies, overage;
-	unsigned int nr_pages = current->memcg_nr_pages_over_high;
-	struct mem_cgroup *memcg;
+	unsigned long penalty_jiffies;
+	u64 max_overage = 0;
 
-	if (likely(!nr_pages))
-		return;
+	do {
+		unsigned long usage, high;
+		u64 overage;
+
+		usage = page_counter_read(&memcg->memory);
+		high = READ_ONCE(memcg->high);
+
+		/*
+		 * Prevent division by 0 in overage calculation by acting as if
+		 * it was a threshold of 1 page
+		 */
+		high = max(high, 1UL);
+
+		overage = usage - high;
+		overage <<= MEMCG_DELAY_PRECISION_SHIFT;
+		overage = div64_u64(overage, high);
+
+		if (overage > max_overage)
+			max_overage = overage;
+	} while ((memcg = parent_mem_cgroup(memcg)) &&
+		 !mem_cgroup_is_root(memcg));
 
-	memcg = get_mem_cgroup_from_mm(current->mm);
-	reclaim_high(memcg, nr_pages, GFP_KERNEL);
-	current->memcg_nr_pages_over_high = 0;
+	if (!max_overage)
+		return 0;
 
 	/*
-	 * memory.high is breached and reclaim is unable to keep up. Throttle
-	 * allocators proactively to slow down excessive growth.
-	 *
 	 * We use overage compared to memory.high to calculate the number of
 	 * jiffies to sleep (penalty_jiffies). Ideally this value should be
 	 * fairly lenient on small overages, and increasingly harsh when the
@@ -2337,24 +2350,9 @@ void mem_cgroup_handle_over_high(void)
 	 * its crazy behaviour, so we exponentially increase the delay based on
 	 * overage amount.
 	 */
-
-	usage = page_counter_read(&memcg->memory);
-	high = READ_ONCE(memcg->high);
-
-	if (usage <= high)
-		goto out;
-
-	/*
-	 * Prevent division by 0 in overage calculation by acting as if it was a
-	 * threshold of 1 page
-	 */
-	clamped_high = max(high, 1UL);
-
-	overage = div64_u64((u64)(usage - high) << MEMCG_DELAY_PRECISION_SHIFT,
-			  clamped_high);
-
-	penalty_jiffies = ((u64)overage * overage * HZ)
-		>> (MEMCG_DELAY_PRECISION_SHIFT + MEMCG_DELAY_SCALING_SHIFT);
+	penalty_jiffies = max_overage * max_overage * HZ;
+	penalty_jiffies >>= MEMCG_DELAY_PRECISION_SHIFT;
+	penalty_jiffies >>= MEMCG_DELAY_SCALING_SHIFT;
 
 	/*
 	 * Factor in the task's own contribution to the overage, such that four
@@ -2371,7 +2369,32 @@ void mem_cgroup_handle_over_high(void)
 	 * application moving forwards and also permit diagnostics, albeit
 	 * extremely slowly.
 	 */
-	penalty_jiffies = min(penalty_jiffies, MEMCG_MAX_HIGH_DELAY_JIFFIES);
+	return min(penalty_jiffies, MEMCG_MAX_HIGH_DELAY_JIFFIES);
+}
+
+/*
+ * Scheduled by try_charge() to be executed from the userland return path
+ * and reclaims memory over the high limit.
+ */
+void mem_cgroup_handle_over_high(void)
+{
+	unsigned long penalty_jiffies;
+	unsigned long pflags;
+	unsigned int nr_pages = current->memcg_nr_pages_over_high;
+	struct mem_cgroup *memcg;
+
+	if (likely(!nr_pages))
+		return;
+
+	memcg = get_mem_cgroup_from_mm(current->mm);
+	reclaim_high(memcg, nr_pages, GFP_KERNEL);
+	current->memcg_nr_pages_over_high = 0;
+
+	/*
+	 * memory.high is breached and reclaim is unable to keep up. Throttle
+	 * allocators proactively to slow down excessive growth.
+	 */
+	penalty_jiffies = calculate_high_delay(memcg, nr_pages);
 
 	/*
 	 * Don't sleep if the amount of jiffies this memcg owes us is so low
_

Patches currently in -mm which might be from chris@chrisdown.name are

mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch
mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-do-not-allow-madv_pageout-for-cow-pages.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (11 preceding siblings ...)
  2020-03-12 22:35 ` + mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch " Andrew Morton
@ 2020-03-13  0:26 ` Andrew Morton
  2020-03-13  3:25 ` + selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch " Andrew Morton
  2020-03-20 23:48 ` + mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch " Andrew Morton
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-13  0:26 UTC (permalink / raw)
  To: dancol, dave.hansen, jannh, joel, mhocko, minchan, mm-commits,
	stable, vbabka


The patch titled
     Subject: mm: do not allow MADV_PAGEOUT for CoW pages
has been added to the -mm tree.  Its filename is
     mm-do-not-allow-madv_pageout-for-cow-pages.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-do-not-allow-madv_pageout-for-cow-pages.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-not-allow-madv_pageout-for-cow-pages.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Michal Hocko <mhocko@suse.com>
Subject: mm: do not allow MADV_PAGEOUT for CoW pages

Jann has brought up a very interesting point [1].  While shared pages are
excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
that way.  This can lead to all sorts of hard to debug problems.  E.g. 
performance problems outlined by Daniel [2].

There are runtime environments where there is a substantial memory shared
among security domains via CoW memory and a easy to reclaim way of that
memory, which MADV_{COLD,PAGEOUT} offers, can lead to either performance
degradation in for the parent process which might be more privileged or
even open side channel attacks.

The feasibility of the latter is not really clear to me TBH but there is
no real reason for exposure at this stage.  It seems there is no real use
case to depend on reclaiming CoW memory via madvise at this stage so it is
much easier to simply disallow it and this is what this patch does.  Put
it simply MADV_{PAGEOUT,COLD} can operate only on the exclusively owned
memory which is a straightforward semantic.

[1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
[2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com

Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.cz
Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/madvise.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

--- a/mm/madvise.c~mm-do-not-allow-madv_pageout-for-cow-pages
+++ a/mm/madvise.c
@@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_r
 		}
 
 		page = pmd_page(orig_pmd);
+
+		/* Do not interfere with other mappings of this page */
+		if (page_mapcount(page) != 1)
+			goto huge_unlock;
+
 		if (next - addr != HPAGE_PMD_SIZE) {
 			int err;
 
-			if (page_mapcount(page) != 1)
-				goto huge_unlock;
-
 			get_page(page);
 			spin_unlock(ptl);
 			lock_page(page);
@@ -426,6 +428,10 @@ regular_page:
 			continue;
 		}
 
+		/* Do not interfere with other mappings of this page */
+		if (page_mapcount(page) != 1)
+			continue;
+
 		VM_BUG_ON_PAGE(PageTransCompound(page), page);
 
 		if (pte_young(ptent)) {
_

Patches currently in -mm which might be from mhocko@suse.com are

mm-do-not-allow-madv_pageout-for-cow-pages.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* + selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (12 preceding siblings ...)
  2020-03-13  0:26 ` + mm-do-not-allow-madv_pageout-for-cow-pages.patch " Andrew Morton
@ 2020-03-13  3:25 ` Andrew Morton
  2020-03-20 23:48 ` + mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch " Andrew Morton
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-13  3:25 UTC (permalink / raw)
  To: christophe.leroy, leonardo, mm-commits, mpe, shuah, stable


The patch titled
     Subject: selftests/vm: fix map_hugetlb length used for testing read and write
has been added to the -mm tree.  Its filename is
     selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Christophe Leroy <christophe.leroy@c-s.fr>
Subject: selftests/vm: fix map_hugetlb length used for testing read and write

Commit fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page
size in map_hugetlb") added the possibility to change the size of memory
mapped for the test, but left the read and write test using the default
value.  This is unnoticed when mapping a length greater than the default
one, but segfaults otherwise.

Fix read_bytes() and write_bytes() by giving them the real length.

Also fix the call to munmap().

Link: http://lkml.kernel.org/r/9a404a13c871c4bd0ba9ede68f69a1225180dd7e.1580978385.git.christophe.leroy@c-s.fr
Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/map_hugetlb.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/tools/testing/selftests/vm/map_hugetlb.c~selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write
+++ a/tools/testing/selftests/vm/map_hugetlb.c
@@ -45,20 +45,20 @@ static void check_bytes(char *addr)
 	printf("First hex is %x\n", *((unsigned int *)addr));
 }
 
-static void write_bytes(char *addr)
+static void write_bytes(char *addr, size_t length)
 {
 	unsigned long i;
 
-	for (i = 0; i < LENGTH; i++)
+	for (i = 0; i < length; i++)
 		*(addr + i) = (char)i;
 }
 
-static int read_bytes(char *addr)
+static int read_bytes(char *addr, size_t length)
 {
 	unsigned long i;
 
 	check_bytes(addr);
-	for (i = 0; i < LENGTH; i++)
+	for (i = 0; i < length; i++)
 		if (*(addr + i) != (char)i) {
 			printf("Mismatch at %lu\n", i);
 			return 1;
@@ -96,11 +96,11 @@ int main(int argc, char **argv)
 
 	printf("Returned address is %p\n", addr);
 	check_bytes(addr);
-	write_bytes(addr);
-	ret = read_bytes(addr);
+	write_bytes(addr, length);
+	ret = read_bytes(addr, length);
 
 	/* munmap() length of MAP_HUGETLB memory must be hugepage aligned */
-	if (munmap(addr, LENGTH)) {
+	if (munmap(addr, length)) {
 		perror("munmap");
 		exit(1);
 	}
_

Patches currently in -mm which might be from christophe.leroy@c-s.fr are

selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: + list-prevent-compiler-reloads-inside-safe-list-iteration.patch added to -mm tree
  2020-03-12  2:58 ` + list-prevent-compiler-reloads-inside-safe-list-iteration.patch " Andrew Morton
@ 2020-03-14 14:13   ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2020-03-14 14:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: chris, David.Laight, elver, mark.rutland, mm-commits, rdunlap,
	stable

On Wed, Mar 11, 2020 at 07:58:09PM -0700, Andrew Morton wrote:
> 
> The patch titled
>      Subject: lib/list: prevent compiler reloads inside 'safe' list iteration
> has been added to the -mm tree.  Its filename is
>      list-prevent-compiler-reloads-inside-safe-list-iteration.patch
> 
> This patch should soon appear at
>     http://ozlabs.org/~akpm/mmots/broken-out/list-prevent-compiler-reloads-inside-safe-list-iteration.patch
> and later at
>     http://ozlabs.org/~akpm/mmotm/broken-out/list-prevent-compiler-reloads-inside-safe-list-iteration.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Subject: lib/list: prevent compiler reloads inside 'safe' list iteration
> 
> Instruct the compiler to read the next element in the list iteration
> once, and that it is not allowed to reload the value from the stale
> element later. This is important as during the course of the safe
> iteration, the stale element may be poisoned (unbeknownst to the
> compiler).
> 
> This helps prevent kcsan warnings over 'unsafe' conduct in releasing the
> list elements during list_for_each_entry_safe() and friends.
> 
> Link: http://lkml.kernel.org/r/20200310092119.14965-1-chris@chris-wilson.co.uk
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: David Laight <David.Laight@ACULAB.COM>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marco Elver <elver@google.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

One possible concern is that we are overloading the suffix "_safe()",
but that looks better to me than yet another explosion of this API.
It would be good to have uses, of course.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
> 
>  include/linux/list.h |   50 +++++++++++++++++++++++++++++------------
>  1 file changed, 36 insertions(+), 14 deletions(-)
> 
> --- a/include/linux/list.h~list-prevent-compiler-reloads-inside-safe-list-iteration
> +++ a/include/linux/list.h
> @@ -537,6 +537,17 @@ static inline void list_splice_tail_init
>  	list_entry((pos)->member.next, typeof(*(pos)), member)
>  
>  /**
> + * list_next_entry_safe - get the next element in list [once]
> + * @pos:	the type * to cursor
> + * @member:	the name of the list_head within the struct.
> + *
> + * Like list_next_entry() but prevents the compiler from reloading the
> + * next element.
> + */
> +#define list_next_entry_safe(pos, member) \
> +	list_entry(READ_ONCE((pos)->member.next), typeof(*(pos)), member)
> +
> +/**
>   * list_prev_entry - get the prev element in list
>   * @pos:	the type * to cursor
>   * @member:	the name of the list_head within the struct.
> @@ -545,6 +556,17 @@ static inline void list_splice_tail_init
>  	list_entry((pos)->member.prev, typeof(*(pos)), member)
>  
>  /**
> + * list_prev_entry_safe - get the prev element in list [once]
> + * @pos:	the type * to cursor
> + * @member:	the name of the list_head within the struct.
> + *
> + * Like list_prev_entry() but prevents the compiler from reloading the
> + * previous element.
> + */
> +#define list_prev_entry_safe(pos, member) \
> +	list_entry(READ_ONCE((pos)->member.prev), typeof(*(pos)), member)
> +
> +/**
>   * list_for_each	-	iterate over a list
>   * @pos:	the &struct list_head to use as a loop cursor.
>   * @head:	the head for your list.
> @@ -686,9 +708,9 @@ static inline void list_splice_tail_init
>   */
>  #define list_for_each_entry_safe(pos, n, head, member)			\
>  	for (pos = list_first_entry(head, typeof(*pos), member),	\
> -		n = list_next_entry(pos, member);			\
> +		n = list_next_entry_safe(pos, member);			\
>  	     &pos->member != (head); 					\
> -	     pos = n, n = list_next_entry(n, member))
> +	     pos = n, n = list_next_entry_safe(n, member))
>  
>  /**
>   * list_for_each_entry_safe_continue - continue list iteration safe against removal
> @@ -700,11 +722,11 @@ static inline void list_splice_tail_init
>   * Iterate over list of given type, continuing after current point,
>   * safe against removal of list entry.
>   */
> -#define list_for_each_entry_safe_continue(pos, n, head, member) 		\
> -	for (pos = list_next_entry(pos, member), 				\
> -		n = list_next_entry(pos, member);				\
> -	     &pos->member != (head);						\
> -	     pos = n, n = list_next_entry(n, member))
> +#define list_for_each_entry_safe_continue(pos, n, head, member) 	\
> +	for (pos = list_next_entry(pos, member), 			\
> +		n = list_next_entry_safe(pos, member);			\
> +	     &pos->member != (head);					\
> +	     pos = n, n = list_next_entry_safe(n, member))
>  
>  /**
>   * list_for_each_entry_safe_from - iterate over list from current point safe against removal
> @@ -716,10 +738,10 @@ static inline void list_splice_tail_init
>   * Iterate over list of given type from current point, safe against
>   * removal of list entry.
>   */
> -#define list_for_each_entry_safe_from(pos, n, head, member) 			\
> -	for (n = list_next_entry(pos, member);					\
> -	     &pos->member != (head);						\
> -	     pos = n, n = list_next_entry(n, member))
> +#define list_for_each_entry_safe_from(pos, n, head, member) 		\
> +	for (n = list_next_entry_safe(pos, member);			\
> +	     &pos->member != (head);					\
> +	     pos = n, n = list_next_entry_safe(n, member))
>  
>  /**
>   * list_for_each_entry_safe_reverse - iterate backwards over list safe against removal
> @@ -733,9 +755,9 @@ static inline void list_splice_tail_init
>   */
>  #define list_for_each_entry_safe_reverse(pos, n, head, member)		\
>  	for (pos = list_last_entry(head, typeof(*pos), member),		\
> -		n = list_prev_entry(pos, member);			\
> +		n = list_prev_entry_safe(pos, member);			\
>  	     &pos->member != (head); 					\
> -	     pos = n, n = list_prev_entry(n, member))
> +	     pos = n, n = list_prev_entry_safe(n, member))
>  
>  /**
>   * list_safe_reset_next - reset a stale list_for_each_entry_safe loop
> @@ -750,7 +772,7 @@ static inline void list_splice_tail_init
>   * completing the current iteration of the loop body.
>   */
>  #define list_safe_reset_next(pos, n, member)				\
> -	n = list_next_entry(pos, member)
> +	n = list_next_entry_safe(pos, member)
>  
>  /*
>   * Double linked lists with a single pointer list head.
> _
> 
> Patches currently in -mm which might be from chris@chris-wilson.co.uk are
> 
> list-prevent-compiler-reloads-inside-safe-list-iteration.patch
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* + mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch added to -mm tree
       [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
                   ` (13 preceding siblings ...)
  2020-03-13  3:25 ` + selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch " Andrew Morton
@ 2020-03-20 23:48 ` Andrew Morton
  14 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2020-03-20 23:48 UTC (permalink / raw)
  To: bharata, cl, iamjoonsoo.kim, ktkhai, mgorman, mhocko, mm-commits,
	mpe, nathanl, penberg, puvichakravarthy, rientjes, sachinp,
	srikar, stable, vbabka


The patch titled
     Subject: mm, slub: prevent kmalloc_node crashes and memory leaks
has been added to the -mm tree.  Its filename is
     mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vlastimil Babka <vbabka@suse.cz>
Subject: mm, slub: prevent kmalloc_node crashes and memory leaks

Sachin reports [1] a crash in SLUB __slab_alloc():

BUG: Kernel NULL pointer dereference on read at 0x000073b0
Faulting instruction address: 0xc0000000003d55f4
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
NIP [c0000000003d55f4] ___slab_alloc+0x1f4/0x760
LR [c0000000003d5b94] __slab_alloc+0x34/0x60
Call Trace:
[c0000008b3783960] [c0000000003d5734] ___slab_alloc+0x334/0x760 (unreliable)
[c0000008b3783a40] [c0000000003d5b94] __slab_alloc+0x34/0x60
[c0000008b3783a70] [c0000000003d6fa0] __kmalloc_node+0x110/0x490
[c0000008b3783af0] [c0000000003443d8] kvmalloc_node+0x58/0x110
[c0000008b3783b30] [c0000000003fee38] mem_cgroup_css_online+0x108/0x270
[c0000008b3783b90] [c000000000235aa8] online_css+0x48/0xd0
[c0000008b3783bc0] [c00000000023eaec] cgroup_apply_control_enable+0x2ec/0x4d0
[c0000008b3783ca0] [c000000000242318] cgroup_mkdir+0x228/0x5f0
[c0000008b3783d10] [c00000000051e170] kernfs_iop_mkdir+0x90/0xf0
[c0000008b3783d50] [c00000000043dc00] vfs_mkdir+0x110/0x230
[c0000008b3783da0] [c000000000441c90] do_mkdirat+0xb0/0x1a0
[c0000008b3783e20] [c00000000000b278] system_call+0x5c/0x68

This is a PowerPC platform with following NUMA topology:

available: 2 nodes (0-1)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
node 1 size: 35247 MB
node 1 free: 30907 MB
node distances:
node   0   1
  0:  10  40
  1:  40  10

possible numa nodes: 0-31

This only happens with a mmotm patch "mm/memcontrol.c: allocate shrinker_map on
appropriate NUMA node" [2] which effectively calls kmalloc_node for each
possible node. SLUB however only allocates kmem_cache_node on online
N_NORMAL_MEMORY nodes, and relies on node_to_mem_node to return such valid node
for other nodes since commit a561ce00b09e ("slub: fall back to
node_to_mem_node() node if allocating on memoryless node"). This is however not
true in this configuration where the _node_numa_mem_ array is not initialized
for nodes 0 and 2-31, thus it contains zeroes and get_partial() ends up
accessing non-allocated kmem_cache_node.

A related issue was reported by Bharata (originally by Ramachandran) [3] where
a similar PowerPC configuration, but with mainline kernel without patch [2]
ends up allocating large amounts of pages by kmalloc-1k kmalloc-512. This seems
to have the same underlying issue with node_to_mem_node() not behaving as
expected, and might probably also lead to an infinite loop with
CONFIG_SLUB_CPU_PARTIAL [4].

This patch should fix both issues by not relying on node_to_mem_node() anymore
and instead simply falling back to NUMA_NO_NODE, when kmalloc_node(node) is
attempted for a node that's not online, or has no usable memory. The "usable
memory" condition is also changed from node_present_pages() to N_NORMAL_MEMORY
node state, as that is exactly the condition that SLUB uses to allocate
kmem_cache_node structures. The check in get_partial() is removed completely,
as the checks in ___slab_alloc() are now sufficient to prevent get_partial()
being reached with an invalid node.

[1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/
[2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/
[3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/

Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.cz
Fixes: a561ce00b09e ("slub: fall back to node_to_mem_node() node if allocating on memoryless node")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.ibm.com>
Debugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |   26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

--- a/mm/slub.c~mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks
+++ a/mm/slub.c
@@ -1973,8 +1973,6 @@ static void *get_partial(struct kmem_cac
 
 	if (node == NUMA_NO_NODE)
 		searchnode = numa_mem_id();
-	else if (!node_present_pages(node))
-		searchnode = node_to_mem_node(node);
 
 	object = get_partial_node(s, get_node(s, searchnode), c, flags);
 	if (object || node != NUMA_NO_NODE)
@@ -2563,17 +2561,27 @@ static void *___slab_alloc(struct kmem_c
 	struct page *page;
 
 	page = c->page;
-	if (!page)
+	if (!page) {
+		/*
+		 * if the node is not online or has no normal memory, just
+		 * ignore the node constraint
+		 */
+		if (unlikely(node != NUMA_NO_NODE &&
+			     !node_state(node, N_NORMAL_MEMORY)))
+			node = NUMA_NO_NODE;
 		goto new_slab;
+	}
 redo:
 
 	if (unlikely(!node_match(page, node))) {
-		int searchnode = node;
-
-		if (node != NUMA_NO_NODE && !node_present_pages(node))
-			searchnode = node_to_mem_node(node);
-
-		if (unlikely(!node_match(page, searchnode))) {
+		/*
+		 * same as above but node_match() being false already
+		 * implies node != NUMA_NO_NODE
+		 */
+		if (!node_state(node, N_NORMAL_MEMORY)) {
+			node = NUMA_NO_NODE;
+			goto redo;
+		} else {
 			stat(s, ALLOC_NODE_MISMATCH);
 			deactivate_slab(s, page, c->freelist, c);
 			goto new_slab;
_

Patches currently in -mm which might be from vbabka@suse.cz are

mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch
revert-topology-add-support-for-node_to_mem_node-to-determine-the-fallback-node.patch
mm-compaction-fully-assume-capture-is-not-null-in-compact_zone_order.patch
mm-hugetlb-remove-unnecessary-memory-fetch-in-pageheadhuge.patch


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-03-20 23:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20200305222751.6d781a3f2802d79510941e4e@linux-foundation.org>
2020-03-06  6:28 ` [patch 1/7] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa Andrew Morton
2020-03-06  6:28 ` [patch 2/7] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() Andrew Morton
2020-03-06  6:28 ` [patch 3/7] mm: avoid data corruption on CoW fault into PFN-mapped VMA Andrew Morton
2020-03-06  6:28 ` [patch 4/7] fat: fix uninit-memory access for partial initialized inode Andrew Morton
2020-03-06  6:28 ` [patch 6/7] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled Andrew Morton
2020-03-07 20:58 ` + mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case.patch added to -mm tree Andrew Morton
2020-03-10 23:59 ` + kmod-make-request_module-return-an-error-when-autoloading-is-disabled.patch " Andrew Morton
2020-03-12  1:08 ` + page-flags-fix-a-crash-at-setpageerrorthp_swap.patch " Andrew Morton
2020-03-12  2:58 ` + list-prevent-compiler-reloads-inside-safe-list-iteration.patch " Andrew Morton
2020-03-14 14:13   ` Paul E. McKenney
2020-03-12 22:29 ` + fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once.patch " Andrew Morton
2020-03-12 22:35 ` + mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling.patch " Andrew Morton
2020-03-12 22:35 ` + mm-memcg-throttle-allocators-based-on-ancestral-memoryhigh.patch " Andrew Morton
2020-03-13  0:26 ` + mm-do-not-allow-madv_pageout-for-cow-pages.patch " Andrew Morton
2020-03-13  3:25 ` + selftests-vm-fix-map_hugetlb-length-used-for-testing-read-and-write.patch " Andrew Morton
2020-03-20 23:48 ` + mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks.patch " Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).