All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Mel Gorman <mgorman@techsingularity.net>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Yang Shi <shy828301@gmail.com>, Brian Foster <bfoster@redhat.com>,
	Dan Streetman <ddstreet@ieee.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Oleksandr Natalenko <oleksandr@natalenko.name>,
	Seth Jennings <sjenning@redhat.com>,
	Vitaly Wool <vitaly.wool@konsulko.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 6.0 19/20] mm/huge_memory: do not clobber swp_entry_t during THP split
Date: Mon, 24 Oct 2022 13:31:21 +0200	[thread overview]
Message-ID: <20221024112935.173960536@linuxfoundation.org> (raw)
In-Reply-To: <20221024112934.415391158@linuxfoundation.org>

From: Mel Gorman <mgorman@techsingularity.net>

commit 71e2d666ef85d51834d658830f823560c402b8b6 upstream.

The following has been observed when running stressng mmap since commit
b653db77350c ("mm: Clear page->private when splitting or migrating a page")

   watchdog: BUG: soft lockup - CPU#75 stuck for 26s! [stress-ng:9546]
   CPU: 75 PID: 9546 Comm: stress-ng Tainted: G            E      6.0.0-revert-b653db77-fix+ #29 0357d79b60fb09775f678e4f3f64ef0579ad1374
   Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016
   RIP: 0010:xas_descend+0x28/0x80
   Code: cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 75 08 <48> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02 89 c2
   RSP: 0018:ffffbbf02a2236a8 EFLAGS: 00000246
   RAX: ffff9cab7d6a0002 RBX: ffffe04b0af88040 RCX: 0000000000000002
   RDX: 0000000000000030 RSI: ffff9cab60509b60 RDI: ffffbbf02a2236c0
   RBP: 0000000000000000 R08: ffff9cab60509b60 R09: ffffbbf02a2236c0
   R10: 0000000000000001 R11: ffffbbf02a223698 R12: 0000000000000000
   R13: ffff9cab4e28da80 R14: 0000000000039c01 R15: ffff9cab4e28da88
   FS:  00007fab89b85e40(0000) GS:ffff9cea3fcc0000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00007fab84e00000 CR3: 00000040b73a4003 CR4: 00000000003706e0
   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   Call Trace:
    <TASK>
    xas_load+0x3a/0x50
    __filemap_get_folio+0x80/0x370
    ? put_swap_page+0x163/0x360
    pagecache_get_page+0x13/0x90
    __try_to_reclaim_swap+0x50/0x190
    scan_swap_map_slots+0x31e/0x670
    get_swap_pages+0x226/0x3c0
    folio_alloc_swap+0x1cc/0x240
    add_to_swap+0x14/0x70
    shrink_page_list+0x968/0xbc0
    reclaim_page_list+0x70/0xf0
    reclaim_pages+0xdd/0x120
    madvise_cold_or_pageout_pte_range+0x814/0xf30
    walk_pgd_range+0x637/0xa30
    __walk_page_range+0x142/0x170
    walk_page_range+0x146/0x170
    madvise_pageout+0xb7/0x280
    ? asm_common_interrupt+0x22/0x40
    madvise_vma_behavior+0x3b7/0xac0
    ? find_vma+0x4a/0x70
    ? find_vma+0x64/0x70
    ? madvise_vma_anon_name+0x40/0x40
    madvise_walk_vmas+0xa6/0x130
    do_madvise+0x2f4/0x360
    __x64_sys_madvise+0x26/0x30
    do_syscall_64+0x5b/0x80
    ? do_syscall_64+0x67/0x80
    ? syscall_exit_to_user_mode+0x17/0x40
    ? do_syscall_64+0x67/0x80
    ? syscall_exit_to_user_mode+0x17/0x40
    ? do_syscall_64+0x67/0x80
    ? do_syscall_64+0x67/0x80
    ? common_interrupt+0x8b/0xa0
    entry_SYSCALL_64_after_hwframe+0x63/0xcd

The problem can be reproduced with the mmtests config
config-workload-stressng-mmap.  It does not always happen and when it
triggers is variable but it has happened on multiple machines.

The intent of commit b653db77350c patch was to avoid the case where
PG_private is clear but folio->private is not-NULL.  However, THP tail
pages uses page->private for "swp_entry_t if folio_test_swapcache()" as
stated in the documentation for struct folio.  This patch only clobbers
page->private for tail pages if the head page was not in swapcache and
warns once if page->private had an unexpected value.

Link: https://lkml.kernel.org/r/20221019134156.zjyyn5aownakvztf@techsingularity.net
Fixes: b653db77350c ("mm: Clear page->private when splitting or migrating a page")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/huge_memory.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2445,7 +2445,16 @@ static void __split_huge_page_tail(struc
 			page_tail);
 	page_tail->mapping = head->mapping;
 	page_tail->index = head->index + tail;
-	page_tail->private = 0;
+
+	/*
+	 * page->private should not be set in tail pages with the exception
+	 * of swap cache pages that store the swp_entry_t in tail pages.
+	 * Fix up and warn once if private is unexpectedly set.
+	 */
+	if (!folio_test_swapcache(page_folio(head))) {
+		VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, head);
+		page_tail->private = 0;
+	}
 
 	/* Page flags must be visible before we make the page non-compound. */
 	smp_wmb();



  parent reply	other threads:[~2022-10-24 11:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-24 11:31 [PATCH 6.0 00/20] 6.0.4-rc1 review Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 01/20] drm/i915/bios: Validate fp_timing terminator presence Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 02/20] drm/i915/bios: Use hardcoded fp_timing size for generating LFP data pointers Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 03/20] pinctrl: amd: change dev_warn to dev_dbg for additional feature support Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 04/20] thermal: intel_powerclamp: Use first online CPU as control_cpu Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 05/20] io_uring/net: fail zc send when unsupported by socket Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 06/20] HID: playstation: stop DualSense output work on remove Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 07/20] HID: playstation: add initial DualSense Edge controller support Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 08/20] net: flag sockets supporting msghdr originated zerocopy Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 09/20] drm/amd/pm: fulfill SMU13.0.7 cstate control interface Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 10/20] drm/amd/pm: add SMU IP v13.0.4 IF version define to V7 Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 11/20] drm/amd/pm: disable cstate feature for gpu reset scenario Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 12/20] drm/amd/pm: fulfill SMU13.0.0 cstate control interface Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 13/20] drm/amd/pm: update SMU IP v13.0.4 driver interface version Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 14/20] dm clone: Fix typo in block_device format specifier Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 15/20] efi: efivars: Fix variable writes without query_variable_store() Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 16/20] efi: ssdt: Dont free memory if ACPI table was loaded successfully Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 17/20] gcov: support GCC 12.1 and newer compilers Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 18/20] io-wq: Fix memory leak in worker creation Greg Kroah-Hartman
2022-10-24 11:31 ` Greg Kroah-Hartman [this message]
2022-10-25 15:11   ` [PATCH 6.0 19/20] mm/huge_memory: do not clobber swp_entry_t during THP split Hugh Dickins
2022-10-25 15:58     ` Greg Kroah-Hartman
2022-10-30  3:33       ` Hugh Dickins
2022-10-31  6:44         ` Greg Kroah-Hartman
2022-10-24 11:31 ` [PATCH 6.0 20/20] fbdev/core: Remove remove_conflicting_pci_framebuffers() Greg Kroah-Hartman
2022-11-01  8:42   ` Boris V.
2022-11-01 10:34     ` Thomas Zimmermann
2022-11-01 11:34       ` Boris V.
2022-10-24 15:47 ` [PATCH 6.0 00/20] 6.0.4-rc1 review Luna Jernberg
2022-10-24 19:10 ` Rudi Heitbaum
2022-10-24 19:21 ` Jon Hunter
2022-10-24 19:28 ` Florian Fainelli
2022-10-24 20:48 ` Shuah Khan
2022-10-24 21:55 ` Ron Economos
2022-10-25  0:20 ` Slade Watkins
2022-10-25  7:51 ` Fenil Jain
2022-10-25  9:08 ` Bagas Sanjaya
2022-10-25 12:33 ` Naresh Kamboju
2022-10-25 15:32 ` Guenter Roeck
2022-10-25 22:43 ` Justin Forbes
2022-10-26  6:36 ` Ernst Herzberg
2022-10-26  6:59   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221024112935.173960536@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=bfoster@redhat.com \
    --cc=ddstreet@ieee.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=oleksandr@natalenko.name \
    --cc=shy828301@gmail.com \
    --cc=sjenning@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=vitaly.wool@konsulko.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.