From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org, zhengjun.xing@linux.intel.com,
ying.huang@intel.com, willy@infradead.org,
stable@vger.kernel.org, shy828301@gmail.com, rientjes@google.com,
riel@surriel.com, nathan@kernel.org, feng.tang@intel.com,
fengwei.yin@intel.com, akpm@linux-foundation.org
Subject: [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree
Date: Wed, 18 Jan 2023 17:15:37 -0800 [thread overview]
Message-ID: <20230119011537.87567C433D2@smtp.kernel.org> (raw)
The quilt patch titled
Subject: mm/thp: check and bail out if page in deferred queue already
has been removed from the -mm tree. Its filename was
mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yin Fengwei <fengwei.yin@intel.com>
Subject: mm/thp: check and bail out if page in deferred queue already
Date: Fri, 23 Dec 2022 21:52:07 +0800
Kernel build regression with LLVM was reported here:
https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ with
commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
boundaries"). And the commit f35b5d7d676e was reverted.
It turned out the regression is related with madvise(MADV_DONTNEED)
was used by ld.lld. But with none PMD_SIZE aligned parameter len.
trace-bpfcc captured:
531607 531732 ld.lld do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
531607 531793 ld.lld do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
If the underneath physical page is THP, the madvise(MADV_DONTNEED) can
trigger split_queue_lock contention raised significantly. perf showed
following data:
14.85% 0.00% ld.lld [kernel.kallsyms] [k]
entry_SYSCALL_64_after_hwframe
11.52%
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_madvise
do_madvise.part.0
zap_page_range
unmap_single_vma
unmap_page_range
page_remove_rmap
deferred_split_huge_page
__lock_text_start
native_queued_spin_lock_slowpath
If THP can't be removed from rmap as whole THP, partial THP will be
removed from rmap by removing sub-pages from rmap. Even the THP head page
is added to deferred queue already, the split_queue_lock will be acquired
and check whether the THP head page is in the queue already. Thus, the
contention of split_queue_lock is raised.
Before acquire split_queue_lock, check and bail out early if the THP
head page is in the queue already. The checking without holding
split_queue_lock could race with deferred_split_scan, but it doesn't
impact the correctness here.
Test result of building kernel with ld.lld:
commit 7b5a0b664ebe (parent commit of f35b5d7d676e):
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:07.99 real, 26367.77 user, 5063.35 sys
commit f35b5d7d676e:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
7:22.15 real, 26235.03 user, 12504.55 sys
commit f35b5d7d676e with the fixing patch:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:08.49 real, 26520.15 user, 5047.91 sys
Link: https://lkml.kernel.org/r/20221223135207.2275317-1-fengwei.yin@intel.com
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/huge_memory.c~mm-thp-check-and-bail-out-if-page-in-deferred-queue-already
+++ a/mm/huge_memory.c
@@ -2835,6 +2835,9 @@ void deferred_split_huge_page(struct pag
if (PageSwapCache(page))
return;
+ if (!list_empty(page_deferred_list(page)))
+ return;
+
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
if (list_empty(page_deferred_list(page))) {
count_vm_event(THP_DEFERRED_SPLIT_PAGE);
_
Patches currently in -mm which might be from fengwei.yin@intel.com are
next reply other threads:[~2023-01-19 1:20 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-19 1:15 Andrew Morton [this message]
2023-01-19 1:31 ` [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree Yang Shi
2023-01-19 3:37 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230119011537.87567C433D2@smtp.kernel.org \
--to=akpm@linux-foundation.org \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=nathan@kernel.org \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=shy828301@gmail.com \
--cc=stable@vger.kernel.org \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.