* [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree
@ 2023-01-19 1:15 Andrew Morton
2023-01-19 1:31 ` Yang Shi
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2023-01-19 1:15 UTC (permalink / raw)
To: mm-commits, zhengjun.xing, ying.huang, willy, stable, shy828301,
rientjes, riel, nathan, feng.tang, fengwei.yin, akpm
The quilt patch titled
Subject: mm/thp: check and bail out if page in deferred queue already
has been removed from the -mm tree. Its filename was
mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yin Fengwei <fengwei.yin@intel.com>
Subject: mm/thp: check and bail out if page in deferred queue already
Date: Fri, 23 Dec 2022 21:52:07 +0800
Kernel build regression with LLVM was reported here:
https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ with
commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
boundaries"). And the commit f35b5d7d676e was reverted.
It turned out the regression is related with madvise(MADV_DONTNEED)
was used by ld.lld. But with none PMD_SIZE aligned parameter len.
trace-bpfcc captured:
531607 531732 ld.lld do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
531607 531793 ld.lld do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
If the underneath physical page is THP, the madvise(MADV_DONTNEED) can
trigger split_queue_lock contention raised significantly. perf showed
following data:
14.85% 0.00% ld.lld [kernel.kallsyms] [k]
entry_SYSCALL_64_after_hwframe
11.52%
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_madvise
do_madvise.part.0
zap_page_range
unmap_single_vma
unmap_page_range
page_remove_rmap
deferred_split_huge_page
__lock_text_start
native_queued_spin_lock_slowpath
If THP can't be removed from rmap as whole THP, partial THP will be
removed from rmap by removing sub-pages from rmap. Even the THP head page
is added to deferred queue already, the split_queue_lock will be acquired
and check whether the THP head page is in the queue already. Thus, the
contention of split_queue_lock is raised.
Before acquire split_queue_lock, check and bail out early if the THP
head page is in the queue already. The checking without holding
split_queue_lock could race with deferred_split_scan, but it doesn't
impact the correctness here.
Test result of building kernel with ld.lld:
commit 7b5a0b664ebe (parent commit of f35b5d7d676e):
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:07.99 real, 26367.77 user, 5063.35 sys
commit f35b5d7d676e:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
7:22.15 real, 26235.03 user, 12504.55 sys
commit f35b5d7d676e with the fixing patch:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:08.49 real, 26520.15 user, 5047.91 sys
Link: https://lkml.kernel.org/r/20221223135207.2275317-1-fengwei.yin@intel.com
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/huge_memory.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/huge_memory.c~mm-thp-check-and-bail-out-if-page-in-deferred-queue-already
+++ a/mm/huge_memory.c
@@ -2835,6 +2835,9 @@ void deferred_split_huge_page(struct pag
if (PageSwapCache(page))
return;
+ if (!list_empty(page_deferred_list(page)))
+ return;
+
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
if (list_empty(page_deferred_list(page))) {
count_vm_event(THP_DEFERRED_SPLIT_PAGE);
_
Patches currently in -mm which might be from fengwei.yin@intel.com are
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree
2023-01-19 1:15 [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree Andrew Morton
@ 2023-01-19 1:31 ` Yang Shi
2023-01-19 3:37 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Yang Shi @ 2023-01-19 1:31 UTC (permalink / raw)
To: Andrew Morton
Cc: mm-commits, zhengjun.xing, ying.huang, willy, stable, rientjes,
riel, nathan, feng.tang, fengwei.yin
On Wed, Jan 18, 2023 at 5:15 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
>
> The quilt patch titled
> Subject: mm/thp: check and bail out if page in deferred queue already
> has been removed from the -mm tree. Its filename was
> mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch
>
> This patch was dropped because it was merged into the mm-stable branch
> of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> ------------------------------------------------------
> From: Yin Fengwei <fengwei.yin@intel.com>
> Subject: mm/thp: check and bail out if page in deferred queue already
> Date: Fri, 23 Dec 2022 21:52:07 +0800
>
> Kernel build regression with LLVM was reported here:
> https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ with
> commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
> boundaries"). And the commit f35b5d7d676e was reverted.
>
> It turned out the regression is related with madvise(MADV_DONTNEED)
> was used by ld.lld. But with none PMD_SIZE aligned parameter len.
> trace-bpfcc captured:
> 531607 531732 ld.lld do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
> 531607 531793 ld.lld do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
This just reminds me that we should reinstantiate Rik's commit?
>
> If the underneath physical page is THP, the madvise(MADV_DONTNEED) can
> trigger split_queue_lock contention raised significantly. perf showed
> following data:
> 14.85% 0.00% ld.lld [kernel.kallsyms] [k]
> entry_SYSCALL_64_after_hwframe
> 11.52%
> entry_SYSCALL_64_after_hwframe
> do_syscall_64
> __x64_sys_madvise
> do_madvise.part.0
> zap_page_range
> unmap_single_vma
> unmap_page_range
> page_remove_rmap
> deferred_split_huge_page
> __lock_text_start
> native_queued_spin_lock_slowpath
>
> If THP can't be removed from rmap as whole THP, partial THP will be
> removed from rmap by removing sub-pages from rmap. Even the THP head page
> is added to deferred queue already, the split_queue_lock will be acquired
> and check whether the THP head page is in the queue already. Thus, the
> contention of split_queue_lock is raised.
>
> Before acquire split_queue_lock, check and bail out early if the THP
> head page is in the queue already. The checking without holding
> split_queue_lock could race with deferred_split_scan, but it doesn't
> impact the correctness here.
>
> Test result of building kernel with ld.lld:
> commit 7b5a0b664ebe (parent commit of f35b5d7d676e):
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
> 6:07.99 real, 26367.77 user, 5063.35 sys
>
> commit f35b5d7d676e:
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
> 7:22.15 real, 26235.03 user, 12504.55 sys
>
> commit f35b5d7d676e with the fixing patch:
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
> 6:08.49 real, 26520.15 user, 5047.91 sys
>
> Link: https://lkml.kernel.org/r/20221223135207.2275317-1-fengwei.yin@intel.com
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Tested-by: Nathan Chancellor <nathan@kernel.org>
> Acked-by: David Rientjes <rientjes@google.com>
> Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/huge_memory.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> --- a/mm/huge_memory.c~mm-thp-check-and-bail-out-if-page-in-deferred-queue-already
> +++ a/mm/huge_memory.c
> @@ -2835,6 +2835,9 @@ void deferred_split_huge_page(struct pag
> if (PageSwapCache(page))
> return;
>
> + if (!list_empty(page_deferred_list(page)))
> + return;
> +
> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
> if (list_empty(page_deferred_list(page))) {
> count_vm_event(THP_DEFERRED_SPLIT_PAGE);
> _
>
> Patches currently in -mm which might be from fengwei.yin@intel.com are
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree
2023-01-19 1:31 ` Yang Shi
@ 2023-01-19 3:37 ` Andrew Morton
0 siblings, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2023-01-19 3:37 UTC (permalink / raw)
To: Yang Shi
Cc: mm-commits, zhengjun.xing, ying.huang, willy, stable, rientjes,
riel, nathan, feng.tang, fengwei.yin
On Wed, 18 Jan 2023 17:31:48 -0800 Yang Shi <shy828301@gmail.com> wrote:
> On Wed, Jan 18, 2023 at 5:15 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> >
> > The quilt patch titled
> > Subject: mm/thp: check and bail out if page in deferred queue already
> > has been removed from the -mm tree. Its filename was
> > mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch
> >
> > This patch was dropped because it was merged into the mm-stable branch
> > of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >
> > ------------------------------------------------------
> > From: Yin Fengwei <fengwei.yin@intel.com>
> > Subject: mm/thp: check and bail out if page in deferred queue already
> > Date: Fri, 23 Dec 2022 21:52:07 +0800
> >
> > Kernel build regression with LLVM was reported here:
> > https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ with
> > commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
> > boundaries"). And the commit f35b5d7d676e was reverted.
> >
> > It turned out the regression is related with madvise(MADV_DONTNEED)
> > was used by ld.lld. But with none PMD_SIZE aligned parameter len.
> > trace-bpfcc captured:
> > 531607 531732 ld.lld do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
> > 531607 531793 ld.lld do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
>
> This just reminds me that we should reinstantiate Rik's commit?
OK, I did that.
The changelog doesn't mention any performance testing results?
From: Rik van Riel <riel@surriel.com>
Subject: mm: align larger anonymous mappings on THP boundaries
Date: Tue, 9 Aug 2022 14:24:57 -0400
Align larger anonymous memory mappings on THP boundaries by going through
thp_get_unmapped_area if THPs are enabled for the current process.
With this patch, larger anonymous mappings are now THP aligned. When a
malloc library allocates a 2MB or larger arena, that arena can now be
mapped with THPs right from the start, which can result in better TLB hit
rates and execution time.
Link: https://lkml.kernel.org/r/20220809142457.4751229f@imladris.surriel.com
Signed-off-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
--- a/mm/mmap.c~mm-align-larger-anonymous-mappings-on-thp-boundaries
+++ a/mm/mmap.c
@@ -1782,6 +1782,9 @@ get_unmapped_area(struct file *file, uns
*/
pgoff = 0;
get_area = shmem_get_unmapped_area;
+ } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ /* Ensures that larger anonymous mappings are THP aligned. */
+ get_area = thp_get_unmapped_area;
}
addr = get_area(file, addr, len, pgoff, flags);
_
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-01-19 3:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-19 1:15 [merged mm-stable] mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch removed from -mm tree Andrew Morton
2023-01-19 1:31 ` Yang Shi
2023-01-19 3:37 ` Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.