Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Christoph Lameter (Ampere)" <cl@linux.com>
To: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yang Shi <shy828301@gmail.com>,
	kernel test robot <oliver.sang@intel.com>,
	 Rik van Riel <riel@surriel.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	 Linux Memory Management List <linux-mm@kvack.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Matthew Wilcox <willy@infradead.org>,
	ying.huang@intel.com,  feng.tang@intel.com
Subject: Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression
Date: Wed, 20 Dec 2023 07:42:17 -0800 (PST)	[thread overview]
Message-ID: <edb35574-e8be-adc8-a756-96bcbab2f0af@linux.com> (raw)
In-Reply-To: <ec48d168-284b-4376-97a7-090273a3ae5e@intel.com>

On Wed, 20 Dec 2023, Yin Fengwei wrote:

>> Interesting, wasn't the same regression seen last time? And I'm a
>> little bit confused about how pthread got regressed. I didn't see the
>> pthread benchmark do any intensive memory alloc/free operations. Do
>> the pthread APIs do any intensive memory operations? I saw the
>> benchmark does allocate memory for thread stack, but it should be just
>> 8K per thread, so it should not trigger what this patch does. With
>> 1024 threads, the thread stacks may get merged into one single VMA (8M
>> total), but it may do so even though the patch is not applied.
> stress-ng.pthread test code is strange here:
>
> https://github.com/ColinIanKing/stress-ng/blob/master/stress-pthread.c#L573
>
> Even it allocates its own stack, but that attr is not passed
> to pthread_create. So it's still glibc to allocate stack for
> pthread which is 8M size. This is why this patch can impact
> the stress-ng.pthread testing.

Hmmm... The use of calloc()  for 8M triggers an mmap I guess.

Why is that memory slower if we align the adress to a 2M boundary? Because 
THP can act faster and creates more overhead?

> while this time, the hotspot is in (pmd_lock from do_madvise I suppose):
>   - 55.02% zap_pmd_range.isra.0
>      - 53.42% __split_huge_pmd
>         - 51.74% _raw_spin_lock
>            - 51.73% native_queued_spin_lock_slowpath
>               + 3.03% asm_sysvec_call_function
>         - 1.67% __split_huge_pmd_locked
>            - 0.87% pmdp_invalidate
>               + 0.86% flush_tlb_mm_range
>      - 1.60% zap_pte_range
>         - 1.04% page_remove_rmap
>              0.55% __mod_lruvec_page_state

Ok so we have 2M mappings and they are split because of some action on 4K 
segments? Guess because of the guard pages?

>> More time spent in madvise and munmap. but I'm not sure whether this
>> is caused by tearing down the address space when exiting the test. If
>> so it should not count in the regression.
> It's not for the whole address space tearing down. It's for pthread
> stack tearing down when pthread exit (can be treated as address space
> tearing down? I suppose so).
>
> https://github.com/lattera/glibc/blob/master/nptl/allocatestack.c#L384
> https://github.com/lattera/glibc/blob/master/nptl/pthread_create.c#L576
>
> Another thing is whether it's worthy to make stack use THP? It may be
> useful for some apps which need large stack size?

No can do since a calloc is used to allocate the stack. How can the kernel 
distinguish the allocation?

next prev parent reply	other threads:[~2023-12-20 15:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19 15:41 [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression kernel test robot
2023-12-20  5:27 ` Yang Shi
2023-12-20  8:29   ` Yin Fengwei
2023-12-20 15:42     ` Christoph Lameter (Ampere) [this message]
2023-12-20 20:14       ` Yang Shi
2023-12-20 20:09     ` Yang Shi
2023-12-21  0:26       ` Yang Shi
2023-12-21  0:58         ` Yin Fengwei
2023-12-21  1:02           ` Yin Fengwei
2023-12-21  4:49           ` Matthew Wilcox
2023-12-21  4:58             ` Yin Fengwei
2023-12-21 18:07             ` Yang Shi
2023-12-21 18:14               ` Matthew Wilcox
2023-12-22  1:06                 ` Yin, Fengwei
2023-12-22  2:23                   ` Huang, Ying
2023-12-21 13:39           ` Yin, Fengwei
2023-12-21 18:11             ` Yang Shi
2023-12-22  1:13               ` Yin, Fengwei
2024-01-04  1:32                 ` Yang Shi
2024-01-04  8:18                   ` Yin Fengwei
2024-01-04  8:39                     ` Oliver Sang
2024-01-05  9:29                       ` Oliver Sang
2024-01-05 14:52                         ` Yin, Fengwei
2024-01-05 18:49                         ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=edb35574-e8be-adc8-a756-96bcbab2f0af@linux.com \
    --to=cl@linux.com \
    --cc=akpm@linux-foundation.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=riel@surriel.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.