linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Jann Horn <jannh@google.com>
Cc: kernel test robot <oliver.sang@intel.com>,
	Dev Jain <dev.jain@arm.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>, Pedro Falcato <pfalcato@suse.de>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Bang Li <libang.li@antgroup.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	bibo mao <maobibo@loongson.cn>, Hugh Dickins <hughd@google.com>,
	Ingo Molnar <mingo@kernel.org>, Lance Yang <ioworker0@gmail.com>,
	Liam Howlett <liam.howlett@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Peter Xu <peterx@redhat.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Yang Shi <yang@os.amperecomputing.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org
Subject: Re: [linus:master] [mm] f822a9a81a: stress-ng.bigheap.realloc_calls_per_sec 37.3% regression
Date: Thu, 7 Aug 2025 20:01:51 +0200	[thread overview]
Message-ID: <cbc2e23d-69ab-4820-9942-c7abf2066ff7@redhat.com> (raw)
In-Reply-To: <e4f5faea-ccec-4cc7-83de-1a3c7013b81b@lucifer.local>

On 07.08.25 19:51, Lorenzo Stoakes wrote:
> On Thu, Aug 07, 2025 at 07:46:39PM +0200, Jann Horn wrote:
>> On Thu, Aug 7, 2025 at 7:41 PM Lorenzo Stoakes
>> <lorenzo.stoakes@oracle.com> wrote:
>>> On Thu, Aug 07, 2025 at 07:37:38PM +0200, Jann Horn wrote:
>>>> On Thu, Aug 7, 2025 at 10:28 AM Lorenzo Stoakes
>>>> <lorenzo.stoakes@oracle.com> wrote:
>>>>> On Thu, Aug 07, 2025 at 04:17:09PM +0800, kernel test robot wrote:
>>>>>> 94dab12d86cf77ff f822a9a81a31311d67f260aea96
>>>>>> ---------------- ---------------------------
>>>>>>           %stddev     %change         %stddev
>>>>>>               \          |                \
>>>>>>       13777 ą 37%     +45.0%      19979 ą 27%  numa-vmstat.node1.nr_slab_reclaimable
>>>>>>      367205            +2.3%     375703        vmstat.system.in
>>>>>>       55106 ą 37%     +45.1%      79971 ą 27%  numa-meminfo.node1.KReclaimable
>>>>>>       55106 ą 37%     +45.1%      79971 ą 27%  numa-meminfo.node1.SReclaimable
>>>>>>      559381           -37.3%     350757        stress-ng.bigheap.realloc_calls_per_sec
>>>>>>       11468            +1.2%      11603        stress-ng.time.system_time
>>>>>>      296.25            +4.5%     309.70        stress-ng.time.user_time
>>>>>>        0.81 ą187%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>        9.36 ą165%    -100.0%       0.00        perf-sched.sch_delay.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>        0.81 ą187%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>        9.36 ą165%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>        5.50 ą 17%    +390.9%      27.00 ą 56%  perf-c2c.DRAM.local
>>>>>>      388.50 ą 10%    +114.7%     834.17 ą 33%  perf-c2c.DRAM.remote
>>>>>>        1214 ą 13%    +107.3%       2517 ą 31%  perf-c2c.HITM.local
>>>>>>      135.00 ą 19%    +130.9%     311.67 ą 32%  perf-c2c.HITM.remote
>>>>>>        1349 ą 13%    +109.6%       2829 ą 31%  perf-c2c.HITM.total
>>>>>
>>>>> Yeah this also looks pretty consistent too...
>>>>
>>>> FWIW, HITM hat different meanings depending on exactly which
>>>> microarchitecture that test happened on; the message says it is from
>>>> Sapphire Rapids, which is a successor of Ice Lake, so HITM is less
>>>> meaningful than if it came from a pre-IceLake system (see
>>>> https://lore.kernel.org/all/CAG48ez3RmV6SsVw9oyTXxQXHp3rqtKDk2qwJWo9TGvXCq7Xr-w@mail.gmail.com/).
>>>>
>>>> To me those numbers mainly look like you're accessing a lot more
>>>> cache-cold data. (On pre-IceLake they would indicate cacheline
>>>> bouncing, but I guess here they probably don't.) And that makes sense,
>>>> since before the patch, this path was just moving PTEs around without
>>>> looking at the associated pages/folios; basically more or less like a
>>>> memcpy() on x86-64. But after the patch, for every 8 bytes that you
>>>> copy, you have to load a cacheline from the vmemmap to get the page.
>>>
>>> Yup this is representative of what my investigation is showing.
>>>
>>> I've narrowed it down but want to wait to report until I'm sure...
>>>
>>> But yeah we're doing a _lot_ more work.
>>>
>>> I'm leaning towards disabling except for arm64 atm tbh, seems mremap is
>>> especially sensitive to this (I found issues with this with my abortive mremap
>>> anon merging stuff too, but really expected it there...)
>>
>> Another approach would be to always read and write PTEs in
>> contpte-sized chunks here, without caring whether they're actually
>> contiguous or whatever, or something along those lines.
> 
> Not sure I love that, you'd have to figure out offset without cont pte batch and
> can it vary? And we're doing this on non-arm64 arches for what reason?
> 
> And would it solve anything really? We'd still be looking at folio, yes less
> than now, but uselessly for arches that don't benefit?
> 
> The basis of this series was (and I did explicitly ask) that it wouldn't harm
> other arches.

We'd need some hint to detect "this is either small" or "this is 
unbatchable".

Sure, we could use pte_batch_hint(), but I'm curious if x86 would also 
benefit with larger folios (e.g., 64K, 128K) with this patch.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-08-07 18:02 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07  8:17 [linus:master] [mm] f822a9a81a: stress-ng.bigheap.realloc_calls_per_sec 37.3% regression kernel test robot
2025-08-07  8:27 ` Lorenzo Stoakes
2025-08-07  8:56   ` Dev Jain
2025-08-07 10:21   ` David Hildenbrand
2025-08-07 16:06     ` Dev Jain
2025-08-07 16:10       ` Lorenzo Stoakes
2025-08-07 16:16         ` Lorenzo Stoakes
2025-08-07 17:04           ` Dev Jain
2025-08-07 17:07             ` Lorenzo Stoakes
2025-08-07 17:11               ` Dev Jain
2025-08-07 17:37   ` Jann Horn
2025-08-07 17:41     ` Lorenzo Stoakes
2025-08-07 17:46       ` Jann Horn
2025-08-07 17:50         ` Dev Jain
2025-08-07 17:53           ` Lorenzo Stoakes
2025-08-07 17:51         ` Lorenzo Stoakes
2025-08-07 18:01           ` David Hildenbrand [this message]
2025-08-07 18:04             ` Lorenzo Stoakes
2025-08-07 18:13               ` David Hildenbrand
2025-08-07 18:07             ` Jann Horn
2025-08-07 18:31               ` David Hildenbrand
2025-08-07 19:52                 ` Lorenzo Stoakes
2025-08-07 17:59       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cbc2e23d-69ab-4820-9942-c7abf2066ff7@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=libang.li@antgroup.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=maobibo@loongson.cn \
    --cc=mingo@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).