Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Li Zhe <lizhe.67@bytedance.com>
Cc: akpm@linux-foundation.org, arnd@arndb.de, bp@alien8.de,
	dave.hansen@linux.intel.com, david@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mingo@redhat.com, tglx@kernel.org,
	x86@kernel.org
Subject: Re: [PATCH 0/4] mm: speed up ZONE_DEVICE memmap initialization
Date: Wed, 20 May 2026 09:20:18 +0300	[thread overview]
Message-ID: <ag1Sog3zDcJuOaaE@kernel.org> (raw)
In-Reply-To: <20260518085700.69849-1-lizhe.67@bytedance.com>

On Mon, May 18, 2026 at 04:57:00PM +0800, Li Zhe wrote:
> On Mon, 18 May 2026 09:23:33 +0300, rppt@kernel.org wrote:
> > On Fri, May 15, 2026 at 04:20:41PM +0800, Li Zhe wrote:
> > > 
> > > Performance
> > > ===========
> > > nd_pmem rebind, 100 GB fsdax namespace, map=dev
> > >   Base(v7.1-rc3):
> > >     First binding: 1486 ms
> > >     Average of subsequent rebinds: 273.52 ms
> > >   Full series:
> > >     First binding: 1272 ms
> > >     Average of subsequent rebinds: 104.59 ms
> > > 
> > > dax_pmem rebind, 100 GB devdax namespace, align=2097152
> > >   Base(v7.1-rc3):
> > >     First binding: 1515 ms
> > >     Average of subsequent rebinds: 313.45 ms
> > >   Full series:
> > >     First binding: 1286 ms
> > >     Average of subsequent rebinds: 116.93 ms
> > 
> > This is really good improvement!
> > 
> > It would be also interesting to see how the template approach would improve
> > "normal" memory map initialization.
> 
> I also experimented with this approach earlier. Unfortunately, in the
> normal memory map initialization path, functions such as
> deferred_free_pages() are invoked shortly after struct page
> initialization, and this function performs both read and write accesses
> to members of the struct page.
> 
> Non-temporal stores via MOVNTI are primarily beneficial for streaming
> write operations, where the cache lines written are not expected to be
> reused by the CPU in the near future. In this case, however, data
> written using MOVNTI is immediately accessed again through regular load
> and store instructions. This results in an access pattern that resembles
> a write-then-reuse workload rather than a pure streaming store.
> 
> Consequently, non-temporal stores do not deliver the expected reduction
> in cache pollution, and using MOVNTI provides no measurable performance
> benefit for this particular workload.

We can split initialization and freeing into separate loops if there is
overall benefit, but this needs to be verified on other major architectures
as well.
 
> That said, a template-based approach can still accelerate initialization.
> Based on measurements from this patchset, it should improve performance
> on the generic path by roughly 10%. I would appreciate feedback on
> whether such an optimization is still considered useful.

Improving the memory map initialization by 10% is valuable.
 
> Thanks,
> Zhe

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2026-05-20  6:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  8:20 [PATCH 0/4] mm: speed up ZONE_DEVICE memmap initialization Li Zhe
2026-05-15  8:20 ` [PATCH 1/4] mm: factor zone-device page init helpers out of __init_zone_device_page Li Zhe
2026-05-18  6:32   ` Mike Rapoport
2026-05-18  9:11     ` Li Zhe
2026-05-15  8:20 ` [PATCH 2/4] mm: add a template-based fast path for zone-device page init Li Zhe
2026-05-18  6:51   ` Mike Rapoport
2026-05-18  9:54     ` Li Zhe
2026-05-18 11:42       ` Mike Rapoport
2026-05-15  8:20 ` [PATCH 3/4] mm: extend the template fast path to zone-device compound tails Li Zhe
2026-05-15  8:20 ` [PATCH 4/4] mm: use arch store helpers in zone-device template copies Li Zhe
2026-05-18  0:32   ` Alistair Popple
2026-05-18  6:42     ` Li Zhe
2026-05-19  3:09     ` Balbir Singh
2026-05-18  6:23 ` [PATCH 0/4] mm: speed up ZONE_DEVICE memmap initialization Mike Rapoport
2026-05-18  8:57   ` Li Zhe
2026-05-20  6:20     ` Mike Rapoport [this message]
2026-05-20 11:57       ` Li Zhe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag1Sog3zDcJuOaaE@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizhe.67@bytedance.com \
    --cc=mingo@redhat.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox