All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Gregory Price <gourry@gourry.net>
Cc: Li Zhe <lizhe.67@bytedance.com>,
	akpm@linux-foundation.org, ankur.a.arora@oracle.com,
	dan.j.williams@intel.com, dave@stgolabs.net, david@kernel.org,
	fvdl@google.com, joao.m.martins@oracle.com,
	jonathan.cameron@huawei.com, linux-cxl@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	mhocko@suse.com, mjguzik@gmail.com, muchun.song@linux.dev,
	osalvador@suse.de, raghavendra.kt@amd.com,
	wangzhou1@hisilicon.com, zhanjie9@hisilicon.com
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism
Date: Tue, 20 Jan 2026 19:30:27 +0000	[thread overview]
Message-ID: <20260120193027.3d160211@pumpkin> (raw)
In-Reply-To: <aW_G66HeWLbyiPHs@gourry-fedora-PF4VCD3F>

On Tue, 20 Jan 2026 13:18:19 -0500
Gregory Price <gourry@gourry.net> wrote:

> On Tue, Jan 20, 2026 at 06:39:48PM +0800, Li Zhe wrote:
> > On Tue, 20 Jan 2026 09:47:44 +0000, david.laight.linux@gmail.com wrote:
> >   
> > > On Tue, 20 Jan 2026 14:27:06 +0800
> > > "Li Zhe" <lizhe.67@bytedance.com> wrote:
> > >   
> > > > In light of the preceding discussion, we appear to have reached the
> > > > following understanding:
> > > > 
> > > > (1) At present we prefer to mitigate slow application startup (e.g.,
> > > > VM creation) by zeroing huge pages at the moment they are freed
> > > > (init_on_free). The principal benefit is that user space gains the
> > > > performance improvement without deploying any additional user space
> > > > daemon.  
> > > 
> > > Am I missing something?
> > > If userspace does:
> > > $ program_a; program_b
> > > and pages used by program_a are zeroed when it exits you get the delay
> > > for zeroing all the pages it used before program_b starts.
> > > OTOH if the zeroing is deferred program_b only needs to zero the pages
> > > it needs to start (and there may be some lurking).  
> > 
> > Under the init_on-free approach, improving the speed of zeroing may
> > indeed prove necessary.
> > 
> > However, I believe we should first reach consensus on adopting
> > “init_on_free” as the solution to slow application startup before
> > turning to performance tuning.
> >   
> 
> His point was init_on_free may not actually reduce any delays on serial
> applications, and can actually introduce additional delays.
> 
> Example
> -------
> program_a:  alloc_hugepages(10);
>             exit();
> 
> program b:  alloc_hugepages(5);
> 	    exit();
> 
> /* Run programs in serial */
> sh:  program_a && program_b
> 
> in zero_on_alloc():
> 	program_a eats zero(10) cost on startup
> 	program_b eats zero(5) cost on startup
> 	Overall zero(15) cost to start program_b
> 
> in zero_on_free()
> 	program_a eats zero(10) cost on startup

Do you get that cost? - wont all the unused memory be zeros.

> 	program_a eats zero(10) cost on exit
> 	program_b eats zero(0) cost on startup
> 	Overall zero(20) cost to start program_b
> 
> zero_on_free is worse by zero(5)
> -------
> 
> This is a trivial example, but it's unclear zero_on_free actually
> provides a benefit.  You have to know ahead of time what the runtime
> behavior, pre-zeroed count, and allocation pattern (0->10->5->...) would
> be to determine whether there's an actual reduction in startup time.
> 
> But just trivially, starting from the base case of no pages being
> zeroed, you're just injecting an additional zero(X) cost if program_a()
> consumes more hugepages than program_b().

I'd consider a different test:
	for c in $(jot 1 1000); do program_a; done

Regardless of whether you zero on alloc or free all the zeroing is in line.
Move it to a low priority thread (that uses a non-aggressive loop) and
there will be reasonable chance of there being pre-zeroed pages available.
(Most DMA is far too aggressive...)

If you zero on free it might also be a waste of time.
Maybe the memory is next used to read data from a disk file.

	David

> 
> Long way of saying the shift from alloc to free seems heuristic-y and
> you need stronger analysis / better data to show this change is actually
> beneficial in the general case.
> 
> ~Gregory


  parent reply	other threads:[~2026-01-20 19:30 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07 11:31 [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism Li Zhe
2026-01-07 11:31 ` [PATCH v2 1/8] mm/hugetlb: add pre-zeroed framework Li Zhe
2026-01-07 11:31 ` [PATCH v2 2/8] mm/hugetlb: convert to prep_account_new_hugetlb_folio() Li Zhe
2026-01-07 11:31 ` [PATCH v2 3/8] mm/hugetlb: move the huge folio to the end of the list during enqueue Li Zhe
2026-01-07 11:31 ` [PATCH v2 4/8] mm/hugetlb: introduce per-node sysfs interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 5/8] mm/hugetlb: simplify function hugetlb_sysfs_add_hstate() Li Zhe
2026-01-07 11:31 ` [PATCH v2 6/8] mm/hugetlb: relocate the per-hstate struct kobject pointer Li Zhe
2026-01-07 11:31 ` [PATCH v2 7/8] mm/hugetlb: add epoll support for interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 8/8] mm/hugetlb: limit event generation frequency of function do_zero_free_notify() Li Zhe
2026-01-07 16:19 ` [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism Andrew Morton
2026-01-12 11:25   ` Li Zhe
2026-01-09  6:05 ` Muchun Song
2026-01-12 11:27   ` Li Zhe
2026-01-12 19:52     ` David Hildenbrand (Red Hat)
2026-01-13  6:37       ` Li Zhe
2026-01-13 10:15         ` David Hildenbrand (Red Hat)
2026-01-13 12:41           ` Li Zhe
2026-01-14 10:41             ` David Hildenbrand (Red Hat)
2026-01-14 11:36               ` Li Zhe
2026-01-14 11:55                 ` David Hildenbrand (Red Hat)
2026-01-14 12:11                   ` Mateusz Guzik
2026-01-14 12:33                     ` David Hildenbrand (Red Hat)
2026-01-14 12:41                       ` David Hildenbrand (Red Hat)
2026-01-14 13:06                         ` Mateusz Guzik
2026-01-14 17:21                           ` David Hildenbrand (Red Hat)
2026-01-15  9:36                             ` Li Zhe
2026-01-15 11:08                               ` David Hildenbrand (Red Hat)
2026-01-15 11:57                                 ` Jonathan Cameron
2026-01-15 17:08                                   ` David Hildenbrand (Red Hat)
2026-01-15 20:16                                     ` dan.j.williams
2026-01-15 20:22                                       ` David Hildenbrand (Red Hat)
2026-01-15 22:30                                         ` Ankur Arora
2026-01-20  6:27                                           ` Li Zhe
2026-01-20  9:47                                             ` David Laight
2026-01-20 10:39                                               ` Li Zhe
2026-01-20 18:18                                                 ` Gregory Price
2026-01-20 18:38                                                   ` Gregory Price
2026-01-20 19:30                                                   ` David Laight [this message]
2026-01-20 19:52                                                     ` Gregory Price
2026-01-21  8:03                                                   ` Li Zhe
2026-01-21 12:41                                                   ` David Hildenbrand (Red Hat)
2026-01-21 12:32                                               ` David Hildenbrand (Red Hat)
2026-01-12 22:00     ` Ankur Arora
2026-01-13  6:39       ` Li Zhe
2026-01-12 22:01 ` Ankur Arora
2026-01-13  6:41   ` Li Zhe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260120193027.3d160211@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=gourry@gourry.net \
    --cc=joao.m.martins@oracle.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizhe.67@bytedance.com \
    --cc=mhocko@suse.com \
    --cc=mjguzik@gmail.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=raghavendra.kt@amd.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=zhanjie9@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.