From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
akpm@linux-foundation.org, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
mjguzik@gmail.com, luto@kernel.org, peterz@infradead.org,
tglx@linutronix.de, willy@infradead.org, raghavendra.kt@amd.com,
chleroy@kernel.org, ioworker0@gmail.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH v10 8/8] mm: folio_zero_user: cache neighbouring pages
Date: Tue, 23 Dec 2025 11:11:32 +0100 [thread overview]
Message-ID: <4e3ff85e-2c8c-489e-92b4-088189eed63b@kernel.org> (raw)
In-Reply-To: <87jyyjv5zy.fsf@oracle.com>
On 12/18/25 22:23, Ankur Arora wrote:
>
> Ankur Arora <ankur.a.arora@oracle.com> writes:
>
>> David Hildenbrand (Red Hat) <david@kernel.org> writes:
>>
>>> On 12/15/25 21:49, Ankur Arora wrote:
>>>> folio_zero_user() does straight zeroing without caring about
>>>> temporal locality for caches.
>>>> This replaced commit c6ddfb6c5890 ("mm, clear_huge_page: move order
>>>> algorithm into a separate function") where we cleared a page at a
>>>> time converging to the faulting page from the left and the right.
>>>> To retain limited temporal locality, split the clearing in three
>>>> parts: the faulting page and its immediate neighbourhood, and, the
>>>> remaining regions on the left and the right. The local neighbourhood
>>>> will be cleared last.
>>>> Do this only when zeroing small folios (< MAX_ORDER_NR_PAGES) since
>>>> there isn't much expectation of cache locality for large folios.
>>>> Performance
>>>> ===
>>>> AMD Genoa (EPYC 9J14, cpus=2 sockets * 96 cores * 2 threads,
>>>> memory=2.2 TB, L1d= 16K/thread, L2=512K/thread, L3=2MB/thread)
>>>> anon-w-seq (vm-scalability):
>>>> stime utime
>>>> page-at-a-time 1654.63 ( +- 3.84% ) 811.00 ( +- 3.84% )
>>>> contiguous clearing 1602.86 ( +- 3.00% ) 970.75 ( +- 4.68% )
>>>> neighbourhood-last 1630.32 ( +- 2.73% ) 886.37 ( +- 5.19% )
>>>> Both stime and utime respond in expected ways. stime drops for both
>>>> contiguous clearing (-3.14%) and neighbourhood-last (-1.46%)
>>>> approaches. However, utime increases for both contiguous clearing
>>>> (+19.7%) and neighbourhood-last (+9.28%).
>>>> In part this is because anon-w-seq runs with 384 processes zeroing
>>>> anonymously mapped memory which they then access sequentially. As
>>>> such this is likely an uncommon pattern where the memory bandwidth
>>>> is saturated while also being cache limited because we access the
>>>> entire region.
>>>> Kernel make workload (make -j 12 bzImage):
>>>> stime utime
>>>> page-at-a-time 138.16 ( +- 0.31% ) 1015.11 ( +- 0.05% )
>>>> contiguous clearing 133.42 ( +- 0.90% ) 1013.49 ( +- 0.05% )
>>>> neighbourhood-last 131.20 ( +- 0.76% ) 1011.36 ( +- 0.07% )
>>>> For make the utime stays relatively flat with an up to 4.9% improvement
>>>> in the stime.
>>>
>>> Nice evaluation!
>>>
>>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>>> Reviewed-by: Raghavendra K T <raghavendra.kt@amd.com>
>>>> Tested-by: Raghavendra K T <raghavendra.kt@amd.com>
>>>> ---
>>>> mm/memory.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
>>>> 1 file changed, 42 insertions(+), 2 deletions(-)
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index 974c48db6089..d22348b95227 100644
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -7268,13 +7268,53 @@ static void clear_contig_highpages(struct page *page, unsigned long addr,
>>>> * @addr_hint: The address accessed by the user or the base address.
>>>> *
>>>> * Uses architectural support to clear page ranges.
>>>> + *
>>>> + * Clearing of small folios (< MAX_ORDER_NR_PAGES) is split in three parts:
>>>> + * pages in the immediate locality of the faulting page, and its left, right
>>>> + * regions; the local neighbourhood is cleared last in order to keep cache
>>>> + * lines of the faulting region hot.
>>>> + *
>>>> + * For larger folios we assume that there is no expectation of cache locality
>>>> + * and just do a straight zero.
>>>
>>> Just wondering: why not do the same thing here as well? Probably shouldn't hurt
>>> and would get rid of some code?
>>
>> That's a good point. With only a three way split, there's no reason to
>> treat large folios specially.
>
> A bit more on this: this change makes sense but I'll retain the current
> split between patches-7, 8.
>
> Where patch-7, is used to justify using contiguous clearing (and the
> choice of value for PROCESS_PAGES_NON_PREEMPT_BATCH), unit based on
> preemption model etc and patch-8, for the neighbourhood optimization.
>
>>>> */
>>>> void folio_zero_user(struct folio *folio, unsigned long addr_hint)
>>>> {
>>>> unsigned long base_addr = ALIGN_DOWN(addr_hint, folio_size(folio));
>>>
>>> While at it you could turn that const as well.
>>
>> Ack.
>>
>>>> + const long fault_idx = (addr_hint - base_addr) / PAGE_SIZE;
>>>> + const struct range pg = DEFINE_RANGE(0, folio_nr_pages(folio) - 1);
>>>> + const int width = 2; /* number of pages cleared last on either side */
>>>
>>> Is "width" really the right terminology? (the way you describe it, it's more
>>> like diameter?)
>>
>> I like diameter. Will make that a define.
>
> I'll make that radius since that's how I'm using it.
All makes sense to me.
--
Cheers
David
next prev parent reply other threads:[~2025-12-23 10:11 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 20:49 [PATCH v10 0/8] mm: folio_zero_user: clear contiguous pages Ankur Arora
2025-12-15 20:49 ` [PATCH v10 1/8] treewide: provide a generic clear_user_page() variant Ankur Arora
2025-12-18 7:11 ` David Hildenbrand (Red Hat)
2025-12-18 19:31 ` Ankur Arora
2025-12-15 20:49 ` [PATCH v10 2/8] highmem: introduce clear_user_highpages() Ankur Arora
2025-12-15 20:49 ` [PATCH v10 3/8] mm: introduce clear_pages() and clear_user_pages() Ankur Arora
2025-12-15 20:49 ` [PATCH v10 4/8] highmem: do range clearing in clear_user_highpages() Ankur Arora
2025-12-18 7:15 ` David Hildenbrand (Red Hat)
2025-12-18 20:01 ` Ankur Arora
2025-12-15 20:49 ` [PATCH v10 5/8] x86/mm: Simplify clear_page_* Ankur Arora
2025-12-15 20:49 ` [PATCH v10 6/8] x86/clear_page: Introduce clear_pages() Ankur Arora
2025-12-18 7:22 ` David Hildenbrand (Red Hat)
2025-12-15 20:49 ` [PATCH v10 7/8] mm, folio_zero_user: support clearing page ranges Ankur Arora
2025-12-16 2:44 ` Andrew Morton
2025-12-16 6:49 ` Ankur Arora
2025-12-16 15:12 ` Andrew Morton
2025-12-17 8:48 ` Ankur Arora
2025-12-17 18:54 ` Andrew Morton
2025-12-17 19:51 ` Ankur Arora
2025-12-17 20:26 ` Andrew Morton
2025-12-18 0:51 ` Ankur Arora
2025-12-18 7:36 ` David Hildenbrand (Red Hat)
2025-12-18 20:16 ` Ankur Arora
2025-12-15 20:49 ` [PATCH v10 8/8] mm: folio_zero_user: cache neighbouring pages Ankur Arora
2025-12-18 7:49 ` David Hildenbrand (Red Hat)
2025-12-18 21:01 ` Ankur Arora
2025-12-18 21:23 ` Ankur Arora
2025-12-23 10:11 ` David Hildenbrand (Red Hat) [this message]
2025-12-16 2:48 ` [PATCH v10 0/8] mm: folio_zero_user: clear contiguous pages Andrew Morton
2025-12-16 5:04 ` Ankur Arora
2025-12-18 7:38 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e3ff85e-2c8c-489e-92b4-088189eed63b@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=chleroy@kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=ioworker0@gmail.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.