All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Lance Yang <lance.yang@linux.dev>
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	anshuman.khandual@arm.com, axelrasmussen@google.com,
	baohua@kernel.org, baolin.wang@linux.alibaba.com, bhe@redhat.com,
	chrisl@kernel.org, david@kernel.org, harry.yoo@oracle.com,
	hughd@google.com, jannh@google.com, kas@kernel.org,
	kasong@tencent.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, ljs@kernel.org, mhocko@suse.com,
	nphamcs@gmail.com, pfalcato@suse.de, riel@surriel.com,
	rppt@kernel.org, ryan.roberts@arm.com, shikemeng@huaweicloud.com,
	surenb@google.com, vbabka@kernel.org, weixugc@google.com,
	willy@infradead.org, youngjun.park@lge.com, yuanchu@google.com,
	yuzhao@google.com, ziy@nvidia.com
Subject: Re: [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping
Date: Wed, 11 Mar 2026 13:41:35 +0530	[thread overview]
Message-ID: <3fb1c538-e666-4bb3-b3f4-6f631f4db325@arm.com> (raw)
In-Reply-To: <20260310125940.39707-1-lance.yang@linux.dev>



On 10/03/26 6:29 pm, Lance Yang wrote:
> 
> On Tue, Mar 10, 2026 at 01:00:04PM +0530, Dev Jain wrote:
>> Speed up unmapping of anonymous large folios by clearing the ptes, and
>> setting swap ptes, in one go.
>>
>> The following benchmark (stolen from Barry at [1]) is used to measure the
>> time taken to swapout 256M worth of memory backed by 64K large folios:
>>
>> #define _GNU_SOURCE
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <sys/mman.h>
>> #include <string.h>
>> #include <time.h>
>> #include <unistd.h>
>> #include <errno.h>
>>
>> #define SIZE_MB 256
>> #define SIZE_BYTES (SIZE_MB * 1024 * 1024)
>>
>> int main() {
>>     void *addr = mmap(NULL, SIZE_BYTES, PROT_READ | PROT_WRITE,
>>                       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>     if (addr == MAP_FAILED) {
>>         perror("mmap failed");
>>         return 1;
>>     }
>>
>>     memset(addr, 0, SIZE_BYTES);
>>
>>     struct timespec start, end;
>>     clock_gettime(CLOCK_MONOTONIC, &start);
>>
>>     if (madvise(addr, SIZE_BYTES, MADV_PAGEOUT) != 0) {
>>         perror("madvise(MADV_PAGEOUT) failed");
>>         munmap(addr, SIZE_BYTES);
>>         return 1;
>>     }
>>
>>     clock_gettime(CLOCK_MONOTONIC, &end);
>>
>>     long duration_ns = (end.tv_sec - start.tv_sec) * 1e9 +
>>                        (end.tv_nsec - start.tv_nsec);
>>     printf("madvise(MADV_PAGEOUT) took %ld ns (%.3f ms)\n",
>>            duration_ns, duration_ns / 1e6);
>>
>>     munmap(addr, SIZE_BYTES);
>>     return 0;
>> }
>>
>> On arm64, only showing one of the middle values in the distribution:
>>
>> without patch:
>> madvise(MADV_PAGEOUT) took 52192959 ns (52.193 ms)
>>
>> with patch:
>> madvise(MADV_PAGEOUT) took 26676625 ns (26.677 ms)
> 
> Good numbers! Just tested on x86 KVM with THP=never, no performance
> regression observed.

Thanks Lance!

Although still I'll try to get no-regression numbers and perf-boost
numbers on x86 myself and post it in next version.

> 
> Cheers,
> Lance
> 



      reply	other threads:[~2026-03-11  8:11 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  7:30 [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping Dev Jain
2026-03-10  7:30 ` [PATCH 1/9] mm/rmap: make nr_pages signed in try_to_unmap_one Dev Jain
2026-03-10  7:56   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:06     ` David Hildenbrand (Arm)
2026-03-10  8:23       ` Dev Jain
2026-03-10 12:40         ` Matthew Wilcox
2026-03-11  4:54           ` Dev Jain
2026-03-10  7:30 ` [PATCH 2/9] mm/rmap: initialize nr_pages to 1 at loop start " Dev Jain
2026-03-10  8:10   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:31     ` Dev Jain
2026-03-10  8:39       ` Lorenzo Stoakes (Oracle)
2026-03-10  8:43         ` Dev Jain
2026-03-10  7:30 ` [PATCH 3/9] mm/rmap: refactor lazyfree unmap commit path to commit_ttu_lazyfree_folio() Dev Jain
2026-03-10  8:19   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:42     ` Dev Jain
2026-03-19 15:53       ` Lorenzo Stoakes (Oracle)
2026-03-10  7:30 ` [PATCH 4/9] mm/memory: Batch set uffd-wp markers during zapping Dev Jain
2026-03-10  7:30 ` [PATCH 5/9] mm/rmap: batch unmap folios belonging to uffd-wp VMAs Dev Jain
2026-03-10  8:34   ` Lorenzo Stoakes (Oracle)
2026-03-10 23:32     ` Barry Song
2026-03-11  4:14       ` Barry Song
2026-03-11  4:52         ` Dev Jain
2026-03-11  4:56     ` Dev Jain
2026-03-10  7:30 ` [PATCH 6/9] mm/swapfile: Make folio_dup_swap batchable Dev Jain
2026-03-10  8:27   ` Kairui Song
2026-03-10  8:46     ` Dev Jain
2026-03-10  8:49   ` Lorenzo Stoakes (Oracle)
2026-03-11  5:42     ` Dev Jain
2026-03-19 15:26       ` Lorenzo Stoakes (Oracle)
2026-03-19 16:47       ` Matthew Wilcox
2026-03-18  0:20   ` kernel test robot
2026-03-10  7:30 ` [PATCH 7/9] mm/swapfile: Make folio_put_swap batchable Dev Jain
2026-03-10  8:29   ` Kairui Song
2026-03-10  8:50     ` Dev Jain
2026-03-10  8:55   ` Lorenzo Stoakes (Oracle)
2026-03-18  1:04   ` kernel test robot
2026-03-10  7:30 ` [PATCH 8/9] mm/rmap: introduce folio_try_share_anon_rmap_ptes Dev Jain
2026-03-10  9:38   ` Lorenzo Stoakes (Oracle)
2026-03-11  8:09     ` Dev Jain
2026-03-12  8:19       ` Wei Yang
2026-03-19 15:47       ` Lorenzo Stoakes (Oracle)
2026-04-08  7:14         ` Dev Jain
2026-03-10  7:30 ` [PATCH 9/9] mm/rmap: enable batch unmapping of anonymous folios Dev Jain
2026-03-10  8:02 ` [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping Lorenzo Stoakes (Oracle)
2026-03-10  9:28   ` Dev Jain
2026-03-10 12:59 ` Lance Yang
2026-03-11  8:11   ` Dev Jain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fb1c538-e666-4bb3-b3f4-6f631f4db325@arm.com \
    --to=dev.jain@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kas@kernel.org \
    --cc=kasong@tencent.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=nphamcs@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=youngjun.park@lge.com \
    --cc=yuanchu@google.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.