All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: akpm@linux-foundation.org, axelrasmussen@google.com,
	yuanchu@google.com, david@kernel.org, hughd@google.com,
	chrisl@kernel.org, kasong@tencent.com, weixugc@google.com,
	Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, riel@surriel.com,
	harry.yoo@oracle.com, jannh@google.com, pfalcato@suse.de,
	baolin.wang@linux.alibaba.com, shikemeng@huaweicloud.com,
	nphamcs@gmail.com, bhe@redhat.com, baohua@kernel.org,
	youngjun.park@lge.com, ziy@nvidia.com, kas@kernel.org,
	willy@infradead.org, yuzhao@google.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, ryan.roberts@arm.com,
	anshuman.khandual@arm.com
Subject: Re: [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping
Date: Tue, 10 Mar 2026 14:58:47 +0530	[thread overview]
Message-ID: <0a5279ba-7964-49ea-a84b-c80b5e53c359@arm.com> (raw)
In-Reply-To: <032ab9a6-a49f-41c7-81fe-1c6fbaf94a79@lucifer.local>



On 10/03/26 1:32 pm, Lorenzo Stoakes (Oracle) wrote:
> On Tue, Mar 10, 2026 at 01:00:04PM +0530, Dev Jain wrote:
>> Speed up unmapping of anonymous large folios by clearing the ptes, and
>> setting swap ptes, in one go.
>>
>> The following benchmark (stolen from Barry at [1]) is used to measure the
>> time taken to swapout 256M worth of memory backed by 64K large folios:
>>
>>  #define _GNU_SOURCE
>>  #include <stdio.h>
>>  #include <stdlib.h>
>>  #include <sys/mman.h>
>>  #include <string.h>
>>  #include <time.h>
>>  #include <unistd.h>
>>  #include <errno.h>
>>
>>  #define SIZE_MB 256
>>  #define SIZE_BYTES (SIZE_MB * 1024 * 1024)
>>
>>  int main() {
>>      void *addr = mmap(NULL, SIZE_BYTES, PROT_READ | PROT_WRITE,
>>                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>      if (addr == MAP_FAILED) {
>>          perror("mmap failed");
>>          return 1;
>>      }
>>
>>      memset(addr, 0, SIZE_BYTES);
>>
>>      struct timespec start, end;
>>      clock_gettime(CLOCK_MONOTONIC, &start);
>>
>>      if (madvise(addr, SIZE_BYTES, MADV_PAGEOUT) != 0) {
>>          perror("madvise(MADV_PAGEOUT) failed");
>>          munmap(addr, SIZE_BYTES);
>>          return 1;
>>      }
>>
>>      clock_gettime(CLOCK_MONOTONIC, &end);
>>
>>      long duration_ns = (end.tv_sec - start.tv_sec) * 1e9 +
>>                         (end.tv_nsec - start.tv_nsec);
>>      printf("madvise(MADV_PAGEOUT) took %ld ns (%.3f ms)\n",
>>             duration_ns, duration_ns / 1e6);
>>
>>      munmap(addr, SIZE_BYTES);
>>      return 0;
>>  }
>>
>> On arm64, only showing one of the middle values in the distribution:
>>
> 
> This doesn't seem very statistically valid.
> 
> How about you give median, stddev etc.? Variance matters too.

Okay.

> 
>> without patch:
>> madvise(MADV_PAGEOUT) took 52192959 ns (52.193 ms)
>>
>> with patch:
>> madvise(MADV_PAGEOUT) took 26676625 ns (26.677 ms)
> 
> You have a habit of only giving data on arm64, and not mentioning whether you've
> tested on any other arch/setup.

I did do an x86 build, forgot to mention that.
I didn't do the numbers thinking this patchset is quite generic and has got
nothing to do with the arm64 cont bit - but arguably I should have.

> 
> I've commented on this before so I'm a bit disappointed you've done the exact
> same thing here again. Especially since you've previously introduced regressions
> this way.
> 
> Please can you test this on (hardware!) x86-64 _at least_ as well and confirm
> you aren't regressing anything for 4K pages?

Lemme go and manage that :)

> 
>>
>>
>> [1] https://lore.kernel.org/all/20250513084620.58231-1-21cnbao@gmail.com/
>>
>> ---
>> Based on mm-unstable bb420884e9e0. mm-selftests pass.
>>
>> Dev Jain (9):
>>   mm/rmap: make nr_pages signed in try_to_unmap_one
>>   mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one
>>   mm/rmap: refactor lazyfree unmap commit path to
>>     commit_ttu_lazyfree_folio()
>>   mm/memory: Batch set uffd-wp markers during zapping
>>   mm/rmap: batch unmap folios belonging to uffd-wp VMAs
>>   mm/swapfile: Make folio_dup_swap batchable
>>   mm/swapfile: Make folio_put_swap batchable
>>   mm/rmap: introduce folio_try_share_anon_rmap_ptes
>>   mm/rmap: enable batch unmapping of anonymous folios
>>
>>  include/linux/mm_inline.h  |  37 +++--
>>  include/linux/page-flags.h |  11 ++
>>  include/linux/rmap.h       |  38 ++++-
>>  mm/internal.h              |  26 ++++
>>  mm/memory.c                |  26 +---
>>  mm/mprotect.c              |  17 ---
>>  mm/rmap.c                  | 274 ++++++++++++++++++++++++-------------
>>  mm/shmem.c                 |   8 +-
>>  mm/swap.h                  |  10 +-
>>  mm/swapfile.c              |  25 ++--
>>  10 files changed, 298 insertions(+), 174 deletions(-)
>>
>> --
>> 2.34.1
>>



  reply	other threads:[~2026-03-10  9:29 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  7:30 [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping Dev Jain
2026-03-10  7:30 ` [PATCH 1/9] mm/rmap: make nr_pages signed in try_to_unmap_one Dev Jain
2026-03-10  7:56   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:06     ` David Hildenbrand (Arm)
2026-03-10  8:23       ` Dev Jain
2026-03-10 12:40         ` Matthew Wilcox
2026-03-11  4:54           ` Dev Jain
2026-03-10  7:30 ` [PATCH 2/9] mm/rmap: initialize nr_pages to 1 at loop start " Dev Jain
2026-03-10  8:10   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:31     ` Dev Jain
2026-03-10  8:39       ` Lorenzo Stoakes (Oracle)
2026-03-10  8:43         ` Dev Jain
2026-03-10  7:30 ` [PATCH 3/9] mm/rmap: refactor lazyfree unmap commit path to commit_ttu_lazyfree_folio() Dev Jain
2026-03-10  8:19   ` Lorenzo Stoakes (Oracle)
2026-03-10  8:42     ` Dev Jain
2026-03-19 15:53       ` Lorenzo Stoakes (Oracle)
2026-03-10  7:30 ` [PATCH 4/9] mm/memory: Batch set uffd-wp markers during zapping Dev Jain
2026-03-10  7:30 ` [PATCH 5/9] mm/rmap: batch unmap folios belonging to uffd-wp VMAs Dev Jain
2026-03-10  8:34   ` Lorenzo Stoakes (Oracle)
2026-03-10 23:32     ` Barry Song
2026-03-11  4:14       ` Barry Song
2026-03-11  4:52         ` Dev Jain
2026-03-11  4:56     ` Dev Jain
2026-03-10  7:30 ` [PATCH 6/9] mm/swapfile: Make folio_dup_swap batchable Dev Jain
2026-03-10  8:27   ` Kairui Song
2026-03-10  8:46     ` Dev Jain
2026-03-10  8:49   ` Lorenzo Stoakes (Oracle)
2026-03-11  5:42     ` Dev Jain
2026-03-19 15:26       ` Lorenzo Stoakes (Oracle)
2026-03-19 16:47       ` Matthew Wilcox
2026-03-18  0:20   ` kernel test robot
2026-03-10  7:30 ` [PATCH 7/9] mm/swapfile: Make folio_put_swap batchable Dev Jain
2026-03-10  8:29   ` Kairui Song
2026-03-10  8:50     ` Dev Jain
2026-03-10  8:55   ` Lorenzo Stoakes (Oracle)
2026-03-18  1:04   ` kernel test robot
2026-03-10  7:30 ` [PATCH 8/9] mm/rmap: introduce folio_try_share_anon_rmap_ptes Dev Jain
2026-03-10  9:38   ` Lorenzo Stoakes (Oracle)
2026-03-11  8:09     ` Dev Jain
2026-03-12  8:19       ` Wei Yang
2026-03-19 15:47       ` Lorenzo Stoakes (Oracle)
2026-04-08  7:14         ` Dev Jain
2026-03-10  7:30 ` [PATCH 9/9] mm/rmap: enable batch unmapping of anonymous folios Dev Jain
2026-03-10  8:02 ` [PATCH 0/9] mm/rmap: Optimize anonymous large folio unmapping Lorenzo Stoakes (Oracle)
2026-03-10  9:28   ` Dev Jain [this message]
2026-03-10 12:59 ` Lance Yang
2026-03-11  8:11   ` Dev Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a5279ba-7964-49ea-a84b-c80b5e53c359@arm.com \
    --to=dev.jain@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kas@kernel.org \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=nphamcs@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=youngjun.park@lge.com \
    --cc=yuanchu@google.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.