From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Andrea Arcangeli <aarcange@redhat.com>,
David Hildenbrand <david@redhat.com>,
linuxppc-dev@lists.ozlabs.org,
Anshuman Khandual <anshuman.khandual@arm.com>,
Dave Chinner <david@fromorbit.com>,
Mel Gorman <mgorman@techsingularity.net>,
Peter Xu <peterx@redhat.com>,
linux-mm@kvack.org, Hugh Dickins <hughd@google.com>,
Nadav Amit <namit@vmware.com>,
Nicholas Piggin <npiggin@gmail.com>,
Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>
Subject: [PATCH v2 0/7] mm/autonuma: replace savedwrite infrastructure
Date: Tue, 8 Nov 2022 18:46:45 +0100 [thread overview]
Message-ID: <20221108174652.198904-1-david@redhat.com> (raw)
This series is based on mm-unstable.
As discussed in my talk at LPC, we can reuse the same mechanism for
deciding whether to map a pte writable when upgrading permissions via
mprotect() -- e.g., PROT_READ -> PROT_READ|PROT_WRITE -- to replace the
savedwrite infrastructure used for NUMA hinting faults (e.g., PROT_NONE
-> PROT_READ|PROT_WRITE).
Instead of maintaining previous write permissions for a pte/pmd, we
re-determine if the pte/pmd can be writable. The big benefit is that we
have a common logic for deciding whether we can map a pte/pmd writable on
protection changes.
For private mappings, there should be no difference -- from
what I understand, that is what autonuma benchmarks care about.
I ran autonumabench for v1 on a system with 2 NUMA nodes, 96 GiB each via:
perf stat --null --repeat 10
The numa01 benchmark is quite noisy in my environment and I failed to
reduce the noise so far.
numa01:
mm-unstable: 146.88 +- 6.54 seconds time elapsed ( +- 4.45% )
mm-unstable++: 147.45 +- 13.39 seconds time elapsed ( +- 9.08% )
numa02:
mm-unstable: 16.0300 +- 0.0624 seconds time elapsed ( +- 0.39% )
mm-unstable++: 16.1281 +- 0.0945 seconds time elapsed ( +- 0.59% )
It is worth noting that for shared writable mappings that require
writenotify, we will only avoid write faults if the pte/pmd is dirty
(inherited from the older mprotect logic). If we ever care about optimizing
that further, we'd need a different mechanism to identify whether the FS
still needs to get notified on the next write access.
In any case, such an optimiztion will then not be autonuma-specific,
but mprotect() permission upgrades would similarly benefit from it.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
v1 -> v2:
* "mm/mprotect: factor out check whether manual PTE write upgrades are
required"
-> Added
* "mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite"
-> Simplify and don't opimize for failed migration
-> Update patch description
RFC -> v1:
* "mm/mprotect: allow clean exclusive anon pages to be writable"
-> Move comment change to patch #2
* "mm/mprotect: minor can_change_pte_writable() cleanups"
-> Adjust comments
* "mm/huge_memory: try avoiding write faults when changing PMD protection"
-> Fix wrong check
* "selftests/vm: anon_cow: add mprotect() optimiation tests"
-> Add basic tests for the mprotect() optimization
David Hildenbrand (6):
mm/mprotect: minor can_change_pte_writable() cleanups
mm/huge_memory: try avoiding write faults when changing PMD protection
mm/mprotect: factor out check whether manual PTE write upgrades are
required
mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite
mm: remove unused savedwrite infrastructure
selftests/vm: anon_cow: add mprotect() optimization tests
Nadav Amit (1):
mm/mprotect: allow clean exclusive anon pages to be writable
arch/powerpc/include/asm/book3s/64/pgtable.h | 80 +-------------------
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +-
include/linux/mm.h | 18 ++++-
include/linux/pgtable.h | 24 ------
mm/debug_vm_pgtable.c | 32 --------
mm/huge_memory.c | 64 ++++++++++++----
mm/ksm.c | 9 +--
mm/memory.c | 16 +++-
mm/mprotect.c | 50 ++++++------
tools/testing/selftests/vm/anon_cow.c | 49 +++++++++++-
10 files changed, 158 insertions(+), 186 deletions(-)
--
2.38.1
WARNING: multiple messages have this Message-ID (diff)
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
David Hildenbrand <david@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Dave Chinner <david@fromorbit.com>, Nadav Amit <namit@vmware.com>,
Peter Xu <peterx@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Hugh Dickins <hughd@google.com>, Vlastimil Babka <vbabka@suse.cz>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
Mike Rapoport <rppt@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>
Subject: [PATCH v2 0/7] mm/autonuma: replace savedwrite infrastructure
Date: Tue, 8 Nov 2022 18:46:45 +0100 [thread overview]
Message-ID: <20221108174652.198904-1-david@redhat.com> (raw)
This series is based on mm-unstable.
As discussed in my talk at LPC, we can reuse the same mechanism for
deciding whether to map a pte writable when upgrading permissions via
mprotect() -- e.g., PROT_READ -> PROT_READ|PROT_WRITE -- to replace the
savedwrite infrastructure used for NUMA hinting faults (e.g., PROT_NONE
-> PROT_READ|PROT_WRITE).
Instead of maintaining previous write permissions for a pte/pmd, we
re-determine if the pte/pmd can be writable. The big benefit is that we
have a common logic for deciding whether we can map a pte/pmd writable on
protection changes.
For private mappings, there should be no difference -- from
what I understand, that is what autonuma benchmarks care about.
I ran autonumabench for v1 on a system with 2 NUMA nodes, 96 GiB each via:
perf stat --null --repeat 10
The numa01 benchmark is quite noisy in my environment and I failed to
reduce the noise so far.
numa01:
mm-unstable: 146.88 +- 6.54 seconds time elapsed ( +- 4.45% )
mm-unstable++: 147.45 +- 13.39 seconds time elapsed ( +- 9.08% )
numa02:
mm-unstable: 16.0300 +- 0.0624 seconds time elapsed ( +- 0.39% )
mm-unstable++: 16.1281 +- 0.0945 seconds time elapsed ( +- 0.59% )
It is worth noting that for shared writable mappings that require
writenotify, we will only avoid write faults if the pte/pmd is dirty
(inherited from the older mprotect logic). If we ever care about optimizing
that further, we'd need a different mechanism to identify whether the FS
still needs to get notified on the next write access.
In any case, such an optimiztion will then not be autonuma-specific,
but mprotect() permission upgrades would similarly benefit from it.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
v1 -> v2:
* "mm/mprotect: factor out check whether manual PTE write upgrades are
required"
-> Added
* "mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite"
-> Simplify and don't opimize for failed migration
-> Update patch description
RFC -> v1:
* "mm/mprotect: allow clean exclusive anon pages to be writable"
-> Move comment change to patch #2
* "mm/mprotect: minor can_change_pte_writable() cleanups"
-> Adjust comments
* "mm/huge_memory: try avoiding write faults when changing PMD protection"
-> Fix wrong check
* "selftests/vm: anon_cow: add mprotect() optimiation tests"
-> Add basic tests for the mprotect() optimization
David Hildenbrand (6):
mm/mprotect: minor can_change_pte_writable() cleanups
mm/huge_memory: try avoiding write faults when changing PMD protection
mm/mprotect: factor out check whether manual PTE write upgrades are
required
mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite
mm: remove unused savedwrite infrastructure
selftests/vm: anon_cow: add mprotect() optimization tests
Nadav Amit (1):
mm/mprotect: allow clean exclusive anon pages to be writable
arch/powerpc/include/asm/book3s/64/pgtable.h | 80 +-------------------
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +-
include/linux/mm.h | 18 ++++-
include/linux/pgtable.h | 24 ------
mm/debug_vm_pgtable.c | 32 --------
mm/huge_memory.c | 64 ++++++++++++----
mm/ksm.c | 9 +--
mm/memory.c | 16 +++-
mm/mprotect.c | 50 ++++++------
tools/testing/selftests/vm/anon_cow.c | 49 +++++++++++-
10 files changed, 158 insertions(+), 186 deletions(-)
--
2.38.1
next reply other threads:[~2022-11-08 17:50 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 17:46 David Hildenbrand [this message]
2022-11-08 17:46 ` [PATCH v2 0/7] mm/autonuma: replace savedwrite infrastructure David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 1/7] mm/mprotect: allow clean exclusive anon pages to be writable David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 2/7] mm/mprotect: minor can_change_pte_writable() cleanups David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 3/7] mm/huge_memory: try avoiding write faults when changing PMD protection David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 4/7] mm/mprotect: factor out check whether manual PTE write upgrades are required David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 5/7] mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 6/7] mm: remove unused savedwrite infrastructure David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
2022-11-08 17:46 ` [PATCH v2 7/7] selftests/vm: anon_cow: add mprotect() optimization tests David Hildenbrand
2022-11-08 17:46 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221108174652.198904-1-david@redhat.com \
--to=david@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mgorman@techsingularity.net \
--cc=namit@vmware.com \
--cc=npiggin@gmail.com \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.