linux-csky.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Matthew Wilcox <willy@infradead.org>, Guo Ren <guoren@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	"David S . Miller" <davem@davemloft.net>,
	Andreas Larsson <andreas@gaisler.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Nicolas Pitre <nico@fluxnic.net>,
	Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	David Hildenbrand <david@redhat.com>,
	Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
	Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Dave Young <dyoung@redhat.com>, Tony Luck <tony.luck@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Dave Martin <Dave.Martin@arm.com>,
	James Morse <james.morse@arm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-csky@vger.kernel.org,
	linux-mips@vger.kernel.org, linux-s390@vger.kernel.org,
	sparclinux@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-cxl@vger.kernel.org, linux-mm@kvack.org,
	ntfs3@lists.linux.dev, kexec@lists.infradead.org,
	kasan-dev@googlegroups.com, Jason Gunthorpe <jgg@nvidia.com>
Subject: [PATCH 00/16] expand mmap_prepare functionality, port more users
Date: Mon,  8 Sep 2025 12:10:31 +0100	[thread overview]
Message-ID: <cover.1757329751.git.lorenzo.stoakes@oracle.com> (raw)

Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
callback"), The f_op->mmap hook has been deprecated in favour of
f_op->mmap_prepare.

This was introduced in order to make it possible for us to eventually
eliminate the f_op->mmap hook which is highly problematic as it allows
drivers and filesystems raw access to a VMA which is not yet correctly
initialised.

This hook also introduces complexity for the memory mapping operation, as
we must correctly unwind what we do should an error arises.

Overall this interface being so open has caused significant problems for
us, including security issues, it is important for us to simply eliminate
this as a source of problems.

Therefore this series continues what was established by extending the
functionality further to permit more drivers and filesystems to use
mmap_prepare.

After updating some areas that can simply use mmap_prepare as-is, and
performing some housekeeping, we then introduce two new hooks:

f_op->mmap_complete - this is invoked at the point of the VMA having been
correctly inserted, though with the VMA write lock still held. mmap_prepare
must also be specified.

This expands the use of mmap_prepare to those callers which need to
prepopulate mappings, as well as any which does genuinely require access to
the VMA.

It's simple - we will let the caller access the VMA, but only once it's
established. At this point unwinding issues is simple - we just unmap the
VMA.

The VMA is also then correctly initialised at this stage so there can be no
issues arising from a not-fully initialised VMA at this point.

The other newly added hook is:

f_op->mmap_abort - this is only valid in conjunction with mmap_prepare and
mmap_complete. This is called should an error arise between mmap_prepare
and mmap_complete (not as a result of mmap_prepare but rather some other
part of the mapping logic).

This is required in case mmap_prepare wishes to establish state or locks
which need to be cleaned up on completion. If we did not provide this, then
this could not be permitted as this cleanup would otherwise not occur
should the mapping fail between the two calls.

We then add split remap_pfn_range*() functions which allow for PFN remap (a
typical mapping prepopulation operation) split between a prepare/complete
step, as well as io_mremap_pfn_range_prepare, complete for a similar
purpose.

From there we update various mm-adjacent logic to use this functionality as
a first set of changes, as well as resctl and cramfs filesystems to round
off the non-stacked filesystem instances.


REVIEWER NOTE:
~~~~~~~~~~~~~~

I considered putting the complete, abort callbacks in vm_ops, however this
won't work because then we would be unable to adjust helpers like
generic_file_mmap_prepare() (which provides vm_ops) to provide the correct
complete, abort callbacks.

Conceptually it also makes more sense to have these in f_op as they are
one-off operations performed at mmap time to establish the VMA, rather than
a property of the VMA itself.

Lorenzo Stoakes (16):
  mm/shmem: update shmem to use mmap_prepare
  device/dax: update devdax to use mmap_prepare
  mm: add vma_desc_size(), vma_desc_pages() helpers
  relay: update relay to use mmap_prepare
  mm/vma: rename mmap internal functions to avoid confusion
  mm: introduce the f_op->mmap_complete, mmap_abort hooks
  doc: update porting, vfs documentation for mmap_[complete, abort]
  mm: add remap_pfn_range_prepare(), remap_pfn_range_complete()
  mm: introduce io_remap_pfn_range_prepare, complete
  mm/hugetlb: update hugetlbfs to use mmap_prepare, mmap_complete
  mm: update mem char driver to use mmap_prepare, mmap_complete
  mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort
  mm: update cramfs to use mmap_prepare, mmap_complete
  fs/proc: add proc_mmap_[prepare, complete] hooks for procfs
  fs/proc: update vmcore to use .proc_mmap_[prepare, complete]
  kcov: update kcov to use mmap_prepare, mmap_complete

 Documentation/filesystems/porting.rst |   9 ++
 Documentation/filesystems/vfs.rst     |  35 +++++++
 arch/csky/include/asm/pgtable.h       |   5 +
 arch/mips/alchemy/common/setup.c      |  28 +++++-
 arch/mips/include/asm/pgtable.h       |  10 ++
 arch/s390/kernel/crash_dump.c         |   6 +-
 arch/sparc/include/asm/pgtable_32.h   |  29 +++++-
 arch/sparc/include/asm/pgtable_64.h   |  29 +++++-
 drivers/char/mem.c                    |  80 ++++++++-------
 drivers/dax/device.c                  |  32 +++---
 fs/cramfs/inode.c                     | 134 ++++++++++++++++++--------
 fs/hugetlbfs/inode.c                  |  86 +++++++++--------
 fs/ntfs3/file.c                       |   2 +-
 fs/proc/inode.c                       |  13 ++-
 fs/proc/vmcore.c                      |  53 +++++++---
 fs/resctrl/pseudo_lock.c              |  56 ++++++++---
 include/linux/fs.h                    |   4 +
 include/linux/mm.h                    |  53 +++++++++-
 include/linux/mm_types.h              |   5 +
 include/linux/proc_fs.h               |   5 +
 include/linux/shmem_fs.h              |   3 +-
 include/linux/vmalloc.h               |  10 +-
 kernel/kcov.c                         |  40 +++++---
 kernel/relay.c                        |  32 +++---
 mm/memory.c                           | 128 +++++++++++++++---------
 mm/secretmem.c                        |   2 +-
 mm/shmem.c                            |  49 +++++++---
 mm/util.c                             |  18 +++-
 mm/vma.c                              |  96 +++++++++++++++---
 mm/vmalloc.c                          |  16 ++-
 tools/testing/vma/vma_internal.h      |  31 +++++-
 31 files changed, 810 insertions(+), 289 deletions(-)

--
2.51.0

             reply	other threads:[~2025-09-08 11:11 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-08 11:10 Lorenzo Stoakes [this message]
2025-09-08 11:10 ` [PATCH 01/16] mm/shmem: update shmem to use mmap_prepare Lorenzo Stoakes
2025-09-08 14:59   ` David Hildenbrand
2025-09-08 15:28     ` Lorenzo Stoakes
2025-09-09  3:19   ` Baolin Wang
2025-09-09  9:08     ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 02/16] device/dax: update devdax " Lorenzo Stoakes
2025-09-08 15:03   ` David Hildenbrand
2025-09-08 15:28     ` Lorenzo Stoakes
2025-09-08 15:31       ` David Hildenbrand
2025-09-08 11:10 ` [PATCH 03/16] mm: add vma_desc_size(), vma_desc_pages() helpers Lorenzo Stoakes
2025-09-08 12:51   ` Jason Gunthorpe
2025-09-08 13:12     ` Lorenzo Stoakes
2025-09-08 13:32       ` Jason Gunthorpe
2025-09-08 14:09         ` Lorenzo Stoakes
2025-09-08 14:20           ` Jason Gunthorpe
2025-09-08 14:47             ` Lorenzo Stoakes
2025-09-08 15:07               ` David Hildenbrand
2025-09-08 15:35                 ` Lorenzo Stoakes
2025-09-08 17:30                   ` David Hildenbrand
2025-09-09  9:21                     ` Lorenzo Stoakes
2025-09-08 15:16               ` Jason Gunthorpe
2025-09-08 15:24                 ` David Hildenbrand
2025-09-08 15:33                   ` Jason Gunthorpe
2025-09-08 15:46                     ` David Hildenbrand
2025-09-08 15:50                       ` David Hildenbrand
2025-09-08 15:56                         ` Jason Gunthorpe
2025-09-08 17:36                           ` David Hildenbrand
2025-09-08 20:24                             ` Lorenzo Stoakes
2025-09-08 15:33                   ` Lorenzo Stoakes
2025-09-08 15:10   ` David Hildenbrand
2025-09-08 11:10 ` [PATCH 04/16] relay: update relay to use mmap_prepare Lorenzo Stoakes
2025-09-08 15:15   ` David Hildenbrand
2025-09-08 15:29     ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 05/16] mm/vma: rename mmap internal functions to avoid confusion Lorenzo Stoakes
2025-09-08 15:19   ` David Hildenbrand
2025-09-08 15:31     ` Lorenzo Stoakes
2025-09-08 17:38       ` David Hildenbrand
2025-09-09  9:04         ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 06/16] mm: introduce the f_op->mmap_complete, mmap_abort hooks Lorenzo Stoakes
2025-09-08 12:55   ` Jason Gunthorpe
2025-09-08 13:19     ` Lorenzo Stoakes
2025-09-08 15:27   ` David Hildenbrand
2025-09-09  9:13     ` Lorenzo Stoakes
2025-09-09  9:26       ` David Hildenbrand
2025-09-09  9:37         ` Lorenzo Stoakes
2025-09-09 16:43           ` Suren Baghdasaryan
2025-09-09 17:36             ` Lorenzo Stoakes
2025-09-09 16:44   ` Suren Baghdasaryan
2025-09-08 11:10 ` [PATCH 07/16] doc: update porting, vfs documentation for mmap_[complete, abort] Lorenzo Stoakes
2025-09-08 23:17   ` Randy Dunlap
2025-09-09  9:02     ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 08/16] mm: add remap_pfn_range_prepare(), remap_pfn_range_complete() Lorenzo Stoakes
2025-09-08 13:00   ` Jason Gunthorpe
2025-09-08 13:27     ` Lorenzo Stoakes
2025-09-08 13:35       ` Jason Gunthorpe
2025-09-08 14:18         ` Lorenzo Stoakes
2025-09-08 16:03           ` Jason Gunthorpe
2025-09-08 16:07             ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 09/16] mm: introduce io_remap_pfn_range_prepare, complete Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 10/16] mm/hugetlb: update hugetlbfs to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:11   ` Jason Gunthorpe
2025-09-08 13:37     ` Lorenzo Stoakes
2025-09-08 13:52       ` Jason Gunthorpe
2025-09-08 14:19         ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 11/16] mm: update mem char driver " Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort Lorenzo Stoakes
2025-09-08 13:24   ` Jason Gunthorpe
2025-09-08 13:40     ` Lorenzo Stoakes
2025-09-08 14:27     ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 13/16] mm: update cramfs to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:27   ` Jason Gunthorpe
2025-09-08 13:44     ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 14/16] fs/proc: add proc_mmap_[prepare, complete] hooks for procfs Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 15/16] fs/proc: update vmcore to use .proc_mmap_[prepare, complete] Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 16/16] kcov: update kcov to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:30   ` Jason Gunthorpe
2025-09-08 13:47     ` Lorenzo Stoakes
2025-09-08 13:27 ` [PATCH 00/16] expand mmap_prepare functionality, port more users Jan Kara
2025-09-08 14:48   ` Lorenzo Stoakes
2025-09-08 15:04     ` Jason Gunthorpe
2025-09-08 15:15       ` Lorenzo Stoakes
2025-09-09  8:31 ` Alexander Gordeev
2025-09-09  8:59   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1757329751.git.lorenzo.stoakes@oracle.com \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Dave.Martin@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=almaz.alexandrovich@paragon-software.com \
    --cc=andreas@gaisler.com \
    --cc=andreyknvl@gmail.com \
    --cc=arnd@arndb.de \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=dvyukov@google.com \
    --cc=dyoung@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=guoren@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nico@fluxnic.net \
    --cc=ntfs3@lists.linux.dev \
    --cc=nvdimm@lists.linux.dev \
    --cc=osalvador@suse.de \
    --cc=pfalcato@suse.de \
    --cc=reinette.chatre@intel.com \
    --cc=rppt@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=svens@linux.ibm.com \
    --cc=tony.luck@intel.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).