linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/37] mm: remove nth_page()
@ 2025-09-01 15:03 David Hildenbrand
  2025-09-01 15:03 ` [PATCH v2 01/37] mm: stop making SPARSEMEM_VMEMMAP user-selectable David Hildenbrand
                   ` (36 more replies)
  0 siblings, 37 replies; 40+ messages in thread
From: David Hildenbrand @ 2025-09-01 15:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Linus Torvalds, Jason Gunthorpe,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jens Axboe, Marek Szyprowski,
	Robin Murphy, John Hubbard, Peter Xu, Alexander Potapenko,
	Marco Elver, Dmitry Vyukov, Brendan Jackman, Johannes Weiner,
	Zi Yan, Dennis Zhou, Tejun Heo, Christoph Lameter, Muchun Song,
	Oscar Salvador, x86, linux-arm-kernel, linux-mips, linux-s390,
	linux-crypto, linux-ide, intel-gfx, dri-devel, linux-mmc,
	linux-arm-kernel, linux-scsi, kvm, virtualization, linux-mm,
	io-uring, iommu, kasan-dev, wireguard, netdev, linux-kselftest,
	linux-riscv, Albert Ou, Alexander Gordeev, Alexandre Ghiti,
	Alexandru Elisei, Alex Dubov, Alex Williamson, Andreas Larsson,
	Bart Van Assche, Borislav Petkov, Brett Creeley, Catalin Marinas,
	Christian Borntraeger, Christophe Leroy, Damien Le Moal,
	Dave Hansen, David Airlie, David S. Miller, Doug Gilbert,
	Heiko Carstens, Herbert Xu, Huacai Chen, Ingo Molnar,
	James E.J. Bottomley, Jani Nikula, Jason A. Donenfeld,
	Jason Gunthorpe, Jesper Nilsson, Joonas Lahtinen, Kevin Tian,
	Lars Persson, Madhavan Srinivasan, Martin K. Petersen,
	Maxim Levitsky, Michael Ellerman, Nicholas Piggin, Niklas Cassel,
	Palmer Dabbelt, Paul Walmsley, Pavel Begunkov, Rodrigo Vivi,
	SeongJae Park, Shameer Kolothum, Shuah Khan, Simona Vetter,
	Sven Schnelle, Thomas Bogendoerfer, Thomas Gleixner,
	Tvrtko Ursulin, Ulf Hansson, Vasily Gorbik, WANG Xuerui, Wei Yang,
	Will Deacon, Yishai Hadas

This is based on mm-unstable.

I will only CC non-MM folks on the cover letter and the respective patch
to not flood too many inboxes (the lists receive all patches).

--

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is because
on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when
it might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".
If we only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page()
within CMA allocations (again, could span memory sections), and in
one corner case (kfence) when processing memblock allocations (again,
could span memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.

[1] https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u

v1 -> v2:
* "fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()"
 -> Add comment for loop and remove comment of function regarding
    copy_page_to_iter().
* Various smaller patch description tweaks I am not going to list for my
  sanity
* "mips: mm: convert __flush_dcache_pages() to
  __flush_dcache_folio_pages()"
 -> Fix flush_dcache_page()
 -> Drop "extern"
* "mm/gup: remove record_subpages()"
 -> Added
* "mm/hugetlb: check for unreasonable folio sizes when registering hstate"
 -> Refine comment
* "mm/cma: refuse handing out non-contiguous page ranges"
 -> Add comment above loop
* "mm/page_alloc: reject unreasonable folio/compound page sizes in
   alloc_contig_range_noprof()"
 -> Added comment above check
* "mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()"
 -> Refined comment

RFC -> v1:
* "wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel
   config"
 -> Mention that it was never really relevant for the test
* "mm/mm_init: make memmap_init_compound() look more like
   prep_compound_page()"
 -> Mention the setup of page links
* "mm: limit folio/compound page sizes in problematic kernel configs"
 -> Improve comment for PUD handling, mentioning hugetlb and dax
* "mm: simplify folio_page() and folio_page_idx()"
 -> Call variable "n"
* "mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()"
 -> Keep __init_single_page() and refer to the usage of
    memblock_reserved_mark_noinit()
* "fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()"
* "fs: hugetlbfs: remove nth_page() usage within folio in
   adjust_range_hwpoison()"
 -> Separate nth_page() removal from cleanups
 -> Further improve cleanups
* "io_uring/zcrx: remove nth_page() usage within folio"
 -> Keep the io_copy_cache for now and limit to nth_page() removal
* "mm/gup: drop nth_page() usage within folio when recording subpages"
 -> Cleanup record_subpages as bit
* "mm/cma: refuse handing out non-contiguous page ranges"
 -> Replace another instance of "pfn_to_page(pfn)" where we already have
    the page
* "scatterlist: disallow non-contigous page ranges in a single SG entry"
 -> We have to EXPORT the symbol. I thought about moving it to mm_inline.h,
    but I really don't want to include that in include/linux/scatterlist.h
* "ata: libata-eh: drop nth_page() usage within SG entry"
* "mspro_block: drop nth_page() usage within SG entry"
* "memstick: drop nth_page() usage within SG entry"
* "mmc: drop nth_page() usage within SG entry"
 -> Keep PAGE_SHIFT
* "scsi: scsi_lib: drop nth_page() usage within SG entry"
* "scsi: sg: drop nth_page() usage within SG entry"
 -> Split patches, Keep PAGE_SHIFT
* "crypto: remove nth_page() usage within SG entry"
 -> Keep PAGE_SHIFT
* "kfence: drop nth_page() usage"
 -> Keep modifying i and use "start_pfn" only instead

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: x86@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: linux-ide@vger.kernel.org
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-arm-kernel@axis.com
Cc: linux-scsi@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux.dev
Cc: linux-mm@kvack.org
Cc: io-uring@vger.kernel.org
Cc: iommu@lists.linux.dev
Cc: kasan-dev@googlegroups.com
Cc: wireguard@lists.zx2c4.com
Cc: netdev@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-riscv@lists.infradead.org

David Hildenbrand (37):
  mm: stop making SPARSEMEM_VMEMMAP user-selectable
  arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
  s390/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
  x86/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP"
  wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu
    kernel config
  mm/page_alloc: reject unreasonable folio/compound page sizes in
    alloc_contig_range_noprof()
  mm/memremap: reject unreasonable folio/compound page sizes in
    memremap_pages()
  mm/hugetlb: check for unreasonable folio sizes when registering hstate
  mm/mm_init: make memmap_init_compound() look more like
    prep_compound_page()
  mm: sanity-check maximum folio size in folio_set_order()
  mm: limit folio/compound page sizes in problematic kernel configs
  mm: simplify folio_page() and folio_page_idx()
  mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  mm/mm/percpu-km: drop nth_page() usage within single allocation
  fs: hugetlbfs: remove nth_page() usage within folio in
    adjust_range_hwpoison()
  fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()
  mm/pagewalk: drop nth_page() usage within folio in folio_walk_start()
  mm/gup: drop nth_page() usage within folio when recording subpages
  mm/gup: remove record_subpages()
  io_uring/zcrx: remove nth_page() usage within folio
  mips: mm: convert __flush_dcache_pages() to
    __flush_dcache_folio_pages()
  mm/cma: refuse handing out non-contiguous page ranges
  dma-remap: drop nth_page() in dma_common_contiguous_remap()
  scatterlist: disallow non-contigous page ranges in a single SG entry
  ata: libata-sff: drop nth_page() usage within SG entry
  drm/i915/gem: drop nth_page() usage within SG entry
  mspro_block: drop nth_page() usage within SG entry
  memstick: drop nth_page() usage within SG entry
  mmc: drop nth_page() usage within SG entry
  scsi: scsi_lib: drop nth_page() usage within SG entry
  scsi: sg: drop nth_page() usage within SG entry
  vfio/pci: drop nth_page() usage within SG entry
  crypto: remove nth_page() usage within SG entry
  mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()
  kfence: drop nth_page() usage
  block: update comment of "struct bio_vec" regarding nth_page()
  mm: remove nth_page()

 arch/arm64/Kconfig                            |  1 -
 arch/mips/include/asm/cacheflush.h            | 11 +++--
 arch/mips/mm/cache.c                          |  8 ++--
 arch/s390/Kconfig                             |  1 -
 arch/x86/Kconfig                              |  1 -
 crypto/ahash.c                                |  4 +-
 crypto/scompress.c                            |  8 ++--
 drivers/ata/libata-sff.c                      |  6 +--
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  2 +-
 drivers/memstick/core/mspro_block.c           |  3 +-
 drivers/memstick/host/jmb38x_ms.c             |  3 +-
 drivers/memstick/host/tifm_ms.c               |  3 +-
 drivers/mmc/host/tifm_sd.c                    |  4 +-
 drivers/mmc/host/usdhi6rol0.c                 |  4 +-
 drivers/scsi/scsi_lib.c                       |  3 +-
 drivers/scsi/sg.c                             |  3 +-
 drivers/vfio/pci/pds/lm.c                     |  3 +-
 drivers/vfio/pci/virtio/migrate.c             |  3 +-
 fs/hugetlbfs/inode.c                          | 36 +++++---------
 include/crypto/scatterwalk.h                  |  4 +-
 include/linux/bvec.h                          |  7 +--
 include/linux/mm.h                            | 48 +++++++++++++++----
 include/linux/page-flags.h                    |  5 +-
 include/linux/scatterlist.h                   |  3 +-
 io_uring/zcrx.c                               |  4 +-
 kernel/dma/remap.c                            |  2 +-
 mm/Kconfig                                    |  3 +-
 mm/cma.c                                      | 39 +++++++++------
 mm/gup.c                                      | 36 +++++++-------
 mm/hugetlb.c                                  | 22 +++++----
 mm/internal.h                                 |  1 +
 mm/kfence/core.c                              | 12 +++--
 mm/memremap.c                                 |  3 ++
 mm/mm_init.c                                  | 15 +++---
 mm/page_alloc.c                               | 10 +++-
 mm/pagewalk.c                                 |  2 +-
 mm/percpu-km.c                                |  2 +-
 mm/util.c                                     | 36 ++++++++++++++
 tools/testing/scatterlist/linux/mm.h          |  1 -
 .../selftests/wireguard/qemu/kernel.config    |  1 -
 40 files changed, 217 insertions(+), 146 deletions(-)


base-commit: b73c6f2b5712809f5f386780ac46d1d78c31b2e6
-- 
2.50.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2025-09-02  9:42 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 15:03 [PATCH v2 00/37] mm: remove nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 01/37] mm: stop making SPARSEMEM_VMEMMAP user-selectable David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 02/37] arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 03/37] s390/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 04/37] x86/Kconfig: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 05/37] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 06/37] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 07/37] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 08/37] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 09/37] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 10/37] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 11/37] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 12/37] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 13/37] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 14/37] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 15/37] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 16/37] fs: hugetlbfs: cleanup " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 17/37] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 18/37] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 19/37] mm/gup: remove record_subpages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 20/37] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 21/37] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 22/37] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 23/37] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 24/37] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 25/37] ata: libata-sff: drop nth_page() usage within " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 26/37] drm/i915/gem: " David Hildenbrand
2025-09-02  9:22   ` Tvrtko Ursulin
2025-09-02  9:42     ` David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 27/37] mspro_block: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 28/37] memstick: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 29/37] mmc: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 30/37] scsi: scsi_lib: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 31/37] scsi: sg: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 32/37] vfio/pci: " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 33/37] crypto: remove " David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 34/37] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 35/37] kfence: drop nth_page() usage David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 36/37] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
2025-09-01 15:03 ` [PATCH v2 37/37] mm: remove nth_page() David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).