From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1153C2E92DE for ; Thu, 21 Aug 2025 20:07:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755806838; cv=none; b=pGXBem13CYUZ63PR38raXUzRIieRll/Cmml1IA+cYcUWyQgZIUuXJiOWkNfNbSxrUOPyuO1m0U895X9lGmqpH0k02V3ZuOKm0BQD7YNrMTleDAZHPwxAkMmmcnNtckySFG2ajNCDUGHFGuDEW0egPYbtp3iGKK5oXRzOw+zEtdA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755806838; c=relaxed/simple; bh=1aX6aTG8s6ax1CtK3pZ1UQ/TXpBX5+Ze6lBjUUjUQPw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=MHTlly8kAtIisHo0Qd8nUFkEU6rdxbPcxuN4BoWIcE1lIuzV0o5HpVfA/ZjAoJodqob4r+dJbY3grRuRsKRzwRGuQQVHa9qMT9BzJGjttBilxvHseIlRLaqpwQP7qxalYSAcw4uw0Et3Tu0chQhcTApprA56tcrFobeRthCd4z0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=jL8jv9px; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="jL8jv9px" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755806835; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=LLdMjSHtv+zm1bY5U2c0p8D/d+U3cemLYeHBKaj7voE=; b=jL8jv9pxLUXOICnVRLzlw0GOkjVgioY5K2tVrxYYzhqXKPZpkQQkr9ekH00ObR7P+KucxA r5DGBOOz2cNc4+FwAjHrqUflkgeCiYYW9j4I7r/1L8jNUGPgTYYL5LAkGnZCJD7hcW5c6L XatsrTw3Z/oyl64TboekSYbdPYz6Hpo= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-194-E5XENhj0MIyw08mBw61DhQ-1; Thu, 21 Aug 2025 16:07:07 -0400 X-MC-Unique: E5XENhj0MIyw08mBw61DhQ-1 X-Mimecast-MFC-AGG-ID: E5XENhj0MIyw08mBw61DhQ_1755806827 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-45b51411839so818805e9.0 for ; Thu, 21 Aug 2025 13:07:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755806826; x=1756411626; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LLdMjSHtv+zm1bY5U2c0p8D/d+U3cemLYeHBKaj7voE=; b=XfR6sa9SoA4lcgNQ5a62BbAcagweVkFfC+Qb8LOKX4oP+B+Q1qTqigF3QiHM2ED1IQ Pc3WVqZsRideRQQH+seRrbAKGTdhl7YBrEHgMSWx1ODQSztd/3KEsuWyWje25RAxDUyF k4vid6GZekfXqiElN81m1dRrONcdIvu+jY9EBG5f/aiRwth9oJ2hEmbwISVi6sXW7VPx gTf4cdkxg/IEyxOc8zhQiCAk+apZqLch4Nbd31XyLwNLFn4nniehio8w8aIrPlc+XCuw LEl/6lDr02Iaf5fkzqOL/Uso1OQdVkjrf/uP+pnrTQ3aW4BiNRmbZz43UqLgEVh8sy0w lp+w== X-Forwarded-Encrypted: i=1; AJvYcCU+OzwZ5P2vgY1w3FJWj0BeNDrRmBxEoZ5FlEcSs0IrIz20HVwKTd67Gepk+Rx/o36mvH/Kh0FTW6xt@vger.kernel.org X-Gm-Message-State: AOJu0YxOZ7/5rPqnKcbFT6UGluHyrfityue3fcsMTng/XgtVzlUjk1Bp /YGtI2xuebeUBUpcYPcL4DMB4o8nGbHHmHfTm5gdKQeojFfapf/76vY927KwjB35daq1LaVBCA7 XFIfEgHh3DKYm6RA5P6lcGCnKToPSjV0EsYJJPfV4lLXcayGabh7dx6HB8qFn8nk= X-Gm-Gg: ASbGncuOP1mS0WBIPbxG/O5y/ISu3jfW4WyxnFSBn5YQUm0DdLL61m2Wqpj1sdRIrC0 wai8HUnogzxzSosaTohLCd+c9nL4+l7l9c7mFe+OmsOtwBIL5AYVsdlBqX1yGHJDTbqoKFc/9jT 4Rw11f37iAgMYlzWfgDxU0lI7bpNdZNrxcH+wk8x5cBWDTHxlkcK5ba7UH47+4gJrScq8nRauGM FbbaKg5/ggfVe1AFiGiyf3FzU3juPm4pU9BKCnPNKri2blKXqOCOSuF0SIdi/G0ueP2sAACby3p tcrkfGDOr1hbPgCJT/pmmhOfo6lBIRYnnjRP4dZVLA0xMeFdRH/Mvbc3lADNaDFq6yaoqndyDmJ ms59NzRs3KF4tBP9m1noOvQ== X-Received: by 2002:a05:600c:4506:b0:456:1bae:5470 with SMTP id 5b1f17b1804b1-45b5179b6camr3190885e9.8.1755806826435; Thu, 21 Aug 2025 13:07:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG7XEacdneicK9OezeDu7E6BzvwjamEpx7+RbUEh0t7qSo20kryXdXhBwYQdhu2/m95zw+e6A== X-Received: by 2002:a05:600c:4506:b0:456:1bae:5470 with SMTP id 5b1f17b1804b1-45b5179b6camr3190665e9.8.1755806825859; Thu, 21 Aug 2025 13:07:05 -0700 (PDT) Received: from localhost (p200300d82f26ba0008036ec5991806fd.dip0.t-ipconnect.de. [2003:d8:2f26:ba00:803:6ec5:9918:6fd]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-3c077788df7sm12764142f8f.48.2025.08.21.13.07.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Aug 2025 13:07:05 -0700 (PDT) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Andrew Morton , Linus Torvalds , Jason Gunthorpe , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Jens Axboe , Marek Szyprowski , Robin Murphy , John Hubbard , Peter Xu , Alexander Potapenko , Marco Elver , Dmitry Vyukov , Brendan Jackman , Johannes Weiner , Zi Yan , Dennis Zhou , Tejun Heo , Christoph Lameter , Muchun Song , Oscar Salvador , x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-s390@vger.kernel.org, linux-crypto@vger.kernel.org, linux-ide@vger.kernel.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-mmc@vger.kernel.org, linux-arm-kernel@axis.com, linux-scsi@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux.dev, linux-mm@kvack.org, io-uring@vger.kernel.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, wireguard@lists.zx2c4.com, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-riscv@lists.infradead.org, Albert Ou , Alexander Gordeev , Alexandre Ghiti , Alex Dubov , Alex Williamson , Andreas Larsson , Borislav Petkov , Brett Creeley , Catalin Marinas , Christian Borntraeger , Christophe Leroy , Damien Le Moal , Dave Hansen , David Airlie , "David S. Miller" , Doug Gilbert , Heiko Carstens , Herbert Xu , Huacai Chen , Ingo Molnar , "James E.J. Bottomley" , Jani Nikula , "Jason A. Donenfeld" , Jason Gunthorpe , Jesper Nilsson , Joonas Lahtinen , Kevin Tian , Lars Persson , Madhavan Srinivasan , "Martin K. Petersen" , Maxim Levitsky , Michael Ellerman , Nicholas Piggin , Niklas Cassel , Palmer Dabbelt , Paul Walmsley , Rodrigo Vivi , Shameer Kolothum , Shuah Khan , Simona Vetter , Sven Schnelle , Thomas Bogendoerfer , Thomas Gleixner , Tvrtko Ursulin , Ulf Hansson , Vasily Gorbik , WANG Xuerui , Will Deacon , Yishai Hadas Subject: [PATCH RFC 00/35] mm: remove nth_page() Date: Thu, 21 Aug 2025 22:06:26 +0200 Message-ID: <20250821200701.1329277-1-david@redhat.com> X-Mailer: git-send-email 2.50.1 Precedence: bulk X-Mailing-List: linux-s390@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is based on mm-unstable and was cross-compiled heavily. I should probably have already dropped the RFC label but I want to hear first if I ignored some corner case (SG entries?) and I need to do at least a bit more testing. I will only CC non-MM folks on the cover letter and the respective patch to not flood too many inboxes (the lists receive all patches). --- As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch #1 -> #5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch #6 -> #12 : disallow folios to have non-contiguous pages Patch #13 -> #20 : remove nth_page() usage within folios Patch #21 : disallow CMA allocations of non-contiguous pages Patch #22 -> #31 : sanity+check + remove nth_page() usage within SG entry Patch #32 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch #33 : remove nth_page() in kfence Patch #34 : adjust stale comment regarding nth_page Patch #35 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. [1] https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u Cc: Andrew Morton Cc: Linus Torvalds Cc: Jason Gunthorpe Cc: Lorenzo Stoakes Cc: "Liam R. Howlett" Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Cc: Jens Axboe Cc: Marek Szyprowski Cc: Robin Murphy Cc: John Hubbard Cc: Peter Xu Cc: Alexander Potapenko Cc: Marco Elver Cc: Dmitry Vyukov Cc: Brendan Jackman Cc: Johannes Weiner Cc: Zi Yan Cc: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: Muchun Song Cc: Oscar Salvador Cc: x86@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-ide@vger.kernel.org Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-mmc@vger.kernel.org Cc: linux-arm-kernel@axis.com Cc: linux-scsi@vger.kernel.org Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux.dev Cc: linux-mm@kvack.org Cc: io-uring@vger.kernel.org Cc: iommu@lists.linux.dev Cc: kasan-dev@googlegroups.com Cc: wireguard@lists.zx2c4.com Cc: netdev@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-riscv@lists.infradead.org David Hildenbrand (35): mm: stop making SPARSEMEM_VMEMMAP user-selectable arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" s390/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" x86/Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() mm/hugetlb: check for unreasonable folio sizes when registering hstate mm/mm_init: make memmap_init_compound() look more like prep_compound_page() mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() mm: sanity-check maximum folio size in folio_set_order() mm: limit folio/compound page sizes in problematic kernel configs mm: simplify folio_page() and folio_page_idx() mm/mm/percpu-km: drop nth_page() usage within single allocation fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() mm/gup: drop nth_page() usage within folio when recording subpages io_uring/zcrx: remove "struct io_copy_cache" and one nth_page() usage io_uring/zcrx: remove nth_page() usage within folio mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() mm/cma: refuse handing out non-contiguous page ranges dma-remap: drop nth_page() in dma_common_contiguous_remap() scatterlist: disallow non-contigous page ranges in a single SG entry ata: libata-eh: drop nth_page() usage within SG entry drm/i915/gem: drop nth_page() usage within SG entry mspro_block: drop nth_page() usage within SG entry memstick: drop nth_page() usage within SG entry mmc: drop nth_page() usage within SG entry scsi: core: drop nth_page() usage within SG entry vfio/pci: drop nth_page() usage within SG entry crypto: remove nth_page() usage within SG entry mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() kfence: drop nth_page() usage block: update comment of "struct bio_vec" regarding nth_page() mm: remove nth_page() arch/arm64/Kconfig | 1 - arch/mips/include/asm/cacheflush.h | 11 +++-- arch/mips/mm/cache.c | 8 ++-- arch/s390/Kconfig | 1 - arch/x86/Kconfig | 1 - crypto/ahash.c | 4 +- crypto/scompress.c | 8 ++-- drivers/ata/libata-sff.c | 6 +-- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- drivers/memstick/core/mspro_block.c | 3 +- drivers/memstick/host/jmb38x_ms.c | 3 +- drivers/memstick/host/tifm_ms.c | 3 +- drivers/mmc/host/tifm_sd.c | 4 +- drivers/mmc/host/usdhi6rol0.c | 4 +- drivers/scsi/scsi_lib.c | 3 +- drivers/scsi/sg.c | 3 +- drivers/vfio/pci/pds/lm.c | 3 +- drivers/vfio/pci/virtio/migrate.c | 3 +- fs/hugetlbfs/inode.c | 25 ++++------ include/crypto/scatterwalk.h | 4 +- include/linux/bvec.h | 7 +-- include/linux/mm.h | 48 +++++++++++++++---- include/linux/page-flags.h | 5 +- include/linux/scatterlist.h | 4 +- io_uring/zcrx.c | 34 ++++--------- kernel/dma/remap.c | 2 +- mm/Kconfig | 3 +- mm/cma.c | 36 +++++++++----- mm/gup.c | 13 +++-- mm/hugetlb.c | 23 ++++----- mm/internal.h | 1 + mm/kfence/core.c | 17 ++++--- mm/memremap.c | 3 ++ mm/mm_init.c | 13 ++--- mm/page_alloc.c | 5 +- mm/pagewalk.c | 2 +- mm/percpu-km.c | 2 +- mm/util.c | 33 +++++++++++++ tools/testing/scatterlist/linux/mm.h | 1 - .../selftests/wireguard/qemu/kernel.config | 1 - 40 files changed, 203 insertions(+), 150 deletions(-) base-commit: c0e3b3f33ba7b767368de4afabaf7c1ddfdc3872 -- 2.50.1