linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Alexandru Elisei <alexandru.elisei@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Alexander Potapenko <glider@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Brendan Jackman <jackmanb@google.com>,
	Christoph Lameter <cl@gentwo.org>,
	Dennis Zhou <dennis@kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	iommu@lists.linux.dev, io-uring@vger.kernel.org,
	Jason Gunthorpe <jgg@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	John Hubbard <jhubbard@nvidia.com>,
	kasan-dev@googlegroups.com, kvm@vger.kernel.org,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arm-kernel@axis.com, linux-arm-kernel@lists.infradead.org,
	linux-crypto@vger.kernel.org, linux-ide@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-mmc@vger.kernel.org, linux-mm@kvack.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Marco Elver <elver@google.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	netdev@vger.kernel.org, Oscar Salvador <osalvador@suse.de>,
	Peter Xu <peterx@redhat.com>, Robin Murphy <robin.murphy@arm.com>,
	Suren Baghdasaryan <surenb@google.com>, Tejun Heo <tj@kernel.org>,
	virtualization@lists.linux.dev, Vlastimil Babka <vbabka@suse.cz>,
	wireguard@lists.zx2c4.com, x86@kernel.org,
	Zi Yan <ziy@nvidia.com>
Subject: Re: [PATCH RFC 21/35] mm/cma: refuse handing out non-contiguous page ranges
Date: Tue, 26 Aug 2025 11:45:58 +0100	[thread overview]
Message-ID: <aK2QZnzS1ErHK5tP@raptor> (raw)
In-Reply-To: <20250821200701.1329277-22-david@redhat.com>

Hi David,

On Thu, Aug 21, 2025 at 10:06:47PM +0200, David Hildenbrand wrote:
> Let's disallow handing out PFN ranges with non-contiguous pages, so we
> can remove the nth-page usage in __cma_alloc(), and so any callers don't
> have to worry about that either when wanting to blindly iterate pages.
> 
> This is really only a problem in configs with SPARSEMEM but without
> SPARSEMEM_VMEMMAP, and only when we would cross memory sections in some
> cases.
> 
> Will this cause harm? Probably not, because it's mostly 32bit that does
> not support SPARSEMEM_VMEMMAP. If this ever becomes a problem we could
> look into allocating the memmap for the memory sections spanned by a
> single CMA region in one go from memblock.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  include/linux/mm.h |  6 ++++++
>  mm/cma.c           | 36 +++++++++++++++++++++++-------------
>  mm/util.c          | 33 +++++++++++++++++++++++++++++++++
>  3 files changed, 62 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ef360b72cb05c..f59ad1f9fc792 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -209,9 +209,15 @@ extern unsigned long sysctl_user_reserve_kbytes;
>  extern unsigned long sysctl_admin_reserve_kbytes;
>  
>  #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> +bool page_range_contiguous(const struct page *page, unsigned long nr_pages);
>  #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>  #else
>  #define nth_page(page,n) ((page) + (n))
> +static inline bool page_range_contiguous(const struct page *page,
> +		unsigned long nr_pages)
> +{
> +	return true;
> +}
>  #endif
>  
>  /* to align the pointer to the (next) page boundary */
> diff --git a/mm/cma.c b/mm/cma.c
> index 2ffa4befb99ab..1119fa2830008 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -780,10 +780,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>  				unsigned long count, unsigned int align,
>  				struct page **pagep, gfp_t gfp)
>  {
> -	unsigned long mask, offset;
> -	unsigned long pfn = -1;
> -	unsigned long start = 0;
>  	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
> +	unsigned long start, pfn, mask, offset;
>  	int ret = -EBUSY;
>  	struct page *page = NULL;
>  
> @@ -795,7 +793,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>  	if (bitmap_count > bitmap_maxno)
>  		goto out;
>  
> -	for (;;) {
> +	for (start = 0; ; start = bitmap_no + mask + 1) {
>  		spin_lock_irq(&cma->lock);
>  		/*
>  		 * If the request is larger than the available number
> @@ -812,6 +810,22 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>  			spin_unlock_irq(&cma->lock);
>  			break;
>  		}
> +
> +		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
> +		page = pfn_to_page(pfn);
> +
> +		/*
> +		 * Do not hand out page ranges that are not contiguous, so
> +		 * callers can just iterate the pages without having to worry
> +		 * about these corner cases.
> +		 */
> +		if (!page_range_contiguous(page, count)) {
> +			spin_unlock_irq(&cma->lock);
> +			pr_warn_ratelimited("%s: %s: skipping incompatible area [0x%lx-0x%lx]",
> +					    __func__, cma->name, pfn, pfn + count - 1);
> +			continue;
> +		}
> +
>  		bitmap_set(cmr->bitmap, bitmap_no, bitmap_count);
>  		cma->available_count -= count;
>  		/*
> @@ -821,29 +835,25 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>  		 */
>  		spin_unlock_irq(&cma->lock);
>  
> -		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
>  		mutex_lock(&cma->alloc_mutex);
>  		ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
>  		mutex_unlock(&cma->alloc_mutex);
> -		if (ret == 0) {
> -			page = pfn_to_page(pfn);
> +		if (!ret)
>  			break;
> -		}
>  
>  		cma_clear_bitmap(cma, cmr, pfn, count);
>  		if (ret != -EBUSY)
>  			break;
>  
>  		pr_debug("%s(): memory range at pfn 0x%lx %p is busy, retrying\n",
> -			 __func__, pfn, pfn_to_page(pfn));
> +			 __func__, pfn, page);
>  
>  		trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),

Nitpick: I think you already have the page here.

>  					   count, align);
> -		/* try again with a bit different memory target */
> -		start = bitmap_no + mask + 1;
>  	}
>  out:
> -	*pagep = page;
> +	if (!ret)
> +		*pagep = page;
>  	return ret;
>  }
>  
> @@ -882,7 +892,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>  	 */
>  	if (page) {
>  		for (i = 0; i < count; i++)
> -			page_kasan_tag_reset(nth_page(page, i));
> +			page_kasan_tag_reset(page + i);

Had a look at it, not very familiar with CMA, but the changes look equivalent to
what was before. Not sure that's worth a Reviewed-by tag, but here it in case
you want to add it:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

Just so I can better understand the problem being fixed, I guess you can have
two consecutive pfns with non-consecutive associated struct page if you have two
adjacent memory sections spanning the same physical memory region, is that
correct?

Thanks,
Alex

>  	}
>  
>  	if (ret && !(gfp & __GFP_NOWARN)) {
> diff --git a/mm/util.c b/mm/util.c
> index d235b74f7aff7..0bf349b19b652 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -1280,4 +1280,37 @@ unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte,
>  {
>  	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, 0);
>  }
> +
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> +/**
> + * page_range_contiguous - test whether the page range is contiguous
> + * @page: the start of the page range.
> + * @nr_pages: the number of pages in the range.
> + *
> + * Test whether the page range is contiguous, such that they can be iterated
> + * naively, corresponding to iterating a contiguous PFN range.
> + *
> + * This function should primarily only be used for debug checks, or when
> + * working with page ranges that are not naturally contiguous (e.g., pages
> + * within a folio are).
> + *
> + * Returns true if contiguous, otherwise false.
> + */
> +bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
> +{
> +	const unsigned long start_pfn = page_to_pfn(page);
> +	const unsigned long end_pfn = start_pfn + nr_pages;
> +	unsigned long pfn;
> +
> +	/*
> +	 * The memmap is allocated per memory section. We need to check
> +	 * each involved memory section once.
> +	 */
> +	for (pfn = ALIGN(start_pfn, PAGES_PER_SECTION);
> +	     pfn < end_pfn; pfn += PAGES_PER_SECTION)
> +		if (unlikely(page + (pfn - start_pfn) != pfn_to_page(pfn)))
> +			return false;
> +	return true;
> +}
> +#endif
>  #endif /* CONFIG_MMU */
> -- 
> 2.50.1
> 
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2025-08-26 11:45 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-21 20:06 [PATCH RFC 00/35] mm: remove nth_page() David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 01/35] mm: stop making SPARSEMEM_VMEMMAP user-selectable David Hildenbrand
2025-08-21 20:20   ` Zi Yan
2025-08-22 15:09   ` Mike Rapoport
2025-08-22 17:02   ` SeongJae Park
2025-08-21 20:06 ` [PATCH RFC 02/35] arm64: Kconfig: drop superfluous "select SPARSEMEM_VMEMMAP" David Hildenbrand
2025-08-22 15:10   ` Mike Rapoport
2025-08-21 20:06 ` [PATCH RFC 03/35] s390/Kconfig: " David Hildenbrand
2025-08-22 15:11   ` Mike Rapoport
2025-08-21 20:06 ` [PATCH RFC 04/35] x86/Kconfig: " David Hildenbrand
2025-08-22 15:11   ` Mike Rapoport
2025-08-21 20:06 ` [PATCH RFC 05/35] wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config David Hildenbrand
2025-08-22 15:13   ` Mike Rapoport
2025-08-21 20:06 ` [PATCH RFC 06/35] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
2025-08-21 20:23   ` Zi Yan
2025-08-22 17:07   ` SeongJae Park
2025-08-21 20:06 ` [PATCH RFC 07/35] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
2025-08-22 17:09   ` SeongJae Park
2025-08-21 20:06 ` [PATCH RFC 08/35] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 09/35] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
2025-08-22 15:27   ` Mike Rapoport
2025-08-22 18:09     ` David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 10/35] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
2025-08-22  4:09   ` Mika Penttilä
2025-08-22  6:24     ` David Hildenbrand
2025-08-23  8:59       ` Mike Rapoport
2025-08-25 12:48         ` David Hildenbrand
2025-08-25 14:32           ` Mike Rapoport
2025-08-25 14:38             ` David Hildenbrand
2025-08-25 14:59               ` Mike Rapoport
2025-08-25 15:42                 ` David Hildenbrand
2025-08-25 16:17                   ` Mike Rapoport
2025-08-25 16:23                     ` David Hildenbrand
2025-08-25 16:58                       ` update kernel-doc for MEMBLOCK_RSRV_NOINIT (was: Re: [PATCH RFC 10/35] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()) Mike Rapoport
2025-08-25 18:32                         ` update kernel-doc for MEMBLOCK_RSRV_NOINIT David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 11/35] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
2025-08-21 20:36   ` Zi Yan
2025-08-21 20:06 ` [PATCH RFC 12/35] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
2025-08-21 20:46   ` Zi Yan
2025-08-21 20:49     ` David Hildenbrand
2025-08-21 20:50       ` Zi Yan
2025-08-24 13:24   ` Mike Rapoport
2025-08-21 20:06 ` [PATCH RFC 13/35] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
2025-08-21 20:55   ` Zi Yan
2025-08-21 21:00     ` David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 14/35] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 15/35] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 16/35] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 17/35] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 18/35] io_uring/zcrx: remove "struct io_copy_cache" and one nth_page() usage David Hildenbrand
2025-08-22 11:32   ` Pavel Begunkov
2025-08-22 13:59     ` David Hildenbrand
2025-08-27  9:43       ` Pavel Begunkov
2025-08-21 20:06 ` [PATCH RFC 19/35] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 20/35] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 21/35] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
2025-08-26 10:45   ` Alexandru Elisei [this message]
2025-08-26 11:04     ` David Hildenbrand
2025-08-26 13:03       ` Alexandru Elisei
2025-08-26 13:08         ` David Hildenbrand
2025-08-26 13:11           ` Alexandru Elisei
2025-08-21 20:06 ` [PATCH RFC 22/35] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
2025-08-22  8:15   ` Marek Szyprowski
2025-08-21 20:06 ` [PATCH RFC 23/35] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
2025-08-22  8:15   ` Marek Szyprowski
2025-08-21 20:06 ` [PATCH RFC 24/35] ata: libata-eh: drop nth_page() usage within " David Hildenbrand
2025-08-22  1:59   ` Damien Le Moal
2025-08-22  6:18     ` David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 25/35] drm/i915/gem: " David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 26/35] mspro_block: " David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 27/35] memstick: " David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 28/35] mmc: " David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 29/35] scsi: core: " David Hildenbrand
2025-08-22 18:01   ` Bart Van Assche
2025-08-22 18:10     ` David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 30/35] vfio/pci: " David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 31/35] crypto: remove " David Hildenbrand
2025-08-21 20:24   ` Linus Torvalds
2025-08-21 20:29     ` David Hildenbrand
2025-08-21 20:36       ` Linus Torvalds
2025-08-21 20:37       ` David Hildenbrand
2025-08-21 20:40       ` Linus Torvalds
2025-08-21 20:06 ` [PATCH RFC 32/35] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
2025-08-21 20:06 ` [PATCH RFC 33/35] kfence: drop nth_page() usage David Hildenbrand
2025-08-21 20:32   ` David Hildenbrand
2025-08-21 21:45     ` David Hildenbrand
2025-08-21 20:07 ` [PATCH RFC 34/35] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
2025-08-21 20:07 ` [PATCH RFC 35/35] mm: remove nth_page() David Hildenbrand
2025-08-21 21:37 ` [syzbot ci] " syzbot ci
2025-08-22 14:30 ` [PATCH RFC 00/35] " Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aK2QZnzS1ErHK5tP@raptor \
    --to=alexandru.elisei@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=cl@gentwo.org \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=io-uring@vger.kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jackmanb@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@axis.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=m.szyprowski@samsung.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux.dev \
    --cc=wireguard@lists.zx2c4.com \
    --cc=x86@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).