From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B70B4C8F for ; Tue, 17 Jun 2025 00:00:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750118459; cv=none; b=WM9DlDH2Sc1OwKw9icGMHqpS4wKhIpgXBytRNA99L1HYN8s1VTKnAHtoxf7bUna3UOVirkwwKUvDjUaPda+CMexe40IZ9H/m7RVaISwFRVoKpIjyVMTkvzb4a5CBYTo3glT1H30lWju+nvJVYGZtlNkCi59SGVtgsKnNaI3HXpA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750118459; c=relaxed/simple; bh=UIaS2GqggPOCifOT61NQ0jBlcg+CEueemoetK4JtS5E=; h=Date:To:From:Subject:Message-Id; b=Fr4TjEuYlzenU3FucwZWBHe4oAfoJ1+YWsKqEFnnuzd9wZuyUb8hzm0rE1uaTE0Tn+VBZqAs8OO+YuvlTKyAcHh/97G1dwnaKBA1AWttCoDCkHWQaIxdxJ9fx57ykMDnHMFYFRxah8mNIRQrBYBTCmSa9q1ccl0+SOBrmP+3JEI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=Qf0ICtzI; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="Qf0ICtzI" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CCF0FC4CEEA; Tue, 17 Jun 2025 00:00:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1750118458; bh=UIaS2GqggPOCifOT61NQ0jBlcg+CEueemoetK4JtS5E=; h=Date:To:From:Subject:From; b=Qf0ICtzIzaJPNHFXbxCh3YYMlsWKm9ppr7lsMynyxAa/TlmOGMWfu9pupwVzLG/qP /zcJoH2SHL+soj+rx1uzFMciH03NxchRhOakSf6ao169VJKxQu3YtmF+QvBZsOMuG8 E1z2w1xhXK+9pMdUJyzW/d2YSZG9v9qMNFPMSGvo= Date: Mon, 16 Jun 2025 17:00:58 -0700 To: mm-commits@vger.kernel.org,zhang.lyra@gmail.com,willy@infradead.org,will@kernel.org,m.szyprowski@samsung.com,lorenzo.stoakes@oracle.com,john@groves.net,jhubbard@nvidia.com,jgg@nvidia.com,hch@lst.de,gerald.schaefer@linux.ibm.com,debug@rivosinc.com,david@redhat.com,dan.j.williams@intel.com,bjorn@rivosinc.com,bjorn@kernel.org,balbirs@nvidia.com,apopple@nvidia.com,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-gup-remove-pxx_devmap-usage-from-get_user_pages.patch added to mm-new branch Message-Id: <20250617000058.CCF0FC4CEEA@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/gup: remove pXX_devmap usage from get_user_pages() has been added to the -mm mm-new branch. Its filename is mm-gup-remove-pxx_devmap-usage-from-get_user_pages.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-gup-remove-pxx_devmap-usage-from-get_user_pages.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Alistair Popple Subject: mm/gup: remove pXX_devmap usage from get_user_pages() Date: Mon, 16 Jun 2025 21:58:07 +1000 GUP uses pXX_devmap() calls to see if it needs to a get a reference on the associated pgmap data structure to ensure the pages won't go away. However it's a driver responsibility to ensure that if pages are mapped (ie. discoverable by GUP) that they are not offlined or removed from the memmap so there is no need to hold a reference on the pgmap data structure to ensure this. Furthermore mappings with PFN_DEV are no longer created, hence this effectively dead code anyway so can be removed. Link: https://lkml.kernel.org/r/e6f00c4b64843dbc0494c5cae9cb861cf7fcd8b6.1750075065.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Williams Cc: Balbir Singh Cc: Björn Töpel Cc: Björn Töpel Cc: Christoph Hellwig Cc: Chunyan Zhang Cc: David Hildenbrand Cc: Deepak Gupta Cc: Gerald Schaefer Cc: Inki Dae Cc: John Groves Cc: John Hubbard Cc: Lorenzo Stoakes Cc: Matthew Wilcox (Oracle) Cc: Will Deacon Signed-off-by: Andrew Morton --- include/linux/huge_mm.h | 3 mm/gup.c | 160 +------------------------------------- mm/huge_memory.c | 40 --------- 3 files changed, 5 insertions(+), 198 deletions(-) --- a/include/linux/huge_mm.h~mm-gup-remove-pxx_devmap-usage-from-get_user_pages +++ a/include/linux/huge_mm.h @@ -473,9 +473,6 @@ static inline bool folio_test_pmd_mappab return folio_order(folio) >= HPAGE_PMD_ORDER; } -struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, int flags, struct dev_pagemap **pgmap); - vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf); extern struct folio *huge_zero_folio; --- a/mm/gup.c~mm-gup-remove-pxx_devmap-usage-from-get_user_pages +++ a/mm/gup.c @@ -679,31 +679,9 @@ static struct page *follow_huge_pud(stru return NULL; pfn += (addr & ~PUD_MASK) >> PAGE_SHIFT; - - if (IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) && - pud_devmap(pud)) { - /* - * device mapped pages can only be returned if the caller - * will manage the page reference count. - * - * At least one of FOLL_GET | FOLL_PIN must be set, so - * assert that here: - */ - if (!(flags & (FOLL_GET | FOLL_PIN))) - return ERR_PTR(-EEXIST); - - if (flags & FOLL_TOUCH) - touch_pud(vma, addr, pudp, flags & FOLL_WRITE); - - ctx->pgmap = get_dev_pagemap(pfn, ctx->pgmap); - if (!ctx->pgmap) - return ERR_PTR(-EFAULT); - } - page = pfn_to_page(pfn); - if (!pud_devmap(pud) && !pud_write(pud) && - gup_must_unshare(vma, flags, page)) + if (!pud_write(pud) && gup_must_unshare(vma, flags, page)) return ERR_PTR(-EMLINK); ret = try_grab_folio(page_folio(page), 1, flags); @@ -857,8 +835,7 @@ static struct page *follow_page_pte(stru page = vm_normal_page(vma, address, pte); /* - * We only care about anon pages in can_follow_write_pte() and don't - * have to worry about pte_devmap() because they are never anon. + * We only care about anon pages in can_follow_write_pte(). */ if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, page, vma, flags)) { @@ -866,18 +843,7 @@ static struct page *follow_page_pte(stru goto out; } - if (!page && pte_devmap(pte) && (flags & (FOLL_GET | FOLL_PIN))) { - /* - * Only return device mapping pages in the FOLL_GET or FOLL_PIN - * case since they are only valid while holding the pgmap - * reference. - */ - *pgmap = get_dev_pagemap(pte_pfn(pte), *pgmap); - if (*pgmap) - page = pte_page(pte); - else - goto no_page; - } else if (unlikely(!page)) { + if (unlikely(!page)) { if (flags & FOLL_DUMP) { /* Avoid special (like zero) pages in core dumps */ page = ERR_PTR(-EFAULT); @@ -959,14 +925,6 @@ static struct page *follow_pmd_mask(stru return no_page_table(vma, flags, address); if (!pmd_present(pmdval)) return no_page_table(vma, flags, address); - if (pmd_devmap(pmdval)) { - ptl = pmd_lock(mm, pmd); - page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); - spin_unlock(ptl); - if (page) - return page; - return no_page_table(vma, flags, address); - } if (likely(!pmd_leaf(pmdval))) return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); @@ -2896,7 +2854,7 @@ static int gup_fast_pte_range(pmd_t pmd, int *nr) { struct dev_pagemap *pgmap = NULL; - int nr_start = *nr, ret = 0; + int ret = 0; pte_t *ptep, *ptem; ptem = ptep = pte_offset_map(&pmd, addr); @@ -2920,16 +2878,7 @@ static int gup_fast_pte_range(pmd_t pmd, if (!pte_access_permitted(pte, flags & FOLL_WRITE)) goto pte_unmap; - if (pte_devmap(pte)) { - if (unlikely(flags & FOLL_LONGTERM)) - goto pte_unmap; - - pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); - if (unlikely(!pgmap)) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - goto pte_unmap; - } - } else if (pte_special(pte)) + if (pte_special(pte)) goto pte_unmap; /* If it's not marked as special it must have a valid memmap. */ @@ -3001,91 +2950,6 @@ static int gup_fast_pte_range(pmd_t pmd, } #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ -#if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) -static int gup_fast_devmap_leaf(unsigned long pfn, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) -{ - int nr_start = *nr; - struct dev_pagemap *pgmap = NULL; - - do { - struct folio *folio; - struct page *page = pfn_to_page(pfn); - - pgmap = get_dev_pagemap(pfn, pgmap); - if (unlikely(!pgmap)) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - break; - } - - folio = try_grab_folio_fast(page, 1, flags); - if (!folio) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - break; - } - folio_set_referenced(folio); - pages[*nr] = page; - (*nr)++; - pfn++; - } while (addr += PAGE_SIZE, addr != end); - - put_dev_pagemap(pgmap); - return addr == end; -} - -static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - unsigned long fault_pfn; - int nr_start = *nr; - - fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr)) - return 0; - - if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - return 0; - } - return 1; -} - -static int gup_fast_devmap_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - unsigned long fault_pfn; - int nr_start = *nr; - - fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr)) - return 0; - - if (unlikely(pud_val(orig) != pud_val(*pudp))) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - return 0; - } - return 1; -} -#else -static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - BUILD_BUG(); - return 0; -} - -static int gup_fast_devmap_pud_leaf(pud_t pud, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - BUILD_BUG(); - return 0; -} -#endif - static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) @@ -3100,13 +2964,6 @@ static int gup_fast_pmd_leaf(pmd_t orig, if (pmd_special(orig)) return 0; - if (pmd_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; - return gup_fast_devmap_pmd_leaf(orig, pmdp, addr, end, flags, - pages, nr); - } - page = pmd_page(orig); refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); @@ -3147,13 +3004,6 @@ static int gup_fast_pud_leaf(pud_t orig, if (pud_special(orig)) return 0; - if (pud_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; - return gup_fast_devmap_pud_leaf(orig, pudp, addr, end, flags, - pages, nr); - } - page = pud_page(orig); refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); --- a/mm/huge_memory.c~mm-gup-remove-pxx_devmap-usage-from-get_user_pages +++ a/mm/huge_memory.c @@ -1672,46 +1672,6 @@ void touch_pmd(struct vm_area_struct *vm update_mmu_cache_pmd(vma, addr, pmd); } -struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, int flags, struct dev_pagemap **pgmap) -{ - unsigned long pfn = pmd_pfn(*pmd); - struct mm_struct *mm = vma->vm_mm; - struct page *page; - int ret; - - assert_spin_locked(pmd_lockptr(mm, pmd)); - - if (flags & FOLL_WRITE && !pmd_write(*pmd)) - return NULL; - - if (pmd_present(*pmd) && pmd_devmap(*pmd)) - /* pass */; - else - return NULL; - - if (flags & FOLL_TOUCH) - touch_pmd(vma, addr, pmd, flags & FOLL_WRITE); - - /* - * device mapped pages can only be returned if the - * caller will manage the page reference count. - */ - if (!(flags & (FOLL_GET | FOLL_PIN))) - return ERR_PTR(-EEXIST); - - pfn += (addr & ~PMD_MASK) >> PAGE_SHIFT; - *pgmap = get_dev_pagemap(pfn, *pgmap); - if (!*pgmap) - return ERR_PTR(-EFAULT); - page = pfn_to_page(pfn); - ret = try_grab_folio(page_folio(page), 1, flags); - if (ret) - page = ERR_PTR(ret); - - return page; -} - int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) _ Patches currently in -mm which might be from apopple@nvidia.com are mm-convert-pxd_devmap-checks-to-vma_is_dax.patch mm-filter-zone-device-pages-returned-from-folio_walk_start.patch mm-convert-vmf_insert_mixed-from-using-pte_devmap-to-pte_special.patch mm-remove-remaining-uses-of-pfn_dev.patch mm-gup-remove-pxx_devmap-usage-from-get_user_pages.patch mm-huge_memory-remove-pxd_devmap-usage-from-insert_pxd_pfn.patch mm-remove-redundant-pxd_devmap-calls.patch mm-khugepaged-remove-redundant-pmd_devmap-check.patch powerpc-remove-checks-for-devmap-pages-and-pmds-puds.patch fs-dax-remove-fs_dax_limited-config-option.patch mm-remove-devmap-related-functions-and-page-table-bits.patch mm-remove-pfn_map-pfn_special-pfn_sg_chain-and-pfn_sg_last.patch mm-remove-callers-of-pfn_t-functionality.patch mm-memremap-remove-unused-devmap_managed_key.patch