From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E4A1357CE8; Tue, 12 May 2026 19:36:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778614604; cv=none; b=EG4eGnMT5HaqEhlG7BelpQoQ4u7cpnynfqajwJcatv8vmQ+4qPs5G0CSVLuISSRW7xb0oARxuuAI4TjkDgteUnZlrYp3lpyyzcVSk9iwhk6tkQXqWkbPQOFnhmtOv60Eixa/aPga/uQ+kj3LY+yxsDMlfWs6UxoqPQd/b/K1Y1w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778614604; c=relaxed/simple; bh=IPx+sOOGaWlk45zXKoNtEzn3GSndhUo86JVR9ERtPzA=; h=Date:To:From:Subject:Message-Id; b=DnaFNoXMBuBVW5835cndKod0Rkw/zjFicVbdEvwa/lWw42wfG7LpTUBd47jS+wFIYUgkumRVIKLAYYKtox7hCKKhMJ6kjVt3WMSh/57X0NKxJX5r8E+Weqo4O+Z9gU4XBLx4I/XkN9Xa1DziKoqZIg8NxbUmno+R3SX4VWdChlQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=IFEBlnhd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="IFEBlnhd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE149C2BCC7; Tue, 12 May 2026 19:36:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1778614603; bh=IPx+sOOGaWlk45zXKoNtEzn3GSndhUo86JVR9ERtPzA=; h=Date:To:From:Subject:From; b=IFEBlnhd66CxMdZMpnLkcOG5Gf+fQZ/v9RTGzv7LooprQQtIb0lZ/W2QRgXUT10Xq EXUlDx3tJHiRYSCg4tm9gxZ4WDQr0BnzgMb/lU6egrwTmNVsjDZGQr6lEuE3F9yPtb RGSAyrG1oRWMA1wTrCD4bxT7ex1AwtrNa275Fl4k= Date: Tue, 12 May 2026 12:36:43 -0700 To: mm-commits@vger.kernel.org,stable@vger.kernel.org,pfalcato@suse.de,osalvador@suse.de,muchun.song@linux.dev,liam@infradead.org,david@kernel.org,25181214217@stu.xidian.edu.cn,ljs@kernel.org,akpm@linux-foundation.org From: Andrew Morton Subject: + revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare.patch added to mm-hotfixes-unstable branch Message-Id: <20260512193643.BE149C2BCC7@smtp.kernel.org> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare" has been added to the -mm mm-hotfixes-unstable branch. Its filename is revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare.patch This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: Lorenzo Stoakes Subject: Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare" Date: Tue, 12 May 2026 17:06:43 +0100 This reverts commit ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare") with conflict resolution to account for changes in commit ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare"). The patch incorrectly handled hugetlb VMA lock allocation at the mmap_prepare stage, where a failed allocation occurring after mmap_prepare is called might result in the lock leaking. There is no risk of a merge causing a similar issues, as VMA_DONTEXPAND_BIT is set for hugetlb mappings. As a first step in addressing this issue, simply revert the change so we can rework how we do this having corrected the underlying issues. We maintain the VMA flags changes as best we can, accounting for the fact that we were working with a VMA descriptor previously and propagating like-for-like changes for this. Note that we invoke vma_set_flags() and do not call vma_start_write() as vm_flags_set() does. This is OK as it's being done in an .mmap hook where the VMA is not yet linked into the tree so nobody else can be accessing it. Link: https://lore.kernel.org/20260512160643.266960-1-ljs@kernel.org Fixes: ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare") Signed-off-by: Lorenzo Stoakes Reported-by: Mingyu Wang <25181214217@stu.xidian.edu.cn> Closes: https://lore.kernel.org/linux-mm/20260425070700.562229-1-25181214217@stu.xidian.edu.cn/ Cc: David Hildenbrand Cc: Liam R. Howlett Cc: Muchun Song Cc: Oscar Salvador Cc: Pedro Falcato Cc: Signed-off-by: Andrew Morton --- fs/hugetlbfs/inode.c | 46 +++++--------------- include/linux/hugetlb.h | 8 --- include/linux/hugetlb_inline.h | 14 ------ mm/hugetlb.c | 71 ++++++++++++------------------- 4 files changed, 45 insertions(+), 94 deletions(-) --- a/fs/hugetlbfs/inode.c~revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare +++ a/fs/hugetlbfs/inode.c @@ -96,15 +96,8 @@ static const struct fs_parameter_spec hu #define PGOFF_LOFFT_MAX \ (((1UL << (PAGE_SHIFT + 1)) - 1) << (BITS_PER_LONG - (PAGE_SHIFT + 1))) -static int hugetlb_file_mmap_prepare_success(const struct vm_area_struct *vma) +static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma) { - /* Unfortunate we have to reassign vma->vm_private_data. */ - return hugetlb_vma_lock_alloc((struct vm_area_struct *)vma); -} - -static int hugetlbfs_file_mmap_prepare(struct vm_area_desc *desc) -{ - struct file *file = desc->file; struct inode *inode = file_inode(file); loff_t len, vma_len; int ret; @@ -119,8 +112,8 @@ static int hugetlbfs_file_mmap_prepare(s * way when do_mmap unwinds (may be important on powerpc * and ia64). */ - vma_desc_set_flags(desc, VMA_HUGETLB_BIT, VMA_DONTEXPAND_BIT); - desc->vm_ops = &hugetlb_vm_ops; + vma_set_flags(vma, VMA_HUGETLB_BIT, VMA_DONTEXPAND_BIT); + vma->vm_ops = &hugetlb_vm_ops; /* * page based offset in vm_pgoff could be sufficiently large to @@ -129,16 +122,16 @@ static int hugetlbfs_file_mmap_prepare(s * sizeof(unsigned long). So, only check in those instances. */ if (sizeof(unsigned long) == sizeof(loff_t)) { - if (desc->pgoff & PGOFF_LOFFT_MAX) + if (vma->vm_pgoff & PGOFF_LOFFT_MAX) return -EINVAL; } /* must be huge page aligned */ - if (desc->pgoff & (~huge_page_mask(h) >> PAGE_SHIFT)) + if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT)) return -EINVAL; - vma_len = (loff_t)vma_desc_size(desc); - len = vma_len + ((loff_t)desc->pgoff << PAGE_SHIFT); + vma_len = (loff_t)(vma->vm_end - vma->vm_start); + len = vma_len + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); /* check for overflow */ if (len < vma_len) return -EINVAL; @@ -148,7 +141,7 @@ static int hugetlbfs_file_mmap_prepare(s ret = -ENOMEM; - vma_flags = desc->vma_flags; + vma_flags = vma->flags; /* * for SHM_HUGETLB, the pages are reserved in the shmget() call so skip * reserving here. Note: only for SHM hugetlbfs file, the inode @@ -158,30 +151,17 @@ static int hugetlbfs_file_mmap_prepare(s vma_flags_set(&vma_flags, VMA_NORESERVE_BIT); if (hugetlb_reserve_pages(inode, - desc->pgoff >> huge_page_order(h), - len >> huge_page_shift(h), desc, - vma_flags) < 0) + vma->vm_pgoff >> huge_page_order(h), + len >> huge_page_shift(h), vma, + vma_flags) < 0) goto out; ret = 0; - if (vma_desc_test(desc, VMA_WRITE_BIT) && inode->i_size < len) + if (vma_test(vma, VMA_WRITE_BIT) && inode->i_size < len) i_size_write(inode, len); out: inode_unlock(inode); - if (!ret) { - /* Allocate the VMA lock after we set it up. */ - desc->action.success_hook = hugetlb_file_mmap_prepare_success; - /* - * We cannot permit the rmap finding this VMA in the time - * between the VMA being inserted into the VMA tree and the - * completion/success hook being invoked. - * - * This is because we establish a per-VMA hugetlb lock which can - * be raced by rmap. - */ - desc->action.hide_from_rmap_until_complete = true; - } return ret; } @@ -1227,7 +1207,7 @@ static void init_once(void *foo) static const struct file_operations hugetlbfs_file_operations = { .read_iter = hugetlbfs_read_iter, - .mmap_prepare = hugetlbfs_file_mmap_prepare, + .mmap = hugetlbfs_file_mmap, .fsync = noop_fsync, .get_unmapped_area = hugetlb_get_unmapped_area, .llseek = default_llseek, --- a/include/linux/hugetlb.h~revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare +++ a/include/linux/hugetlb.h @@ -148,7 +148,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_ struct folio **foliop); #endif /* CONFIG_USERFAULTFD */ long hugetlb_reserve_pages(struct inode *inode, long from, long to, - struct vm_area_desc *desc, vma_flags_t vma_flags); + struct vm_area_struct *vma, vma_flags_t vma_flags); long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); @@ -276,7 +276,6 @@ long hugetlb_change_protection(struct vm void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); void fixup_hugetlb_reservations(struct vm_area_struct *vma); void hugetlb_split(struct vm_area_struct *vma, unsigned long addr); -int hugetlb_vma_lock_alloc(struct vm_area_struct *vma); unsigned int arch_hugetlb_cma_order(void); @@ -469,11 +468,6 @@ static inline void fixup_hugetlb_reserva static inline void hugetlb_split(struct vm_area_struct *vma, unsigned long addr) {} -static inline int hugetlb_vma_lock_alloc(struct vm_area_struct *vma) -{ - return 0; -} - #endif /* !CONFIG_HUGETLB_PAGE */ #ifndef pgd_write --- a/include/linux/hugetlb_inline.h~revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare +++ a/include/linux/hugetlb_inline.h @@ -6,23 +6,13 @@ #ifdef CONFIG_HUGETLB_PAGE -static inline bool is_vm_hugetlb_flags(vm_flags_t vm_flags) -{ - return !!(vm_flags & VM_HUGETLB); -} - static inline bool is_vma_hugetlb_flags(const vma_flags_t *flags) { - return vma_flags_test_any(flags, VMA_HUGETLB_BIT); + return vma_flags_test(flags, VMA_HUGETLB_BIT); } #else -static inline bool is_vm_hugetlb_flags(vm_flags_t vm_flags) -{ - return false; -} - static inline bool is_vma_hugetlb_flags(const vma_flags_t *flags) { return false; @@ -32,7 +22,7 @@ static inline bool is_vma_hugetlb_flags( static inline bool is_vm_hugetlb_page(const struct vm_area_struct *vma) { - return is_vm_hugetlb_flags(vma->vm_flags); + return is_vma_hugetlb_flags(&vma->flags); } #endif --- a/mm/hugetlb.c~revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare +++ a/mm/hugetlb.c @@ -116,6 +116,7 @@ struct mutex *hugetlb_fault_mutex_table /* Forward declaration */ static int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); +static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); static void hugetlb_unshare_pmds(struct vm_area_struct *vma, unsigned long start, unsigned long end, bool take_locks); @@ -413,21 +414,17 @@ static void hugetlb_vma_lock_free(struct } } -/* - * vma specific semaphore used for pmd sharing and fault/truncation - * synchronization - */ -int hugetlb_vma_lock_alloc(struct vm_area_struct *vma) +static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma) { struct hugetlb_vma_lock *vma_lock; /* Only establish in (flags) sharable vmas */ if (!vma || !(vma->vm_flags & VM_MAYSHARE)) - return 0; + return; /* Should never get here with non-NULL vm_private_data */ if (vma->vm_private_data) - return -EINVAL; + return; vma_lock = kmalloc_obj(*vma_lock); if (!vma_lock) { @@ -442,15 +439,13 @@ int hugetlb_vma_lock_alloc(struct vm_are * allocation failure. */ pr_warn_once("HugeTLB: unable to allocate vma specific lock\n"); - return -EINVAL; + return; } kref_init(&vma_lock->refs); init_rwsem(&vma_lock->rw_sema); vma_lock->vma = vma; vma->vm_private_data = vma_lock; - - return 0; } /* Helper that removes a struct file_region from the resv_map cache and returns @@ -1147,28 +1142,20 @@ static struct resv_map *vma_resv_map(str } } -static void set_vma_resv_flags(struct vm_area_struct *vma, unsigned long flags) +static void set_vma_resv_map(struct vm_area_struct *vma, struct resv_map *map) { VM_WARN_ON_ONCE_VMA(!is_vm_hugetlb_page(vma), vma); - VM_WARN_ON_ONCE_VMA(vma->vm_flags & VM_MAYSHARE, vma); + VM_WARN_ON_ONCE_VMA(vma_test(vma, VMA_MAYSHARE_BIT), vma); - set_vma_private_data(vma, get_vma_private_data(vma) | flags); + set_vma_private_data(vma, (unsigned long)map); } -static void set_vma_desc_resv_map(struct vm_area_desc *desc, struct resv_map *map) -{ - VM_WARN_ON_ONCE(!is_vma_hugetlb_flags(&desc->vma_flags)); - VM_WARN_ON_ONCE(vma_desc_test(desc, VMA_MAYSHARE_BIT)); - - desc->private_data = map; -} - -static void set_vma_desc_resv_flags(struct vm_area_desc *desc, unsigned long flags) +static void set_vma_resv_flags(struct vm_area_struct *vma, unsigned long flags) { - VM_WARN_ON_ONCE(!is_vma_hugetlb_flags(&desc->vma_flags)); - VM_WARN_ON_ONCE(vma_desc_test(desc, VMA_MAYSHARE_BIT)); + VM_WARN_ON_ONCE_VMA(!is_vm_hugetlb_page(vma), vma); + VM_WARN_ON_ONCE_VMA(vma_test(vma, VMA_MAYSHARE_BIT), vma); - desc->private_data = (void *)((unsigned long)desc->private_data | flags); + set_vma_private_data(vma, get_vma_private_data(vma) | flags); } static int is_vma_resv_set(struct vm_area_struct *vma, unsigned long flag) @@ -1178,13 +1165,6 @@ static int is_vma_resv_set(struct vm_are return (get_vma_private_data(vma) & flag) != 0; } -static bool is_vma_desc_resv_set(struct vm_area_desc *desc, unsigned long flag) -{ - VM_WARN_ON_ONCE(!is_vma_hugetlb_flags(&desc->vma_flags)); - - return ((unsigned long)desc->private_data) & flag; -} - bool __vma_private_lock(struct vm_area_struct *vma) { return !(vma->vm_flags & VM_MAYSHARE) && @@ -6554,7 +6534,7 @@ next: long hugetlb_reserve_pages(struct inode *inode, long from, long to, - struct vm_area_desc *desc, + struct vm_area_struct *vma, vma_flags_t vma_flags) { long chg = -1, add = -1, spool_resv, gbl_resv; @@ -6572,6 +6552,12 @@ long hugetlb_reserve_pages(struct inode } /* + * vma specific semaphore used for pmd sharing and fault/truncation + * synchronization + */ + hugetlb_vma_lock_alloc(vma); + + /* * Only apply hugepage reservation if asked. At fault time, an * attempt will be made for VM_NORESERVE to allocate a page * without using reserves @@ -6583,9 +6569,9 @@ long hugetlb_reserve_pages(struct inode * Shared mappings base their reservation on the number of pages that * are already allocated on behalf of the file. Private mappings need * to reserve the full area even if read-only as mprotect() may be - * called to make the mapping read-write. Assume !desc is a shm mapping + * called to make the mapping read-write. Assume !vma is a shm mapping */ - if (!desc || vma_desc_test(desc, VMA_MAYSHARE_BIT)) { + if (!vma || vma_test(vma, VMA_MAYSHARE_BIT)) { /* * resv_map can not be NULL as hugetlb_reserve_pages is only * called for inodes for which resv_maps were created (see @@ -6604,8 +6590,8 @@ long hugetlb_reserve_pages(struct inode chg = to - from; - set_vma_desc_resv_map(desc, resv_map); - set_vma_desc_resv_flags(desc, HPAGE_RESV_OWNER); + set_vma_resv_map(vma, resv_map); + set_vma_resv_flags(vma, HPAGE_RESV_OWNER); } if (chg < 0) { @@ -6619,7 +6605,7 @@ long hugetlb_reserve_pages(struct inode if (err < 0) goto out_err; - if (desc && !vma_desc_test(desc, VMA_MAYSHARE_BIT) && h_cg) { + if (vma && !vma_test(vma, VMA_MAYSHARE_BIT) && h_cg) { /* For private mappings, the hugetlb_cgroup uncharge info hangs * of the resv_map. */ @@ -6656,7 +6642,7 @@ long hugetlb_reserve_pages(struct inode * consumed reservations are stored in the map. Hence, nothing * else has to be done for private mappings here */ - if (!desc || vma_desc_test(desc, VMA_MAYSHARE_BIT)) { + if (!vma || vma_test(vma, VMA_MAYSHARE_BIT)) { add = region_add(resv_map, from, to, regions_needed, h, h_cg); if (unlikely(add < 0)) { @@ -6720,15 +6706,16 @@ out_uncharge_cgroup: hugetlb_cgroup_uncharge_cgroup_rsvd(hstate_index(h), chg * pages_per_huge_page(h), h_cg); out_err: - if (!desc || vma_desc_test(desc, VMA_MAYSHARE_BIT)) + hugetlb_vma_lock_free(vma); + if (!vma || vma_test(vma, VMA_MAYSHARE_BIT)) /* Only call region_abort if the region_chg succeeded but the * region_add failed or didn't run. */ if (chg >= 0 && add < 0) region_abort(resv_map, from, to, regions_needed); - if (desc && is_vma_desc_resv_set(desc, HPAGE_RESV_OWNER)) { + if (vma && is_vma_resv_set(vma, HPAGE_RESV_OWNER)) { kref_put(&resv_map->refs, resv_map_release); - set_vma_desc_resv_map(desc, NULL); + set_vma_resv_map(vma, NULL); } return err; } _ Patches currently in -mm which might be from ljs@kernel.org are revert-mm-hugetlbfs-update-hugetlbfs-to-use-mmap_prepare.patch