From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7092BC63703 for ; Wed, 30 Nov 2022 22:51:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229476AbiK3Wv4 (ORCPT ); Wed, 30 Nov 2022 17:51:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229636AbiK3Wvk (ORCPT ); Wed, 30 Nov 2022 17:51:40 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B945D8DFED; Wed, 30 Nov 2022 14:51:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7B386B81D12; Wed, 30 Nov 2022 22:51:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F125C433B5; Wed, 30 Nov 2022 22:51:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1669848697; bh=RHKDo60t7ddHZ7Rgx+lnsiwytiyEbA8DGSZ7lDUuxGA=; h=Date:To:From:Subject:From; b=wKkTD5irUweNMK+pLDRiXbZKVhhnC5nzFclmO4MDTqURO8RvjMPC0qnIdtkjdHRBh XX9Qs+dW7IbEh34ppmc8QGiyj0d2qqo/fWVUhveIUqP6ll4vTDUnQPj592pTKrgMSm Q/JzAyDbyP/luTdcAzJM49pgeebqmq3ruHerNfzg= Date: Wed, 30 Nov 2022 14:51:36 -0800 To: mm-commits@vger.kernel.org, willy@infradead.org, vbabka@suse.cz, stable@vger.kernel.org, riel@surriel.com, peterx@redhat.com, naoya.horiguchi@linux.dev, nadav.amit@gmail.com, harperchen1110@gmail.com, david@redhat.com, axelrasmussen@google.com, almasrymina@google.com, mike.kravetz@oracle.com, akpm@linux-foundation.org From: Andrew Morton Subject: [merged mm-hotfixes-stable] hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing.patch removed from -mm tree Message-Id: <20221130225137.2F125C433B5@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The quilt patch titled Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing has been removed from the -mm tree. Its filename was hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Mike Kravetz Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing Date: Mon, 14 Nov 2022 15:55:06 -0800 madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page tables associated with the address range. For hugetlb vmas, zap_page_range will call __unmap_hugepage_range_final. However, __unmap_hugepage_range_final assumes the passed vma is about to be removed and deletes the vma_lock to prevent pmd sharing as the vma is on the way out. In the case of madvise(MADV_DONTNEED) the vma remains, but the missing vma_lock prevents pmd sharing and could potentially lead to issues with truncation/fault races. This issue was originally reported here [1] as a BUG triggered in page_try_dup_anon_rmap. Prior to the introduction of the hugetlb vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to prevent pmd sharing. Subsequent faults on this vma were confused as VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was not set in new pages added to the page table. This resulted in pages that appeared anonymous in a VM_SHARED vma and triggered the BUG. Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from unmap_vmas(). This is used to indicate the 'final' unmapping of a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and the vm_lock is not deleted. [1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20221114235507.294320-3-mike.kravetz@oracle.com Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings") Signed-off-by: Mike Kravetz Reported-by: Wei Chen Cc: Axel Rasmussen Cc: David Hildenbrand Cc: Matthew Wilcox Cc: Mina Almasry Cc: Nadav Amit Cc: Naoya Horiguchi Cc: Peter Xu Cc: Rik van Riel Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- include/linux/mm.h | 2 ++ mm/hugetlb.c | 27 ++++++++++++++++----------- mm/memory.c | 2 +- 3 files changed, 19 insertions(+), 12 deletions(-) --- a/include/linux/mm.h~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing +++ a/include/linux/mm.h @@ -1868,6 +1868,8 @@ struct zap_details { * default, the flag is not set. */ #define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0)) +/* Set in unmap_vmas() to indicate a final unmap call. Only used by hugetlb */ +#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1)) #ifdef CONFIG_MMU extern bool can_do_mlock(void); --- a/mm/hugetlb.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing +++ a/mm/hugetlb.c @@ -5206,17 +5206,22 @@ void __unmap_hugepage_range_final(struct __unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags); - /* - * Unlock and free the vma lock before releasing i_mmap_rwsem. When - * the vma_lock is freed, this makes the vma ineligible for pmd - * sharing. And, i_mmap_rwsem is required to set up pmd sharing. - * This is important as page tables for this unmapped range will - * be asynchrously deleted. If the page tables are shared, there - * will be issues when accessed by someone else. - */ - __hugetlb_vma_unlock_write_free(vma); - - i_mmap_unlock_write(vma->vm_file->f_mapping); + if (zap_flags & ZAP_FLAG_UNMAP) { /* final unmap */ + /* + * Unlock and free the vma lock before releasing i_mmap_rwsem. + * When the vma_lock is freed, this makes the vma ineligible + * for pmd sharing. And, i_mmap_rwsem is required to set up + * pmd sharing. This is important as page tables for this + * unmapped range will be asynchrously deleted. If the page + * tables are shared, there will be issues when accessed by + * someone else. + */ + __hugetlb_vma_unlock_write_free(vma); + i_mmap_unlock_write(vma->vm_file->f_mapping); + } else { + i_mmap_unlock_write(vma->vm_file->f_mapping); + hugetlb_vma_unlock_write(vma); + } } void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, --- a/mm/memory.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing +++ a/mm/memory.c @@ -1711,7 +1711,7 @@ void unmap_vmas(struct mmu_gather *tlb, { struct mmu_notifier_range range; struct zap_details details = { - .zap_flags = ZAP_FLAG_DROP_MARKER, + .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP, /* Careful - we need to zap private pages too! */ .even_cows = true, }; _ Patches currently in -mm which might be from mike.kravetz@oracle.com are selftests-vm-update-hugetlb-madvise.patch hugetlb-remove-duplicate-mmu-notifications.patch