From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0D66C433FE for ; Sun, 6 Mar 2022 22:09:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234308AbiCFWKA (ORCPT ); Sun, 6 Mar 2022 17:10:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234317AbiCFWJs (ORCPT ); Sun, 6 Mar 2022 17:09:48 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D125222BE8; Sun, 6 Mar 2022 14:08:55 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6550F6100F; Sun, 6 Mar 2022 22:08:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAD2FC340EC; Sun, 6 Mar 2022 22:08:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1646604534; bh=EN3v7kAtGDRHVT+xbUNTo+5FjMxpSfXu6QUSGlHTGnk=; h=Date:To:From:Subject:From; b=09/PaImFInWHDMRgAdcCtu+M/GF8WIL+ypAktVmoNldgp+S4iow/IUepOaVYRAqwZ BIJVCzFckW59guI/bTu7EX1HUWBM992Ij2kFjfQLFELWOEvUdEs9Stq1U/Fvg98tCR tN3znObkGqDJgmvbeDCevTjqif4SmXL43iVgqQ6E= Date: Sun, 06 Mar 2022 14:08:54 -0800 To: mm-commits@vger.kernel.org, zealci@zte.com.cn, yang.yang29@zte.com.cn, willy@infradead.org, wang.yong12@zte.com.cn, stable@vger.kernel.org, songliubraving@fb.com, mike.kravetz@oracle.com, kirill@shutemov.name, cgel.zte@gmail.com, hughd@google.com, akpm@linux-foundation.org From: Andrew Morton Subject: [merged] memfd-fix-f_seal_write-after-shmem-huge-page-allocated.patch removed from -mm tree Message-Id: <20220306220854.BAD2FC340EC@smtp.kernel.org> Precedence: bulk Reply-To: linux-kernel@vger.kernel.org List-ID: X-Mailing-List: mm-commits@vger.kernel.org The patch titled Subject: memfd: fix F_SEAL_WRITE after shmem huge page allocated has been removed from the -mm tree. Its filename was memfd-fix-f_seal_write-after-shmem-huge-page-allocated.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins Subject: memfd: fix F_SEAL_WRITE after shmem huge page allocated Wangyong reports: after enabling tmpfs filesystem to support transparent hugepage with the following command: echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled the docker program tries to add F_SEAL_WRITE through the following command, but it fails unexpectedly with errno EBUSY: fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1. That is because memfd_tag_pins() and memfd_wait_for_pins() were never updated for shmem huge pages: checking page_mapcount() against page_count() is hopeless on THP subpages - they need to check total_mapcount() against page_count() on THP heads only. Make memfd_tag_pins() (compared > 1) as strict as memfd_wait_for_pins() (compared != 1): either can be justified, but given the non-atomic total_mapcount() calculation, it is better now to be strict. Bear in mind that total_mapcount() itself scans all of the THP subpages, when choosing to take an XA_CHECK_SCHED latency break. Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a page has been swapped out since memfd_tag_pins(), then its refcount must have fallen, and so it can safely be untagged. Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.com Signed-off-by: Hugh Dickins Reported-by: Zeal Robot Reported-by: wangyong Cc: Mike Kravetz Cc: Matthew Wilcox (Oracle) Cc: CGEL ZTE Cc: Kirill A. Shutemov Cc: Song Liu Cc: Yang Yang Cc: Signed-off-by: Andrew Morton --- mm/memfd.c | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) --- a/mm/memfd.c~memfd-fix-f_seal_write-after-shmem-huge-page-allocated +++ a/mm/memfd.c @@ -31,20 +31,28 @@ static void memfd_tag_pins(struct xa_state *xas) { struct page *page; - unsigned int tagged = 0; + int latency = 0; + int cache_count; lru_add_drain(); xas_lock_irq(xas); xas_for_each(xas, page, ULONG_MAX) { - if (xa_is_value(page)) - continue; - page = find_subpage(page, xas->xa_index); - if (page_count(page) - page_mapcount(page) > 1) + cache_count = 1; + if (!xa_is_value(page) && + PageTransHuge(page) && !PageHuge(page)) + cache_count = HPAGE_PMD_NR; + + if (!xa_is_value(page) && + page_count(page) - total_mapcount(page) != cache_count) xas_set_mark(xas, MEMFD_TAG_PINNED); + if (cache_count != 1) + xas_set(xas, page->index + cache_count); - if (++tagged % XA_CHECK_SCHED) + latency += cache_count; + if (latency < XA_CHECK_SCHED) continue; + latency = 0; xas_pause(xas); xas_unlock_irq(xas); @@ -73,7 +81,8 @@ static int memfd_wait_for_pins(struct ad error = 0; for (scan = 0; scan <= LAST_SCAN; scan++) { - unsigned int tagged = 0; + int latency = 0; + int cache_count; if (!xas_marked(&xas, MEMFD_TAG_PINNED)) break; @@ -87,10 +96,14 @@ static int memfd_wait_for_pins(struct ad xas_lock_irq(&xas); xas_for_each_marked(&xas, page, ULONG_MAX, MEMFD_TAG_PINNED) { bool clear = true; - if (xa_is_value(page)) - continue; - page = find_subpage(page, xas.xa_index); - if (page_count(page) - page_mapcount(page) != 1) { + + cache_count = 1; + if (!xa_is_value(page) && + PageTransHuge(page) && !PageHuge(page)) + cache_count = HPAGE_PMD_NR; + + if (!xa_is_value(page) && cache_count != + page_count(page) - total_mapcount(page)) { /* * On the last scan, we clean up all those tags * we inserted; but make a note that we still @@ -103,8 +116,11 @@ static int memfd_wait_for_pins(struct ad } if (clear) xas_clear_mark(&xas, MEMFD_TAG_PINNED); - if (++tagged % XA_CHECK_SCHED) + + latency += cache_count; + if (latency < XA_CHECK_SCHED) continue; + latency = 0; xas_pause(&xas); xas_unlock_irq(&xas); _ Patches currently in -mm which might be from hughd@google.com are tmpfs-support-for-file-creation-time-fix.patch mm-_install_special_mapping-apply-vm_locked_clear_mask.patch mm-thp-refix-__split_huge_pmd_locked-for-migration-pmd.patch mm-thp-clearpagedoublemap-in-first-page_add_file_rmap.patch mm-thp-fix-nr_file_mapped-accounting-in-page__file_rmap.patch