From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Hugh Dickins <hughd@google.com>,
Zeal Robot <zealci@zte.com.cn>, wangyong <wang.yong12@zte.com.cn>,
Mike Kravetz <mike.kravetz@oracle.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
CGEL ZTE <cgel.zte@gmail.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Song Liu <songliubraving@fb.com>,
Yang Yang <yang.yang29@zte.com.cn>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.19 48/51] memfd: fix F_SEAL_WRITE after shmem huge page allocated
Date: Mon, 7 Mar 2022 10:19:23 +0100 [thread overview]
Message-ID: <20220307091638.357126249@linuxfoundation.org> (raw)
In-Reply-To: <20220307091636.988950823@linuxfoundation.org>
From: Hugh Dickins <hughd@google.com>
commit f2b277c4d1c63a85127e8aa2588e9cc3bd21cb99 upstream.
Wangyong reports: after enabling tmpfs filesystem to support transparent
hugepage with the following command:
echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled
the docker program tries to add F_SEAL_WRITE through the following
command, but it fails unexpectedly with errno EBUSY:
fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1.
That is because memfd_tag_pins() and memfd_wait_for_pins() were never
updated for shmem huge pages: checking page_mapcount() against
page_count() is hopeless on THP subpages - they need to check
total_mapcount() against page_count() on THP heads only.
Make memfd_tag_pins() (compared > 1) as strict as memfd_wait_for_pins()
(compared != 1): either can be justified, but given the non-atomic
total_mapcount() calculation, it is better now to be strict. Bear in
mind that total_mapcount() itself scans all of the THP subpages, when
choosing to take an XA_CHECK_SCHED latency break.
Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a
page has been swapped out since memfd_tag_pins(), then its refcount must
have fallen, and so it can safely be untagged.
Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reported-by: wangyong <wang.yong12@zte.com.cn>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: CGEL ZTE <cgel.zte@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/memfd.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -34,26 +34,35 @@ static void memfd_tag_pins(struct addres
void __rcu **slot;
pgoff_t start;
struct page *page;
- unsigned int tagged = 0;
+ int latency = 0;
+ int cache_count;
lru_add_drain();
start = 0;
xa_lock_irq(&mapping->i_pages);
radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) {
+ cache_count = 1;
page = radix_tree_deref_slot_protected(slot, &mapping->i_pages.xa_lock);
- if (!page || radix_tree_exception(page)) {
+ if (!page || radix_tree_exception(page) || PageTail(page)) {
if (radix_tree_deref_retry(page)) {
slot = radix_tree_iter_retry(&iter);
continue;
}
- } else if (page_count(page) - page_mapcount(page) > 1) {
- radix_tree_tag_set(&mapping->i_pages, iter.index,
- MEMFD_TAG_PINNED);
+ } else {
+ if (PageTransHuge(page) && !PageHuge(page))
+ cache_count = HPAGE_PMD_NR;
+ if (cache_count !=
+ page_count(page) - total_mapcount(page)) {
+ radix_tree_tag_set(&mapping->i_pages,
+ iter.index, MEMFD_TAG_PINNED);
+ }
}
- if (++tagged % 1024)
+ latency += cache_count;
+ if (latency < 1024)
continue;
+ latency = 0;
slot = radix_tree_iter_resume(slot, &iter);
xa_unlock_irq(&mapping->i_pages);
@@ -79,6 +88,7 @@ static int memfd_wait_for_pins(struct ad
pgoff_t start;
struct page *page;
int error, scan;
+ int cache_count;
memfd_tag_pins(mapping);
@@ -107,8 +117,12 @@ static int memfd_wait_for_pins(struct ad
page = NULL;
}
- if (page &&
- page_count(page) - page_mapcount(page) != 1) {
+ cache_count = 1;
+ if (page && PageTransHuge(page) && !PageHuge(page))
+ cache_count = HPAGE_PMD_NR;
+
+ if (page && cache_count !=
+ page_count(page) - total_mapcount(page)) {
if (scan < LAST_SCAN)
goto continue_resched;
next prev parent reply other threads:[~2022-03-07 9:34 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-07 9:18 [PATCH 4.19 00/51] 4.19.233-rc1 review Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 01/51] mac80211_hwsim: report NOACK frames in tx_status Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 02/51] mac80211_hwsim: initialize ieee80211_tx_info at hw_scan_work Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 03/51] i2c: bcm2835: Avoid clock stretching timeouts Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 04/51] ASoC: rt5668: do not block workqueue if card is unbound Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 05/51] ASoC: rt5682: " Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 06/51] Input: clear BTN_RIGHT/MIDDLE on buttonpads Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 07/51] cifs: fix double free race when mount fails in cifs_get_root() Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 08/51] dmaengine: shdma: Fix runtime PM imbalance on error Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 09/51] i2c: cadence: allow COMPILE_TEST Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 10/51] i2c: qup: " Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 11/51] net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 12/51] usb: gadget: dont release an existing dev->buf Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 13/51] usb: gadget: clear related members when goto fail Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 14/51] ata: pata_hpt37x: fix PCI clock detection Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 15/51] ALSA: intel_hdmi: Fix reference to PCM buffer address Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 16/51] ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 17/51] xfrm: fix MTU regression Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 18/51] netfilter: fix use-after-free in __nf_register_net_hook() Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 19/51] xfrm: fix the if_id check in changelink Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 20/51] xfrm: enforce validity of offload input flags Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 21/51] netfilter: nf_queue: dont assume sk is full socket Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 22/51] netfilter: nf_queue: fix possible use-after-free Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 23/51] batman-adv: Request iflink once in batadv-on-batadv check Greg Kroah-Hartman
2022-03-07 9:18 ` [PATCH 4.19 24/51] batman-adv: Request iflink once in batadv_get_real_netdevice Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 25/51] batman-adv: Dont expect inter-netns unique iflink indices Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 26/51] net: dcb: flush lingering app table entries for unregistered devices Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 27/51] net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 28/51] net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 29/51] block: Fix fsync always failed if once failed Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 30/51] PCI: pciehp: Fix infinite loop in IRQ handler upon power fault Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 31/51] xen/netfront: destroy queues before real_num_tx_queues is zeroed Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 32/51] mac80211: fix forwarded mesh frames AC & queue selection Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 33/51] net: stmmac: fix return value of __setup handler Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 34/51] net: sxgbe: " Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 35/51] net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe() Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 36/51] efivars: Respect "block" flag in efivar_entry_set_safe() Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 37/51] firmware: arm_scmi: Remove space in MODULE_ALIAS name Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 38/51] can: gs_usb: change active_channelss type from atomic_t to u8 Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 39/51] arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 40/51] ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions Greg Kroah-Hartman
2022-03-07 9:19 ` Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 41/51] ibmvnic: free reset-work-item when flushing Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 42/51] soc: fsl: qe: Check of ioremap return value Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 43/51] net: chelsio: cxgb3: check the return value of pci_find_capability() Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 44/51] nl80211: Handle nla_memdup failures in handle_nan_filter Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 45/51] Input: elan_i2c - move regulator_[en|dis]able() out of elan_[en|dis]able_power() Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 46/51] Input: elan_i2c - fix regulator enable count imbalance after suspend/resume Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 47/51] HID: add mapping for KEY_ALL_APPLICATIONS Greg Kroah-Hartman
2022-03-07 9:19 ` Greg Kroah-Hartman [this message]
2022-03-07 9:19 ` [PATCH 4.19 49/51] tracing/histogram: Fix sorting on old "cpu" value Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 50/51] btrfs: add missing run of delayed items after unlink during log replay Greg Kroah-Hartman
2022-03-07 9:19 ` [PATCH 4.19 51/51] net: dcb: disable softirqs in dcbnl_flush_dev() Greg Kroah-Hartman
2022-03-07 14:59 ` [PATCH 4.19 00/51] 4.19.233-rc1 review Pavel Machek
2022-03-07 23:47 ` Shuah Khan
2022-03-08 6:34 ` Samuel Zou
2022-03-08 8:20 ` Jon Hunter
2022-03-08 14:07 ` Naresh Kamboju
2022-03-08 15:00 ` Jeffrin Thalakkottoor
2022-03-08 15:48 ` Sudip Mukherjee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220307091638.357126249@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=cgel.zte@gmail.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=songliubraving@fb.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=wang.yong12@zte.com.cn \
--cc=willy@infradead.org \
--cc=yang.yang29@zte.com.cn \
--cc=zealci@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.