From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Hocko <mhocko@kernel.org>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Sasha Levin <sashal@kernel.org>,
linux-mm@kvack.org
Subject: [PATCH AUTOSEL 5.1 012/186] hugetlbfs: on restore reserve error path retain subpool reservation
Date: Sat, 1 Jun 2019 09:13:48 -0400 [thread overview]
Message-ID: <20190601131653.24205-12-sashal@kernel.org> (raw)
In-Reply-To: <20190601131653.24205-1-sashal@kernel.org>
From: Mike Kravetz <mike.kravetz@oracle.com>
[ Upstream commit 0919e1b69ab459e06df45d3ba6658d281962db80 ]
When a huge page is allocated, PagePrivate() is set if the allocation
consumed a reservation. When freeing a huge page, PagePrivate is checked.
If set, it indicates the reservation should be restored. PagePrivate
being set at free huge page time mostly happens on error paths.
When huge page reservations are created, a check is made to determine if
the mapping is associated with an explicitly mounted filesystem. If so,
pages are also reserved within the filesystem. The default action when
freeing a huge page is to decrement the usage count in any associated
explicitly mounted filesystem. However, if the reservation is to be
restored the reservation/use count within the filesystem should not be
decrementd. Otherwise, a subsequent page allocation and free for the same
mapping location will cause the file filesystem usage to go 'negative'.
Filesystem Size Used Avail Use% Mounted on
nodev 4.0G -4.0M 4.1G - /opt/hugepool
To fix, when freeing a huge page do not adjust filesystem usage if
PagePrivate() is set to indicate the reservation should be restored.
I did not cc stable as the problem has been around since reserves were
added to hugetlbfs and nobody has noticed.
Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/hugetlb.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5baf1f00ad427..5b4f00be325d7 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1258,12 +1258,23 @@ void free_huge_page(struct page *page)
ClearPagePrivate(page);
/*
- * A return code of zero implies that the subpool will be under its
- * minimum size if the reservation is not restored after page is free.
- * Therefore, force restore_reserve operation.
+ * If PagePrivate() was set on page, page allocation consumed a
+ * reservation. If the page was associated with a subpool, there
+ * would have been a page reserved in the subpool before allocation
+ * via hugepage_subpool_get_pages(). Since we are 'restoring' the
+ * reservtion, do not call hugepage_subpool_put_pages() as this will
+ * remove the reserved page from the subpool.
*/
- if (hugepage_subpool_put_pages(spool, 1) == 0)
- restore_reserve = true;
+ if (!restore_reserve) {
+ /*
+ * A return code of zero implies that the subpool will be
+ * under its minimum size if the reservation is not restored
+ * after page is free. Therefore, force restore_reserve
+ * operation.
+ */
+ if (hugepage_subpool_put_pages(spool, 1) == 0)
+ restore_reserve = true;
+ }
spin_lock(&hugetlb_lock);
clear_page_huge_active(page);
--
2.20.1
next prev parent reply other threads:[~2019-06-01 13:17 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-01 13:13 [PATCH AUTOSEL 5.1 001/186] media: rockchip/vpu: Fix/re-order probe-error/remove path Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 002/186] media: rockchip/vpu: Add missing dont_use_autosuspend() calls Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 003/186] rapidio: fix a NULL pointer dereference when create_workqueue() fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 004/186] fs/fat/file.c: issue flush after the writeback of FAT Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 005/186] sysctl: return -EINVAL if val violates minmax Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 006/186] ipc: prevent lockup on alloc_msg and free_msg Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 007/186] drm/msm: correct attempted NULL pointer dereference in debugfs Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 008/186] drm/pl111: Initialize clock spinlock early Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 009/186] mm/mprotect.c: fix compilation warning because of unused 'mm' variable Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 010/186] ARM: prevent tracing IPI_CPU_BACKTRACE Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 011/186] mm/hmm: select mmu notifier when selecting HMM Sasha Levin
2019-06-01 13:13 ` Sasha Levin [this message]
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 013/186] mm/memory_hotplug: release memory resource after arch_remove_memory() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 014/186] mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 015/186] mm/cma.c: fix crash on CMA allocation if bitmap allocation fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 016/186] initramfs: free initrd memory if opening /initrd.image fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 017/186] mm/compaction.c: fix an undefined behaviour Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 018/186] mm/memory_hotplug.c: fix the wrong usage of N_HIGH_MEMORY Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 019/186] mm/cma.c: fix the bitmap status to show failed allocation reason Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 020/186] mm: page_mkclean vs MADV_DONTNEED race Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 021/186] mm/cma_debug.c: fix the break condition in cma_maxchunk_get() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 022/186] mm/slab.c: fix an infinite loop in leaks_show() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 023/186] kernel/sys.c: prctl: fix false positive in validate_prctl_map() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 024/186] thermal: rcar_gen3_thermal: disable interrupt in .remove Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 025/186] drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 026/186] mfd: tps65912-spi: Add missing of table registration Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 027/186] mfd: intel-lpss: Set the device in reset state when init Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 028/186] drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 029/186] mfd: twl6040: Fix device init errors for ACCCTL register Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 030/186] perf/x86/intel: Allow PEBS multi-entry in watermark mode Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 031/186] drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 032/186] drm/nouveau: fix duplication of nv50_head_atom struct Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 033/186] drm/bridge: adv7511: Fix low refresh rate selection Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 034/186] objtool: Don't use ignore flag for fake jumps Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 035/186] drm/nouveau/kms/gv100-: fix spurious window immediate interlocks Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 036/186] bpf: fix undefined behavior in narrow load handling Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 037/186] gcc-plugins: arm_ssp_per_task_plugin: Fix for older GCC < 6 Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 038/186] EDAC/mpc85xx: Prevent building as a module Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 039/186] NFS4: Fix v4.0 client state corruption when mount Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 040/186] pwm: meson: Use the spin-lock only to protect register modifications Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 041/186] mailbox: stm32-ipcc: check invalid irq Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 042/186] ntp: Allow TAI-UTC offset to be set to zero Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 043/186] f2fs: fix to avoid panic in do_recover_data() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 044/186] f2fs: fix to avoid panic in f2fs_inplace_write_data() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 045/186] f2fs: fix error path of recovery Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 046/186] f2fs: fix to avoid panic in f2fs_remove_inode_page() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 047/186] f2fs: fix to do sanity check on free nid Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 048/186] f2fs: fix to clear dirty inode in error path of f2fs_iget() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 049/186] f2fs: fix to avoid panic in dec_valid_block_count() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 050/186] f2fs: fix to use inline space only if inline_xattr is enable Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 051/186] f2fs: fix to avoid panic in dec_valid_node_count() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 052/186] f2fs: fix to do sanity check on valid block count of segment Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 053/186] f2fs: fix to avoid deadloop in foreground GC Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 054/186] f2fs: fix to retrieve inline xattr space Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 055/186] f2fs: fix to do checksum even if inode page is uptodate Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 056/186] media: atmel: atmel-isc: fix asd memory allocation Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 057/186] percpu: remove spurious lock dependency between percpu and sched Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 058/186] tracing: probeevent: Fix to make the type of $comm string Sasha Levin
2019-06-08 21:31 ` Steven Rostedt
2019-06-09 19:13 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 059/186] tracing: Fix partial reading of trace event's id file Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 060/186] configfs: fix possible use-after-free in configfs_register_group Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 061/186] uml: fix a boot splat wrt use of cpu_all_mask Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 062/186] cifs: fix credits leak for SMB1 oplock breaks Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 063/186] PCI: dwc: Free MSI in dw_pcie_host_init() error path Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 064/186] PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 065/186] fbcon: Don't reset logo_shown when logo is currently shown Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 066/186] netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 067/186] ovl: do not generate duplicate fsnotify events for "fake" path Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 068/186] mmc: mmci: Prevent polling for busy detection in IRQ context Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 069/186] netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 070/186] netfilter: nf_conntrack_h323: restore boundary check correctness Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 071/186] mips: Make sure dt memory regions are valid Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 072/186] netfilter: nf_tables: fix base chain stat rcu_dereference usage Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 073/186] watchdog: Use depends instead of select for pretimeout governors Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 074/186] watchdog: imx2_wdt: Fix set_timeout for big timeout values Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190601131653.24205-12-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=dave@stgolabs.net \
--cc=iamjoonsoo.kim@lge.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).