From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Andrey Ryabinin <aryabinin@virtuozzo.com>,
Shakeel Butt <shakeelb@google.com>, Michal Hocko <mhocko@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.14 058/101] mm/vmscan: wake up flushers for legacy cgroups too
Date: Tue, 27 Mar 2018 18:27:30 +0200 [thread overview]
Message-ID: <20180327162753.545466903@linuxfoundation.org> (raw)
In-Reply-To: <20180327162749.993880276@linuxfoundation.org>
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Ryabinin <aryabinin@virtuozzo.com>
commit 1c610d5f93c709df56787f50b3576704ac271826 upstream.
Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter dirty
pages on the LRU") added flusher invocation to shrink_inactive_list()
when many dirty pages on the LRU are encountered.
However, shrink_inactive_list() doesn't wake up flushers for legacy
cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove old
flusher wakeup from direct reclaim path") removed the only source of
flusher's wake up in legacy mem cgroup reclaim path.
This leads to premature OOM if there is too many dirty pages in cgroup:
# mkdir /sys/fs/cgroup/memory/test
# echo $$ > /sys/fs/cgroup/memory/test/tasks
# echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
# dd if=/dev/zero of=tmp_file bs=1M count=100
Killed
dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Call Trace:
dump_stack+0x46/0x65
dump_header+0x6b/0x2ac
oom_kill_process+0x21c/0x4a0
out_of_memory+0x2a5/0x4b0
mem_cgroup_out_of_memory+0x3b/0x60
mem_cgroup_oom_synchronize+0x2ed/0x330
pagefault_out_of_memory+0x24/0x54
__do_page_fault+0x521/0x540
page_fault+0x45/0x50
Task in /test killed as a result of limit of /test
memory: usage 51200kB, limit 51200kB, failcnt 73
memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0
kmem: usage 296kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB
mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB
active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB
Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child
Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB
oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Wake up flushers in legacy cgroup reclaim too.
Link: http://lkml.kernel.org/r/20180315164553.17856-1-aryabinin@virtuozzo.com
Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/vmscan.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1847,6 +1847,20 @@ shrink_inactive_list(unsigned long nr_to
set_bit(PGDAT_WRITEBACK, &pgdat->flags);
/*
+ * If dirty pages are scanned that are not queued for IO, it
+ * implies that flushers are not doing their job. This can
+ * happen when memory pressure pushes dirty pages to the end of
+ * the LRU before the dirty limits are breached and the dirty
+ * data has expired. It can also happen when the proportion of
+ * dirty pages grows not through writes but through memory
+ * pressure reclaiming all the clean cache. And in some cases,
+ * the flushers simply cannot keep up with the allocation
+ * rate. Nudge the flusher threads in case they are asleep.
+ */
+ if (stat.nr_unqueued_dirty == nr_taken)
+ wakeup_flusher_threads(0, WB_REASON_VMSCAN);
+
+ /*
* Legacy memcg will stall in page writeback so avoid forcibly
* stalling here.
*/
@@ -1858,22 +1872,9 @@ shrink_inactive_list(unsigned long nr_to
if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
set_bit(PGDAT_CONGESTED, &pgdat->flags);
- /*
- * If dirty pages are scanned that are not queued for IO, it
- * implies that flushers are not doing their job. This can
- * happen when memory pressure pushes dirty pages to the end of
- * the LRU before the dirty limits are breached and the dirty
- * data has expired. It can also happen when the proportion of
- * dirty pages grows not through writes but through memory
- * pressure reclaiming all the clean cache. And in some cases,
- * the flushers simply cannot keep up with the allocation
- * rate. Nudge the flusher threads in case they are asleep, but
- * also allow kswapd to start writing pages during reclaim.
- */
- if (stat.nr_unqueued_dirty == nr_taken) {
- wakeup_flusher_threads(0, WB_REASON_VMSCAN);
+ /* Allow kswapd to start writing pages during reclaim. */
+ if (stat.nr_unqueued_dirty == nr_taken)
set_bit(PGDAT_DIRTY, &pgdat->flags);
- }
/*
* If kswapd scans pages marked marked for immediate
next prev parent reply other threads:[~2018-03-27 16:38 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-27 16:26 [PATCH 4.14 000/101] 4.14.31-stable review Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 001/101] MIPS: ralink: Remove ralink_halt() Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 002/101] MIPS: ralink: Fix booting on MT7621 Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 003/101] MIPS: lantiq: Fix Danube USB clock Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 004/101] MIPS: lantiq: Enable AHB Bus for USB Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 005/101] MIPS: lantiq: ase: Enable MFD_SYSCON Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 006/101] iio: chemical: ccs811: Corrected firmware boot/application mode transition Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 007/101] iio: st_pressure: st_accel: pass correct platform data to init Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 008/101] iio: adc: meson-saradc: unlock on error in meson_sar_adc_lock() Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 009/101] ALSA: usb-audio: Fix parsing descriptor of UAC2 processing unit Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 010/101] ALSA: aloop: Sync stale timer before release Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 011/101] ALSA: aloop: Fix access to not-yet-ready substream via cable Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 012/101] ALSA: hda - Force polling mode on CFL for fixing codec communication Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 013/101] ALSA: hda/realtek - Fix speaker no sound after system resume Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 014/101] ALSA: hda/realtek - Fix Dell headset Mic cant record Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 015/101] ALSA: hda/realtek - Always immediately update mute LED with pin VREF Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 016/101] mmc: core: Fix tracepoint print of blk_addr and blksz Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 017/101] mmc: core: Disable HPI for certain Micron (Numonyx) eMMC cards Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 018/101] mmc: block: fix updating ext_csd caches on ioctl call Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 019/101] mmc: dw_mmc: Fix the DTO/CTO timeout overflow calculation for 32-bit systems Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 020/101] mmc: dw_mmc: exynos: fix the suspend/resume issue for exynos5433 Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 021/101] mmc: dw_mmc: fix falling from idmac to PIO mode when dw_mci_reset occurs Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 022/101] PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 023/101] ahci: Add PCI-id for the Highpoint Rocketraid 644L card Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 024/101] lockdep: fix fs_reclaim warning Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 025/101] clk: bcm2835: Fix ana->maskX definitions Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 026/101] clk: bcm2835: Protect sections updating shared registers Greg Kroah-Hartman
2018-03-27 16:26 ` [PATCH 4.14 027/101] clk: sunxi-ng: a31: Fix CLK_OUT_* clock ops Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 028/101] RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 029/101] Drivers: hv: vmbus: Fix ring buffer signaling Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 030/101] pinctrl: samsung: Validate alias coming from DT Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 031/101] Bluetooth: btusb: Remove Yoga 920 from the btusb_needs_reset_resume_table Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 032/101] Bluetooth: btusb: Add Dell OptiPlex 3060 to btusb_needs_reset_resume_table Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 033/101] Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174 Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 034/101] libata: fix length validation of ATAPI-relayed SCSI commands Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 035/101] libata: remove WARN() for DMA or PIO command without data Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 036/101] libata: dont try to pass through NCQ commands to non-NCQ devices Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 037/101] libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 038/101] libata: disable LPM for Crucial BX100 SSD 500GB drive Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 039/101] libata: Enable queued TRIM for Samsung SSD 860 Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 040/101] libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 041/101] libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 042/101] libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 043/101] cgroup: fix rule checking for threaded mode switching Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 044/101] nfsd: remove blocked locks on client teardown Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 045/101] hugetlbfs: check for pgoff value overflow Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 046/101] h8300: remove extraneous __BIG_ENDIAN definition Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 047/101] mm/vmalloc: add interfaces to free unmapped page table Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 048/101] x86/mm: implement free pmd/pte page interfaces Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 049/101] mm/khugepaged.c: convert VM_BUG_ON() to collapse fail Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 050/101] mm/thp: do not wait for lock_page() in deferred_split_scan() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 051/101] mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 052/101] Revert "mm: page_alloc: skip over regions of invalid pfns where possible" Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 053/101] drm/vmwgfx: Fix black screen and device errors when running without fbdev Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 054/101] drm/vmwgfx: Fix a destoy-while-held mutex problem Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 056/101] drm: Reject getfb for multi-plane framebuffers Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 057/101] drm: udl: Properly check framebuffer mmap offsets Greg Kroah-Hartman
2018-03-27 16:27 ` Greg Kroah-Hartman [this message]
2018-03-27 16:27 ` [PATCH 4.14 059/101] acpi, numa: fix pxm to online numa node associations Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 060/101] ACPI / watchdog: Fix off-by-one error at resource assignment Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 061/101] libnvdimm, {btt, blk}: do integrity setup before add_disk() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 062/101] brcmfmac: fix P2P_DEVICE ethernet address generation Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 063/101] rtlwifi: rtl8723be: Fix loss of signal Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 064/101] tracing: probeevent: Fix to support minus offset from symbol Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 065/101] mtdchar: fix usage of mtd_ooblayout_ecc() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 066/101] mtd: nand: fsl_ifc: Fix nand waitfunc return value Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 067/101] mtd: nand: fsl_ifc: Fix eccstat array overflow for IFC ver >= 2.0.0 Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 068/101] mtd: nand: fsl_ifc: Read ECCSTAT0 and ECCSTAT1 registers for IFC 2.0 Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 069/101] staging: ncpfs: memory corruption in ncp_read_kernel() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 070/101] can: peak/pcie_fd: fix echo_skb is occupied! bug Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 071/101] can: peak/pcie_fd: remove useless code when interface starts Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 072/101] can: ifi: Repair the error handling Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 073/101] can: ifi: Check core revision upon probe Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 074/101] can: cc770: Fix stalls on rt-linux, remove redundant IRQ ack Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 075/101] can: cc770: Fix queue stall & dropped RTR reply Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 076/101] can: cc770: Fix use after free in cc770_tx_interrupt() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 077/101] tty: vt: fix up tabstops properly Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 078/101] x86/entry/64: Dont use IST entry for #BP stack Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 079/101] selftests/x86/ptrace_syscall: Fix for yet more glibc interference Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 080/101] x86/vsyscall/64: Use proper accessor to update P4D entry Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 081/101] x86/efi: Free efi_pgd with free_pages() Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 082/101] posix-timers: Protect posix clock array access against speculation Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 083/101] kvm/x86: fix icebp instruction handling Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 084/101] x86/build/64: Force the linker to use 2MB page size Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 085/101] x86/boot/64: Verify alignment of the LOAD segment Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 086/101] drm/syncobj: Stop reusing the same struct file for all syncobj -> fd Greg Kroah-Hartman
2018-03-27 16:27 ` [PATCH 4.14 087/101] perf/x86/intel/uncore: Fix Skylake UPI event format Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 088/101] perf stat: Fix CVS output format for non-supported counters Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 089/101] perf/core: Fix ctx_event_type in ctx_resched() Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 090/101] perf/x86/intel: Dont accidentally clear high bits in bdw_limit_period() Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 091/101] perf/x86/intel/uncore: Fix multi-domain PCI CHA enumeration bug on Skylake servers Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 092/101] iio: ABI: Fix name of timestamp sysfs file Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 093/101] iio: imu: st_lsm6dsx: fix endianness in st_lsm6dsx_read_oneshot() Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 094/101] staging: android: ion: Zero CMA allocated memory Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 095/101] staging: lustre: ptlrpc: kfree used instead of kvfree Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 096/101] usb: xhci: Disable slot even when virt-dev is null Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 097/101] usb: xhci: Fix potential memory leak in xhci_disable_slot() Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 098/101] x86/pkeys/selftests: Rename si_pkey to siginfo_pkey Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 099/101] kbuild: disable clangs default use of -fmerge-all-constants Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 100/101] bpf: skip unnecessary capability check Greg Kroah-Hartman
2018-03-27 16:28 ` [PATCH 4.14 101/101] bpf, x64: increase number of passes Greg Kroah-Hartman
2018-03-27 22:56 ` [PATCH 4.14 000/101] 4.14.31-stable review Shuah Khan
2018-03-28 13:38 ` Guenter Roeck
2018-03-28 16:21 ` Greg Kroah-Hartman
2018-03-28 18:41 ` Dan Rue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180327162753.545466903@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=aryabinin@virtuozzo.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.cz \
--cc=shakeelb@google.com \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).