From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
alan@lxorguk.ukuu.org.uk, Mel Gorman <mgorman@suse.de>,
David Rientjes <rientjes@google.com>,
Luigi Semenzato <semenzato@google.com>,
Dan Magenheimer <dan.magenheimer@oracle.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Sonny Rao <sonnyrao@google.com>, Minchan Kim <minchan@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
CAI Qian <caiqian@redhat.com>
Subject: [ 48/54] mm: vmscan: check for fatal signals iff the process was throttled
Date: Fri, 30 Nov 2012 10:56:19 -0800 [thread overview]
Message-ID: <20121130185212.740469578@linuxfoundation.org> (raw)
In-Reply-To: <20121130185207.894301294@linuxfoundation.org>
3.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mel Gorman <mgorman@suse.de>
commit 50694c28f1e1dbea18272980d265742a5027fb63 upstream.
Commit 5515061d22f0 ("mm: throttle direct reclaimers if PF_MEMALLOC
reserves are low and swap is backed by network storage") introduced a
check for fatal signals after a process gets throttled for network
storage. The intention was that if a process was throttled and got
killed that it should not trigger the OOM killer. As pointed out by
Minchan Kim and David Rientjes, this check is in the wrong place and too
broad. If a system is in am OOM situation and a process is exiting, it
can loop in __alloc_pages_slowpath() and calling direct reclaim in a
loop. As the fatal signal is pending it returns 1 as if it is making
forward progress and can effectively deadlock.
This patch moves the fatal_signal_pending() check after throttling to
throttle_direct_reclaim() where it belongs. If the process is killed
while throttled, it will return immediately without direct reclaim
except now it will have TIF_MEMDIE set and will use the PFMEMALLOC
reserves.
Minchan pointed out that it may be better to direct reclaim before
returning to avoid using the reserves because there may be pages that
can easily reclaim that would avoid using the reserves. However, we do
no such targetted reclaim and there is no guarantee that suitable pages
are available. As it is expected that this throttling happens when
swap-over-NFS is used there is a possibility that the process will
instead swap which may allocate network buffers from the PFMEMALLOC
reserves. Hence, in the swap-over-nfs case where a process can be
throtted and be killed it can use the reserves to exit or it can
potentially use reserves to swap a few pages and then exit. This patch
takes the option of using the reserves if necessary to allow the process
exit quickly.
If this patch passes review it should be considered a -stable candidate
for 3.6.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/vmscan.c | 37 +++++++++++++++++++++++++++----------
1 file changed, 27 insertions(+), 10 deletions(-)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2176,9 +2176,12 @@ static bool pfmemalloc_watermark_ok(pg_d
* Throttle direct reclaimers if backing storage is backed by the network
* and the PFMEMALLOC reserve for the preferred node is getting dangerously
* depleted. kswapd will continue to make progress and wake the processes
- * when the low watermark is reached
+ * when the low watermark is reached.
+ *
+ * Returns true if a fatal signal was delivered during throttling. If this
+ * happens, the page allocator should not consider triggering the OOM killer.
*/
-static void throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
+static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
nodemask_t *nodemask)
{
struct zone *zone;
@@ -2193,13 +2196,20 @@ static void throttle_direct_reclaim(gfp_
* processes to block on log_wait_commit().
*/
if (current->flags & PF_KTHREAD)
- return;
+ goto out;
+
+ /*
+ * If a fatal signal is pending, this process should not throttle.
+ * It should return quickly so it can exit and free its memory
+ */
+ if (fatal_signal_pending(current))
+ goto out;
/* Check if the pfmemalloc reserves are ok */
first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
pgdat = zone->zone_pgdat;
if (pfmemalloc_watermark_ok(pgdat))
- return;
+ goto out;
/* Account for the throttling */
count_vm_event(PGSCAN_DIRECT_THROTTLE);
@@ -2215,12 +2225,20 @@ static void throttle_direct_reclaim(gfp_
if (!(gfp_mask & __GFP_FS)) {
wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
pfmemalloc_watermark_ok(pgdat), HZ);
- return;
+
+ goto check_pending;
}
/* Throttle until kswapd wakes the process */
wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
pfmemalloc_watermark_ok(pgdat));
+
+check_pending:
+ if (fatal_signal_pending(current))
+ return true;
+
+out:
+ return false;
}
unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
@@ -2242,13 +2260,12 @@ unsigned long try_to_free_pages(struct z
.gfp_mask = sc.gfp_mask,
};
- throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
-
/*
- * Do not enter reclaim if fatal signal is pending. 1 is returned so
- * that the page allocator does not consider triggering OOM
+ * Do not enter reclaim if fatal signal was delivered while throttled.
+ * 1 is returned so that the page allocator does not OOM kill at this
+ * point.
*/
- if (fatal_signal_pending(current))
+ if (throttle_direct_reclaim(gfp_mask, zonelist, nodemask))
return 1;
trace_mm_vmscan_direct_reclaim_begin(order,
next prev parent reply other threads:[~2012-11-30 19:07 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-30 18:55 [ 00/54] 3.6.9-stable review Greg Kroah-Hartman
2012-11-30 18:55 ` [ 01/54] wireless: add back sysfs directory Greg Kroah-Hartman
2012-11-30 18:55 ` [ 02/54] x86-32: Fix invalid stack address while in softirq Greg Kroah-Hartman
2012-11-30 18:55 ` [ 03/54] x86, efi: Fix processor-specific memcpy() build error Greg Kroah-Hartman
2012-11-30 18:55 ` [ 04/54] x86, microcode, AMD: Add support for family 16h processors Greg Kroah-Hartman
2012-11-30 18:55 ` [ 05/54] x86-32: Export kernel_stack_pointer() for modules Greg Kroah-Hartman
2012-11-30 18:55 ` [ 06/54] iwlwifi: dont WARN when a non empty queue is disabled Greg Kroah-Hartman
2012-11-30 18:55 ` [ 07/54] iwlwifi: fix monitor mode FCS flag Greg Kroah-Hartman
2012-11-30 18:55 ` [ 08/54] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2012-11-30 18:55 ` [ 09/54] mwifiex: report error to MMC core if we cannot suspend Greg Kroah-Hartman
2012-11-30 18:55 ` [ 10/54] mwifiex: fix system hang issue in cmd timeout error case Greg Kroah-Hartman
2012-11-30 18:55 ` [ 11/54] SCSI: isci: copy fis 0x34 response into proper buffer Greg Kroah-Hartman
2012-11-30 18:55 ` [ 12/54] drm/radeon: add new SI pci id Greg Kroah-Hartman
2012-11-30 18:55 ` [ 13/54] ALSA: ua101, usx2y: fix broken MIDI output Greg Kroah-Hartman
2012-11-30 18:55 ` [ 14/54] ALSA: hda - Cirrus: Correctly clear line_out_pins when moving to speaker Greg Kroah-Hartman
2012-11-30 18:55 ` [ 15/54] PARISC: fix virtual aliasing issue in get_shared_area() Greg Kroah-Hartman
2012-11-30 18:55 ` [ 16/54] PARISC: fix user-triggerable panic on parisc Greg Kroah-Hartman
2012-11-30 18:55 ` [ 17/54] mtd: slram: invalid checking of absolute end address Greg Kroah-Hartman
2012-11-30 18:55 ` [ 18/54] mtd: ofpart: Fix incorrect NULL check in parse_ofoldpart_partitions() Greg Kroah-Hartman
2012-11-30 18:55 ` [ 19/54] jffs2: Fix lock acquisition order bug in jffs2_write_begin Greg Kroah-Hartman
2012-11-30 18:55 ` [ 20/54] md: Reassigned the parameters if read_seqretry returned true in func md_is_badblock Greg Kroah-Hartman
2012-11-30 18:55 ` [ 21/54] md: Avoid write invalid address if read_seqretry returned true Greg Kroah-Hartman
2012-11-30 18:55 ` [ 22/54] md/raid10: decrement correct pending counter when writing to replacement Greg Kroah-Hartman
2012-11-30 18:55 ` [ 23/54] block: Dont access request after it might be freed Greg Kroah-Hartman
2012-11-30 18:55 ` [ 24/54] dm: fix deadlock with request based dm and queue request_fn recursion Greg Kroah-Hartman
2012-11-30 18:55 ` [ 25/54] futex: avoid wake_futex() for a PI futex_q Greg Kroah-Hartman
2012-11-30 18:55 ` [ 26/54] mac80211: deinitialize ibss-internals after emptiness check Greg Kroah-Hartman
2012-11-30 18:55 ` [ 27/54] radeon: add AGPMode 1 quirk for RV250 Greg Kroah-Hartman
2012-11-30 18:55 ` [ 28/54] can: peak_usb: fix hwtstamp assignment Greg Kroah-Hartman
2012-11-30 18:56 ` [ 29/54] can: bcm: initialize ifindex for timeouts without previous frame reception Greg Kroah-Hartman
2012-11-30 18:56 ` [ 30/54] jbd: Fix lock ordering bug in journal_unmap_buffer() Greg Kroah-Hartman
2012-11-30 18:56 ` [ 31/54] sparc64: not any error from do_sigaltstack() should fail rt_sigreturn() Greg Kroah-Hartman
2012-11-30 18:56 ` [ 32/54] PM / QoS: fix wrong error-checking condition Greg Kroah-Hartman
2012-11-30 18:56 ` [ 33/54] writeback: put unused inodes to LRU after writeback completion Greg Kroah-Hartman
2012-11-30 18:56 ` [ 34/54] ALSA: hda - Add new codec ALC283 ALC290 support Greg Kroah-Hartman
2012-11-30 18:56 ` [ 35/54] ALSA: hda - Fix missing beep on ASUS X43U notebook Greg Kroah-Hartman
2012-11-30 18:56 ` [ 36/54] ALSA: hda - Add support for Realtek ALC292 Greg Kroah-Hartman
2012-11-30 18:56 ` [ 37/54] bas_gigaset: fix pre_reset handling Greg Kroah-Hartman
2012-11-30 18:56 ` [ 38/54] pstore/ram: Fix printk format warning Greg Kroah-Hartman
2012-11-30 18:56 ` [ 39/54] HID: add quirk for Freescale i.MX28 ROM recovery Greg Kroah-Hartman
2012-11-30 18:56 ` [ 40/54] KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461) Greg Kroah-Hartman
2012-11-30 18:56 ` [ 41/54] ixgbe: add support for X540-AT1 Greg Kroah-Hartman
2012-11-30 18:56 ` [ 42/54] sata_svw: check DMA start bit before reset Greg Kroah-Hartman
2012-11-30 18:56 ` [ 43/54] get_dvb_firmware: fix download site for tda10046 firmware Greg Kroah-Hartman
2012-11-30 18:56 ` [ 44/54] NFC: pn533: Fix use after free Greg Kroah-Hartman
2012-11-30 18:56 ` [ 45/54] NFC: Fix pn533 target mode memory leak Greg Kroah-Hartman
2012-11-30 18:56 ` [ 46/54] NFC: pn533: Fix mem leak in pn533_in_dep_link_up Greg Kroah-Hartman
2012-11-30 18:56 ` [ 47/54] NFC: Fix nfc_llcp_local chained list insertion Greg Kroah-Hartman
2012-11-30 18:56 ` Greg Kroah-Hartman [this message]
2012-11-30 18:56 ` [ 49/54] watchdog: using u64 in get_sample_period() Greg Kroah-Hartman
2012-11-30 18:56 ` [ 50/54] MPI: Fix compilation on MIPS with GCC 4.4 and newer Greg Kroah-Hartman
2012-11-30 18:56 ` [ 51/54] ext4: remove erroneous ext4_superblock_csum_set() in update_backups() Greg Kroah-Hartman
2012-11-30 18:56 ` [ 52/54] powerpc/eeh: Lock module while handling EEH event Greg Kroah-Hartman
2012-11-30 18:56 ` [ 53/54] mmc: sdhci-s3c: fix the wrong number of max bus clocks Greg Kroah-Hartman
2012-11-30 18:56 ` [ 54/54] md/raid10: close race that lose writes lost when replacement completes Greg Kroah-Hartman
2012-12-01 14:58 ` [ 00/54] 3.6.9-stable review Satoru Takeuchi
2012-12-01 16:50 ` Greg Kroah-Hartman
2012-12-02 2:11 ` Shuah Khan
2012-12-02 17:01 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121130185212.740469578@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=caiqian@redhat.com \
--cc=dan.magenheimer@oracle.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=rientjes@google.com \
--cc=semenzato@google.com \
--cc=sonnyrao@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.