public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Vladimir Davydov <vdavydov@virtuozzo.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.7 19/41] mm: memcontrol: fix swap counter leak on swapout from offline cgroup
Date: Sun, 14 Aug 2016 22:38:45 +0200	[thread overview]
Message-ID: <20160814202532.981270885@linuxfoundation.org> (raw)
In-Reply-To: <20160814202531.818402015@linuxfoundation.org>

4.7-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Davydov <vdavydov@virtuozzo.com>

commit 1f47b61fb4077936465dcde872a4e5cc4fe708da upstream.

An offline memory cgroup might have anonymous memory or shmem left
charged to it and no swap.  Since only swap entries pin the id of an
offline cgroup, such a cgroup will have no id and so an attempt to
swapout its anon/shmem will not store memory cgroup info in the swap
cgroup map.  As a result, memcg->swap or memcg->memsw will never get
uncharged from it and any of its ascendants.

Fix this by always charging swapout to the first ancestor cgroup that
hasn't released its id yet.

[hannes@cmpxchg.org: add comment to mem_cgroup_swapout]
[vdavydov@virtuozzo.com: use WARN_ON_ONCE() in mem_cgroup_id_get_online()]
  Link: http://lkml.kernel.org/r/20160803123445.GJ13263@esperanza
Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Link: http://lkml.kernel.org/r/5336daa5c9a32e776067773d9da655d2dc126491.1470219853.git.vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/memcontrol.c |   44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4088,6 +4088,24 @@ static void mem_cgroup_id_get(struct mem
 	atomic_inc(&memcg->id.ref);
 }
 
+static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
+{
+	while (!atomic_inc_not_zero(&memcg->id.ref)) {
+		/*
+		 * The root cgroup cannot be destroyed, so it's refcount must
+		 * always be >= 1.
+		 */
+		if (WARN_ON_ONCE(memcg == root_mem_cgroup)) {
+			VM_BUG_ON(1);
+			break;
+		}
+		memcg = parent_mem_cgroup(memcg);
+		if (!memcg)
+			memcg = root_mem_cgroup;
+	}
+	return memcg;
+}
+
 static void mem_cgroup_id_put(struct mem_cgroup *memcg)
 {
 	if (atomic_dec_and_test(&memcg->id.ref)) {
@@ -5805,7 +5823,7 @@ subsys_initcall(mem_cgroup_init);
  */
 void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 {
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg, *swap_memcg;
 	unsigned short oldid;
 
 	VM_BUG_ON_PAGE(PageLRU(page), page);
@@ -5820,16 +5838,27 @@ void mem_cgroup_swapout(struct page *pag
 	if (!memcg)
 		return;
 
-	mem_cgroup_id_get(memcg);
-	oldid = swap_cgroup_record(entry, mem_cgroup_id(memcg));
+	/*
+	 * In case the memcg owning these pages has been offlined and doesn't
+	 * have an ID allocated to it anymore, charge the closest online
+	 * ancestor for the swap instead and transfer the memory+swap charge.
+	 */
+	swap_memcg = mem_cgroup_id_get_online(memcg);
+	oldid = swap_cgroup_record(entry, mem_cgroup_id(swap_memcg));
 	VM_BUG_ON_PAGE(oldid, page);
-	mem_cgroup_swap_statistics(memcg, true);
+	mem_cgroup_swap_statistics(swap_memcg, true);
 
 	page->mem_cgroup = NULL;
 
 	if (!mem_cgroup_is_root(memcg))
 		page_counter_uncharge(&memcg->memory, 1);
 
+	if (memcg != swap_memcg) {
+		if (!mem_cgroup_is_root(swap_memcg))
+			page_counter_charge(&swap_memcg->memsw, 1);
+		page_counter_uncharge(&memcg->memsw, 1);
+	}
+
 	/*
 	 * Interrupts should be disabled here because the caller holds the
 	 * mapping->tree_lock lock which is taken with interrupts-off. It is
@@ -5868,11 +5897,14 @@ int mem_cgroup_try_charge_swap(struct pa
 	if (!memcg)
 		return 0;
 
+	memcg = mem_cgroup_id_get_online(memcg);
+
 	if (!mem_cgroup_is_root(memcg) &&
-	    !page_counter_try_charge(&memcg->swap, 1, &counter))
+	    !page_counter_try_charge(&memcg->swap, 1, &counter)) {
+		mem_cgroup_id_put(memcg);
 		return -ENOMEM;
+	}
 
-	mem_cgroup_id_get(memcg);
 	oldid = swap_cgroup_record(entry, mem_cgroup_id(memcg));
 	VM_BUG_ON_PAGE(oldid, page);
 	mem_cgroup_swap_statistics(memcg, true);

  parent reply	other threads:[~2016-08-14 20:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20160814204653uscas1p136630e2fd59612e56b31ed2096f71df2@uscas1p1.samsung.com>
2016-08-14 20:38 ` [PATCH 4.7 00/41] 4.7.1-stable review Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 01/41] ext4: verify extent header depth Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 02/41] vfs: ioctl: prevent double-fetch in dedupe ioctl Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 03/41] vfs: fix deadlock in file_remove_privs() on overlayfs Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 04/41] udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skb Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 05/41] net/bonding: Enforce active-backup policy for IPoIB bonds Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 06/41] bridge: Fix incorrect re-injection of LLDP packets Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 07/41] net: ipv6: Always leave anycast and multicast groups on link down Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 08/41] sctp: fix BH handling on socket backlog Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 09/41] net/irda: fix NULL pointer dereference on memory allocation failure Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 10/41] net/sctp: terminate rhashtable walk correctly Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 11/41] qed: Fix setting/clearing bit in completion bitmap Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 12/41] macsec: ensure rx_sa is set when validation is disabled Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 13/41] tcp: consider recv buf for the initial window scale Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 14/41] arm: oabi compat: add missing access checks Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 15/41] KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 16/41] IB/hfi1: Disable by default Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 17/41] apparmor: fix ref count leak when profile sha1 hash is read Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 18/41] random: strengthen input validation for RNDADDTOENTCNT Greg Kroah-Hartman
2016-08-14 20:38   ` Greg Kroah-Hartman [this message]
2016-08-14 20:38   ` [PATCH 4.7 20/41] mm: memcontrol: fix memcg id ref counter on swap charge move Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 21/41] x86/syscalls/64: Add compat_sys_keyctl for 32-bit userspace Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 22/41] block: fix use-after-free in seq file Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 23/41] sysv, ipc: fix security-layer leaking Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 24/41] radix-tree: account nodes to memcg only if explicitly requested Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 25/41] x86/microcode: Fix suspend to RAM with builtin microcode Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 26/41] x86/power/64: Fix hibernation return address corruption Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 27/41] fuse: fsync() did not return IO errors Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 28/41] fuse: fuse_flush must check mapping->flags for errors Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 29/41] fuse: fix wrong assignment of ->flags in fuse_send_init() Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 30/41] Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 31/41] fs/dcache.c: avoid soft-lockup in dput() Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 32/41] Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" Greg Kroah-Hartman
2016-08-14 20:38   ` [PATCH 4.7 33/41] crypto: gcm - Filter out async ghash if necessary Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 34/41] crypto: scatterwalk - Fix test in scatterwalk_done Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 35/41] serial: mvebu-uart: free the IRQ in ->shutdown() Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 36/41] ext4: check for extents that wrap around Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 37/41] ext4: fix deadlock during page writeback Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 38/41] ext4: dont call ext4_should_journal_data() on the journal inode Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 39/41] ext4: validate s_reserved_gdt_blocks on mount Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 40/41] ext4: short-cut orphan cleanup on error Greg Kroah-Hartman
2016-08-14 20:39   ` [PATCH 4.7 41/41] ext4: fix reference counting bug on block allocation error Greg Kroah-Hartman
     [not found]   ` <57b11e62.eeb8c20a.9231a.76ad@mx.google.com>
2016-08-15  7:56     ` [PATCH 4.7 00/41] 4.7.1-stable review Greg Kroah-Hartman
2016-08-15 21:31       ` Kevin Hilman
2016-08-15 13:03   ` Guenter Roeck
2016-08-15 13:46     ` Greg Kroah-Hartman
2016-08-16  4:03   ` Shuah Khan
2016-08-16 10:48     ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160814202532.981270885@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox