From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Miles Chen <miles.chen@mediatek.com>,
Qian Cai <cai@lca.pw>, Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.4 36/78] mm/memcontrol.c: fix use after free in mem_cgroup_iter()
Date: Thu, 22 Aug 2019 10:18:40 -0700 [thread overview]
Message-ID: <20190822171833.088975095@linuxfoundation.org> (raw)
In-Reply-To: <20190822171832.012773482@linuxfoundation.org>
From: Miles Chen <miles.chen@mediatek.com>
commit 54a83d6bcbf8f4700013766b974bf9190d40b689 upstream.
This patch is sent to report an use after free in mem_cgroup_iter()
after merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").
I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
to the trees. However, I can still observe use after free issues
addressed in the commit be2657752e9e. (on low-end devices, a few times
this month)
backtrace:
css_tryget <- crash here
mem_cgroup_iter
shrink_node
shrink_zones
do_try_to_free_pages
try_to_free_pages
__perform_reclaim
__alloc_pages_direct_reclaim
__alloc_pages_slowpath
__alloc_pages_nodemask
To debug, I poisoned mem_cgroup before freeing it:
static void __mem_cgroup_free(struct mem_cgroup *memcg)
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->stat);
+ /* poison memcg before freeing it */
+ memset(memcg, 0x78, sizeof(struct mem_cgroup));
kfree(memcg);
}
The coredump shows the position=0xdbbc2a00 is freed.
(gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
$13 = {position = 0xdbbc2a00, generation = 0x2efd}
0xdbbc2a00: 0xdbbc2e00 0x00000000 0xdbbc2800 0x00000100
0xdbbc2a10: 0x00000200 0x78787878 0x00026218 0x00000000
0xdbbc2a20: 0xdcad6000 0x00000001 0x78787800 0x00000000
0xdbbc2a30: 0x78780000 0x00000000 0x0068fb84 0x78787878
0xdbbc2a40: 0x78787878 0x78787878 0x78787878 0xe3fa5cc0
0xdbbc2a50: 0x78787878 0x78787878 0x00000000 0x00000000
0xdbbc2a60: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a70: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a80: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2a90: 0x00000001 0x00000000 0x00000000 0x00100000
0xdbbc2aa0: 0x00000001 0xdbbc2ac8 0x00000000 0x00000000
0xdbbc2ab0: 0x00000000 0x00000000 0x00000000 0x00000000
0xdbbc2ac0: 0x00000000 0x00000000 0xe5b02618 0x00001000
0xdbbc2ad0: 0x00000000 0x78787878 0x78787878 0x78787878
0xdbbc2ae0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2af0: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b00: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b10: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b20: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b30: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b40: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b50: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b60: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b70: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2b80: 0x78787878 0x78787878 0x00000000 0x78787878
0xdbbc2b90: 0x78787878 0x78787878 0x78787878 0x78787878
0xdbbc2ba0: 0x78787878 0x78787878 0x78787878 0x78787878
In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().
In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL. It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().
try_to_free_pages
struct scan_control sc = {...}, target_mem_cgroup is 0x0;
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup *root = sc->target_mem_cgroup;
memcg = mem_cgroup_iter(root, NULL, &reclaim);
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
My device uses memcg non-hierarchical mode. When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.
static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
{
struct mem_cgroup *memcg = dead_memcg;
for (; memcg; memcg = parent_mem_cgroup(memcg)
...
}
So the use after free scenario looks like:
CPU1 CPU2
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
css = css_next_descendant_pre(css, &root->css);
memcg = mem_cgroup_from_css(css);
cmpxchg(&iter->position, pos, memcg);
invalidate_reclaim_iterators(memcg);
...
__mem_cgroup_free()
kfree(memcg);
try_to_free_pages
do_try_to_free_pages
shrink_zones
shrink_node
mem_cgroup_iter()
if (!root)
root = root_mem_cgroup;
...
mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
iter = &mz->iter[reclaim->priority];
pos = READ_ONCE(iter->position);
css_tryget(&pos->css) <- use after free
To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
in invalidate_reclaim_iterators().
[cai@lca.pw: fix -Wparentheses compilation warning]
Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/memcontrol.c | 41 ++++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 11 deletions(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -988,28 +988,47 @@ void mem_cgroup_iter_break(struct mem_cg
css_put(&prev->css);
}
-static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+static void __invalidate_reclaim_iterators(struct mem_cgroup *from,
+ struct mem_cgroup *dead_memcg)
{
- struct mem_cgroup *memcg = dead_memcg;
struct mem_cgroup_reclaim_iter *iter;
struct mem_cgroup_per_zone *mz;
int nid, zid;
int i;
- for (; memcg; memcg = parent_mem_cgroup(memcg)) {
- for_each_node(nid) {
- for (zid = 0; zid < MAX_NR_ZONES; zid++) {
- mz = &memcg->nodeinfo[nid]->zoneinfo[zid];
- for (i = 0; i <= DEF_PRIORITY; i++) {
- iter = &mz->iter[i];
- cmpxchg(&iter->position,
- dead_memcg, NULL);
- }
+ for_each_node(nid) {
+ for (zid = 0; zid < MAX_NR_ZONES; zid++) {
+ mz = &from->nodeinfo[nid]->zoneinfo[zid];
+ for (i = 0; i <= DEF_PRIORITY; i++) {
+ iter = &mz->iter[i];
+ cmpxchg(&iter->position,
+ dead_memcg, NULL);
}
}
}
}
+static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
+{
+ struct mem_cgroup *memcg = dead_memcg;
+ struct mem_cgroup *last;
+
+ do {
+ __invalidate_reclaim_iterators(memcg, dead_memcg);
+ last = memcg;
+ } while ((memcg = parent_mem_cgroup(memcg)));
+
+ /*
+ * When cgruop1 non-hierarchy mode is used,
+ * parent_mem_cgroup() does not walk all the way up to the
+ * cgroup root (root_mem_cgroup). So we have to handle
+ * dead_memcg from cgroup root separately.
+ */
+ if (last != root_mem_cgroup)
+ __invalidate_reclaim_iterators(root_mem_cgroup,
+ dead_memcg);
+}
+
/*
* Iteration constructs for visiting all cgroups (under a tree). If
* loops are exited prematurely (break), mem_cgroup_iter_break() must
next prev parent reply other threads:[~2019-08-22 17:45 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-22 17:18 [PATCH 4.4 00/78] 4.4.190-stable review Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 01/78] usb: iowarrior: fix deadlock on disconnect Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 02/78] sound: fix a memory leak bug Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 03/78] x86/mm: Check for pfn instead of page in vmalloc_sync_one() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 04/78] x86/mm: Sync also unmappings in vmalloc_sync_all() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 05/78] mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 06/78] perf db-export: Fix thread__exec_comm() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 07/78] usb: yurex: Fix use-after-free in yurex_delete Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 08/78] can: peak_usb: fix potential double kfree_skb() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 09/78] netfilter: nfnetlink: avoid deadlock due to synchronous request_module Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 10/78] iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 11/78] mac80211: dont warn about CW params when not using them Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 12/78] hwmon: (nct6775) Fix register address and added missed tolerance for nct6106 Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 13/78] cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 14/78] s390/qdio: add sanity checks to the fast-requeue path Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 15/78] ALSA: compress: Fix regression on compressed capture streams Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 16/78] ALSA: compress: Prevent bypasses of set_params Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 17/78] ALSA: compress: Be more restrictive about when a drain is allowed Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 18/78] perf probe: Avoid calling freeing routine multiple times for same pointer Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 19/78] ARM: davinci: fix sleep.S build error on ARMv4 Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 20/78] scsi: megaraid_sas: fix panic on loading firmware crashdump Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 21/78] scsi: ibmvfc: fix WARN_ON during event pool release Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 22/78] tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 23/78] perf/core: Fix creating kernel counters for PMUs that override event->cpu Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 24/78] can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 25/78] can: peak_usb: pcan_usb_fd: " Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 26/78] hwmon: (nct7802) Fix wrong detection of in4 presence Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 27/78] ALSA: firewire: fix a memory leak bug Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 28/78] mac80211: dont WARN on short WMM parameters from AP Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 29/78] SMB3: Fix deadlock in validate negotiate hits reconnect Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 30/78] smb3: send CAP_DFS capability during session setup Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 31/78] mwifiex: fix 802.11n/WPA detection Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 32/78] scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 33/78] sh: kernel: hw_breakpoint: Fix missing break in switch statement Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 34/78] usb: gadget: f_midi: fail if set_alt fails to allocate requests Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 35/78] USB: gadget: f_midi: fixing a possible double-free in f_midi Greg Kroah-Hartman
2019-08-22 17:18 ` Greg Kroah-Hartman [this message]
2019-08-22 17:18 ` [PATCH 4.4 37/78] ALSA: hda - Fix a memory leak bug Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 38/78] HID: holtek: test for sanity of intfdata Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 39/78] HID: hiddev: avoid opening a disconnected device Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 40/78] HID: hiddev: do cleanup in failure of opening a device Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 41/78] Input: kbtab - sanity check for endpoint type Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 42/78] Input: iforce - add sanity checks Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 43/78] net: usb: pegasus: fix improper read if get_registers() fail Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 44/78] xen/pciback: remove set but not used variable old_state Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 45/78] irqchip/irq-imx-gpcv2: Forward irq type to parent Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 46/78] perf header: Fix divide by zero error if f_header.attr_size==0 Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 47/78] perf header: Fix use of unitialized value warning Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 48/78] libata: zpodd: Fix small read overflow in zpodd_get_mech_type() Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 49/78] scsi: hpsa: correct scsi command status issue after reset Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 50/78] ata: libahci: do not complain in case of deferred probe Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 51/78] kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 52/78] IB/core: Add mitigation for Spectre V1 Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 53/78] ocfs2: remove set but not used variable last_hash Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 54/78] asm-generic: fix -Wtype-limits compiler warnings Greg Kroah-Hartman
2019-08-22 17:18 ` [PATCH 4.4 55/78] staging: comedi: dt3000: Fix signed integer overflow divider * base Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 56/78] staging: comedi: dt3000: Fix rounding up of timer divisor Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 57/78] USB: core: Fix races in character device registration and deregistraion Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 58/78] usb: cdc-acm: make sure a refcount is taken early enough Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 59/78] USB: serial: option: add D-Link DWM-222 device ID Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 60/78] USB: serial: option: Add support for ZTE MF871A Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 61/78] USB: serial: option: add the BroadMobi BM818 card Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 62/78] USB: serial: option: Add Motorola modem UARTs Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 63/78] Backport minimal compiler_attributes.h to support GCC 9 Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 64/78] include/linux/module.h: copy __init/__exit attrs to init/cleanup_module Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 65/78] arm64: compat: Allow single-byte watchpoints on all addresses Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 66/78] Input: psmouse - fix build error of multiple definition Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 67/78] asm-generic: default BUG_ON(x) to if(x)BUG() Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 68/78] scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 69/78] RDMA: Directly cast the sockaddr union to sockaddr Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 70/78] IB/mlx5: Make coding style more consistent Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 71/78] x86/vdso: Remove direct HPET access through the vDSO Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 72/78] iommu/amd: Move iommu_init_pci() to .init section Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 73/78] x86/boot: Disable the address-of-packed-member compiler warning Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 74/78] net/packet: fix race in tpacket_snd() Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 75/78] xen/netback: Reset nr_frags before freeing skb Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 76/78] net/mlx5e: Only support tx/rx pause setting for port owner Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 77/78] sctp: fix the transport error_count check Greg Kroah-Hartman
2019-08-22 17:19 ` [PATCH 4.4 78/78] bonding: Add vlan tx offload to hw_enc_features Greg Kroah-Hartman
2019-08-22 21:17 ` [PATCH 4.4 00/78] 4.4.190-stable review kernelci.org bot
2019-08-22 22:40 ` Kevin Hilman
2019-08-23 8:31 ` Naresh Kamboju
2019-08-23 16:05 ` Kevin Hilman
2019-08-23 2:05 ` Jon Hunter
2019-08-23 17:38 ` Greg Kroah-Hartman
2019-08-23 8:26 ` Naresh Kamboju
2019-08-23 17:38 ` Greg Kroah-Hartman
2019-08-23 14:27 ` Guenter Roeck
2019-08-23 17:39 ` Greg Kroah-Hartman
2019-08-24 18:03 ` shuah
2019-08-24 18:14 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190822171833.088975095@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=miles.chen@mediatek.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).