public inbox for patches@lists.linux.dev
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, "GONG, Ruiqi" <gongruiqi1@huawei.com>,
	Michal Hocko <mhocko@suse.com>,
	GONG
Subject: [PATCH 5.4 07/84] memcg: add refcnt for pcpu stock to avoid UAF problem in drain_all_stock()
Date: Tue, 27 Feb 2024 14:26:34 +0100	[thread overview]
Message-ID: <20240227131553.109209736@linuxfoundation.org> (raw)
In-Reply-To: <20240227131552.864701583@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "GONG, Ruiqi" <gongruiqi1@huawei.com>

commit 1a3e1f40962c445b997151a542314f3c6097f8c3 upstream.

NOTE: This is a partial backport since we only need the refcnt between
memcg and stock to fix the problem stated below, and in this way
multiple versions use the same code and align with each other.

There was a kernel panic happened on an in-house environment running
3.10, and the same problem was reproduced on 4.19:

general protection fault: 0000 [#1] SMP PTI
CPU: 1 PID: 2085 Comm: bash Kdump: loaded Tainted: G             L    4.19.90+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010 drain_all_stock+0xad/0x140
Code: 00 00 4d 85 ff 74 2c 45 85 c9 74 27 4d 39 fc 74 42 41 80 bc 24 28 04 00 00 00 74 17 49 8b 04 24 49 8b 17 48 8b 88 90 02 00 00 <48> 39 8a 90 02 00 00 74 02 eb 86 48 63 88 3c 01 00 00 39 8a 3c 01
RSP: 0018:ffffa7efc5813d70 EFLAGS: 00010202
RAX: ffff8cb185548800 RBX: ffff8cb89f420160 RCX: ffff8cb1867b6000
RDX: babababababababa RSI: 0000000000000001 RDI: 0000000000231876
RBP: 0000000000000000 R08: 0000000000000415 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8cb186f89040
R13: 0000000000020160 R14: 0000000000000001 R15: ffff8cb186b27040
FS:  00007f4a308d3740(0000) GS:ffff8cb89f440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe4d634a68 CR3: 000000010b022000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 mem_cgroup_force_empty_write+0x31/0xb0
 cgroup_file_write+0x60/0x140
 ? __check_object_size+0x136/0x147
 kernfs_fop_write+0x10e/0x190
 __vfs_write+0x37/0x1b0
 ? selinux_file_permission+0xe8/0x130
 ? security_file_permission+0x2e/0xb0
 vfs_write+0xb6/0x1a0
 ksys_write+0x57/0xd0
 do_syscall_64+0x63/0x250
 ? async_page_fault+0x8/0x30
 entry_SYSCALL_64_after_hwframe+0x5c/0xc1
Modules linked in: ...

It is found that in case of stock->nr_pages == 0, the memcg on
stock->cached could be freed due to its refcnt decreased to 0, which
made stock->cached become a dangling pointer. It could cause a UAF
problem in drain_all_stock() in the following concurrent scenario. Note
that drain_all_stock() doesn't disable irq but only preemption.

CPU1                             CPU2
==============================================================================
stock->cached = memcgA (freed)
                                 drain_all_stock(memcgB)
                                  rcu_read_lock()
                                  memcg = CPU1's stock->cached (memcgA)
                                  (interrupted)
refill_stock(memcgC)
 drain_stock(memcgA)
 stock->cached = memcgC
 stock->nr_pages += xxx (> 0)
                                  stock->nr_pages > 0
                                  mem_cgroup_is_descendant(memcgA, memcgB) [UAF]
                                  rcu_read_unlock()

This problem is, unintentionally, fixed at 5.9, where commit
1a3e1f40962c ("mm: memcontrol: decouple reference counting from page
accounting") adds memcg refcnt for stock. Therefore affected LTS
versions include 4.19 and 5.4.

For 4.19, memcg's css offline process doesn't call drain_all_stock(). so
it's easier for the released memcg to be left on the stock. For 5.4,
although mem_cgroup_css_offline() does call drain_all_stock(), but the
flushing could be skipped when stock->nr_pages happens to be 0, and
besides the async draining could be delayed and take place after the UAF
problem has happened.

Fix this problem by adding (and decreasing) memcg's refcnt when memcg is
put onto (and removed from) stock, just like how commit 1a3e1f40962c
("mm: memcontrol: decouple reference counting from page accounting")
does. After all, "being on the stock" is a kind of reference with
regards to memcg. As such, it's guaranteed that a css on stock would not
be freed.

It's good to mention that refill_stock() is executed in an irq-disabled
context, so the drain_stock() patched with css_put() would not actually
free memcgA until the end of refill_stock(), since css_put() is an RCU
free and it's still in grace period. For CPU2, the access to CPU1's
stock->cached is protected by rcu_read_lock(), so in this case it gets
either NULL from stock->cached or a memcgA that is still good.

Cc: stable@vger.kernel.org      # 4.19 5.4
Fixes: cdec2e4265df ("memcg: coalesce charging via percpu storage")
Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memcontrol.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2214,6 +2214,9 @@ static void drain_stock(struct memcg_sto
 {
 	struct mem_cgroup *old = stock->cached;
 
+	if (!old)
+		return;
+
 	if (stock->nr_pages) {
 		page_counter_uncharge(&old->memory, stock->nr_pages);
 		if (do_memsw_account())
@@ -2221,6 +2224,8 @@ static void drain_stock(struct memcg_sto
 		css_put_many(&old->css, stock->nr_pages);
 		stock->nr_pages = 0;
 	}
+
+	css_put(&old->css);
 	stock->cached = NULL;
 }
 
@@ -2256,6 +2261,7 @@ static void refill_stock(struct mem_cgro
 	stock = this_cpu_ptr(&memcg_stock);
 	if (stock->cached != memcg) { /* reset if necessary */
 		drain_stock(stock);
+		css_get(&memcg->css);
 		stock->cached = memcg;
 	}
 	stock->nr_pages += nr_pages;



  parent reply	other threads:[~2024-02-27 14:22 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 13:26 [PATCH 5.4 00/84] 5.4.270-rc1 review Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 01/84] KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 02/84] KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 03/84] net/sched: Retire CBQ qdisc Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 04/84] net/sched: Retire ATM qdisc Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 05/84] net/sched: Retire dsmark qdisc Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 06/84] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Greg Kroah-Hartman
2024-02-27 13:26 ` Greg Kroah-Hartman [this message]
2024-02-27 13:26 ` [PATCH 5.4 08/84] nilfs2: replace WARN_ONs for invalid DAT metadata block requests Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 09/84] userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 10/84] sched/rt: Fix sysctl_sched_rr_timeslice intial value Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 11/84] sched/rt: Disallow writing invalid values to sched_rt_period_us Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 12/84] scsi: target: core: Add TMF to tmr_list handling Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 13/84] dmaengine: shdma: increase size of dev_id Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 14/84] dmaengine: fsl-qdma: increase size of irq_name Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 15/84] wifi: cfg80211: fix missing interfaces when dumping Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 16/84] wifi: mac80211: fix race condition on enabling fast-xmit Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 17/84] fbdev: savage: Error out if pixclock equals zero Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 18/84] fbdev: sis: " Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 19/84] ahci: asm1166: correct count of reported ports Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 20/84] ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 21/84] ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 22/84] ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal() Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 23/84] regulator: pwm-regulator: Add validity checks in continuous .get_voltage Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 24/84] nvmet-tcp: fix nvme tcp ida memory leak Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 25/84] ASoC: sunxi: sun4i-spdif: Add support for Allwinner H616 Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 26/84] netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 27/84] nvmet-fc: abort command when there is no binding Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 28/84] hwmon: (coretemp) Enlarge per package core count limit Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 29/84] scsi: lpfc: Use unsigned type for num_sge Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 30/84] firewire: core: send bus reset promptly on gap count error Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 31/84] virtio-blk: Ensure no requests in virtqueues before deleting vqs Greg Kroah-Hartman
2024-02-27 13:26 ` [PATCH 5.4 32/84] s390/qeth: Fix potential loss of L3-IP@ in case of network issues Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 33/84] pmdomain: renesas: r8a77980-sysc: CR7 must be always on Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 34/84] tcp: factor out __tcp_close() helper Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 35/84] tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 36/84] tcp: add annotations around sk->sk_shutdown accesses Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 37/84] pinctrl: pinctrl-rockchip: Fix a bunch of kerneldoc misdemeanours Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 38/84] pinctrl: rockchip: Fix refcount leak in rockchip_pinctrl_parse_groups Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 39/84] driver core: Set deferred_probe_timeout to a longer default if CONFIG_MODULES is set Greg Kroah-Hartman
2024-02-27 19:38   ` John Stultz
2024-02-28  7:07     ` Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 40/84] spi: mt7621: Fix an error message in mt7621_spi_probe() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 41/84] net: bridge: clear bridges private skb space on xmit Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 42/84] selftests/bpf: Avoid running unprivileged tests with alignment requirements Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 43/84] ALSA: hda/realtek - Enable micmute LED on and HP system Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 44/84] Revert "drm/sun4i: dsi: Change the start delay calculation" Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 45/84] drm/amdgpu: Check for valid number of registers to read Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 46/84] x86/alternatives: Disable KASAN in apply_alternatives() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 47/84] dm-integrity: dont modify bios immutable bio_vec in integrity_metadata() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 48/84] iomap: Set all uptodate bits for an Uptodate page Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 49/84] drm/amdgpu: Fix type of second parameter in trans_msg() callback Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 50/84] arm64: dts: qcom: msm8916: Fix typo in pronto remoteproc node Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 51/84] PCI: tegra: Fix reporting GPIO error value Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 52/84] PCI: tegra: Fix OF node reference leak Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 53/84] IB/hfi1: Fix sdma.h tx->num_descs off-by-one error Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 54/84] dm-crypt: dont modify the data when using authenticated encryption Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 55/84] gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 56/84] PCI/MSI: Prevent MSI hardware interrupt number truncation Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 57/84] l2tp: pass correct message length to ip6_append_data Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 58/84] ARM: ep93xx: Add terminator to gpiod_lookup_table Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 59/84] usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 60/84] usb: cdns3: fix memory double free when handle zero packet Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 61/84] usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 62/84] usb: roles: dont get/set_role() when usb_role_switch is unregistered Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 63/84] IB/hfi1: Fix a memleak in init_credit_return Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 64/84] RDMA/bnxt_re: Return error for SRQ resize Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 65/84] RDMA/srpt: Make debug output more detailed Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 66/84] RDMA/srpt: fix function pointer cast warnings Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 67/84] scripts/bpf: teach bpf_helpers_doc.py to dump BPF helper definitions Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 68/84] bpf, scripts: Correct GPL license name Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 69/84] scsi: jazz_esp: Only build if SCSI core is builtin Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 70/84] nouveau: fix function cast warnings Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 71/84] ipv4: properly combine dev_base_seq and ipv4.dev_addr_genid Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 72/84] ipv6: properly combine dev_base_seq and ipv6.dev_addr_genid Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 73/84] afs: Increase buffer size in afs_update_volume_status() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 74/84] ipv6: sr: fix possible use-after-free and null-ptr-deref Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 75/84] packet: move from strlcpy with unused retval to strscpy Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 76/84] s390: use the correct count for __iowrite64_copy() Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 77/84] tls: rx: jump to a more appropriate label Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 78/84] tls: rx: drop pointless else after goto Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 79/84] tls: stop recv() if initial process_rx_list gave us non-DATA Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 80/84] netfilter: nf_tables: set dormant flag on hook register failure Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 81/84] drm/syncobj: make lockdep complain on WAIT_FOR_SUBMIT v3 Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 82/84] drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 83/84] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio Greg Kroah-Hartman
2024-02-27 13:27 ` [PATCH 5.4 84/84] scripts/bpf: Fix xdp_md forward declaration typo Greg Kroah-Hartman
2024-02-27 18:26 ` [PATCH 5.4 00/84] 5.4.270-rc1 review Florian Fainelli
2024-02-28 13:38 ` Jon Hunter
2024-02-28 16:57 ` Shuah Khan
2024-02-28 17:20 ` Naresh Kamboju
2024-02-28 18:18 ` Harshit Mogalapalli
2024-02-29 10:57 ` Shreeya Patel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240227131553.109209736@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=gongruiqi1@huawei.com \
    --cc=mhocko@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox