From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Shakeel Butt <shakeel.butt@linux.dev>,
Peilin Ye <yepeilin@google.com>,
Alexei Starovoitov <ast@kernel.org>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.6 024/101] bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()
Date: Wed, 17 Sep 2025 14:34:07 +0200 [thread overview]
Message-ID: <20250917123337.442184862@linuxfoundation.org> (raw)
In-Reply-To: <20250917123336.863698492@linuxfoundation.org>
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peilin Ye <yepeilin@google.com>
[ Upstream commit 6d78b4473cdb08b74662355a9e8510bde09c511e ]
Currently, calling bpf_map_kmalloc_node() from __bpf_async_init() can
cause various locking issues; see the following stack trace (edited for
style) as one example:
...
[10.011566] do_raw_spin_lock.cold
[10.011570] try_to_wake_up (5) double-acquiring the same
[10.011575] kick_pool rq_lock, causing a hardlockup
[10.011579] __queue_work
[10.011582] queue_work_on
[10.011585] kernfs_notify
[10.011589] cgroup_file_notify
[10.011593] try_charge_memcg (4) memcg accounting raises an
[10.011597] obj_cgroup_charge_pages MEMCG_MAX event
[10.011599] obj_cgroup_charge_account
[10.011600] __memcg_slab_post_alloc_hook
[10.011603] __kmalloc_node_noprof
...
[10.011611] bpf_map_kmalloc_node
[10.011612] __bpf_async_init
[10.011615] bpf_timer_init (3) BPF calls bpf_timer_init()
[10.011617] bpf_prog_xxxxxxxxxxxxxxxx_fcg_runnable
[10.011619] bpf__sched_ext_ops_runnable
[10.011620] enqueue_task_scx (2) BPF runs with rq_lock held
[10.011622] enqueue_task
[10.011626] ttwu_do_activate
[10.011629] sched_ttwu_pending (1) grabs rq_lock
...
The above was reproduced on bpf-next (b338cf849ec8) by modifying
./tools/sched_ext/scx_flatcg.bpf.c to call bpf_timer_init() during
ops.runnable(), and hacking the memcg accounting code a bit to make
a bpf_timer_init() call more likely to raise an MEMCG_MAX event.
We have also run into other similar variants (both internally and on
bpf-next), including double-acquiring cgroup_file_kn_lock, the same
worker_pool::lock, etc.
As suggested by Shakeel, fix this by using __GFP_HIGH instead of
GFP_ATOMIC in __bpf_async_init(), so that e.g. if try_charge_memcg()
raises an MEMCG_MAX event, we call __memcg_memory_event() with
@allow_spinning=false and avoid calling cgroup_file_notify() there.
Depends on mm patch
"memcg: skip cgroup_file_notify if spinning is not allowed":
https://lore.kernel.org/bpf/20250905201606.66198-1-shakeel.butt@linux.dev/
v0 approach s/bpf_map_kmalloc_node/bpf_mem_alloc/
https://lore.kernel.org/bpf/20250905061919.439648-1-yepeilin@google.com/
v1 approach:
https://lore.kernel.org/bpf/20250905234547.862249-1-yepeilin@google.com/
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/20250909095222.2121438-1-yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/bpf/helpers.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4b20a72ab8cff..90c281e1379ee 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1204,8 +1204,11 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u
goto out;
}
- /* allocate hrtimer via map_kmalloc to use memcg accounting */
- cb = bpf_map_kmalloc_node(map, size, GFP_ATOMIC, map->numa_node);
+ /* Allocate via bpf_map_kmalloc_node() for memcg accounting. Until
+ * kmalloc_nolock() is available, avoid locking issues by using
+ * __GFP_HIGH (GFP_ATOMIC & ~__GFP_RECLAIM).
+ */
+ cb = bpf_map_kmalloc_node(map, size, __GFP_HIGH, map->numa_node);
if (!cb) {
ret = -ENOMEM;
goto out;
--
2.51.0
next prev parent reply other threads:[~2025-09-17 12:57 UTC|newest]
Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 12:33 [PATCH 6.6 000/101] 6.6.107-rc1 review Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 001/101] kunit: kasan_test: disable fortify string checker on kasan_strings() test Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 002/101] mm: introduce and use {pgd,p4d}_populate_kernel() Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 003/101] kasan: fix GCC mem-intrinsic prefix with sw tags Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 004/101] nfsd: Fix a regression in nfsd_setattr() Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 005/101] NFSD: nfsd_unlink() clobbers non-zero status returned from fh_fill_pre_attrs() Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 006/101] media: i2c: imx214: Fix link frequency validation Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 007/101] net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 008/101] ima: limit the number of ToMToU integrity violations Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 009/101] flexfiles/pNFS: fix NULL checks on result of ff_layout_choose_ds_for_read Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 010/101] SUNRPC: call xs_sock_process_cmsg for all cmsg Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 011/101] NFSv4: Dont clear capabilities that wont be reset Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 012/101] NFSv4: Clear the NFS_CAP_FS_LOCATIONS flag if it is not set Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 013/101] NFSv4: Clear the NFS_CAP_XATTR flag if not supported by the server Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 014/101] tracing: Fix tracing_marker may trigger page fault during preempt_disable Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 015/101] ftrace/samples: Fix function size computation Greg Kroah-Hartman
2025-09-17 12:33 ` [PATCH 6.6 016/101] fs/nfs/io: make nfs_start_io_*() killable Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 017/101] NFS: Serialise O_DIRECT i/o and truncate() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 018/101] NFSv4.2: Serialise O_DIRECT i/o and fallocate() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 019/101] NFSv4.2: Serialise O_DIRECT i/o and clone range Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 020/101] NFSv4.2: Serialise O_DIRECT i/o and copy range Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 021/101] NFSv4/flexfiles: Fix layout merge mirror check Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 022/101] tracing: Silence warning when chunk allocation fails in trace_pid_write Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 023/101] s390/cpum_cf: Deny all sampling events by counter PMU Greg Kroah-Hartman
2025-09-17 12:34 ` Greg Kroah-Hartman [this message]
2025-09-17 12:34 ` [PATCH 6.6 025/101] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 026/101] proc: fix type confusion in pde_set_flags() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 027/101] rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 028/101] rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 029/101] rcu-tasks: Maintain real-time response in rcu_tasks_postscan() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 030/101] KVM: SVM: Set synthesized TSA CPUID flags Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 031/101] EDAC/altera: Delete an inappropriate dma_free_coherent() call Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 032/101] Revert "SUNRPC: Dont allow waiting for exiting tasks" Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 033/101] compiler-clang.h: define __SANITIZE_*__ macros only when undefined Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 034/101] mptcp: sockopt: make sync_socket_options propagate SOCK_KEEPOPEN Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 035/101] ocfs2: fix recursive semaphore deadlock in fiemap call Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 036/101] i2c: i801: Hide Intel Birch Stream SoC TCO WDT Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 037/101] net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 038/101] mtd: rawnand: stm32_fmc2: avoid overlapping mappings on ECC buffer Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 039/101] mtd: rawnand: stm32_fmc2: fix ECC overwrite Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 040/101] fuse: check if copy_file_range() returns larger than requested size Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 041/101] fuse: prevent overflow in copy_file_range return value Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 042/101] mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 043/101] mm/damon/core: set quota->charged_from to jiffies at first charge window Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 044/101] drm/mediatek: fix potential OF node use-after-free Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 045/101] drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 046/101] drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 047/101] mtd: nand: raw: atmel: Fix comment in timings preparation Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 048/101] mtd: nand: raw: atmel: Respect tAR, tCLR in read setup timing Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 049/101] libceph: fix invalid accesses to ceph_connection_v1_info Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 050/101] mm/damon/sysfs: fix use-after-free in state_show() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 051/101] mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 052/101] mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 053/101] btrfs: use readahead_expand() on compressed extents Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 054/101] btrfs: fix corruption reading compressed range when block size is smaller than page size Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 055/101] mm/khugepaged: convert hpage_collapse_scan_pmd() to use folios Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 056/101] mm/khugepaged: fix the address passed to notifier on testing young Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 057/101] cifs: fix pagecache leak when do writepages Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 058/101] kernfs: Fix UAF in polling when open file is released Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 059/101] Input: iqs7222 - avoid enabling unused interrupts Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 060/101] Input: i8042 - add TUXEDO InfinityBook Pro Gen10 AMD to i8042 quirk table Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 061/101] Revert "net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups" Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 062/101] tty: hvc_console: Call hvc_kick in hvc_write unconditionally Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 063/101] serial: sc16is7xx: fix bug in flow control levels init Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 064/101] dt-bindings: serial: brcm,bcm7271-uart: Constrain clocks Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 065/101] USB: serial: option: add Telit Cinterion FN990A w/audio compositions Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 066/101] USB: serial: option: add Telit Cinterion LE910C4-WWX new compositions Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 067/101] Disable SLUB_TINY for build testing Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 068/101] net: fec: Fix possible NPD in fec_enet_phy_reset_after_clk_enable() Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 069/101] net: bridge: Bounce invalid boolopts Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 070/101] tunnels: reset the GSO metadata before reusing the skb Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 071/101] docs: networking: can: change bcm_msg_head frames member to support flexible array Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 072/101] igb: fix link test skipping when interface is admin down Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 073/101] i40e: fix IRQ freeing in i40e_vsi_request_irq_msix error path Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 074/101] can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 075/101] can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails Greg Kroah-Hartman
2025-09-17 12:34 ` [PATCH 6.6 076/101] can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 077/101] net: hsr: Add support for MC filtering at the slave device Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 078/101] net: hsr: Add VLAN CTAG filter support Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 079/101] hsr: use rtnl lock when iterating over ports Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 080/101] hsr: use hsr_for_each_port_rtnl in hsr_port_get_hsr Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 081/101] dmaengine: idxd: Remove improper idxd_free Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 082/101] dmaengine: idxd: Fix refcount underflow on module unload Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 083/101] dmaengine: idxd: Fix double free in idxd_setup_wqs() Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 084/101] dmaengine: ti: edma: Fix memory allocation size for queue_priority_map Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 085/101] regulator: sy7636a: fix lifecycle of power good gpio Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 086/101] hrtimer: Remove unused function Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 087/101] hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active() Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 088/101] hrtimers: Unconditionally update target CPU base after offline timer migration Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 089/101] RISC-V: Remove unnecessary include from compat.h Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 090/101] xhci: fix memory leak regression when freeing xhci vdev devices depth first Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 091/101] USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 092/101] usb: gadget: midi2: Fix missing UMP group attributes initialization Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 093/101] usb: gadget: midi2: Fix MIDI2 IN EP max packet size Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 094/101] dmaengine: qcom: bam_dma: Fix DT error handling for num-channels/ees Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 095/101] dmaengine: dw: dmamux: Fix device reference leak in rzn1_dmamux_route_allocate Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 096/101] phy: tegra: xusb: fix device and OF node leak at probe Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 097/101] phy: ti-pipe3: fix device leak at unbind Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 098/101] ksmbd: fix null pointer dereference in alloc_preauth_hash() Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 099/101] net: mdiobus: release reset_gpio in mdiobus_unregister_device() Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 100/101] drm/amdgpu: fix a memory leak in fence cleanup when unloading Greg Kroah-Hartman
2025-09-17 14:35 ` Deucher, Alexander
2025-09-17 14:44 ` Greg Kroah-Hartman
2025-09-17 12:35 ` [PATCH 6.6 101/101] drm/i915/power: fix size for for_each_set_bit() in abox iteration Greg Kroah-Hartman
2025-09-17 18:04 ` [PATCH 6.6 000/101] 6.6.107-rc1 review Hardik Garg
2025-09-17 20:08 ` Jon Hunter
2025-09-18 0:47 ` Peter Schneider
2025-09-18 5:17 ` Brett A C Sheffield
2025-09-18 12:59 ` [PATCH 6.6 000/101] " Ron Economos
2025-09-18 13:25 ` Anders Roxell
2025-09-18 17:36 ` Florian Fainelli
2025-09-18 20:11 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250917123337.442184862@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ast@kernel.org \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=stable@vger.kernel.org \
--cc=yepeilin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox