From: sashiko-bot@kernel.org
Cc: mkoutny@suse.com, bpf@vger.kernel.org, tj@kernel.org, hannes@cmpxchg.org
Subject: Re: [PATCH RFC 1/4] mm/memcontrol: do not drain objcg stock when spinning is not allowed
Date: Wed, 24 Jun 2026 13:28:10 +0000 [thread overview]
Message-ID: <20260624132811.5C8A71F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260624-kmalloc-nolock-fixes-v1-1-fdf4d17351dd@kernel.org>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] The fallback path in `__refill_obj_stock()` indefinitely accumulates memcg charge without uncharging it, leading to a massive memcg leak and premature OOM for `!allow_spin` workloads.
Pre-existing issues:
- [High] `__memcg_slab_free_hook()` unconditionally drops object references via `obj_cgroup_put()`, causing spinlock acquisition on PREEMPT_RT even when `allow_spin == false`.
- [High] `kfree_nolock()` can still acquire a sleeping lock on PREEMPT_RT through `obj_cgroup_uncharge_pages()` when `allow_spin == false`.
--
commit f6eb6f507d999767dfe0866c3a461084a0d9e774
Author: Harry Yoo (Oracle) <harry@kernel.org>
mm/memcontrol: do not drain objcg stock when spinning is not allowed
This commit updates kmalloc_nolock() and related objcg functions to avoid
draining the objcg stock when spinning is not allowed, aiming to fix
lockdep warnings on PREEMPT_RT. It falls back to atomics if the objcg
doesn't match and draining is unsafe.
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 29390ba13baa2..5bb5e75ef5b0c 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
[ ... ]
> @@ -3346,6 +3347,13 @@ static void __refill_obj_stock(struct obj_cgroup *objcg,
> out:
> if (nr_pages)
> obj_cgroup_uncharge_pages(objcg, nr_pages);
[Severity: High]
This is a pre-existing issue, but can this call to
obj_cgroup_uncharge_pages() still acquire a sleeping lock on PREEMPT_RT?
Even with allow_spin == false, __refill_obj_stock() calls
obj_cgroup_uncharge_pages() if the object's objcg matches the CPU's cached
objcg and the accumulated bytes exceed PAGE_SIZE, or if falling back with a
large object.
obj_cgroup_uncharge_pages() unconditionally calls refill_stock(), which
uses local_trylock(&memcg_stock.lock).
Per kernel guidelines, acquiring a local_lock (even via trylock) within a
raw spinlock or hardirq context causes an Invalid wait context lockdep splat
due to wait-type mismatch.
> + return;
> +
> +fallback:
> + nr_pages = nr_bytes >> PAGE_SHIFT;
> + nr_bytes = nr_bytes & (PAGE_SIZE - 1);
> + atomic_add(nr_bytes, &objcg->nr_charged_bytes);
> + goto out;
> }
[Severity: High]
Does this fallback path indefinitely accumulate memcg charges without ever
uncharging them?
When __refill_obj_stock() is called with allow_spin == false and the cached
objcg doesn't match, it jumps to this fallback path. It adds the remaining
bytes to objcg->nr_charged_bytes via atomic_add(). Because nr_bytes is
typically less than PAGE_SIZE, nr_pages is 0.
Since the function returns immediately via the out label without checking if
the newly accumulated nr_charged_bytes exceeds PAGE_SIZE, the charge is
never uncharged.
If a workload exclusively uses kfree_nolock(), could the memcg charge grow
unbounded and cause a premature OOM?
[ ... ]
> @@ -3535,7 +3547,7 @@ void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> obj_ext->objcg = NULL;
>
> stock = trylock_stock();
> - __refill_obj_stock(objcg, stock, obj_size, true);
> + __refill_obj_stock(objcg, stock, obj_size, true, allow_spin);
> __account_obj_stock(objcg, stock, -obj_size,
> slab_pgdat(slab), cache_vmstat_idx(s));
> unlock_stock(stock);
[Severity: High]
This isn't a bug introduced by this patch, but does __memcg_slab_free_hook()
still unconditionally drop object references via obj_cgroup_put() right
after unlock_stock() returns?
If this drops the final reference to an offline memcg, obj_cgroup_put()
synchronously invokes the release callback obj_cgroup_release(), which
acquires the sleeping objcg_lock lock.
Could this trigger the exact Invalid wait context lockdep splat or deadlock
on PREEMPT_RT that this patch intends to prevent, even when allow_spin is
false?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624-kmalloc-nolock-fixes-v1-0-fdf4d17351dd@kernel.org?part=1
next prev parent reply other threads:[~2026-06-24 13:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 13:11 [PATCH RFC 0/4] memcg,slab: kmalloc_nolock() fixes Harry Yoo (Oracle)
2026-06-24 13:11 ` [PATCH RFC 1/4] mm/memcontrol: do not drain objcg stock when spinning is not allowed Harry Yoo (Oracle)
2026-06-24 13:28 ` sashiko-bot [this message]
2026-06-24 13:11 ` [PATCH RFC 2/4] mm/slab: handle allow_spin in slab_free_hook() instead of open coding Harry Yoo (Oracle)
2026-06-24 13:24 ` sashiko-bot
2026-06-24 13:11 ` [PATCH RFC 3/4] mm/slab: fix a deadlock in memcg_alloc_abort_single() Harry Yoo (Oracle)
2026-06-24 13:11 ` [PATCH RFC 4/4] mm/slab: serialize defer_free_barrier() Harry Yoo (Oracle)
2026-06-24 13:25 ` sashiko-bot
2026-06-24 16:30 ` [PATCH RFC 0/4] memcg,slab: kmalloc_nolock() fixes Alexei Starovoitov
2026-06-24 20:19 ` Harry Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260624132811.5C8A71F00A3A@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=mkoutny@suse.com \
--cc=sashiko-reviews@lists.linux.dev \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.