All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: Muchun Song <muchun.song@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Qi Zheng <qi.zheng@linux.dev>, Alexandre Ghiti <alex@ghiti.fr>,
	 Joshua Hahn <joshua.hahnjy@gmail.com>,
	Harry Yoo <harry@kernel.org>,
	 Meta kernel team <kernel-team@meta.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	 linux-kernel@vger.kernel.org,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH v2 4/4] memcg: multi objcg charge support
Date: Fri, 22 May 2026 09:37:33 -0700	[thread overview]
Message-ID: <ahCGEAQ6USZDYrUo@linux.dev> (raw)
In-Reply-To: <9D6F8C2F-F3E7-4326-A4F6-D5B1433A6C55@linux.dev>

On Fri, May 22, 2026 at 02:33:36PM +0800, Muchun Song wrote:
> 
> 
> > On May 22, 2026, at 09:19, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > 
> > Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg
> > per-node type") split a memcg's single obj_cgroup into one per NUMA
> > node so that reparenting LRU folios can take per-node lru locks. As a
> > side effect, the per-CPU obj_stock_pcp -- which caches exactly one
> > cached_objcg -- thrashes on workloads where threads of the same memcg
> > run on different NUMA nodes. The kernel test robot reported a 67.7%
> > regression on stress-ng.switch.ops_per_sec from this pattern.
> > 
> > Mirror the multi-slot pattern already used by memcg_stock_pcp: turn
> > nr_bytes and cached_objcg into NR_OBJ_STOCK-element arrays, scan all
> > slots on consume/refill/account, prefer empty slots when inserting,
> > and evict a random slot only when full. With multiple slots a CPU can
> > hold the per-node objcg variants of one memcg plus a few siblings
> > without ever forcing a drain.
> > 
> > A single int8_t index records which slot the cached slab stats belong
> > to; the stats are flushed on slot or pgdat change. With NR_OBJ_STOCK
> > = 5 the layout (verified with pahole) is:
> > 
> >  offset 0  : lock(1) + index(1) + node_id(2) + slab stats(4) = 8B
> >  offset 8  : nr_bytes[5]                                     = 10B
> >  offset 18 : padding                                         = 6B
> >  offset 24 : cached[5]                                       = 40B
> >  offset 64 : (line 2) work_struct + flags (cold)
> > 
> > so consume_obj_stock, refill_obj_stock and the slab account path each
> > touch exactly one 64-byte cache line on non-debug 64-bit builds.
> > 
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202605121641.b6a60cb0-lkp@intel.com
> > Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > Tested-by: kernel test robot <oliver.sang@intel.com>
> > ---
> > 
> > Changes since v1:
> > - Use round robin for drain
> > 
> > mm/memcontrol.c | 188 ++++++++++++++++++++++++++++++++++--------------
> > 1 file changed, 136 insertions(+), 52 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 78c02451312b..ba17633b0bd0 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -150,14 +150,14 @@ static void obj_cgroup_release(struct percpu_ref *ref)
> > 	* However, it can be PAGE_SIZE or (x * PAGE_SIZE).
> > 	*
> > 	* The following sequence can lead to it:
> > - 	* 1) CPU0: objcg == stock->cached_objcg
> > + 	* 1) CPU0: objcg cached in one of stock->cached[i]
> > 	* 2) CPU1: we do a small allocation (e.g. 92 bytes),
> > 	*          PAGE_SIZE bytes are charged
> > 	* 3) CPU1: a process from another memcg is allocating something,
> > 	*          the stock if flushed,
> > 	*          objcg->nr_charged_bytes = PAGE_SIZE - 92
> > 	* 5) CPU0: we do release this object,
>            ^
>            4
> 
> Since you're already modifying the comments in this section,
> would you mind fixing the numbering as well? I noticed that the
> sequence was wrong a while back :)

Haha I didn't even notice. If I send a new version, I will fix this otherwise I
will ask Andrew to fix inplace.



  reply	other threads:[~2026-05-22 16:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22  1:19 [PATCH v2 0/4] memcg: shrink obj_stock_pcp and cache multiple objcgs Shakeel Butt
2026-05-22  1:19 ` [PATCH v2 1/4] memcg: store node_id instead of pglist_data pointer Shakeel Butt
2026-05-22  2:27   ` Qi Zheng
2026-05-22  1:19 ` [PATCH v2 2/4] memcg: uint16_t for nr_bytes in obj_stock_pcp Shakeel Butt
2026-05-22  2:23   ` Qi Zheng
2026-05-22 16:30     ` Shakeel Butt
2026-05-22  6:27   ` Muchun Song
2026-05-22  1:19 ` [PATCH v2 3/4] memcg: int16_t for cached slab stats Shakeel Butt
2026-05-22  2:30   ` Qi Zheng
2026-05-22  6:27   ` Muchun Song
2026-05-22  7:50   ` David Laight
2026-05-22  1:19 ` [PATCH v2 4/4] memcg: multi objcg charge support Shakeel Butt
2026-05-22  6:33   ` Muchun Song
2026-05-22 16:37     ` Shakeel Butt [this message]
2026-05-23  2:34 ` [PATCH v2 0/4] memcg: shrink obj_stock_pcp and cache multiple objcgs Andrew Morton
2026-05-25 18:53   ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahCGEAQ6USZDYrUo@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry@kernel.org \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=qi.zheng@linux.dev \
    --cc=roman.gushchin@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.