From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 060FD370AF3 for ; Fri, 22 May 2026 16:38:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779467884; cv=none; b=Jt1XGVHe9OjFl/fpeKlsNglYLhSZioX4DpbhKWmJzW63IomH3wizl3z1CoJwgP9+cz9h0GDUzuXWeZk0IuytIrNKIYmuQknuNizUST45/8YEbyUCDN0h3iAn8aR+6iGnm4InCBMYPJAQnHIv9zmOGpG3G00ISPH/nurqC6KB6Tg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779467884; c=relaxed/simple; bh=lxNyYQr0cQNVzW3YAilHzvlyx477lK01M8Ia03cHmfY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=V9R8g/eoymchY1f3m5VTS3sRLiDhsfSjoDP8Z1NoxFv080yDejrNP9llKS8Nv1uBPta4KTExRptmmUFHz3lwc6A4BZ2hbvErBcNXI21b7JQCI/UWM+rK55JQa6p8MGgrSRrscSTleuTBX8SEqj4TJ0AKZH2gxEnwN32Ad1eYtCI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=giS18ONg; arc=none smtp.client-ip=95.215.58.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="giS18ONg" Date: Fri, 22 May 2026 09:37:33 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1779467871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oGer9ojiBJJKSgW4zmxqBqKbeEU9rBkVQbJdBxhzT0c=; b=giS18ONglvazOIz5+CrJmLvo+1ZX/TD6DtT1KkoaLTzb5sdswuQCbKxe6EBNa8gP4WzY3T HeuaJ+sBuZnuEds/er5wc87wu0eCq6itAzxxcRAMWc+YtmGQGQyyx4izS9xDU2SUIm1NVo J2Ha8K+cWUlsZqyBlo6lKnjYMcimAs0= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Muchun Song Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Qi Zheng , Alexandre Ghiti , Joshua Hahn , Harry Yoo , Meta kernel team , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel test robot Subject: Re: [PATCH v2 4/4] memcg: multi objcg charge support Message-ID: References: <20260522011908.1669332-1-shakeel.butt@linux.dev> <20260522011908.1669332-5-shakeel.butt@linux.dev> <9D6F8C2F-F3E7-4326-A4F6-D5B1433A6C55@linux.dev> Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9D6F8C2F-F3E7-4326-A4F6-D5B1433A6C55@linux.dev> X-Migadu-Flow: FLOW_OUT On Fri, May 22, 2026 at 02:33:36PM +0800, Muchun Song wrote: > > > > On May 22, 2026, at 09:19, Shakeel Butt wrote: > > > > Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg > > per-node type") split a memcg's single obj_cgroup into one per NUMA > > node so that reparenting LRU folios can take per-node lru locks. As a > > side effect, the per-CPU obj_stock_pcp -- which caches exactly one > > cached_objcg -- thrashes on workloads where threads of the same memcg > > run on different NUMA nodes. The kernel test robot reported a 67.7% > > regression on stress-ng.switch.ops_per_sec from this pattern. > > > > Mirror the multi-slot pattern already used by memcg_stock_pcp: turn > > nr_bytes and cached_objcg into NR_OBJ_STOCK-element arrays, scan all > > slots on consume/refill/account, prefer empty slots when inserting, > > and evict a random slot only when full. With multiple slots a CPU can > > hold the per-node objcg variants of one memcg plus a few siblings > > without ever forcing a drain. > > > > A single int8_t index records which slot the cached slab stats belong > > to; the stats are flushed on slot or pgdat change. With NR_OBJ_STOCK > > = 5 the layout (verified with pahole) is: > > > > offset 0 : lock(1) + index(1) + node_id(2) + slab stats(4) = 8B > > offset 8 : nr_bytes[5] = 10B > > offset 18 : padding = 6B > > offset 24 : cached[5] = 40B > > offset 64 : (line 2) work_struct + flags (cold) > > > > so consume_obj_stock, refill_obj_stock and the slab account path each > > touch exactly one 64-byte cache line on non-debug 64-bit builds. > > > > Reported-by: kernel test robot > > Closes: https://lore.kernel.org/oe-lkp/202605121641.b6a60cb0-lkp@intel.com > > Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type") > > Signed-off-by: Shakeel Butt > > Tested-by: kernel test robot > > --- > > > > Changes since v1: > > - Use round robin for drain > > > > mm/memcontrol.c | 188 ++++++++++++++++++++++++++++++++++-------------- > > 1 file changed, 136 insertions(+), 52 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 78c02451312b..ba17633b0bd0 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -150,14 +150,14 @@ static void obj_cgroup_release(struct percpu_ref *ref) > > * However, it can be PAGE_SIZE or (x * PAGE_SIZE). > > * > > * The following sequence can lead to it: > > - * 1) CPU0: objcg == stock->cached_objcg > > + * 1) CPU0: objcg cached in one of stock->cached[i] > > * 2) CPU1: we do a small allocation (e.g. 92 bytes), > > * PAGE_SIZE bytes are charged > > * 3) CPU1: a process from another memcg is allocating something, > > * the stock if flushed, > > * objcg->nr_charged_bytes = PAGE_SIZE - 92 > > * 5) CPU0: we do release this object, > ^ > 4 > > Since you're already modifying the comments in this section, > would you mind fixing the numbering as well? I noticed that the > sequence was wrong a while back :) Haha I didn't even notice. If I send a new version, I will fix this otherwise I will ask Andrew to fix inplace.