From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7EE237DE85 for ; Mon, 13 Apr 2026 07:23:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776065023; cv=none; b=rm1grnN8cI9iHCG4NcWudS+WKWPweOzKOyjVvXcwD5TQOGLzB72fUn5hMLX2wvb22cC4w6fbfxP3dWmUITBVZno0gFm352rj9PZkCGIA64leSLWCfoXehrBIn+3srMB7LU+GPkQDgCc4+rVsbOMpBiLqeBkJPJRaB1uQccUnLPk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776065023; c=relaxed/simple; bh=x3SDL32StMHQJXTyvvaxQCfLIBFSGgUUXwjjJZ0WEQM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dvd814XgRcAum+mpWJVueliInEhGK5Hrlr3mtY/u4a89dxjTeb664GLiOrJHd4LAPHAfmB2lyH7s99G+JV2uplwLuMz6mOuAkfpBkiysFaX6yacm2kVai6quCCsoku859W2nzdTg0p/Kxm5nGDAuWt0z+bb09PD0G6k+t7uNI0g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=axhB9mbw; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="axhB9mbw" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-488a29e6110so44246785e9.3 for ; Mon, 13 Apr 2026 00:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1776065020; x=1776669820; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2nzuZ3kIIyE/QcfErtqwJVEa14xLXdZqRnEwseWxDOo=; b=axhB9mbw9wC0Q0r8zPkxhQegXkjtDlIpg1WPwB7XpDiLxzM6cqCLYlIZaCnIVklZzz FO9mgwGl4KPKZG/4LZ9Uz6bVwRDwLG18dBRZsvvAIErNu1278Dk+tC6eBXFcPBAq2Go0 1CibuuTkLZ/rd5Mw9fb1KGaBRCgn4cHMHfFnFbYwQc1kFu7AHgvQcnC0+osXiHO0Mmlu r8zTvOfS3suxmUf+83y7Gm7N6CauIlatidhduJz3SY9bclHJ6mrsh3F17CGmyfAAi8YO qijJhmwt5qVxV4l//8i4ZgSkVdHpK7BfSeDc5FQEorPqY7aKoETyM2tF6iJNrIt9u3U9 Y/qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776065020; x=1776669820; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2nzuZ3kIIyE/QcfErtqwJVEa14xLXdZqRnEwseWxDOo=; b=CVEFcFWivEmSvE5ReEXQDhitLqwkeewf3mLz2cd9kOdHwLn/LB3Oz8r6Bz61udTYFu RBOZhGeZ2lp1v5E3HxnshgYk5Bmw7NvgwIIQ3JG5mixw1+YkaxdBXRxwINnJ83CUF6pW VgCf+/dTu5TmAioreeWrjmF6aVaN8hJULDBxL0+T8J1o0qACUF8Lsbno3s+S7XWoCdLr vkg5kbW4NyYju1nzRnyyGvII6zaNckfjQv3ZfmGQa/vIlzd7+NklYxCkVkcJdeg76qL8 UPGPHs4hry1cbgXjIZAUwl1jLmgFcXu/VbWtOKidYY/f10+WoCKGhhJ2Z0Ev+VarjuUd 0Ycg== X-Forwarded-Encrypted: i=1; AFNElJ9QYIqJ+KIA9gOFqv5t/CHsKNfJaaCpZ8KZWt7xXr1O41IFzI0VTFsxqZI2xiyJlwOmQl8OrhqzgIKn7PY=@vger.kernel.org X-Gm-Message-State: AOJu0YyKydQoxQUfTyHZDfB9mm1m0XbQiazycR9dCRk6nJnQAg4iax4I aVtt3cOECOp8vFRIKfUyUUHIXyJnybJcXF903CFHM6q9o8StRnyMaL/omJtZgmpitr4= X-Gm-Gg: AeBDietT8LKrqOM0iF7ObR5RQqC/fQsxnsKui1vJsvUteKKG2O91ppAkxMUastJHjD3 p1Vl2oz0gboverhV2YuXggbEz8GiuYk+NR18V+qMEhL4WGR/3xB+73MincZSUxaKMgQ59R7IDSZ aT5ylb4htfxfzJ3uUCVmemFEZUW1NCfDBiUIjAlKOkXXTsWbqxz5m4HiMI/CZSimOA950ui8Ffs LkYzx5enn0AhbY9K2kTChwNt1ezO9aoC8QUsKibenk2qj9JTKeT5q7wGUzZPZ4x2gAs16FIazDN apnLec2xOMaWSqtnH3uPHal35WnCoGzfZp3bLhN/9Dlwr3lcggV2aeV9oB6kYNZuRquQxTR1SHS lC83WWXYqmmEyVXFtzdZtAVkrr44+rzzKuK2+OuD0TE+IqA/xhxddyfed9ZfaQCQJB283vxCxT6 DDhnRuRoAMmp4rxSFFz8mjJyx9rATNi6KZ3DCTGHXNNeYf X-Received: by 2002:a05:600c:5292:b0:486:fe39:28b7 with SMTP id 5b1f17b1804b1-488d67fa48emr170788255e9.9.1776065020262; Mon, 13 Apr 2026 00:23:40 -0700 (PDT) Received: from localhost (109-81-29-22.rct.o2.cz. [109.81.29.22]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-488d5dc7070sm89322685e9.10.2026.04.13.00.23.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2026 00:23:39 -0700 (PDT) Date: Mon, 13 Apr 2026 09:23:38 +0200 From: Michal Hocko To: Joshua Hahn Cc: Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 0/8 RFC] mm/memcontrol, page_counter: move stock from mem_cgroup to page_counter Message-ID: References: <20260410210742.550489-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260410210742.550489-1-joshua.hahnjy@gmail.com> On Fri 10-04-26 14:06:54, Joshua Hahn wrote: > Memcg currently keeps a "stock" of 64 pages per-cpu to cache pre-charged > allocations, allowing small allocations and frees to avoid walking the > expensive mem_cgroup hierarchy traversal on each charge. This design > introduces a fastpath to charge/uncharge, but has several limitations: > > 1. Each CPU can track up to 7 (NR_MEMCG_STOCK) mem_cgroups. When more > than 7 mem_cgroups are actively charging on a single CPU, a random > victim is evicted, and its associated stock is drained, which > triggers unnecessary hierarchy walks. > > Note that previously there used to be a 1-1 mapping between CPU and > memcg stock; it was bumped up to 7 in f735eebe55f8f ("multi-memcg > percpu charge cache") because it was observed that stock would > frequently get flushed and refilled. All true but it is quite important to note that this all is bounded to nr_online_cpus*NR_MEMCG_STOCK*MEMCG_CHARGE_BATCH. You are proposing to increase this to s@NR_MEMCG_STOCK@nr_leaf_cgroups@. In invornments with many cpus and and directly charged cgroups this can be considerable hidden overcharge. Have you considered that and evaluated potential impact? > 2. Stock management is tightly coupled to struct mem_cgroup, which > makes it difficult to add a new page_counter to struct mem_cgroup > and do its own stock management, since each operation has to be > duplicated. Could you expand why this is a problem we need to address? > 3. Each stock slot requires a css reference, as well as a traversal > overhead on every stock operation to check which cpu-memcg we are > trying to consume stock for. Why is this a problem? Please also be more explicit what kind of workloads are going to benefit from this change. The existing caching scheme is simple and ineffective but is it worth improving (likely your points 2 and 3 could clarify that)? All that being said, I like the resulting code which is much easier to follow. The caching is nicely transparent in the charging path which is a plus. My main worry is that caching has caused some confusion in the past and this change will amplify that by the scaling the amount of cached charge. This needs to be really carefully evaluated. -- Michal Hocko SUSE Labs