From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4F83C71159 for ; Mon, 16 Jun 2025 20:13:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 871896B00A1; Mon, 16 Jun 2025 16:13:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 821F66B00A4; Mon, 16 Jun 2025 16:13:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75EF76B00A5; Mon, 16 Jun 2025 16:13:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 69A596B00A1 for ; Mon, 16 Jun 2025 16:13:30 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 18054C0E3F for ; Mon, 16 Jun 2025 20:13:30 +0000 (UTC) X-FDA: 83562363780.21.D2DDA27 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) by imf27.hostedemail.com (Postfix) with ESMTP id 3C5A240006 for ; Mon, 16 Jun 2025 20:13:27 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZQ3dcc5k; spf=pass (imf27.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750104808; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7sWRkpC4J/fwZ8hiPeG9n9Tg+PgJKdAZANW52vn8x6Y=; b=uvOYx+7smRtpod1kFxs9i0HmGKMG3tPqNpQ1yTCeZjG2PVjXg9rkpqD+gQjI4EAm7a/xmq bi1a145PfzA7xQtODial9bLZ/O0nV8DvjL+NvwfSQZX6aQsx2Lf4KyW2yOWFc0zbRIXv7r TRYfOODB60JeImFEecaml36sbNZxtjM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZQ3dcc5k; spf=pass (imf27.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.189 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750104808; a=rsa-sha256; cv=none; b=t8KGHXlixvcV+aqiRQpGe+t50zRd/9zhQuWofMOwDq3KJThRib0shJX0Anmy/TUySrszye wjKDiL7aCYPGylx8mcf2nM3RCkpoEJx1g6VPhv5hF0yXp3GDSrMPkjCe26tAWdJuXUzx2M 1cXzIpsAzChdS/in26OAgcy/yqQgFvE= Date: Mon, 16 Jun 2025 13:13:19 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1750104806; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7sWRkpC4J/fwZ8hiPeG9n9Tg+PgJKdAZANW52vn8x6Y=; b=ZQ3dcc5kh6ona5ww/0k7/JsyT8NLs1JyZaMkcZsHB8MCT7NnZx9bcpaIRCYtFp4nCcIwk0 +ra73movQM5WaTKViT6cCTci195lS0YRX0dS53qVXMs5qfy0UNo4dx2kMkqRylshivfGhu KDTh3eVjr0zGjK3bY08lspkLe6PKVXg= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: JP Kobryn Cc: Tejun Heo , Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Michal =?utf-8?Q?Koutn=C3=BD?= , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [PATCH v2 0/4] cgroup: nmi safe css_rstat_updated Message-ID: References: <20250611221532.2513772-1-shakeel.butt@linux.dev> <218e8b26-6b83-46a4-a57c-2346130a1597@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <218e8b26-6b83-46a4-a57c-2346130a1597@gmail.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3C5A240006 X-Stat-Signature: xmekido9hdacferb5y3y6qknguf8eg3k X-Rspam-User: X-HE-Tag: 1750104807-951549 X-HE-Meta: U2FsdGVkX1930z23V+FCuVbRUexryvtxaMzmB9hOhEBGnDTXNwFLhcE0JzSjbrDtBV2QipSCQ6cR4qUXyXiG0AGPQSMUlXEuvQO6WW+s2x6YB96RZ2m0yp6A4eHkC2gD54uxlrYNw2EIE+uQrXt59mgyhYKIzgoS7q2nQEpexg4Wrfc6fUx4S5+bzb+ZicSjZBNQz+kQo7wG3lW/jNMgRJCWNE40ujx0IJV1ecszKo4qOrUBKJVJ400ybdIda63SWTPLpPO0oGWOlfQdZJfz3mguAfzHJT+gookY9ytyRfP0TU1YBbGQo758c8FXpMHQqWfakOxR2Od2WeyQUM5EsoKx+yHbdHIzuzqqNmW8MowvCvTpeEoj+MGwNtUTHXSaufkHtUiuIkUfUJlmiIq0BUrLLzfE869BtKo6h/xfcgjMPS/fybmpANzzYamn+DSGaHUgQI8QVXzhOqUrLQy/xZTB/5bpHIg0/34GpDCLsmUHKvL4+HC6eKXw8H1yqqcTelqit4SZzPi8vtw9Cuj5N50+aWJGjcg/97+vh7/GLHWlc78sDC39gWT8OrZfyLD+lxsM6AhYGDTghFIWWy1AiSIVrQYc/lcr7DtXtD5RUAE5mUGoAS/tPTM2afxLiC+/6rT41gcF9LMbUaHhhK7k0rCa6cmLDY/+/je4aFA5Vq08aT+4I5WAF/kiYEdihKN6scyrqkArLzwgLGbWLkxn5dnSf791yu7uWkoIIFyTyjpn9F4pYm/qkxh8Rvxk/CRtAhqHkz0u2iUOoBJfUUkW4KHNGsdEUGM1/A7ohwJI7tcENLAwBUvRUWl8fGtyMrL/AsVyIERtD4XkBpnjJ3HIEHVUuXDkekfZDKAFxj/Z9W4e7mlv44MgNRU6SZsoWhleHvwLXZEHG7tLNOmOvRevEOtGukpz8fnJ8XWwdf1sMBo7YD1c0EKdp4iAkawiyRdTFYud1xBYGjUc+x6X9Te Mtiu6ped 5oEDYHzXCCLiFiQzwdcVcEMEII8C3Y9JXN4dpdRH0OGAEt96lAKN+TGrywT2nqZGaBOpGvWRLpbnC8YkhIWpkJhdz5Fz0wYXtegkjwH60MCNFw0lTIUFFVILW3jyWoKuI1rbPDwx2OmQGInYZz02r5oXBnew9FcBSBvzZr4Cio7qArB2ZD6W5QcE+czJm9tkQiSsvcYU8z6tuCOxvEGe+/QZAXaCY9AjQo1m3O7iTHMS2mQRxiCH5piTiklQHjtksXksLnMA19BltIgx+x+EMuGZ8bz/Y84nWSK0xsjflYUWP+CZOD0u12EMF/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 16, 2025 at 01:08:49PM -0700, JP Kobryn wrote: > On 6/11/25 3:15 PM, Shakeel Butt wrote: > > BPF programs can run in nmi context and may trigger memcg charged memory > > allocation in such context. Recently linux added support to nmi safe > > page allocation along with memcg charging of such allocations. However > > the kmalloc/slab support and corresponding memcg charging is still > > lacking, > > > > To provide nmi safe support for memcg charging for kmalloc/slab > > allocations, we need nmi safe memcg stats because for kernel memory > > charging and stats happen together. At the moment, memcg charging and > > memcg stats are nmi safe and the only thing which is not nmi safe is > > adding the cgroup to the per-cpu rstat update tree. i.e. > > css_rstat_updated() which this series is doing. > > > > This series made css_rstat_updated by using per-cpu lockless lists whose > > node in embedded in individual struct cgroup_subsys_state and the > > per-cpu head is placed in struct cgroup_subsys. For rstat users without > > cgroup_subsys, a global per-cpu lockless list head is created. The main > > challenge to use lockless in this scenario was the potential multiple > > inserters from the stacked context i.e. process, softirq, hardirq & nmi, > > potentially using the same per-cpu lockless node of a given > > cgroup_subsys_state. The normal lockless list does not protect against > > such scenario. > > > > The multiple stacked inserters using potentially same lockless node was > > resolved by making one of them succeed on reset the lockless node and the > > winner gets to insert the lockless node in the corresponding lockless > > list. The losers can assume the lockless list insertion will eventually > > succeed and continue their operation. > > > > Changelog since v2: > > - Add more clear explanation in cover letter and in the comment as > > suggested by Andrew, Michal & Tejun. > > - Use this_cpu_cmpxchg() instead of try_cmpxchg() as suggested by Tejun. > > - Remove the per-cpu ss locks as they are not needed anymore. > > > > Changelog since v1: > > - Based on Yosry's suggestion always use llist on the update side and > > create the update tree on flush side > > > > [v1] https://lore.kernel.org/cgroups/20250429061211.1295443-1-shakeel.butt@linux.dev/ > > > > > > Shakeel Butt (4): > > cgroup: support to enable nmi-safe css_rstat_updated > > cgroup: make css_rstat_updated nmi safe > > cgroup: remove per-cpu per-subsystem locks > > memcg: cgroup: call css_rstat_updated irrespective of in_nmi() > > > > include/linux/cgroup-defs.h | 11 +-- > > include/trace/events/cgroup.h | 47 ---------- > > kernel/cgroup/rstat.c | 169 +++++++++++++--------------------- > > mm/memcontrol.c | 10 +- > > 4 files changed, 74 insertions(+), 163 deletions(-) > > > > I tested this series by doing some updates/flushes on a cgroup hierarchy > with four levels. This tag can be added to the patches in this series. > > Tested-by: JP Kobryn > Thanks a lot.