From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF807C71157 for ; Tue, 17 Jun 2025 19:57:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 323416B009C; Tue, 17 Jun 2025 15:57:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FB486B009D; Tue, 17 Jun 2025 15:57:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 238066B00A1; Tue, 17 Jun 2025 15:57:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0DC886B009C for ; Tue, 17 Jun 2025 15:57:39 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8371CC0CE4 for ; Tue, 17 Jun 2025 19:57:38 +0000 (UTC) X-FDA: 83565952596.05.3171601 Received: from out-178.mta1.migadu.com (out-178.mta1.migadu.com [95.215.58.178]) by imf06.hostedemail.com (Postfix) with ESMTP id CFD78180009 for ; Tue, 17 Jun 2025 19:57:36 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=IM3o9XfU; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750190257; a=rsa-sha256; cv=none; b=OTB7uLRIwWp54AGyDjYm9GKu2kDE11L0LwUDnkvy8Ko8oEZEWz9WyfznGuHkNEU+PPuIUr l46JUoqaL5UTDUZZ1OwKjDg3AArqv6uUby6LV5AQqg6UXjOSQHPiJmsUHAR4Fu730ydnRn MWlGQThKbjtUpd392nwljzIXfL5U8XI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=IM3o9XfU; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750190257; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=chThj5q3V7f3H1aWStWIZNg3tPJSNout6hvisnuDvdI=; b=8EqBSScRIpaua4qtgTGHcX1xGklBWy2V6aPQwj4Z9WVo3n6/OkIxWtCsLh7nFVKxTfUv2J 79tfNsr5SAKBI0BrrVpK8GbZ3tPw4TyIaLXDpgpzZhA4ie8vhjHnMyBBRSYQ6CzDLpirSK RzpnFArgaeh1w1jjrxS8w3SqJNFNiHc= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1750190254; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=chThj5q3V7f3H1aWStWIZNg3tPJSNout6hvisnuDvdI=; b=IM3o9XfUi37VhXYjtKSxKAMuNcFi/hJ+bbR6TOn6gTzawy8KlCCPplZuF56BMgxQ5BRMLR orRZElM45luzIkNtlZAvuwUB7sYfadtisqkfoWY0bKHlC6k+mh/onrG5hNk1049oWZvQR/ KjgDxBXUYnX4DOr0HRjO1bG9M/n3ixw= From: Shakeel Butt To: Tejun Heo Cc: Andrew Morton , JP Kobryn , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v3 0/4] cgroup: nmi safe css_rstat_updated Date: Tue, 17 Jun 2025 12:57:21 -0700 Message-ID: <20250617195725.1191132-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CFD78180009 X-Stat-Signature: ic7ytmfca4pkzs7ojrd7hdfecef1969z X-Rspam-User: X-HE-Tag: 1750190256-449522 X-HE-Meta: U2FsdGVkX1+VhsHuWolW8I3oSTTzWWbO6XNRoxOoSPjBrXy1fFbmxvlmpDkSZXN6FugGSCLQeAjB+xqFLRYKYIQAaCeoEI5J5WNowvborWGU45Ph6LybsWHV20CdYxOxTkVoUA9UbehN+ZzyCroEs2Yko66QDaHXdsynNOFkpi2iGwO839lXHPCNjEKO4nXD7248PJN3KA7kg44drrCA6RwvaNM6/z2U8OB8pKTencQGaLI3z15aEMGcDaUQJuu2BbyEt3KZwdJaS2stEYxlZfq5O8QOOK987uDu4LmNgXo6IGtnGflLE8rEhHUmHlzB8ZY7XoDgYmbkZYK//KN9Yfp+HAYQL1ula9MYCNKbBvdYXIzdnQDrzpE0vctxP2OyNv20Who0GK5V8/ajexyOqZzfoVrTXC3Ttbqvb8NqBWqYmtKOsl3KRu36GwxI7/DgoGvvj75Qhyvv2aHbSuVBmZeUgZQXDzYmJwNpOby6w737YBEsqrS1vHAmdd9XFjMwgDnaUZVORN6XCL1zn+6L1rSvp/1H9IwQ1EU8Rnn0KXiJ3WnWsbLClJsdIUdI/7d4+91VUbd67WED8ND8AKdYzTV7lPINptd9lZQEcobbL3StTjPDlGoMlO2Vjh6wSUnYvVDbSgXk+yzY1M87FQ/AslyZdbtsZoy/QW7lYJfwhSqpbBYqBLYKujeVi1TX6rUSPZFfvJPt4UojpqTwLgt2P8Pyou97VAkQrn1t7+vrUlNeZ03LgC/NVE6rjN4mTylJKEkuXzW/e2t6I1famB1oER8SvCzg2Z2VyALN8kJ8Q0iElVCUGezkNLHZvrW5ORyuJtfDXNk5AR2hVypZpYb337Z2Y+eB52CkAkkZjFqN5kIOL6nsWr1L7EaRyKcTrdxNE7Cfd+5JMNvPKsUMWP8RYSuRDtWk/UIr5UIBRMlIYcoTsAmBW5MPFc8HWtbBReItajX91tdmpZXMFiIcRmV QQOx1S31 eXPdv6RybfvlOD1QVJkmdAn8CBQiOsnIP7yhEyCaLg3IatU1nM3apb45FttTAtnDE+LtMpLN8vfrXtPr8TKz3yY5qb1c3SQ3nyyIyPNdqB2Y9uiPuAj0UnOYNxi3UBZVYZEkXtbPhR3zzM2/az+hj0OgIMHJroBsstlKG9fP7Axeg1EFankLt5znjMBrcwCOGJt5kgX0RK8zgbKJD3f6lcnRdx8tihOkejGHRo/LtWf+nJVkiIE7yNGrKl3dR/WgPFDZlIPOnYe9O2+xzsjcdXD+Dgg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: BPF programs can run in nmi context and may trigger memcg charged memory allocation in such context. Recently linux added support to nmi safe page allocation along with memcg charging of such allocations. However the kmalloc/slab support and corresponding memcg charging is still lacking, To provide nmi safe support for memcg charging for kmalloc/slab allocations, we need nmi safe memcg stats because for kernel memory charging and stats happen together. At the moment, memcg charging and memcg stats are nmi safe and the only thing which is not nmi safe is adding the cgroup to the per-cpu rstat update tree. i.e. css_rstat_updated() which this series is doing. This series made css_rstat_updated by using per-cpu lockless lists whose node in embedded in individual struct cgroup_subsys_state and the per-cpu head is placed in struct cgroup_subsys. For rstat users without cgroup_subsys, a global per-cpu lockless list head is created. The main challenge to use lockless in this scenario was the potential multiple inserters from the stacked context i.e. process, softirq, hardirq & nmi, potentially using the same per-cpu lockless node of a given cgroup_subsys_state. The normal lockless list does not protect against such scenario. The multiple stacked inserters using potentially same lockless node was resolved by making one of them succeed on reset the lockless node and the winner gets to insert the lockless node in the corresponding lockless list. The losers can assume the lockless list insertion will eventually succeed and continue their operation. Changelog since v3: - Rebased on for-6.17 branch of cgroup tree Changelog since v2: - Add more clear explanation in cover letter and in the comment as suggested by Andrew, Michal & Tejun. - Use this_cpu_cmpxchg() instead of try_cmpxchg() as suggested by Tejun. - Remove the per-cpu ss locks as they are not needed anymore. Changelog since v1: - Based on Yosry's suggestion always use llist on the update side and create the update tree on flush side [v1] https://lore.kernel.org/cgroups/20250429061211.1295443-1-shakeel.butt@linux.dev/ Shakeel Butt (4): cgroup: support to enable nmi-safe css_rstat_updated cgroup: make css_rstat_updated nmi safe cgroup: remove per-cpu per-subsystem locks memcg: cgroup: call css_rstat_updated irrespective of in_nmi() include/linux/cgroup-defs.h | 11 +-- include/trace/events/cgroup.h | 47 ---------- kernel/cgroup/rstat.c | 158 ++++++++++++++-------------------- mm/memcontrol.c | 10 +-- 4 files changed, 72 insertions(+), 154 deletions(-) -- 2.47.1