From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16C03C4345F for ; Fri, 26 Apr 2024 08:11:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7875B6B0093; Fri, 26 Apr 2024 04:11:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 736366B0096; Fri, 26 Apr 2024 04:11:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FF136B0098; Fri, 26 Apr 2024 04:11:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 42E106B0093 for ; Fri, 26 Apr 2024 04:11:54 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C81991A119F for ; Fri, 26 Apr 2024 08:11:53 +0000 (UTC) X-FDA: 82050964506.28.6778FC3 Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) by imf21.hostedemail.com (Postfix) with ESMTP id 051341C000D for ; Fri, 26 Apr 2024 08:11:51 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=X5RCqyrJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.160.48 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714119112; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Abckr9PsZt6M9XfZKq0N2yL1inDajc/CKDpZfe0fw6I=; b=3GzB/Ah+5XC6/kAXO+s46zeLcV9dKIv79iA7Eq+6pakyYkHKGRFN1TDZpFfD71jF/wC2Fg zqksQzAlcZ+R3AOtuwwK5tJ8jZNIaUaLwWyCnPNVn9ixjowHF3Fottzoxy+NsH6bAxVfhO /6KecUmWl0+O72afmRVBwCy5VEykdHI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=X5RCqyrJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.160.48 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714119112; a=rsa-sha256; cv=none; b=292bJruxehTkmy5Mc5wjiuCpqxRewvJ5+54WeLfRoQi+nB56s72js607MdwT1BGKrR4XCj 0xniKe8S+h3XyaRZZSlhp3Ez1GDWGiGM+Jqr54rlLAT7nRWL5IEXkDNmpulHXk1RmBbv7I ENsAd+faMJzvkXOTGThZKeEkxqCbobY= Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-23935d89261so959660fac.1 for ; Fri, 26 Apr 2024 01:11:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714119111; x=1714723911; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Abckr9PsZt6M9XfZKq0N2yL1inDajc/CKDpZfe0fw6I=; b=X5RCqyrJsy3T3mO0H9wK4gribmCC/DbSN9lBtz6pk2HGL0RETI2nZNPeUeTlLA3fh4 cuto2oaQk3kt4pQRrSz6JrP1xTJBDaCRCu7XqsuNFpGbdCR3SRSHEPfbiEUwUggQpUKW sUIg410G5YcZR5MKws36WtOWY9v75pcyeNyYVBdmImcESLmy2esQDlG6sVOpj2rVlj3k zAPV6rBWhTUwMAu0OTstCa8Be2PY+2NMstft/p85zU24cVQvshUO82evPm/OwCeXuyPJ BYisZkPzO5YqFX82+vgQH/hJvi7IjA+NlrdPmLvYL8qCxp+Oecv5OUFSqvN3NvjPQwpd icTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714119111; x=1714723911; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Abckr9PsZt6M9XfZKq0N2yL1inDajc/CKDpZfe0fw6I=; b=s0rE6zZzKygTyot8Fk4+MPMAFzZHpjqF921pq9RqGfZid0BSoszJ+CTj1EYKx/XjBl fGWHuYgfk1ITTmwcLXA1+T/87WMw5Dxx41GN7OA6st7TzaI4bcbnruo0JkiWn9aW+Xe1 7Q3090mferc4V/8FrXIeMOQX0uWsPEScd0j9Wy8I8a9jQCAraom8SJ4mxjB8PaQfMMmw D3994czvEZs60Ck62ejrwsdMpLa64JxLAsYbOxJxJZV21Yu4FJ08xN9THz4RS9mOUWvb 58AHNCl27Dni7a6G5mCfx2QwE5LvmcnCDeKA7ik7/nnXDHNlAYXZu7BrWjOBP0Me1Hx7 wfQQ== X-Gm-Message-State: AOJu0Yx41NJKqw81+Orjrev8HOkx+s2goK3IK+E/9ZABCQmrYZEqKKOD aUVxrvOzlnszdIj4cy0oIMu03E976J7MG/H5Rbe3/wpzg9eE85dv X-Google-Smtp-Source: AGHT+IFc8NisYizEOwGS51vMjf9HgMa6gMziJgQYK6WQXSlM7tU7tZ0BqCZv8yQArgnVcaI5E0okpg== X-Received: by 2002:a05:6870:224a:b0:239:5419:bc74 with SMTP id j10-20020a056870224a00b002395419bc74mr2023347oaf.27.1714119110565; Fri, 26 Apr 2024 01:11:50 -0700 (PDT) Received: from snowbird ([136.25.84.117]) by smtp.gmail.com with ESMTPSA id r10-20020a6560ca000000b005fd74e632f0sm8339769pgv.38.2024.04.26.01.11.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Apr 2024 01:11:50 -0700 (PDT) Date: Fri, 26 Apr 2024 01:11:46 -0700 From: Dennis Zhou To: Peng Zhang Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, dennisszhou@gmail.com, shakeelb@google.com, jack@suse.cz, surenb@google.com, kent.overstreet@linux.dev, mhocko@suse.cz, vbabka@suse.cz, yuzhao@google.com, yu.ma@intel.com, wangkefeng.wang@huawei.com, sunnanyong@huawei.com Subject: Re: [RFC PATCH v2 1/2] percpu_counter: introduce atomic mode for percpu_counter Message-ID: References: <20240418142008.2775308-1-zhangpeng362@huawei.com> <20240418142008.2775308-2-zhangpeng362@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240418142008.2775308-2-zhangpeng362@huawei.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 051341C000D X-Rspam-User: X-Stat-Signature: g8ghim18uwhm4sfxomthqgc8ow7xqnuh X-HE-Tag: 1714119111-465954 X-HE-Meta: U2FsdGVkX18rR7nBnuL9YpvNVBwd89G89BUfUVRec44saPLSymYAgGiI1M60D2C5V+407Q7jHDt34ja3KzTqChUOOJL9d3JK0uGKB9dBaYzNCFDYxotcOyUl4m3/5TAkjrJP4q8F26G6JRZ9QSQJsT2VwL8hwSphUVxcrY5GX4zhJl5FaL27G5xUUi7jUStuqGv/v/v3f8l19Kogm4toL1h+26EexQwLFzF/eop8tocchWffUrf2lj+4GDYYjy7c/8NFjejZVeIvNKIpcsdHmHhb10zFVkkIp9Ev9Nmawpvsid29b0IKisfvSEqZocmOjIzJ3YoXymLU/Xbp3VUfSUmT69W11AbdctgWvpOlOvoMg3NAp4pPV3Cx/5adU8/O360lhfOlsVHmcIRGkwHCK1gqpk66yXVAo3RK5SCF8gHNFbWidZTGvWWuJgTGcrvchBsILcW8BbBZtGmCtxymNOG6NjFCPBGEFVksWv4BL2Ws9WrhaPFvKuEfvajYM44Wg3H0x+CnxKmS+/LD/jUi9EkmPXg56ZDlowS0FrTaUHjVZY6nA1se9CzTojQwBh9a0jHvQhSOG7pHH6tcZGy8Dj1p9I7IFcqyhnuqriZsesJSjN8oXTCyALODULaLakm3eskjCLhZLhqeVG/PIW/RQEoPiBb0/abEvDU0t7Pya3pmS2PTsSCLqeXTnTjXprWEqPS/bNsGaoheOMcvbLuRv4GaCQHzGEvP9diKX/v5t2FOKzhaLoR3+3wHVcHbmQEMGXVFdMojV8OH8ymGHrH47gIJeiTgKmA3/mjdyfvksPjgMoE91tzvmEHz98T9y1jDCqTrzR9SRCWjII3mEV6p1YFV8XZrdYds92Svl1BWNrJoz5Q/ehMLtXXLTJ6GsSsOZ98yae25VEprtyUkblWBlWqjMs2osbpPyZBYdggtKblVMY3vd6L01+Ad1VPhgwRaiV2iehe9OADtv+37Ni/ 3e3F9IJo 0IS+Mgh7gN/thz6O/nZs+TgESalpjGjXe+7YhxsYvE81NIRc5d9tbr3p7QFOfutieL9xWUfIgENrXkqXwbgMB2q5FPSR0je7hnQ8ZmYraH4sESR4yGqDwktMmF2lw2JZaxayXZ4tGfP3i+PYEGiMCEfzJqIg1I7jAca1tDNEuepoL7Qd0r63JiD3/UEmGFMIbSA2nf6x0fAFGERSQug7hYRDY8idX0Z3U/tknNCE4oZWJ0faXwbUChYR9CtK+Dd1vAWHCt4pYKL4R67LVxC69dbCXLxqnu9Qb5aEETwpGySRA9DvkFLvrKdysxR6MZl4svt+V9lL0Rplab4Z9cix4VnNjlf69IfSSHurKxZx4de8ZuohHykjqCQjdfn7vTii96wsKWTdzZbgiwcALXuc0pCKYsfojt50xHEUw/2AvYu2iAYc2vFbKX5TnWpFdxSz5yEqTj3we3qLuCMXJXBNSzvhPrZHtadkZlDyAB+y4/IUUrzISi/p+BKjOUb5ZKOXgS8ey X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 18, 2024 at 10:20:07PM +0800, Peng Zhang wrote: > From: ZhangPeng > > Depending on whether counters is NULL, we can support two modes: > atomic mode and perpcu mode. We implement both modes by grouping > the s64 count and atomic64_t count_atomic in a union. At the same time, > we create the interface for adding and reading in atomic mode and for > switching atomic mode to percpu mode. > > Suggested-by: Jan Kara > Signed-off-by: ZhangPeng > Signed-off-by: Kefeng Wang > --- > include/linux/percpu_counter.h | 43 +++++++++++++++++++++++++++++++--- > lib/percpu_counter.c | 31 ++++++++++++++++++++++-- > 2 files changed, 69 insertions(+), 5 deletions(-) > > diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h > index 3a44dd1e33d2..160f9734c0bb 100644 > --- a/include/linux/percpu_counter.h > +++ b/include/linux/percpu_counter.h > @@ -21,7 +21,13 @@ > > struct percpu_counter { > raw_spinlock_t lock; > - s64 count; > + /* Depending on whether counters is NULL, we can support two modes, > + * atomic mode using count_atomic and perpcu mode using count. > + */ > + union { > + s64 count; > + atomic64_t count_atomic; > + }; > #ifdef CONFIG_HOTPLUG_CPU > struct list_head list; /* All percpu_counters are on a list */ > #endif > @@ -32,14 +38,14 @@ extern int percpu_counter_batch; > > int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > gfp_t gfp, u32 nr_counters, > - struct lock_class_key *key); > + struct lock_class_key *key, bool switch_mode); Nit: the lock_class_key at the end. > > #define percpu_counter_init_many(fbc, value, gfp, nr_counters) \ > ({ \ > static struct lock_class_key __key; \ > \ > __percpu_counter_init_many(fbc, value, gfp, nr_counters,\ > - &__key); \ > + &__key, false); \ > }) > > > @@ -130,6 +136,20 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) > return (fbc->counters != NULL); > } > > +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) > +{ > + return atomic64_read(&fbc->count_atomic); > +} > + > +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, > + s64 amount) > +{ > + atomic64_add(amount, &fbc->count_atomic); > +} > + > +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters); > + > #else /* !CONFIG_SMP */ > > struct percpu_counter { > @@ -260,6 +280,23 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) > static inline void percpu_counter_sync(struct percpu_counter *fbc) > { > } > + > +static inline s64 percpu_counter_atomic_read(struct percpu_counter *fbc) > +{ > + return fbc->count; > +} > + > +static inline void percpu_counter_atomic_add(struct percpu_counter *fbc, > + s64 amount) > +{ > + percpu_counter_add(fbc, amount); > +} > + > +static inline int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters) > +{ > + return 0; > +} > #endif /* CONFIG_SMP */ > > static inline void percpu_counter_inc(struct percpu_counter *fbc) > diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c > index 44dd133594d4..95c4e038051a 100644 > --- a/lib/percpu_counter.c > +++ b/lib/percpu_counter.c > @@ -153,7 +153,7 @@ EXPORT_SYMBOL(__percpu_counter_sum); > > int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > gfp_t gfp, u32 nr_counters, > - struct lock_class_key *key) > + struct lock_class_key *key, bool switch_mode) > { > unsigned long flags __maybe_unused; > size_t counter_size; > @@ -174,7 +174,8 @@ int __percpu_counter_init_many(struct percpu_counter *fbc, s64 amount, > #ifdef CONFIG_HOTPLUG_CPU > INIT_LIST_HEAD(&fbc[i].list); > #endif > - fbc[i].count = amount; > + if (likely(!switch_mode)) > + fbc[i].count = amount; > fbc[i].counters = (void *)counters + (i * counter_size); > > debug_percpu_counter_activate(&fbc[i]); > @@ -357,6 +358,32 @@ bool __percpu_counter_limited_add(struct percpu_counter *fbc, > return good; > } > > +/* > + * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from > + * atomic mode to percpu mode. > + */ > +int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc, > + u32 nr_counters) > +{ > + static struct lock_class_key __key; This is an improper use of lockdep. Now all of the percpu_counters initialized on this path will key off of this lock_class_key. > + unsigned long flags; > + bool ret = 0; > + > + if (percpu_counter_initialized(fbc)) > + return 0; > + > + preempt_disable(); > + local_irq_save(flags); > + if (likely(!percpu_counter_initialized(fbc))) > + ret = __percpu_counter_init_many(fbc, 0, > + GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO, > + nr_counters, &__key, true); > + local_irq_restore(flags); > + preempt_enable(); > + > + return ret; > +} I'm staring at this API and I'm not in love with it. I think it hinges on that there's one user of mm_stats prior hence it's safe. Generically though, I can't see why this is safe. I need to give this a little more thought, but my gut reaction is I'd rather this look like percpu_refcount where we can init dead minus the percpu allocation. Maybe that's not quite right, but I'd feel better about it than requiring external synchronization on a percpu_counter to ensure that it's correct. > + > static int __init percpu_counter_startup(void) > { > int ret; > -- > 2.25.1 > Thanks, Dennis