From: Michal Hocko <mhocko@suse.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
"Paul E. McKenney" <paulmck@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@linux.com>,
Martin Liu <liumartin@google.com>,
David Rientjes <rientjes@google.com>,
christian.koenig@amd.com, Shakeel Butt <shakeel.butt@linux.dev>,
SeongJae Park <sj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Sweet Tea Dorminy <sweettea-kernel@dorminy.me>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <liam.howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
Christian Brauner <brauner@kernel.org>,
Wei Yang <richard.weiyang@gmail.com>,
David Hildenbrand <david@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
Yu Zhao <yuzhao@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Mateusz Guzik <mjguzik@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Aboorva Devarajan <aboorvad@linux.ibm.com>
Subject: Re: [PATCH v16 1/3] lib: Introduce hierarchical per-cpu counters
Date: Wed, 14 Jan 2026 17:41:48 +0100 [thread overview]
Message-ID: <aWfHTLDA8-Fja_gD@tiehlicka> (raw)
In-Reply-To: <20260114145915.49926-2-mathieu.desnoyers@efficios.com>
On Wed 14-01-26 09:59:13, Mathieu Desnoyers wrote:
> * Motivation
>
> The purpose of this hierarchical split-counter scheme is to:
>
> - Minimize contention when incrementing and decrementing counters,
> - Provide fast access to a sum approximation,
> - Provide a sum approximation with an acceptable accuracy level when
> scaling to many-core systems.
> - Provide approximate and precise comparison of two counters, and
> between a counter and a value.
> - Provide possible precise sum ranges for a given sum approximation.
>
> Its goals are twofold:
>
> - Improve the accuracy of the approximated RSS counter values returned
> by proc interfaces [1],
> - Reduce the latency of the OOM killer on large many-core systems.
>
> * Design
>
> The hierarchical per-CPU counters propagate a sum approximation through
> a N-way tree. When reaching the batch size, the carry is propagated
> through a binary tree which consists of logN(nr_cpu_ids) levels. The
> batch size for each level is twice the batch size of the prior level.
>
> Example propagation diagram with 8 cpus through a binary tree:
>
> Level 0: 0 1 2 3 4 5 6 7
> | / | / | / | /
> | / | / | / | /
> | / | / | / | /
> Level 1: 0 1 2 3
> | / | /
> | / | /
> | / | /
> Level 2: 0 1
> | /
> | /
> | /
> Level 3: 0
>
> For a binary tree, the maximum inaccuracy is bound by:
> batch_size * log2(nr_cpus) * nr_cpus
> which evolves with O(n*log(n)) as the number of CPUs increases.
>
> For a N-way tree, the maximum inaccuracy can be pre-calculated
> based on the the N-arity of each level and the batch size.
One thing you should probably mention here is the memory consumption of
the structure.
I have briefly looked at the implementation and concluded that I do not
have enough time to make a thorough review. Sorry about that. As I've
said in previous version the overall idea is sound. Especially if the
additional memory consumption is not a factor. I will let others judge
implementation details.
Thanks!
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2026-01-14 16:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-14 14:59 [PATCH v16 0/3] Improve proc RSS accuracy and OOM killer latency Mathieu Desnoyers
2026-01-14 14:59 ` [PATCH v16 1/3] lib: Introduce hierarchical per-cpu counters Mathieu Desnoyers
2026-01-14 16:41 ` Michal Hocko [this message]
2026-01-14 19:19 ` Mathieu Desnoyers
2026-01-16 15:51 ` Michal Hocko
2026-01-26 16:34 ` Mathieu Desnoyers
2026-01-14 14:59 ` [PATCH v16 2/3] mm: Improve RSS counter approximation accuracy for proc interfaces Mathieu Desnoyers
2026-01-14 16:48 ` Michal Hocko
2026-01-14 19:21 ` Mathieu Desnoyers
2026-01-14 14:59 ` [PATCH v16 3/3] mm: Reduce latency of OOM killer task selection with 2-pass algorithm Mathieu Desnoyers
2026-01-14 17:06 ` Michal Hocko
2026-01-14 19:36 ` Mathieu Desnoyers
2026-01-16 15:55 ` Michal Hocko
2026-01-26 16:39 ` Mathieu Desnoyers
2026-01-26 17:47 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWfHTLDA8-Fja_gD@tiehlicka \
--to=mhocko@suse.com \
--cc=aboorvad@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=christian.koenig@amd.com \
--cc=cl@linux.com \
--cc=david@redhat.com \
--cc=dennis@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=liam.howlett@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=liumartin@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mjguzik@gmail.com \
--cc=paulmck@kernel.org \
--cc=richard.weiyang@gmail.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=sweettea-kernel@dorminy.me \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.