From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13E5DEEC0 for ; Fri, 20 Feb 2026 00:50:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771548631; cv=none; b=iEx3zjqhzm4uc2+5lHD7AlBFa+0IaGOJx4mGSdDL966HXsH+zSHCY+NrsD8pa6KBw6buD4Sx8Dop41r+NTRmYRVLu2exEYs+Z4YI09sQztu9FZ571nBvkHL7rcaVPtjvW8AZ8qxloWIhH1293/0/YSd8Rx98Qtx/YEOQQBGY0n0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771548631; c=relaxed/simple; bh=agyS6pEyVNpYMMPYyANH4wYJLVwjC9Mwf5QGrBTiy+c=; h=Date:To:From:Subject:Message-Id; b=kqR/2JaMsFURArJd9oyhSEH7hDfnYo+GhBKlkEX4HTOxJKWl7rlfxDHOeAucGggRnegSeJ7lKTHUpsaChsiVWiZjiCL8jYO86e94syV992oKGRrN4GO7SHNmGx0cRCvAWijwitJ5lrrS5BDATxIpiwcLsSpANtNY4WuUj66cio4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=tBag+vrr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="tBag+vrr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEAD7C4CEF7; Fri, 20 Feb 2026 00:50:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1771548630; bh=agyS6pEyVNpYMMPYyANH4wYJLVwjC9Mwf5QGrBTiy+c=; h=Date:To:From:Subject:From; b=tBag+vrrBlQOhUYrQ3QMW/rYkJCBKOGXgGfg8NzysQvcg+V2yZx+lGmWGCPkzp0xt wssVsxlCbIV+rGVIzBn5irJL43ULZuT/eAINJlov/TjZ28rslZOAFbhNbVJFZHhZ4C X52iijg3QK3Z5okYe50ajDLHgqjsnBNFIcMrUmpQ= Date: Thu, 19 Feb 2026 16:50:30 -0800 To: mm-commits@vger.kernel.org,yuzhao@google.com,willy@infradead.org,viro@zeniv.linux.org.uk,vbabka@suse.cz,tj@kernel.org,sweettea-kernel@dorminy.me,surenb@google.com,sj@kernel.org,shakeel.butt@linux.dev,rppt@kernel.org,rostedt@goodmis.org,roman.gushchin@linux.dev,rientjes@google.com,richard.weiyang@gmail.com,paulmck@kernel.org,mjguzik@gmail.com,mhocko@suse.com,mhiramat@kernel.org,lorenzo.stoakes@oracle.com,liumartin@google.com,linmiaohe@huawei.com,liam.howlett@oracle.com,hannes@cmpxchg.org,dennis@kernel.org,david@redhat.com,cl@linux.com,christian.koenig@amd.com,brauner@kernel.org,baolin.wang@linux.alibaba.com,aboorvad@linux.ibm.com,mathieu.desnoyers@efficios.com,akpm@linux-foundation.org From: Andrew Morton Subject: + lib-test-hierarchical-per-cpu-counters.patch added to mm-new branch Message-Id: <20260220005030.AEAD7C4CEF7@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: lib: test hierarchical per-cpu counters has been added to the -mm mm-new branch. Its filename is lib-test-hierarchical-per-cpu-counters.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/lib-test-hierarchical-per-cpu-counters.patch This patch will later appear in the mm-new branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Note, mm-new is a provisional staging ground for work-in-progress patches, and acceptance into mm-new is a notification for others take notice and to finish up reviews. Please do not hesitate to respond to review feedback and post updated versions to replace or incrementally fixup patches in mm-new. The mm-new branch of mm.git is not included in linux-next Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via various branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there most days ------------------------------------------------------ From: Mathieu Desnoyers Subject: lib: test hierarchical per-cpu counters Date: Tue, 17 Feb 2026 11:10:05 -0500 Introduce Kunit tests for hierarchical per-cpu counters. Keep track of two sets of hierarchical counters, each meant to have the same precise sum at any time, but distributed differently across the topology. Keep track of an atomic counter along with each hierarchical counter, for sum validation. Those tests cover: - Single-threaded (no concurrency) updates. - Concurrent updates of counters from various CPUs. Perform the following validations: - Compare the precise sum of counters with the sum tracked by an atomic counter. - Compare the precise sum of two sets of hierarchical counters. - Approximated comparison of hierarchical counter with atomic counter. - Approximated comparison of two sets of hierarchical counters. - Validate the bounds of approximation ranges. Run with the following .kunit/.kunitconfig: CONFIG_KUNIT=y CONFIG_SMP=y CONFIG_PREEMPT=y CONFIG_NR_CPUS=32 CONFIG_HOTPLUG_CPU=y CONFIG_PERCPU_COUNTER_TREE_TEST=y With the following execution (to use SMP): ./tools/testing/kunit/kunit.py run --arch=x86_64 --qemu_args="-smp 12" Link: https://lkml.kernel.org/r/20260217161006.1105611-3-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers Cc: Aboorva Devarajan Cc: Al Viro Cc: Baolin Wang Cc: Christan König Cc: Christian Brauner Cc: Christoph Lameter Cc: David Hildenbrand Cc: David Rientjes Cc: Dennis Zhou Cc: Johannes Weiner Cc: "Liam R . Howlett" Cc: Lorenzo Stoakes Cc: Martin Liu Cc: Masami Hiramatsu Cc: Mateusz Guzik Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Michal Hocko Cc: Mike Rapoport Cc: "Paul E. McKenney" Cc: Roman Gushchin Cc: SeongJae Park Cc: Shakeel Butt Cc: Steven Rostedt Cc: Suren Baghdasaryan Cc: Sweet Tea Dorminy Cc: Tejun Heo Cc: Vlastimil Babka Cc: Wei Yang Cc: Yu Zhao Signed-off-by: Andrew Morton --- lib/Kconfig | 12 lib/tests/Makefile | 2 lib/tests/percpu_counter_tree_kunit.c | 351 ++++++++++++++++++++++++ 3 files changed, 365 insertions(+) --- a/lib/Kconfig~lib-test-hierarchical-per-cpu-counters +++ a/lib/Kconfig @@ -52,6 +52,18 @@ config PACKING_KUNIT_TEST When in doubt, say N. +config PERCPU_COUNTER_TREE_TEST + tristate "Hierarchical Per-CPU counter test" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds Kunit tests for the hierarchical per-cpu counters. + + For more information on KUnit and unit tests in general, + please refer to the KUnit documentation in Documentation/dev-tools/kunit/. + + When in doubt, say N. + config BITREVERSE tristate --- a/lib/tests/Makefile~lib-test-hierarchical-per-cpu-counters +++ a/lib/tests/Makefile @@ -56,4 +56,6 @@ obj-$(CONFIG_UTIL_MACROS_KUNIT) += util_ obj-$(CONFIG_RATELIMIT_KUNIT_TEST) += test_ratelimit.o obj-$(CONFIG_UUID_KUNIT_TEST) += uuid_kunit.o +obj-$(CONFIG_PERCPU_COUNTER_TREE_TEST) += percpu_counter_tree_kunit.o + obj-$(CONFIG_TEST_RUNTIME_MODULE) += module/ diff --git a/lib/tests/percpu_counter_tree_kunit.c a/lib/tests/percpu_counter_tree_kunit.c new file mode 100644 --- /dev/null +++ a/lib/tests/percpu_counter_tree_kunit.c @@ -0,0 +1,351 @@ +// SPDX-License-Identifier: GPL-2.0+ OR MIT +// SPDX-FileCopyrightText: 2026 Mathieu Desnoyers + +#include +#include +#include +#include +#include + +struct multi_thread_test_data { + long increment; + int nr_inc; + int counter_index; +}; + +/* Hierarchical per-CPU counter instances. */ +static struct percpu_counter_tree counter[2]; +static struct percpu_counter_tree_level_item *items[2]; + +/* Global atomic counters for validation. */ +static atomic_long_t global_counter[2]; + +static struct wait_queue_head kernel_threads_wq; +static atomic_t kernel_threads_to_run; + +static void complete_work(void) +{ + if (atomic_dec_and_test(&kernel_threads_to_run)) + wake_up(&kernel_threads_wq); +} + +static void hpcc_print_info(struct kunit *test) +{ + kunit_info(test, "Running test with %d CPUs\n", num_online_cpus()); +} + +static void add_to_counter(int counter_index, unsigned int nr_inc, long increment) +{ + unsigned int i; + + for (i = 0; i < nr_inc; i++) { + percpu_counter_tree_add(&counter[counter_index], increment); + atomic_long_add(increment, &global_counter[counter_index]); + } +} + +static void check_counters(struct kunit *test) +{ + int counter_index; + + /* Compare each counter with its global counter. */ + for (counter_index = 0; counter_index < 2; counter_index++) { + long v = atomic_long_read(&global_counter[counter_index]); + long approx_sum = percpu_counter_tree_approximate_sum(&counter[counter_index]); + unsigned long under_accuracy = 0, over_accuracy = 0; + long precise_min, precise_max; + + /* Precise comparison. */ + KUNIT_EXPECT_EQ(test, percpu_counter_tree_precise_sum(&counter[counter_index]), v); + KUNIT_EXPECT_EQ(test, 0, percpu_counter_tree_precise_compare_value(&counter[counter_index], v)); + + /* Approximate comparison. */ + KUNIT_EXPECT_EQ(test, 0, percpu_counter_tree_approximate_compare_value(&counter[counter_index], v)); + + /* Accuracy limits checks. */ + percpu_counter_tree_approximate_accuracy_range(&counter[counter_index], &under_accuracy, &over_accuracy); + + KUNIT_EXPECT_GE(test, (long)(approx_sum - (v - under_accuracy)), 0); + KUNIT_EXPECT_LE(test, (long)(approx_sum - (v + over_accuracy)), 0); + KUNIT_EXPECT_GT(test, (long)(approx_sum - (v - under_accuracy - 1)), 0); + KUNIT_EXPECT_LT(test, (long)(approx_sum - (v + over_accuracy + 1)), 0); + + /* Precise min/max range check. */ + percpu_counter_tree_approximate_min_max_range(approx_sum, under_accuracy, over_accuracy, &precise_min, &precise_max); + + KUNIT_EXPECT_GE(test, v - precise_min, 0); + KUNIT_EXPECT_LE(test, v - precise_max, 0); + KUNIT_EXPECT_GT(test, v - (precise_min - 1), 0); + KUNIT_EXPECT_LT(test, v - (precise_max + 1), 0); + } + /* Compare each counter with the second counter. */ + KUNIT_EXPECT_EQ(test, percpu_counter_tree_precise_sum(&counter[0]), percpu_counter_tree_precise_sum(&counter[1])); + KUNIT_EXPECT_EQ(test, 0, percpu_counter_tree_precise_compare(&counter[0], &counter[1])); + KUNIT_EXPECT_EQ(test, 0, percpu_counter_tree_approximate_compare(&counter[0], &counter[1])); +} + +static int multi_thread_worker_fn(void *data) +{ + struct multi_thread_test_data *td = data; + + add_to_counter(td->counter_index, td->nr_inc, td->increment); + complete_work(); + kfree(td); + return 0; +} + +static void test_run_on_specific_cpu(struct kunit *test, int target_cpu, int counter_index, unsigned int nr_inc, long increment) +{ + struct task_struct *task; + struct multi_thread_test_data *td = kzalloc(sizeof(struct multi_thread_test_data), GFP_KERNEL); + + KUNIT_EXPECT_PTR_NE(test, td, NULL); + td->increment = increment; + td->nr_inc = nr_inc; + td->counter_index = counter_index; + atomic_inc(&kernel_threads_to_run); + task = kthread_run_on_cpu(multi_thread_worker_fn, td, target_cpu, "kunit_multi_thread_worker"); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, task); +} + +static void init_kthreads(void) +{ + atomic_set(&kernel_threads_to_run, 1); + init_waitqueue_head(&kernel_threads_wq); +} + +static void fini_kthreads(void) +{ + /* Release our own reference. */ + complete_work(); + /* Wait for all others threads to run. */ + __wait_event(kernel_threads_wq, (atomic_read(&kernel_threads_to_run) == 0)); +} + +static void test_sync_kthreads(void) +{ + fini_kthreads(); + init_kthreads(); +} + +static void init_counters(struct kunit *test, unsigned long batch_size) +{ + int i, ret; + + for (i = 0; i < 2; i++) { + items[i] = kzalloc(percpu_counter_tree_items_size(), GFP_KERNEL); + KUNIT_EXPECT_PTR_NE(test, items[i], NULL); + ret = percpu_counter_tree_init(&counter[i], items[i], batch_size, GFP_KERNEL); + KUNIT_EXPECT_EQ(test, ret, 0); + + atomic_long_set(&global_counter[i], 0); + } +} + +static void fini_counters(void) +{ + int i; + + for (i = 0; i < 2; i++) { + percpu_counter_tree_destroy(&counter[i]); + kfree(items[i]); + } +} + +enum up_test_inc_type { + INC_ONE, + INC_MINUS_ONE, + INC_RANDOM, +}; + +/* + * Single-threaded tests. Those use many threads to run on various CPUs, + * but synchronize for completion of each thread before running the + * next, effectively making sure there are no concurrent updates. + */ +static void do_hpcc_test_single_thread(struct kunit *test, int _cpu0, int _cpu1, enum up_test_inc_type type) +{ + unsigned long batch_size_order = 5; + int cpu0 = _cpu0; + int cpu1 = _cpu1; + int i; + + init_counters(test, 1UL << batch_size_order); + init_kthreads(); + for (i = 0; i < 10000; i++) { + long increment; + + switch (type) { + case INC_ONE: + increment = 1; + break; + case INC_MINUS_ONE: + increment = -1; + break; + case INC_RANDOM: + increment = (long) get_random_long() % 50000; + break; + } + if (_cpu0 < 0) + cpu0 = cpumask_any_distribute(cpu_online_mask); + if (_cpu1 < 0) + cpu1 = cpumask_any_distribute(cpu_online_mask); + test_run_on_specific_cpu(test, cpu0, 0, 1, increment); + test_sync_kthreads(); + test_run_on_specific_cpu(test, cpu1, 1, 1, increment); + test_sync_kthreads(); + check_counters(test); + } + fini_kthreads(); + fini_counters(); +} + +static void hpcc_test_single_thread_first(struct kunit *test) +{ + int cpu = cpumask_first(cpu_online_mask); + + do_hpcc_test_single_thread(test, cpu, cpu, INC_ONE); + do_hpcc_test_single_thread(test, cpu, cpu, INC_MINUS_ONE); + do_hpcc_test_single_thread(test, cpu, cpu, INC_RANDOM); +} + +static void hpcc_test_single_thread_first_random(struct kunit *test) +{ + int cpu = cpumask_first(cpu_online_mask); + + do_hpcc_test_single_thread(test, cpu, -1, INC_ONE); + do_hpcc_test_single_thread(test, cpu, -1, INC_MINUS_ONE); + do_hpcc_test_single_thread(test, cpu, -1, INC_RANDOM); +} + +static void hpcc_test_single_thread_random(struct kunit *test) +{ + do_hpcc_test_single_thread(test, -1, -1, INC_ONE); + do_hpcc_test_single_thread(test, -1, -1, INC_MINUS_ONE); + do_hpcc_test_single_thread(test, -1, -1, INC_RANDOM); +} + +/* Multi-threaded SMP tests. */ + +static void do_hpcc_multi_thread_increment_each_cpu(struct kunit *test, unsigned long batch_size, unsigned int nr_inc, long increment) +{ + int cpu; + + init_counters(test, batch_size); + init_kthreads(); + for_each_online_cpu(cpu) { + test_run_on_specific_cpu(test, cpu, 0, nr_inc, increment); + test_run_on_specific_cpu(test, cpu, 1, nr_inc, increment); + } + fini_kthreads(); + check_counters(test); + fini_counters(); +} + +static void do_hpcc_multi_thread_increment_even_cpus(struct kunit *test, unsigned long batch_size, unsigned int nr_inc, long increment) +{ + int cpu; + + init_counters(test, batch_size); + init_kthreads(); + for_each_online_cpu(cpu) { + test_run_on_specific_cpu(test, cpu, 0, nr_inc, increment); + test_run_on_specific_cpu(test, cpu & ~1, 1, nr_inc, increment); /* even cpus. */ + } + fini_kthreads(); + check_counters(test); + fini_counters(); +} + +static void do_hpcc_multi_thread_increment_single_cpu(struct kunit *test, unsigned long batch_size, unsigned int nr_inc, long increment) +{ + int cpu; + + init_counters(test, batch_size); + init_kthreads(); + for_each_online_cpu(cpu) { + test_run_on_specific_cpu(test, cpu, 0, nr_inc, increment); + test_run_on_specific_cpu(test, cpumask_first(cpu_online_mask), 1, nr_inc, increment); + } + fini_kthreads(); + check_counters(test); + fini_counters(); +} + +static void do_hpcc_multi_thread_increment_random_cpu(struct kunit *test, unsigned long batch_size, unsigned int nr_inc, long increment) +{ + int cpu; + + init_counters(test, batch_size); + init_kthreads(); + for_each_online_cpu(cpu) { + test_run_on_specific_cpu(test, cpu, 0, nr_inc, increment); + test_run_on_specific_cpu(test, cpumask_any_distribute(cpu_online_mask), 1, nr_inc, increment); + } + fini_kthreads(); + check_counters(test); + fini_counters(); +} + +static void hpcc_test_multi_thread_batch_increment(struct kunit *test) +{ + unsigned long batch_size_order; + + for (batch_size_order = 2; batch_size_order < 10; batch_size_order++) { + unsigned int nr_inc; + + for (nr_inc = 1; nr_inc < 1024; nr_inc *= 2) { + long increment; + + for (increment = 1; increment < 100000; increment *= 10) { + do_hpcc_multi_thread_increment_each_cpu(test, 1UL << batch_size_order, nr_inc, increment); + do_hpcc_multi_thread_increment_even_cpus(test, 1UL << batch_size_order, nr_inc, increment); + do_hpcc_multi_thread_increment_single_cpu(test, 1UL << batch_size_order, nr_inc, increment); + do_hpcc_multi_thread_increment_random_cpu(test, 1UL << batch_size_order, nr_inc, increment); + } + } + } +} + +static void hpcc_test_multi_thread_random_walk(struct kunit *test) +{ + unsigned long batch_size_order = 5; + int loop; + + for (loop = 0; loop < 100; loop++) { + int i; + + init_counters(test, 1UL << batch_size_order); + init_kthreads(); + for (i = 0; i < 1000; i++) { + long increment = (long) get_random_long() % 512; + unsigned int nr_inc = ((unsigned long) get_random_long()) % 1024; + + test_run_on_specific_cpu(test, cpumask_any_distribute(cpu_online_mask), 0, nr_inc, increment); + test_run_on_specific_cpu(test, cpumask_any_distribute(cpu_online_mask), 1, nr_inc, increment); + } + fini_kthreads(); + check_counters(test); + fini_counters(); + } +} + +static struct kunit_case hpcc_test_cases[] = { + KUNIT_CASE(hpcc_print_info), + KUNIT_CASE(hpcc_test_single_thread_first), + KUNIT_CASE(hpcc_test_single_thread_first_random), + KUNIT_CASE(hpcc_test_single_thread_random), + KUNIT_CASE(hpcc_test_multi_thread_batch_increment), + KUNIT_CASE(hpcc_test_multi_thread_random_walk), + {} +}; + +static struct kunit_suite hpcc_test_suite = { + .name = "percpu_counter_tree", + .test_cases = hpcc_test_cases, +}; + +kunit_test_suite(hpcc_test_suite); + +MODULE_DESCRIPTION("Test cases for hierarchical per-CPU counters"); +MODULE_LICENSE("Dual MIT/GPL"); _ Patches currently in -mm which might be from mathieu.desnoyers@efficios.com are lib-introduce-hierarchical-per-cpu-counters.patch lib-test-hierarchical-per-cpu-counters.patch mm-improve-rss-counter-approximation-accuracy-for-proc-interfaces.patch