From: Jiri Olsa <olsajiri@gmail.com>
To: Andrii Nakryiko <andrii@kernel.org>
Cc: linux-trace-kernel@vger.kernel.org, rostedt@goodmis.org,
mhiramat@kernel.org, bpf@vger.kernel.org,
Matt Wu <wuqiang.matt@bytedance.com>
Subject: Re: [PATCH 2/2] objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
Date: Tue, 28 May 2024 22:30:41 +0200 [thread overview]
Message-ID: <ZlY-8ZI_irTK9MAk@krava> (raw)
In-Reply-To: <20240424215214.3956041-3-andrii@kernel.org>
On Wed, Apr 24, 2024 at 02:52:14PM -0700, Andrii Nakryiko wrote:
> Profiling shows that calling nr_possible_cpus() in objpool_pop() takes
> a noticeable amount of CPU (when profiled on 80-core machine), as we
> need to recalculate number of set bits in a CPU bit mask. This number
> can't change, so there is no point in paying the price for recalculating
> it. As such, cache this value in struct objpool_head and use it in
> objpool_pop().
>
> On the other hand, cached pool->nr_cpus isn't necessary, as it's not
> used in hot path and is also a pretty trivial value to retrieve. So drop
> pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size
> of struct objpool_head remains the same, which is a nice bonus.
>
> Same BPF selftests benchmarks were used to evaluate the effect. Using
> changes in previous patch (inlining of objpool_pop/objpool_push) as
> baseline, here are the differences:
>
> BASELINE
> ========
> kretprobe : 9.937 ± 0.174M/s
> kretprobe-multi: 10.440 ± 0.108M/s
>
> AFTER
> =====
> kretprobe : 10.106 ± 0.120M/s (+1.7%)
> kretprobe-multi: 10.515 ± 0.180M/s (+0.7%)
nice, overall lgtm
jirka
>
> Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> include/linux/objpool.h | 6 +++---
> lib/objpool.c | 12 ++++++------
> 2 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/objpool.h b/include/linux/objpool.h
> index d8b1f7b91128..cb1758eaa2d3 100644
> --- a/include/linux/objpool.h
> +++ b/include/linux/objpool.h
> @@ -73,7 +73,7 @@ typedef int (*objpool_fini_cb)(struct objpool_head *head, void *context);
> * struct objpool_head - object pooling metadata
> * @obj_size: object size, aligned to sizeof(void *)
> * @nr_objs: total objs (to be pre-allocated with objpool)
> - * @nr_cpus: local copy of nr_cpu_ids
> + * @nr_possible_cpus: cached value of num_possible_cpus()
> * @capacity: max objs can be managed by one objpool_slot
> * @gfp: gfp flags for kmalloc & vmalloc
> * @ref: refcount of objpool
> @@ -85,7 +85,7 @@ typedef int (*objpool_fini_cb)(struct objpool_head *head, void *context);
> struct objpool_head {
> int obj_size;
> int nr_objs;
> - int nr_cpus;
> + int nr_possible_cpus;
> int capacity;
> gfp_t gfp;
> refcount_t ref;
> @@ -176,7 +176,7 @@ static inline void *objpool_pop(struct objpool_head *pool)
> raw_local_irq_save(flags);
>
> cpu = raw_smp_processor_id();
> - for (i = 0; i < num_possible_cpus(); i++) {
> + for (i = 0; i < pool->nr_possible_cpus; i++) {
> obj = __objpool_try_get_slot(pool, cpu);
> if (obj)
> break;
> diff --git a/lib/objpool.c b/lib/objpool.c
> index f696308fc026..234f9d0bd081 100644
> --- a/lib/objpool.c
> +++ b/lib/objpool.c
> @@ -50,7 +50,7 @@ objpool_init_percpu_slots(struct objpool_head *pool, int nr_objs,
> {
> int i, cpu_count = 0;
>
> - for (i = 0; i < pool->nr_cpus; i++) {
> + for (i = 0; i < nr_cpu_ids; i++) {
>
> struct objpool_slot *slot;
> int nodes, size, rc;
> @@ -60,8 +60,8 @@ objpool_init_percpu_slots(struct objpool_head *pool, int nr_objs,
> continue;
>
> /* compute how many objects to be allocated with this slot */
> - nodes = nr_objs / num_possible_cpus();
> - if (cpu_count < (nr_objs % num_possible_cpus()))
> + nodes = nr_objs / pool->nr_possible_cpus;
> + if (cpu_count < (nr_objs % pool->nr_possible_cpus))
> nodes++;
> cpu_count++;
>
> @@ -103,7 +103,7 @@ static void objpool_fini_percpu_slots(struct objpool_head *pool)
> if (!pool->cpu_slots)
> return;
>
> - for (i = 0; i < pool->nr_cpus; i++)
> + for (i = 0; i < nr_cpu_ids; i++)
> kvfree(pool->cpu_slots[i]);
> kfree(pool->cpu_slots);
> }
> @@ -130,13 +130,13 @@ int objpool_init(struct objpool_head *pool, int nr_objs, int object_size,
>
> /* initialize objpool pool */
> memset(pool, 0, sizeof(struct objpool_head));
> - pool->nr_cpus = nr_cpu_ids;
> + pool->nr_possible_cpus = num_possible_cpus();
> pool->obj_size = object_size;
> pool->capacity = capacity;
> pool->gfp = gfp & ~__GFP_ZERO;
> pool->context = context;
> pool->release = release;
> - slot_size = pool->nr_cpus * sizeof(struct objpool_slot);
> + slot_size = nr_cpu_ids * sizeof(struct objpool_slot);
> pool->cpu_slots = kzalloc(slot_size, pool->gfp);
> if (!pool->cpu_slots)
> return -ENOMEM;
> --
> 2.43.0
>
>
next prev parent reply other threads:[~2024-05-28 20:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-24 21:52 [PATCH 0/2] Objpool performance improvements Andrii Nakryiko
2024-04-24 21:52 ` [PATCH 1/2] objpool: enable inlining objpool_push() and objpool_pop() operations Andrii Nakryiko
2024-05-07 13:55 ` Vlastimil Babka
2024-05-10 7:59 ` wuqiang.matt
2024-05-10 8:20 ` Vlastimil Babka
2024-05-10 9:15 ` wuqiang.matt
2024-05-28 16:41 ` Masami Hiramatsu
2024-04-24 21:52 ` [PATCH 2/2] objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids Andrii Nakryiko
2024-05-28 20:30 ` Jiri Olsa [this message]
2024-04-26 14:25 ` [PATCH 0/2] Objpool performance improvements Masami Hiramatsu
2024-04-26 16:05 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZlY-8ZI_irTK9MAk@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=rostedt@goodmis.org \
--cc=wuqiang.matt@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.