BPF List
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii@kernel.org>
To: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>
Cc: <andrii@kernel.org>, <kernel-team@fb.com>
Subject: [PATCH bpf-next 8/9] libbpf: automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary
Date: Sun, 8 May 2022 17:41:47 -0700	[thread overview]
Message-ID: <20220509004148.1801791-9-andrii@kernel.org> (raw)
In-Reply-To: <20220509004148.1801791-1-andrii@kernel.org>

Kernel imposes a pretty particular restriction on ringbuf map size. It
has to be a power-of-2 multiple of page size. While generally this isn't
hard for user to satisfy, sometimes it's impossible to do this
declaratively in BPF source code or just plain inconvenient to do at
runtime.

One such example might be BPF libraries that are supposed to work on
different architectures, which might not agree on what the common page
size is.

Let libbpf find the right size for user instead, if it turns out to not
satisfy kernel requirements. If user didn't set size at all, that's most
probably a mistake so don't upsize such zero size to one full page,
though. Also we need to be careful about not overflowing __u32
max_entries.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 63c0f412266c..15117b9a4d1e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4943,6 +4943,44 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 
 static void bpf_map__destroy(struct bpf_map *map);
 
+static bool is_pow_of_2(size_t x)
+{
+	return x && (x & (x - 1));
+}
+
+static size_t adjust_ringbuf_sz(size_t sz)
+{
+	__u32 page_sz = sysconf(_SC_PAGE_SIZE);
+	__u32 i, mul;
+
+	/* if user forgot to set any size, make sure they see error */
+	if (sz == 0)
+		return 0;
+	/* Kernel expects BPF_MAP_TYPE_RINGBUF's max_entries to be
+	 * a power-of-2 multiple of kernel's page size. If user diligently
+	 * satisified these conditions, pass the size through.
+	 */
+	if ((sz % page_sz) == 0 && is_pow_of_2(sz / page_sz))
+		return sz;
+
+	/* Otherwise find closest (page_sz * power_of_2) product bigger than
+	 * user-set size to satisfy both user size request and kernel
+	 * requirements and substitute correct max_entries for map creation.
+	 */
+	for (i = 0, mul = 1; ; i++, mul <<= 1) {
+		if (mul > UINT_MAX / page_sz) /* prevent __u32 overflow */
+			break;
+		if (mul * page_sz > sz)
+			return mul * page_sz;
+	}
+
+	/* if it's impossible to satisfy the conditions (i.e., user size is
+	 * very close to UINT_MAX but is not a power-of-2 multiple of
+	 * page_size) then just return original size and let kernel reject it
+	 */
+	return sz;
+}
+
 static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
 {
 	LIBBPF_OPTS(bpf_map_create_opts, create_attr);
@@ -4981,6 +5019,9 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 	}
 
 	switch (def->type) {
+	case BPF_MAP_TYPE_RINGBUF:
+		map->def.max_entries = adjust_ringbuf_sz(map->def.max_entries);
+		/* fallthrough */
 	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
 	case BPF_MAP_TYPE_CGROUP_ARRAY:
 	case BPF_MAP_TYPE_STACK_TRACE:
@@ -4994,7 +5035,6 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_QUEUE:
 	case BPF_MAP_TYPE_STACK:
-	case BPF_MAP_TYPE_RINGBUF:
 		create_attr.btf_fd = 0;
 		create_attr.btf_key_type_id = 0;
 		create_attr.btf_value_type_id = 0;
-- 
2.30.2


  parent reply	other threads:[~2022-05-09  1:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09  0:41 [PATCH bpf-next 0/9] Misc libbpf fixes and small improvements Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 1/9] selftests/bpf: prevent skeleton generation race Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 2/9] libbpf: make __kptr and __kptr_ref unconditionally use btf_type_tag() attr Andrii Nakryiko
2022-12-26 11:34   ` [PATCH bpf-next 2/9] libbpf: make __kptr and __kptr_ref Rong Tao
2022-12-28 19:03     ` Yonghong Song
2022-05-09  0:41 ` [PATCH bpf-next 3/9] libbpf: improve usability of field-based CO-RE helpers Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 4/9] selftests/bpf: use both syntaxes for " Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 5/9] libbpf: complete field-based CO-RE helpers with field offset helper Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 6/9] selftests/bpf: add bpf_core_field_offset() tests Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 7/9] libbpf: provide barrier() and barrier_var() in bpf_helpers.h Andrii Nakryiko
2022-05-09  0:41 ` Andrii Nakryiko [this message]
2022-05-10 15:34   ` [PATCH bpf-next 8/9] libbpf: automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary Nathan Chancellor
2022-05-10 18:47     ` Andrii Nakryiko
2022-05-09  0:41 ` [PATCH bpf-next 9/9] selftests/bpf: test libbpf's ringbuf size fix up logic Andrii Nakryiko
2022-05-09 15:20 ` [PATCH bpf-next 0/9] Misc libbpf fixes and small improvements patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220509004148.1801791-9-andrii@kernel.org \
    --to=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox