BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test
@ 2023-11-01  3:24 Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 1/3] selftests/bpf: Use value with enough-size when updating per-cpu map Hou Tao
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Hou Tao @ 2023-11-01  3:24 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, Song Liu,
	Hao Luo, Yonghong Song, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, Anton Protopopov,
	houtao1

From: Hou Tao <houtao1@huawei.com>

Hi,

BPF CI failed due to map_percpu_stats_percpu_hash from time to time [1].
It seems that the failure reason is per-cpu bpf memory allocator may not
be able to allocate per-cpu pointer successfully and it can not refill
free llist timely, and bpf_map_update_elem() will return -ENOMEM.

Patch #1 fixes the size of value passed to per-cpu map update API. The
problem was found when fixing the ENOMEM problem, so also post it in
this patchset. Patch #2 & #3 mitigates the ENOMEM problem by retrying
the update operation for non-preallocated per-cpu map.

Please see individual patches for more details. And comments are always
welcome.

Regards,
Tao

[1]: https://github.com/kernel-patches/bpf/actions/runs/6713177520/job/18244865326?pr=5909

Hou Tao (3):
  selftests/bpf: Use value with enough-size when updating per-cpu map
  selftests/bpf: Export map_update_retriable()
  selftsets/bpf: Retry map update for non-preallocated per-cpu map

 .../bpf/map_tests/map_percpu_stats.c          | 39 +++++++++++++++++--
 tools/testing/selftests/bpf/test_maps.c       | 17 +++++---
 tools/testing/selftests/bpf/test_maps.h       |  5 +++
 3 files changed, 53 insertions(+), 8 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH bpf-next 1/3] selftests/bpf: Use value with enough-size when updating per-cpu map
  2023-11-01  3:24 [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test Hou Tao
@ 2023-11-01  3:24 ` Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 2/3] selftests/bpf: Export map_update_retriable() Hou Tao
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Hou Tao @ 2023-11-01  3:24 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, Song Liu,
	Hao Luo, Yonghong Song, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, Anton Protopopov,
	houtao1

From: Hou Tao <houtao1@huawei.com>

When updating per-cpu map in map_percpu_stats test, patch_map_thread()
only passes 4-bytes-sized value to bpf_map_update_elem(). The expected
size of the value is 8 * num_possible_cpus(), so fix it by passing a
value with enough-size for per-cpu map update.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 .../bpf/map_tests/map_percpu_stats.c          | 21 ++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
index 1a9eeefda9a87..8ad17d051ef8f 100644
--- a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
+++ b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
@@ -131,6 +131,12 @@ static bool is_lru(__u32 map_type)
 	       map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH;
 }
 
+static bool is_percpu(__u32 map_type)
+{
+	return map_type == BPF_MAP_TYPE_PERCPU_HASH ||
+	       map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH;
+}
+
 struct upsert_opts {
 	__u32 map_type;
 	int map_fd;
@@ -150,17 +156,26 @@ static int create_small_hash(void)
 
 static void *patch_map_thread(void *arg)
 {
+	/* 8KB is enough for 1024 CPUs. And it is shared between N_THREADS. */
+	static __u8 blob[8 << 10];
 	struct upsert_opts *opts = arg;
+	void *val_ptr;
 	int val;
 	int ret;
 	int i;
 
 	for (i = 0; i < opts->n; i++) {
-		if (opts->map_type == BPF_MAP_TYPE_HASH_OF_MAPS)
+		if (opts->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
 			val = create_small_hash();
-		else
+			val_ptr = &val;
+		} else if (is_percpu(opts->map_type)) {
+			val_ptr = blob;
+		} else {
 			val = rand();
-		ret = bpf_map_update_elem(opts->map_fd, &i, &val, 0);
+			val_ptr = &val;
+		}
+
+		ret = bpf_map_update_elem(opts->map_fd, &i, val_ptr, 0);
 		CHECK(ret < 0, "bpf_map_update_elem", "key=%d error: %s\n", i, strerror(errno));
 
 		if (opts->map_type == BPF_MAP_TYPE_HASH_OF_MAPS)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH bpf-next 2/3] selftests/bpf: Export map_update_retriable()
  2023-11-01  3:24 [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 1/3] selftests/bpf: Use value with enough-size when updating per-cpu map Hou Tao
@ 2023-11-01  3:24 ` Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 3/3] selftsets/bpf: Retry map update for non-preallocated per-cpu map Hou Tao
  2023-11-02 16:50 ` [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Hou Tao @ 2023-11-01  3:24 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, Song Liu,
	Hao Luo, Yonghong Song, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, Anton Protopopov,
	houtao1

From: Hou Tao <houtao1@huawei.com>

Export map_update_retriable() to make it usable for other map_test
cases. These cases may only need retry for specific errno, so add
a new callback parameter to let map_update_retriable() decide whether or
not the errno is retriable.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/testing/selftests/bpf/test_maps.c | 17 ++++++++++++-----
 tools/testing/selftests/bpf/test_maps.h |  5 +++++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 7fc00e423e4dd..767e0693df106 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -1396,13 +1396,18 @@ static void test_map_stress(void)
 #define MAX_DELAY_US 50000
 #define MIN_DELAY_RANGE_US 5000
 
-static int map_update_retriable(int map_fd, const void *key, const void *value,
-				int flags, int attempts)
+static bool retry_for_again_or_busy(int err)
+{
+	return (err == EAGAIN || err == EBUSY);
+}
+
+int map_update_retriable(int map_fd, const void *key, const void *value, int flags, int attempts,
+			 retry_for_error_fn need_retry)
 {
 	int delay = rand() % MIN_DELAY_RANGE_US;
 
 	while (bpf_map_update_elem(map_fd, key, value, flags)) {
-		if (!attempts || (errno != EAGAIN && errno != EBUSY))
+		if (!attempts || !need_retry(errno))
 			return -errno;
 
 		if (delay <= MAX_DELAY_US / 2)
@@ -1445,11 +1450,13 @@ static void test_update_delete(unsigned int fn, void *data)
 		key = value = i;
 
 		if (do_update) {
-			err = map_update_retriable(fd, &key, &value, BPF_NOEXIST, MAP_RETRIES);
+			err = map_update_retriable(fd, &key, &value, BPF_NOEXIST, MAP_RETRIES,
+						   retry_for_again_or_busy);
 			if (err)
 				printf("error %d %d\n", err, errno);
 			assert(err == 0);
-			err = map_update_retriable(fd, &key, &value, BPF_EXIST, MAP_RETRIES);
+			err = map_update_retriable(fd, &key, &value, BPF_EXIST, MAP_RETRIES,
+						   retry_for_again_or_busy);
 			if (err)
 				printf("error %d %d\n", err, errno);
 			assert(err == 0);
diff --git a/tools/testing/selftests/bpf/test_maps.h b/tools/testing/selftests/bpf/test_maps.h
index f6fbca761732f..e4ac704a536c1 100644
--- a/tools/testing/selftests/bpf/test_maps.h
+++ b/tools/testing/selftests/bpf/test_maps.h
@@ -4,6 +4,7 @@
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <stdbool.h>
 
 #define CHECK(condition, tag, format...) ({				\
 	int __ret = !!(condition);					\
@@ -16,4 +17,8 @@
 
 extern int skips;
 
+typedef bool (*retry_for_error_fn)(int err);
+int map_update_retriable(int map_fd, const void *key, const void *value, int flags, int attempts,
+			 retry_for_error_fn need_retry);
+
 #endif
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH bpf-next 3/3] selftsets/bpf: Retry map update for non-preallocated per-cpu map
  2023-11-01  3:24 [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 1/3] selftests/bpf: Use value with enough-size when updating per-cpu map Hou Tao
  2023-11-01  3:24 ` [PATCH bpf-next 2/3] selftests/bpf: Export map_update_retriable() Hou Tao
@ 2023-11-01  3:24 ` Hou Tao
  2023-11-02 16:50 ` [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Hou Tao @ 2023-11-01  3:24 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, Song Liu,
	Hao Luo, Yonghong Song, Daniel Borkmann, KP Singh,
	Stanislav Fomichev, Jiri Olsa, John Fastabend, Anton Protopopov,
	houtao1

From: Hou Tao <houtao1@huawei.com>

BPF CI failed due to map_percpu_stats_percpu_hash from time to time [1].
It seems that the failure reason is per-cpu bpf memory allocator may not
be able to allocate per-cpu pointer successfully and it can not refill
free llist timely, and bpf_map_update_elem() will return -ENOMEM.

So mitigate the problem by retrying the update operation for
non-preallocated per-cpu map.

[1]: https://github.com/kernel-patches/bpf/actions/runs/6713177520/job/18244865326?pr=5909

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 .../bpf/map_tests/map_percpu_stats.c          | 20 ++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
index 8ad17d051ef8f..e152535e9e3ec 100644
--- a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
+++ b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c
@@ -141,6 +141,7 @@ struct upsert_opts {
 	__u32 map_type;
 	int map_fd;
 	__u32 n;
+	bool retry_for_nomem;
 };
 
 static int create_small_hash(void)
@@ -154,6 +155,11 @@ static int create_small_hash(void)
 	return map_fd;
 }
 
+static bool retry_for_nomem_fn(int err)
+{
+	return err == ENOMEM;
+}
+
 static void *patch_map_thread(void *arg)
 {
 	/* 8KB is enough for 1024 CPUs. And it is shared between N_THREADS. */
@@ -175,7 +181,12 @@ static void *patch_map_thread(void *arg)
 			val_ptr = &val;
 		}
 
-		ret = bpf_map_update_elem(opts->map_fd, &i, val_ptr, 0);
+		/* 2 seconds may be enough ? */
+		if (opts->retry_for_nomem)
+			ret = map_update_retriable(opts->map_fd, &i, val_ptr, 0,
+						   40, retry_for_nomem_fn);
+		else
+			ret = bpf_map_update_elem(opts->map_fd, &i, val_ptr, 0);
 		CHECK(ret < 0, "bpf_map_update_elem", "key=%d error: %s\n", i, strerror(errno));
 
 		if (opts->map_type == BPF_MAP_TYPE_HASH_OF_MAPS)
@@ -296,6 +307,13 @@ static void __test(int map_fd)
 	else
 		opts.n /= 2;
 
+	/* per-cpu bpf memory allocator may not be able to allocate per-cpu
+	 * pointer successfully and it can not refill free llist timely, and
+	 * bpf_map_update_elem() will return -ENOMEM. so just retry to mitigate
+	 * the problem temporarily.
+	 */
+	opts.retry_for_nomem = is_percpu(opts.map_type) && (info.map_flags & BPF_F_NO_PREALLOC);
+
 	/*
 	 * Upsert keys [0, n) under some competition: with random values from
 	 * N_THREADS threads. Check values, then delete all elements and check
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test
  2023-11-01  3:24 [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test Hou Tao
                   ` (2 preceding siblings ...)
  2023-11-01  3:24 ` [PATCH bpf-next 3/3] selftsets/bpf: Retry map update for non-preallocated per-cpu map Hou Tao
@ 2023-11-02 16:50 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-11-02 16:50 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, martin.lau, alexei.starovoitov, andrii, song, haoluo,
	yonghong.song, daniel, kpsingh, sdf, jolsa, john.fastabend, aspsk,
	houtao1

Hello:

This series was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Wed,  1 Nov 2023 11:24:52 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
> 
> BPF CI failed due to map_percpu_stats_percpu_hash from time to time [1].
> It seems that the failure reason is per-cpu bpf memory allocator may not
> be able to allocate per-cpu pointer successfully and it can not refill
> free llist timely, and bpf_map_update_elem() will return -ENOMEM.
> 
> [...]

Here is the summary with links:
  - [bpf-next,1/3] selftests/bpf: Use value with enough-size when updating per-cpu map
    https://git.kernel.org/bpf/bpf-next/c/3f1f234e677b
  - [bpf-next,2/3] selftests/bpf: Export map_update_retriable()
    https://git.kernel.org/bpf/bpf-next/c/ff38534e8251
  - [bpf-next,3/3] selftsets/bpf: Retry map update for non-preallocated per-cpu map
    https://git.kernel.org/bpf/bpf-next/c/57688b2a543b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-02 16:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-01  3:24 [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test Hou Tao
2023-11-01  3:24 ` [PATCH bpf-next 1/3] selftests/bpf: Use value with enough-size when updating per-cpu map Hou Tao
2023-11-01  3:24 ` [PATCH bpf-next 2/3] selftests/bpf: Export map_update_retriable() Hou Tao
2023-11-01  3:24 ` [PATCH bpf-next 3/3] selftsets/bpf: Retry map update for non-preallocated per-cpu map Hou Tao
2023-11-02 16:50 ` [PATCH bpf-next 0/3] selftests/bpf: Fixes for map_percpu_stats test patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox