bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes
@ 2019-02-27 22:46 Andrii Nakryiko
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

This patchset fixes a bug in btf_dedup() algorithm, which under specific hash
collision causes infinite loop. It also exposes ability to tune BTF
deduplication table size, with double purpose of allowing applications to
adjust size according to the size of BTF data, as well as allowing a simple way
to force hash collisions by setting table size to 1.

- Patch #1 fixes bug in btf_dedup testing code that's checking strings
- Patch #2 fixes pointer arg formatting in btf.h
- Patch #3 adds option to specify custom dedup table size
- Patch #4 fixes aforementioned bug in btf_dedup
- Patch #5 adds test that validates the fix

Andrii Nakryiko (5):
  selftests/bpf: fix btf_dedup testing code
  libbpf: fix formatting for btf_ext__get_raw_data
  btf: allow to customize dedup hash table size
  btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  selftests/bpf: add btf_dedup test of FWD/STRUCT resolution

 tools/lib/bpf/btf.c                    | 49 ++++++++++++++++----------
 tools/lib/bpf/btf.h                    |  3 +-
 tools/testing/selftests/bpf/.gitignore |  1 +
 tools/testing/selftests/bpf/test_btf.c | 49 ++++++++++++++++++++++++--
 4 files changed, 81 insertions(+), 21 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code
  2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
@ 2019-02-27 22:46 ` Andrii Nakryiko
  2019-02-28 18:17   ` Song Liu
                     ` (2 more replies)
  2019-02-27 22:46 ` [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data Andrii Nakryiko
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

btf_dedup testing code doesn't account for length of struct btf_header
when calculating the start of a string section. This patch fixes this
problem.

Fixes: 49b57e0d01db ("tools/bpf: remove btf__get_strings() superseded by raw data API")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/.gitignore | 1 +
 tools/testing/selftests/bpf/test_btf.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index e47168d1257d..3b74d23fffab 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -14,6 +14,7 @@ feature
 test_libbpf_open
 test_sock
 test_sock_addr
+test_sock_fields
 urandom_read
 test_btf
 test_sockmap
diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index 02d314383a9c..1426c0a905c8 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -5936,9 +5936,9 @@ static int do_test_dedup(unsigned int test_num)
 	}
 
 	test_hdr = test_btf_data;
-	test_strs = test_btf_data + test_hdr->str_off;
+	test_strs = test_btf_data + sizeof(*test_hdr) + test_hdr->str_off;
 	expect_hdr = expect_btf_data;
-	expect_strs = expect_btf_data + expect_hdr->str_off;
+	expect_strs = expect_btf_data + sizeof(*test_hdr) + expect_hdr->str_off;
 	if (CHECK(test_hdr->str_len != expect_hdr->str_len,
 		  "test_hdr->str_len:%u != expect_hdr->str_len:%u",
 		  test_hdr->str_len, expect_hdr->str_len)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data
  2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
@ 2019-02-27 22:46 ` Andrii Nakryiko
  2019-02-28 18:17   ` Song Liu
  2019-02-28 18:53   ` Arnaldo Carvalho de Melo
  2019-02-27 22:46 ` [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size Andrii Nakryiko
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

Fix invalid formatting of pointer arg.

Fixes: ae4ab4b4117d ("btf: expose API to work with raw btf_ext data")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/btf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 94bbc249b0f1..b60bb7cf5fff 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -76,7 +76,7 @@ LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
 
 LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
 LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
-LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext* btf_ext,
+LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
 					     __u32 *size);
 LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,
 					const struct btf_ext *btf_ext,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
  2019-02-27 22:46 ` [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data Andrii Nakryiko
@ 2019-02-27 22:46 ` Andrii Nakryiko
  2019-02-28 18:27   ` Song Liu
  2019-02-28 18:57   ` Arnaldo Carvalho de Melo
  2019-02-27 22:46 ` [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD Andrii Nakryiko
  2019-02-27 22:46 ` [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution Andrii Nakryiko
  4 siblings, 2 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

Default size of dedup table (16k) is good enough for most binaries, even
typical vmlinux images. But there are cases of binaries with huge amount
of BTF types (e.g., allyesconfig variants of kernel), which benefit from
having bigger dedup table size to lower amount of unnecessary hash
collisions. Tools like pahole, thus, can tune this parameter to reach
optimal performance.

This change also serves double purpose of allowing tests to force hash
collisions to test some corner cases, used in follow up patch.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
 tools/lib/bpf/btf.h |  1 +
 2 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 68b50e9bbde1..6bbb710216e6 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
 	return err;
 }
 
-#define BTF_DEDUP_TABLE_SIZE_LOG 14
-#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
+#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
 #define BTF_UNPROCESSED_ID ((__u32)-1)
 #define BTF_IN_PROGRESS_ID ((__u32)-2)
 
@@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
 #undef GOLDEN_RATIO_PRIME
 }
 
-#define for_each_hash_node(table, hash, node) \
-	for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
+#define for_each_dedup_cand(d, hash, node) \
+	for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
+	     node;							   \
+	     node = node->next)
 
 static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
 {
 	struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
+	int bucket = hash & (d->opts.dedup_table_size - 1);
 
 	if (!node)
 		return -ENOMEM;
 	node->type_id = type_id;
-	node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
-	d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
+	node->next = d->dedup_table[bucket];
+	d->dedup_table[bucket] = node;
 	return 0;
 }
 
@@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
 	if (!d->dedup_table)
 		return;
 
-	for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
+	for (i = 0; i < d->opts.dedup_table_size; i++) {
 		while (d->dedup_table[i]) {
 			tmp = d->dedup_table[i];
 			d->dedup_table[i] = tmp->next;
@@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 	if (!d)
 		return ERR_PTR(-ENOMEM);
 
+	d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
+	/* ensure table size is power of two and limit to 2G */
+	d->opts.dedup_table_size = opts && opts->dedup_table_size
+		? opts->dedup_table_size
+		: BTF_DEDUP_TABLE_DEFAULT_SIZE;
+	for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
+		;
+	d->opts.dedup_table_size = 1 << i;
+
 	d->btf = btf;
 	d->btf_ext = btf_ext;
 
-	d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
+	d->dedup_table = calloc(d->opts.dedup_table_size,
 				sizeof(struct btf_dedup_node *));
 	if (!d->dedup_table) {
 		err = -ENOMEM;
@@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 	for (i = 0; i <= btf->nr_types; i++)
 		d->hypot_map[i] = BTF_UNPROCESSED_ID;
 
-	d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
-
 done:
 	if (err) {
 		btf_dedup_free(d);
@@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
 
 	case BTF_KIND_INT:
 		h = btf_hash_int(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_int(t, cand)) {
 				new_id = cand_node->type_id;
@@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
 
 	case BTF_KIND_ENUM:
 		h = btf_hash_enum(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_enum(t, cand)) {
 				new_id = cand_node->type_id;
@@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
 
 	case BTF_KIND_FWD:
 		h = btf_hash_common(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_common(t, cand)) {
 				new_id = cand_node->type_id;
@@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
 		return 0;
 
 	h = btf_hash_struct(t);
-	for_each_hash_node(d->dedup_table, h, cand_node) {
+	for_each_dedup_cand(d, h, cand_node) {
 		int eq;
 
 		btf_dedup_clear_hypot_map(d);
@@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
 		t->type = ref_type_id;
 
 		h = btf_hash_common(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_common(t, cand)) {
 				new_id = cand_node->type_id;
@@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
 		info->index_type = ref_type_id;
 
 		h = btf_hash_array(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_array(t, cand)) {
 				new_id = cand_node->type_id;
@@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
 		}
 
 		h = btf_hash_fnproto(t);
-		for_each_hash_node(d->dedup_table, h, cand_node) {
+		for_each_dedup_cand(d, h, cand_node) {
 			cand = d->btf->types[cand_node->type_id];
 			if (btf_equal_fnproto(t, cand)) {
 				new_id = cand_node->type_id;
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index b60bb7cf5fff..28a1e1e59861 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
 LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
 
 struct btf_dedup_opts {
+	unsigned int dedup_table_size;
 	bool dont_resolve_fwds;
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2019-02-27 22:46 ` [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size Andrii Nakryiko
@ 2019-02-27 22:46 ` Andrii Nakryiko
  2019-02-28 18:18   ` Yonghong Song
  2019-02-28 18:29   ` Song Liu
  2019-02-27 22:46 ` [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution Andrii Nakryiko
  4 siblings, 2 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

When checking available canonical candidates for struct/union algorithm
utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
check is not enough when candidate is corresponding FWD for that
struct/union, because according to equivalence logic they are
equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
to the same bucket, it's possible to create remapping loop from FWD to
STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
forever.

This patch fixes the issue by additionally checking that type and
canonical candidate are strictly equal (utilizing btf_equal_struct).

Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/btf.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 6bbb710216e6..53db26d158c9 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
 static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
 {
 	struct btf_dedup_node *cand_node;
-	struct btf_type *t;
+	struct btf_type *cand_type, *t;
 	/* if we don't find equivalent type, then we are canonical */
 	__u32 new_id = type_id;
 	__u16 kind;
@@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
 	for_each_dedup_cand(d, h, cand_node) {
 		int eq;
 
+		cand_type = d->btf->types[cand_node->type_id];
+		if (!btf_equal_struct(t, cand_type))
+			continue;
+
 		btf_dedup_clear_hypot_map(d);
 		eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
 		if (eq < 0)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution
  2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2019-02-27 22:46 ` [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD Andrii Nakryiko
@ 2019-02-27 22:46 ` Andrii Nakryiko
  2019-02-28 18:29   ` Song Liu
  4 siblings, 1 reply; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-27 22:46 UTC (permalink / raw)
  To: andrii.nakryiko, kernel-team, ast, acme, netdev, bpf, daniel
  Cc: Andrii Nakryiko

This patch adds a btf_dedup test exercising logic of STRUCT<->FWD
resolution and validating that STRUCT is not resolved to a FWD. It also
forces hash collisions, forcing both FWD and STRUCT to be candidates for
each other. Previously this condition caused infinite loop due to FWD
pointing to STRUCT and STRUCT pointing to its FWD.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/test_btf.c | 45 ++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index 1426c0a905c8..38797aa627a7 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -5731,6 +5731,51 @@ const struct btf_dedup_test dedup_tests[] = {
 		.dont_resolve_fwds = false,
 	},
 },
+{
+	.descr = "dedup: struct <-> fwd resolution w/ hash collision",
+	/*
+	 * // CU 1:
+	 * struct x;
+	 * struct s {
+	 *	struct x *x;
+	 * };
+	 * // CU 2:
+	 * struct x {};
+	 * struct s {
+	 *	struct x *x;
+	 * };
+	 */
+	.input = {
+		.raw_types = {
+			/* CU 1 */
+			BTF_FWD_ENC(NAME_TBD, 0 /* struct fwd */),	/* [1] fwd x      */
+			BTF_PTR_ENC(1),					/* [2] ptr -> [1] */
+			BTF_STRUCT_ENC(NAME_TBD, 1, 8),			/* [3] struct s   */
+				BTF_MEMBER_ENC(NAME_TBD, 2, 0),
+			/* CU 2 */
+			BTF_STRUCT_ENC(NAME_TBD, 0, 0),			/* [4] struct x   */
+			BTF_PTR_ENC(4),					/* [5] ptr -> [4] */
+			BTF_STRUCT_ENC(NAME_TBD, 1, 8),			/* [6] struct s   */
+				BTF_MEMBER_ENC(NAME_TBD, 5, 0),
+			BTF_END_RAW,
+		},
+		BTF_STR_SEC("\0x\0s\0x\0x\0s\0x\0"),
+	},
+	.expect = {
+		.raw_types = {
+			BTF_PTR_ENC(3),					/* [1] ptr -> [3] */
+			BTF_STRUCT_ENC(NAME_TBD, 1, 8),			/* [2] struct s   */
+				BTF_MEMBER_ENC(NAME_TBD, 1, 0),
+			BTF_STRUCT_ENC(NAME_NTH(2), 0, 0),		/* [3] struct x   */
+			BTF_END_RAW,
+		},
+		BTF_STR_SEC("\0s\0x"),
+	},
+	.opts = {
+		.dont_resolve_fwds = false,
+		.dedup_table_size = 1, /* force hash collisions */
+	},
+},
 {
 	.descr = "dedup: all possible kinds (no duplicates)",
 	.input = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
@ 2019-02-28 18:17   ` Song Liu
  2019-02-28 18:19   ` Yonghong Song
  2019-02-28 18:52   ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-02-28 18:17 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii.nakryiko, kernel-team, ast, Arnaldo Carvalho de Melo,
	Networking, bpf, Daniel Borkmann

On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> btf_dedup testing code doesn't account for length of struct btf_header
> when calculating the start of a string section. This patch fixes this
> problem.
>
> Fixes: 49b57e0d01db ("tools/bpf: remove btf__get_strings() superseded by raw data API")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/testing/selftests/bpf/.gitignore | 1 +
>  tools/testing/selftests/bpf/test_btf.c | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
> index e47168d1257d..3b74d23fffab 100644
> --- a/tools/testing/selftests/bpf/.gitignore
> +++ b/tools/testing/selftests/bpf/.gitignore
> @@ -14,6 +14,7 @@ feature
>  test_libbpf_open
>  test_sock
>  test_sock_addr
> +test_sock_fields
>  urandom_read
>  test_btf
>  test_sockmap
> diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> index 02d314383a9c..1426c0a905c8 100644
> --- a/tools/testing/selftests/bpf/test_btf.c
> +++ b/tools/testing/selftests/bpf/test_btf.c
> @@ -5936,9 +5936,9 @@ static int do_test_dedup(unsigned int test_num)
>         }
>
>         test_hdr = test_btf_data;
> -       test_strs = test_btf_data + test_hdr->str_off;
> +       test_strs = test_btf_data + sizeof(*test_hdr) + test_hdr->str_off;
>         expect_hdr = expect_btf_data;
> -       expect_strs = expect_btf_data + expect_hdr->str_off;
> +       expect_strs = expect_btf_data + sizeof(*test_hdr) + expect_hdr->str_off;
>         if (CHECK(test_hdr->str_len != expect_hdr->str_len,
>                   "test_hdr->str_len:%u != expect_hdr->str_len:%u",
>                   test_hdr->str_len, expect_hdr->str_len)) {
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data
  2019-02-27 22:46 ` [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data Andrii Nakryiko
@ 2019-02-28 18:17   ` Song Liu
  2019-02-28 18:53   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-02-28 18:17 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii.nakryiko, kernel-team, ast, Arnaldo Carvalho de Melo,
	Networking, bpf, Daniel Borkmann

On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Fix invalid formatting of pointer arg.
>
> Fixes: ae4ab4b4117d ("btf: expose API to work with raw btf_ext data")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/lib/bpf/btf.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 94bbc249b0f1..b60bb7cf5fff 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -76,7 +76,7 @@ LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
>
>  LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
>  LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
> -LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext* btf_ext,
> +LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
>                                              __u32 *size);
>  LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,
>                                         const struct btf_ext *btf_ext,
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-27 22:46 ` [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD Andrii Nakryiko
@ 2019-02-28 18:18   ` Yonghong Song
  2019-02-28 19:07     ` Andrii Nakryiko
  2019-02-28 18:29   ` Song Liu
  1 sibling, 1 reply; 26+ messages in thread
From: Yonghong Song @ 2019-02-28 18:18 UTC (permalink / raw)
  To: Andrii Nakryiko, andrii.nakryiko@gmail.com, Kernel Team,
	Alexei Starovoitov, acme@kernel.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, daniel@iogearbox.net



On 2/27/19 2:46 PM, Andrii Nakryiko wrote:
> When checking available canonical candidates for struct/union algorithm
> utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
> check is not enough when candidate is corresponding FWD for that
> struct/union, because according to equivalence logic they are
> equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
> to the same bucket, it's possible to create remapping loop from FWD to
> STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
> forever.
> 
> This patch fixes the issue by additionally checking that type and
> canonical candidate are strictly equal (utilizing btf_equal_struct).

It looks like btf_equal_struct() checking equality except
member type id's. Maybe calling it btf_almost_equal_struct() or 
something like that?

> 
> Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>   tools/lib/bpf/btf.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 6bbb710216e6..53db26d158c9 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
>   static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>   {
>   	struct btf_dedup_node *cand_node;
> -	struct btf_type *t;
> +	struct btf_type *cand_type, *t;
>   	/* if we don't find equivalent type, then we are canonical */
>   	__u32 new_id = type_id;
>   	__u16 kind;
> @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>   	for_each_dedup_cand(d, h, cand_node) {
>   		int eq;
>   
> +		cand_type = d->btf->types[cand_node->type_id];
> +		if (!btf_equal_struct(t, cand_type))

The comment for this btf_equal_struct is not quite right.
/*
  * Check structural compatibility of two FUNC_PROTOs, ignoring 
referenced type
  * IDs. This check is performed during type graph equivalence check and
  * referenced types equivalence is checked separately.
  */
static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2)

It should be two "struct/union types".

> +			continue;
> +

I did not trace the algorithm how infinite loop happens. But the above
change is certainly a correct one, you want to do deduplication only 
after everything else (except member types) are euqal?

If the bug is due to circle in struct->fwd and fwd->struct mappings,
maybe a simple check whether such circle exists or not before update
the mapping will also work? I am not proposing this fix, but want
to understand better the issue.




>   		btf_dedup_clear_hypot_map(d);
>   		eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
>   		if (eq < 0)
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
  2019-02-28 18:17   ` Song Liu
@ 2019-02-28 18:19   ` Yonghong Song
  2019-02-28 18:52   ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 26+ messages in thread
From: Yonghong Song @ 2019-02-28 18:19 UTC (permalink / raw)
  To: Andrii Nakryiko, andrii.nakryiko@gmail.com, Kernel Team,
	Alexei Starovoitov, acme@kernel.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, daniel@iogearbox.net



On 2/27/19 2:46 PM, Andrii Nakryiko wrote:
> btf_dedup testing code doesn't account for length of struct btf_header
> when calculating the start of a string section. This patch fixes this
> problem.
> 
> Fixes: 49b57e0d01db ("tools/bpf: remove btf__get_strings() superseded by raw data API")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Yonghong Song <yhs@fb.com>
for Patch #1 - #3.

> ---
>   tools/testing/selftests/bpf/.gitignore | 1 +
>   tools/testing/selftests/bpf/test_btf.c | 4 ++--
>   2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
> index e47168d1257d..3b74d23fffab 100644
> --- a/tools/testing/selftests/bpf/.gitignore
> +++ b/tools/testing/selftests/bpf/.gitignore
> @@ -14,6 +14,7 @@ feature
>   test_libbpf_open
>   test_sock
>   test_sock_addr
> +test_sock_fields
>   urandom_read
>   test_btf
>   test_sockmap
> diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> index 02d314383a9c..1426c0a905c8 100644
> --- a/tools/testing/selftests/bpf/test_btf.c
> +++ b/tools/testing/selftests/bpf/test_btf.c
> @@ -5936,9 +5936,9 @@ static int do_test_dedup(unsigned int test_num)
>   	}
>   
>   	test_hdr = test_btf_data;
> -	test_strs = test_btf_data + test_hdr->str_off;
> +	test_strs = test_btf_data + sizeof(*test_hdr) + test_hdr->str_off;
>   	expect_hdr = expect_btf_data;
> -	expect_strs = expect_btf_data + expect_hdr->str_off;
> +	expect_strs = expect_btf_data + sizeof(*test_hdr) + expect_hdr->str_off;
>   	if (CHECK(test_hdr->str_len != expect_hdr->str_len,
>   		  "test_hdr->str_len:%u != expect_hdr->str_len:%u",
>   		  test_hdr->str_len, expect_hdr->str_len)) {
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-27 22:46 ` [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size Andrii Nakryiko
@ 2019-02-28 18:27   ` Song Liu
  2019-02-28 18:51     ` Andrii Nakryiko
  2019-02-28 18:57   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 26+ messages in thread
From: Song Liu @ 2019-02-28 18:27 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, kernel-team, ast, Arnaldo Carvalho de Melo,
	Networking, bpf, Daniel Borkmann

On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Default size of dedup table (16k) is good enough for most binaries, even
> typical vmlinux images. But there are cases of binaries with huge amount
> of BTF types (e.g., allyesconfig variants of kernel), which benefit from
> having bigger dedup table size to lower amount of unnecessary hash
> collisions. Tools like pahole, thus, can tune this parameter to reach
> optimal performance.
>
> This change also serves double purpose of allowing tests to force hash
> collisions to test some corner cases, used in follow up patch.
>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
>  tools/lib/bpf/btf.h |  1 +
>  2 files changed, 27 insertions(+), 17 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 68b50e9bbde1..6bbb710216e6 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
>         return err;
>  }
>
> -#define BTF_DEDUP_TABLE_SIZE_LOG 14
> -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
> +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
>  #define BTF_UNPROCESSED_ID ((__u32)-1)
>  #define BTF_IN_PROGRESS_ID ((__u32)-2)
>
> @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
>  #undef GOLDEN_RATIO_PRIME
>  }
>
> -#define for_each_hash_node(table, hash, node) \
> -       for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
> +#define for_each_dedup_cand(d, hash, node) \
> +       for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
> +            node;                                                         \
> +            node = node->next)
>
>  static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
>  {
>         struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
> +       int bucket = hash & (d->opts.dedup_table_size - 1);
>
>         if (!node)
>                 return -ENOMEM;
>         node->type_id = type_id;
> -       node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
> -       d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
> +       node->next = d->dedup_table[bucket];
> +       d->dedup_table[bucket] = node;
>         return 0;
>  }
>
> @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
>         if (!d->dedup_table)
>                 return;
>
> -       for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
> +       for (i = 0; i < d->opts.dedup_table_size; i++) {
>                 while (d->dedup_table[i]) {
>                         tmp = d->dedup_table[i];
>                         d->dedup_table[i] = tmp->next;
> @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>         if (!d)
>                 return ERR_PTR(-ENOMEM);
>
> +       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> +       /* ensure table size is power of two and limit to 2G */
> +       d->opts.dedup_table_size = opts && opts->dedup_table_size
> +               ? opts->dedup_table_size
> +               : BTF_DEDUP_TABLE_DEFAULT_SIZE;
> +       for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
> +               ;
> +       d->opts.dedup_table_size = 1 << i;
> +
So this is the roundup log2 logic? How about we define some marcos
to make it cleaner? Like

#define BTF_DEDUP_TABLE_MAX_SIZE xxxx

Also, how big hash table do we need for allyesconfig? 2G seems really
big to me.

>         d->btf = btf;
>         d->btf_ext = btf_ext;
>
> -       d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
> +       d->dedup_table = calloc(d->opts.dedup_table_size,
>                                 sizeof(struct btf_dedup_node *));
>         if (!d->dedup_table) {
>                 err = -ENOMEM;
> @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>         for (i = 0; i <= btf->nr_types; i++)
>                 d->hypot_map[i] = BTF_UNPROCESSED_ID;
>
> -       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> -
>  done:
>         if (err) {
>                 btf_dedup_free(d);
> @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>
>         case BTF_KIND_INT:
>                 h = btf_hash_int(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_int(t, cand)) {
>                                 new_id = cand_node->type_id;
> @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>
>         case BTF_KIND_ENUM:
>                 h = btf_hash_enum(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_enum(t, cand)) {
>                                 new_id = cand_node->type_id;
> @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>
>         case BTF_KIND_FWD:
>                 h = btf_hash_common(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_common(t, cand)) {
>                                 new_id = cand_node->type_id;
> @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>                 return 0;
>
>         h = btf_hash_struct(t);
> -       for_each_hash_node(d->dedup_table, h, cand_node) {
> +       for_each_dedup_cand(d, h, cand_node) {
>                 int eq;
>
>                 btf_dedup_clear_hypot_map(d);
> @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>                 t->type = ref_type_id;
>
>                 h = btf_hash_common(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_common(t, cand)) {
>                                 new_id = cand_node->type_id;
> @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>                 info->index_type = ref_type_id;
>
>                 h = btf_hash_array(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_array(t, cand)) {
>                                 new_id = cand_node->type_id;
> @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>                 }
>
>                 h = btf_hash_fnproto(t);
> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> +               for_each_dedup_cand(d, h, cand_node) {
>                         cand = d->btf->types[cand_node->type_id];
>                         if (btf_equal_fnproto(t, cand)) {
>                                 new_id = cand_node->type_id;
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index b60bb7cf5fff..28a1e1e59861 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
>  LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
>
>  struct btf_dedup_opts {
> +       unsigned int dedup_table_size;
>         bool dont_resolve_fwds;
>  };
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-27 22:46 ` [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD Andrii Nakryiko
  2019-02-28 18:18   ` Yonghong Song
@ 2019-02-28 18:29   ` Song Liu
  1 sibling, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-02-28 18:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, kernel-team, ast, Arnaldo Carvalho de Melo,
	Networking, bpf, Daniel Borkmann

On Wed, Feb 27, 2019 at 2:48 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> When checking available canonical candidates for struct/union algorithm
> utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
> check is not enough when candidate is corresponding FWD for that
> struct/union, because according to equivalence logic they are
> equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
> to the same bucket, it's possible to create remapping loop from FWD to
> STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
> forever.
>
> This patch fixes the issue by additionally checking that type and
> canonical candidate are strictly equal (utilizing btf_equal_struct).
>
> Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/lib/bpf/btf.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 6bbb710216e6..53db26d158c9 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
>  static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>  {
>         struct btf_dedup_node *cand_node;
> -       struct btf_type *t;
> +       struct btf_type *cand_type, *t;
>         /* if we don't find equivalent type, then we are canonical */
>         __u32 new_id = type_id;
>         __u16 kind;
> @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>         for_each_dedup_cand(d, h, cand_node) {
>                 int eq;
>
> +               cand_type = d->btf->types[cand_node->type_id];
> +               if (!btf_equal_struct(t, cand_type))
> +                       continue;
> +
>                 btf_dedup_clear_hypot_map(d);
>                 eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
>                 if (eq < 0)
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution
  2019-02-27 22:46 ` [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution Andrii Nakryiko
@ 2019-02-28 18:29   ` Song Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-02-28 18:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, kernel-team, ast, Arnaldo Carvalho de Melo,
	Networking, bpf, Daniel Borkmann

On Wed, Feb 27, 2019 at 2:48 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> This patch adds a btf_dedup test exercising logic of STRUCT<->FWD
> resolution and validating that STRUCT is not resolved to a FWD. It also
> forces hash collisions, forcing both FWD and STRUCT to be candidates for
> each other. Previously this condition caused infinite loop due to FWD
> pointing to STRUCT and STRUCT pointing to its FWD.
>
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/testing/selftests/bpf/test_btf.c | 45 ++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> index 1426c0a905c8..38797aa627a7 100644
> --- a/tools/testing/selftests/bpf/test_btf.c
> +++ b/tools/testing/selftests/bpf/test_btf.c
> @@ -5731,6 +5731,51 @@ const struct btf_dedup_test dedup_tests[] = {
>                 .dont_resolve_fwds = false,
>         },
>  },
> +{
> +       .descr = "dedup: struct <-> fwd resolution w/ hash collision",
> +       /*
> +        * // CU 1:
> +        * struct x;
> +        * struct s {
> +        *      struct x *x;
> +        * };
> +        * // CU 2:
> +        * struct x {};
> +        * struct s {
> +        *      struct x *x;
> +        * };
> +        */
> +       .input = {
> +               .raw_types = {
> +                       /* CU 1 */
> +                       BTF_FWD_ENC(NAME_TBD, 0 /* struct fwd */),      /* [1] fwd x      */
> +                       BTF_PTR_ENC(1),                                 /* [2] ptr -> [1] */
> +                       BTF_STRUCT_ENC(NAME_TBD, 1, 8),                 /* [3] struct s   */
> +                               BTF_MEMBER_ENC(NAME_TBD, 2, 0),
> +                       /* CU 2 */
> +                       BTF_STRUCT_ENC(NAME_TBD, 0, 0),                 /* [4] struct x   */
> +                       BTF_PTR_ENC(4),                                 /* [5] ptr -> [4] */
> +                       BTF_STRUCT_ENC(NAME_TBD, 1, 8),                 /* [6] struct s   */
> +                               BTF_MEMBER_ENC(NAME_TBD, 5, 0),
> +                       BTF_END_RAW,
> +               },
> +               BTF_STR_SEC("\0x\0s\0x\0x\0s\0x\0"),
> +       },
> +       .expect = {
> +               .raw_types = {
> +                       BTF_PTR_ENC(3),                                 /* [1] ptr -> [3] */
> +                       BTF_STRUCT_ENC(NAME_TBD, 1, 8),                 /* [2] struct s   */
> +                               BTF_MEMBER_ENC(NAME_TBD, 1, 0),
> +                       BTF_STRUCT_ENC(NAME_NTH(2), 0, 0),              /* [3] struct x   */
> +                       BTF_END_RAW,
> +               },
> +               BTF_STR_SEC("\0s\0x"),
> +       },
> +       .opts = {
> +               .dont_resolve_fwds = false,
> +               .dedup_table_size = 1, /* force hash collisions */
> +       },
> +},
>  {
>         .descr = "dedup: all possible kinds (no duplicates)",
>         .input = {
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-28 18:27   ` Song Liu
@ 2019-02-28 18:51     ` Andrii Nakryiko
  2019-02-28 19:02       ` Song Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 18:51 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, Networking, bpf, Daniel Borkmann

On Thu, Feb 28, 2019 at 10:27 AM Song Liu <liu.song.a23@gmail.com> wrote:
>
> On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > Default size of dedup table (16k) is good enough for most binaries, even
> > typical vmlinux images. But there are cases of binaries with huge amount
> > of BTF types (e.g., allyesconfig variants of kernel), which benefit from
> > having bigger dedup table size to lower amount of unnecessary hash
> > collisions. Tools like pahole, thus, can tune this parameter to reach
> > optimal performance.
> >
> > This change also serves double purpose of allowing tests to force hash
> > collisions to test some corner cases, used in follow up patch.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
> >  tools/lib/bpf/btf.h |  1 +
> >  2 files changed, 27 insertions(+), 17 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 68b50e9bbde1..6bbb710216e6 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
> >         return err;
> >  }
> >
> > -#define BTF_DEDUP_TABLE_SIZE_LOG 14
> > -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
> > +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
> >  #define BTF_UNPROCESSED_ID ((__u32)-1)
> >  #define BTF_IN_PROGRESS_ID ((__u32)-2)
> >
> > @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
> >  #undef GOLDEN_RATIO_PRIME
> >  }
> >
> > -#define for_each_hash_node(table, hash, node) \
> > -       for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
> > +#define for_each_dedup_cand(d, hash, node) \
> > +       for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
> > +            node;                                                         \
> > +            node = node->next)
> >
> >  static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
> >  {
> >         struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
> > +       int bucket = hash & (d->opts.dedup_table_size - 1);
> >
> >         if (!node)
> >                 return -ENOMEM;
> >         node->type_id = type_id;
> > -       node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
> > -       d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
> > +       node->next = d->dedup_table[bucket];
> > +       d->dedup_table[bucket] = node;
> >         return 0;
> >  }
> >
> > @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
> >         if (!d->dedup_table)
> >                 return;
> >
> > -       for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
> > +       for (i = 0; i < d->opts.dedup_table_size; i++) {
> >                 while (d->dedup_table[i]) {
> >                         tmp = d->dedup_table[i];
> >                         d->dedup_table[i] = tmp->next;
> > @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >         if (!d)
> >                 return ERR_PTR(-ENOMEM);
> >
> > +       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> > +       /* ensure table size is power of two and limit to 2G */
> > +       d->opts.dedup_table_size = opts && opts->dedup_table_size
> > +               ? opts->dedup_table_size
> > +               : BTF_DEDUP_TABLE_DEFAULT_SIZE;
> > +       for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
> > +               ;
> > +       d->opts.dedup_table_size = 1 << i;
> > +
> So this is the roundup log2 logic? How about we define some marcos
> to make it cleaner? Like
>
> #define BTF_DEDUP_TABLE_MAX_SIZE xxxx

You mean hide this loop behind macro? Or specify max size as a macro?
If former, I'd rather do static function, something like
roundup_pow_of_2_with_max?

>
> Also, how big hash table do we need for allyesconfig? 2G seems really
> big to me.

It works even with 16k and takes about 45 seconds. I didn't want to
artificially limit this to something small and left it up to user.
This algorithm can be used for arbitrary binaries after pahole's
dwarf2btf conversion, which could be much bigger than kernel, so I
didn't want to artificially limit this. Realistically, unlikely that
you'll need more than 64k-128k.

>
> >         d->btf = btf;
> >         d->btf_ext = btf_ext;
> >
> > -       d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
> > +       d->dedup_table = calloc(d->opts.dedup_table_size,
> >                                 sizeof(struct btf_dedup_node *));
> >         if (!d->dedup_table) {
> >                 err = -ENOMEM;
> > @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >         for (i = 0; i <= btf->nr_types; i++)
> >                 d->hypot_map[i] = BTF_UNPROCESSED_ID;
> >
> > -       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> > -
> >  done:
> >         if (err) {
> >                 btf_dedup_free(d);
> > @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >         case BTF_KIND_INT:
> >                 h = btf_hash_int(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_int(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >         case BTF_KIND_ENUM:
> >                 h = btf_hash_enum(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_enum(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >         case BTF_KIND_FWD:
> >                 h = btf_hash_common(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_common(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >                 return 0;
> >
> >         h = btf_hash_struct(t);
> > -       for_each_hash_node(d->dedup_table, h, cand_node) {
> > +       for_each_dedup_cand(d, h, cand_node) {
> >                 int eq;
> >
> >                 btf_dedup_clear_hypot_map(d);
> > @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >                 t->type = ref_type_id;
> >
> >                 h = btf_hash_common(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_common(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >                 info->index_type = ref_type_id;
> >
> >                 h = btf_hash_array(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_array(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >                 }
> >
> >                 h = btf_hash_fnproto(t);
> > -               for_each_hash_node(d->dedup_table, h, cand_node) {
> > +               for_each_dedup_cand(d, h, cand_node) {
> >                         cand = d->btf->types[cand_node->type_id];
> >                         if (btf_equal_fnproto(t, cand)) {
> >                                 new_id = cand_node->type_id;
> > diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> > index b60bb7cf5fff..28a1e1e59861 100644
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
> >  LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
> >
> >  struct btf_dedup_opts {
> > +       unsigned int dedup_table_size;
> >         bool dont_resolve_fwds;
> >  };
> >
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code
  2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
  2019-02-28 18:17   ` Song Liu
  2019-02-28 18:19   ` Yonghong Song
@ 2019-02-28 18:52   ` Arnaldo Carvalho de Melo
  2019-02-28 19:19     ` Andrii Nakryiko
  2 siblings, 1 reply; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-02-28 18:52 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: andrii.nakryiko, kernel-team, ast, netdev, bpf, daniel

Em Wed, Feb 27, 2019 at 02:46:37PM -0800, Andrii Nakryiko escreveu:
> btf_dedup testing code doesn't account for length of struct btf_header
> when calculating the start of a string section. This patch fixes this
> problem.
> 
> Fixes: 49b57e0d01db ("tools/bpf: remove btf__get_strings() superseded by raw data API")

I think this clarifies things, but a Fixes seems excessive, right? I.e.
if you missed it in both sides of the (a != b) expression, the test will
be just as valid.

I say this because Fixes tags are now tracked and generates backporting
efforts that sometimes end up causing unnecessary human exchanges when
the patches don't apply because of some other patch.o

Or is there some further use of 'test_strs' and 'expect_strs' further
down that do_test_dedup() function?

/me scratches head, probably missing something...

- Arnaldo

> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/testing/selftests/bpf/.gitignore | 1 +
>  tools/testing/selftests/bpf/test_btf.c | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
> index e47168d1257d..3b74d23fffab 100644
> --- a/tools/testing/selftests/bpf/.gitignore
> +++ b/tools/testing/selftests/bpf/.gitignore
> @@ -14,6 +14,7 @@ feature
>  test_libbpf_open
>  test_sock
>  test_sock_addr
> +test_sock_fields
>  urandom_read
>  test_btf
>  test_sockmap
> diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> index 02d314383a9c..1426c0a905c8 100644
> --- a/tools/testing/selftests/bpf/test_btf.c
> +++ b/tools/testing/selftests/bpf/test_btf.c
> @@ -5936,9 +5936,9 @@ static int do_test_dedup(unsigned int test_num)
>  	}
>  
>  	test_hdr = test_btf_data;
> -	test_strs = test_btf_data + test_hdr->str_off;
> +	test_strs = test_btf_data + sizeof(*test_hdr) + test_hdr->str_off;
>  	expect_hdr = expect_btf_data;
> -	expect_strs = expect_btf_data + expect_hdr->str_off;
> +	expect_strs = expect_btf_data + sizeof(*test_hdr) + expect_hdr->str_off;
>  	if (CHECK(test_hdr->str_len != expect_hdr->str_len,
>  		  "test_hdr->str_len:%u != expect_hdr->str_len:%u",
>  		  test_hdr->str_len, expect_hdr->str_len)) {
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data
  2019-02-27 22:46 ` [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data Andrii Nakryiko
  2019-02-28 18:17   ` Song Liu
@ 2019-02-28 18:53   ` Arnaldo Carvalho de Melo
  2019-02-28 19:20     ` Andrii Nakryiko
  1 sibling, 1 reply; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-02-28 18:53 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: andrii.nakryiko, kernel-team, ast, netdev, bpf, daniel

Em Wed, Feb 27, 2019 at 02:46:38PM -0800, Andrii Nakryiko escreveu:
> Fix invalid formatting of pointer arg.
> 
> Fixes: ae4ab4b4117d ("btf: expose API to work with raw btf_ext data")

Also I think Fixes here is not applicable :-)

- Arnaldo

> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/lib/bpf/btf.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 94bbc249b0f1..b60bb7cf5fff 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -76,7 +76,7 @@ LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
>  
>  LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
>  LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
> -LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext* btf_ext,
> +LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
>  					     __u32 *size);
>  LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,
>  					const struct btf_ext *btf_ext,
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-27 22:46 ` [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size Andrii Nakryiko
  2019-02-28 18:27   ` Song Liu
@ 2019-02-28 18:57   ` Arnaldo Carvalho de Melo
  2019-02-28 20:08     ` Andrii Nakryiko
  1 sibling, 1 reply; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-02-28 18:57 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: andrii.nakryiko, kernel-team, ast, netdev, bpf, daniel

Em Wed, Feb 27, 2019 at 02:46:39PM -0800, Andrii Nakryiko escreveu:
> Default size of dedup table (16k) is good enough for most binaries, even
> typical vmlinux images. But there are cases of binaries with huge amount
> of BTF types (e.g., allyesconfig variants of kernel), which benefit from
> having bigger dedup table size to lower amount of unnecessary hash
> collisions. Tools like pahole, thus, can tune this parameter to reach
> optimal performance.
> 
> This change also serves double purpose of allowing tests to force hash
> collisions to test some corner cases, used in follow up patch.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
>  tools/lib/bpf/btf.h |  1 +
>  2 files changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 68b50e9bbde1..6bbb710216e6 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
>  	return err;
>  }
>  
> -#define BTF_DEDUP_TABLE_SIZE_LOG 14
> -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
> +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
>  #define BTF_UNPROCESSED_ID ((__u32)-1)
>  #define BTF_IN_PROGRESS_ID ((__u32)-2)
>  
> @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
>  #undef GOLDEN_RATIO_PRIME
>  }
>  
> -#define for_each_hash_node(table, hash, node) \
> -	for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
> +#define for_each_dedup_cand(d, hash, node) \
> +	for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
> +	     node;							   \
> +	     node = node->next)
>  
>  static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
>  {
>  	struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
> +	int bucket = hash & (d->opts.dedup_table_size - 1);
>  
>  	if (!node)
>  		return -ENOMEM;
>  	node->type_id = type_id;
> -	node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
> -	d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
> +	node->next = d->dedup_table[bucket];
> +	d->dedup_table[bucket] = node;
>  	return 0;
>  }
>  
> @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
>  	if (!d->dedup_table)
>  		return;
>  
> -	for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
> +	for (i = 0; i < d->opts.dedup_table_size; i++) {
>  		while (d->dedup_table[i]) {
>  			tmp = d->dedup_table[i];
>  			d->dedup_table[i] = tmp->next;
> @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>  	if (!d)
>  		return ERR_PTR(-ENOMEM);
>  
> +	d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;

Is the above line a leftover from some other patch? Doesn't seem related
to this patch.

The rest seems ok.

> +	/* ensure table size is power of two and limit to 2G */
> +	d->opts.dedup_table_size = opts && opts->dedup_table_size
> +		? opts->dedup_table_size
> +		: BTF_DEDUP_TABLE_DEFAULT_SIZE;
> +	for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
> +		;
> +	d->opts.dedup_table_size = 1 << i;
> +
>  	d->btf = btf;
>  	d->btf_ext = btf_ext;
>  
> -	d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
> +	d->dedup_table = calloc(d->opts.dedup_table_size,
>  				sizeof(struct btf_dedup_node *));
>  	if (!d->dedup_table) {
>  		err = -ENOMEM;
> @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>  	for (i = 0; i <= btf->nr_types; i++)
>  		d->hypot_map[i] = BTF_UNPROCESSED_ID;
>  
> -	d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> -
>  done:
>  	if (err) {
>  		btf_dedup_free(d);
> @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>  
>  	case BTF_KIND_INT:
>  		h = btf_hash_int(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_int(t, cand)) {
>  				new_id = cand_node->type_id;
> @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>  
>  	case BTF_KIND_ENUM:
>  		h = btf_hash_enum(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_enum(t, cand)) {
>  				new_id = cand_node->type_id;
> @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>  
>  	case BTF_KIND_FWD:
>  		h = btf_hash_common(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_common(t, cand)) {
>  				new_id = cand_node->type_id;
> @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>  		return 0;
>  
>  	h = btf_hash_struct(t);
> -	for_each_hash_node(d->dedup_table, h, cand_node) {
> +	for_each_dedup_cand(d, h, cand_node) {
>  		int eq;
>  
>  		btf_dedup_clear_hypot_map(d);
> @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>  		t->type = ref_type_id;
>  
>  		h = btf_hash_common(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_common(t, cand)) {
>  				new_id = cand_node->type_id;
> @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>  		info->index_type = ref_type_id;
>  
>  		h = btf_hash_array(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_array(t, cand)) {
>  				new_id = cand_node->type_id;
> @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>  		}
>  
>  		h = btf_hash_fnproto(t);
> -		for_each_hash_node(d->dedup_table, h, cand_node) {
> +		for_each_dedup_cand(d, h, cand_node) {
>  			cand = d->btf->types[cand_node->type_id];
>  			if (btf_equal_fnproto(t, cand)) {
>  				new_id = cand_node->type_id;
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index b60bb7cf5fff..28a1e1e59861 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
>  LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
>  
>  struct btf_dedup_opts {
> +	unsigned int dedup_table_size;
>  	bool dont_resolve_fwds;
>  };
>  
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-28 18:51     ` Andrii Nakryiko
@ 2019-02-28 19:02       ` Song Liu
  2019-02-28 19:40         ` Andrii Nakryiko
  0 siblings, 1 reply; 26+ messages in thread
From: Song Liu @ 2019-02-28 19:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Song Liu, Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, Networking, bpf@vger.kernel.org,
	Daniel Borkmann



> On Feb 28, 2019, at 10:51 AM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Thu, Feb 28, 2019 at 10:27 AM Song Liu <liu.song.a23@gmail.com> wrote:
>> 
>> On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>> 
>>> Default size of dedup table (16k) is good enough for most binaries, even
>>> typical vmlinux images. But there are cases of binaries with huge amount
>>> of BTF types (e.g., allyesconfig variants of kernel), which benefit from
>>> having bigger dedup table size to lower amount of unnecessary hash
>>> collisions. Tools like pahole, thus, can tune this parameter to reach
>>> optimal performance.
>>> 
>>> This change also serves double purpose of allowing tests to force hash
>>> collisions to test some corner cases, used in follow up patch.
>>> 
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>> ---
>>> tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
>>> tools/lib/bpf/btf.h |  1 +
>>> 2 files changed, 27 insertions(+), 17 deletions(-)
>>> 
>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>> index 68b50e9bbde1..6bbb710216e6 100644
>>> --- a/tools/lib/bpf/btf.c
>>> +++ b/tools/lib/bpf/btf.c
>>> @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
>>>        return err;
>>> }
>>> 
>>> -#define BTF_DEDUP_TABLE_SIZE_LOG 14
>>> -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
>>> +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
>>> #define BTF_UNPROCESSED_ID ((__u32)-1)
>>> #define BTF_IN_PROGRESS_ID ((__u32)-2)
>>> 
>>> @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
>>> #undef GOLDEN_RATIO_PRIME
>>> }
>>> 
>>> -#define for_each_hash_node(table, hash, node) \
>>> -       for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
>>> +#define for_each_dedup_cand(d, hash, node) \
>>> +       for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
>>> +            node;                                                         \
>>> +            node = node->next)
>>> 
>>> static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
>>> {
>>>        struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
>>> +       int bucket = hash & (d->opts.dedup_table_size - 1);
>>> 
>>>        if (!node)
>>>                return -ENOMEM;
>>>        node->type_id = type_id;
>>> -       node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
>>> -       d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
>>> +       node->next = d->dedup_table[bucket];
>>> +       d->dedup_table[bucket] = node;
>>>        return 0;
>>> }
>>> 
>>> @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
>>>        if (!d->dedup_table)
>>>                return;
>>> 
>>> -       for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
>>> +       for (i = 0; i < d->opts.dedup_table_size; i++) {
>>>                while (d->dedup_table[i]) {
>>>                        tmp = d->dedup_table[i];
>>>                        d->dedup_table[i] = tmp->next;
>>> @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>>>        if (!d)
>>>                return ERR_PTR(-ENOMEM);
>>> 
>>> +       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
>>> +       /* ensure table size is power of two and limit to 2G */
>>> +       d->opts.dedup_table_size = opts && opts->dedup_table_size
>>> +               ? opts->dedup_table_size
>>> +               : BTF_DEDUP_TABLE_DEFAULT_SIZE;
>>> +       for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
>>> +               ;
>>> +       d->opts.dedup_table_size = 1 << i;
>>> +
>> So this is the roundup log2 logic? How about we define some marcos
>> to make it cleaner? Like
>> 
>> #define BTF_DEDUP_TABLE_MAX_SIZE xxxx
> 
> You mean hide this loop behind macro? Or specify max size as a macro?
> If former, I'd rather do static function, something like
> roundup_pow_of_2_with_max?

I meant specify max size as a macro. Doing static function is also a 
good idea. 

> 
>> 
>> Also, how big hash table do we need for allyesconfig? 2G seems really
>> big to me.
> 
> It works even with 16k and takes about 45 seconds. I didn't want to
> artificially limit this to something small and left it up to user.
> This algorithm can be used for arbitrary binaries after pahole's
> dwarf2btf conversion, which could be much bigger than kernel, so I
> didn't want to artificially limit this. Realistically, unlikely that
> you'll need more than 64k-128k.

How about we show some warning like "You are using really big size" for
too big numbers, like 16M+? 

Thanks,
Song

> 
>> 
>>>        d->btf = btf;
>>>        d->btf_ext = btf_ext;
>>> 
>>> -       d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
>>> +       d->dedup_table = calloc(d->opts.dedup_table_size,
>>>                                sizeof(struct btf_dedup_node *));
>>>        if (!d->dedup_table) {
>>>                err = -ENOMEM;
>>> @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>>>        for (i = 0; i <= btf->nr_types; i++)
>>>                d->hypot_map[i] = BTF_UNPROCESSED_ID;
>>> 
>>> -       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
>>> -
>>> done:
>>>        if (err) {
>>>                btf_dedup_free(d);
>>> @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>> 
>>>        case BTF_KIND_INT:
>>>                h = btf_hash_int(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_int(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>> 
>>>        case BTF_KIND_ENUM:
>>>                h = btf_hash_enum(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_enum(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>> 
>>>        case BTF_KIND_FWD:
>>>                h = btf_hash_common(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_common(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>>>                return 0;
>>> 
>>>        h = btf_hash_struct(t);
>>> -       for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +       for_each_dedup_cand(d, h, cand_node) {
>>>                int eq;
>>> 
>>>                btf_dedup_clear_hypot_map(d);
>>> @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>                t->type = ref_type_id;
>>> 
>>>                h = btf_hash_common(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_common(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>                info->index_type = ref_type_id;
>>> 
>>>                h = btf_hash_array(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_array(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>                }
>>> 
>>>                h = btf_hash_fnproto(t);
>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>                        cand = d->btf->types[cand_node->type_id];
>>>                        if (btf_equal_fnproto(t, cand)) {
>>>                                new_id = cand_node->type_id;
>>> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
>>> index b60bb7cf5fff..28a1e1e59861 100644
>>> --- a/tools/lib/bpf/btf.h
>>> +++ b/tools/lib/bpf/btf.h
>>> @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
>>> LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
>>> 
>>> struct btf_dedup_opts {
>>> +       unsigned int dedup_table_size;
>>>        bool dont_resolve_fwds;
>>> };
>>> 
>>> --
>>> 2.17.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-28 18:18   ` Yonghong Song
@ 2019-02-28 19:07     ` Andrii Nakryiko
  2019-02-28 19:41       ` Yonghong Song
  0 siblings, 1 reply; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 19:07 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, acme@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, daniel@iogearbox.net

On Thu, Feb 28, 2019 at 10:19 AM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 2/27/19 2:46 PM, Andrii Nakryiko wrote:
> > When checking available canonical candidates for struct/union algorithm
> > utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
> > check is not enough when candidate is corresponding FWD for that
> > struct/union, because according to equivalence logic they are
> > equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
> > to the same bucket, it's possible to create remapping loop from FWD to
> > STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
> > forever.
> >
> > This patch fixes the issue by additionally checking that type and
> > canonical candidate are strictly equal (utilizing btf_equal_struct).
>
> It looks like btf_equal_struct() checking equality except
> member type id's. Maybe calling it btf_almost_equal_struct() or
> something like that?

Yes, for struct/union we can't compare types directly, that's what
btf_dedup_is_equiv is doing. I think btf_equal_struct w/ comment
explaining this particular behavior is good enough. If you insist,
though, I'd rather go to something like btf_shallow_equal_struct or
something along those lines.

>
> >
> > Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
> > Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >   tools/lib/bpf/btf.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 6bbb710216e6..53db26d158c9 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
> >   static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >   {
> >       struct btf_dedup_node *cand_node;
> > -     struct btf_type *t;
> > +     struct btf_type *cand_type, *t;
> >       /* if we don't find equivalent type, then we are canonical */
> >       __u32 new_id = type_id;
> >       __u16 kind;
> > @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >       for_each_dedup_cand(d, h, cand_node) {
> >               int eq;
> >
> > +             cand_type = d->btf->types[cand_node->type_id];
> > +             if (!btf_equal_struct(t, cand_type))
>
> The comment for this btf_equal_struct is not quite right.
> /*
>   * Check structural compatibility of two FUNC_PROTOs, ignoring
> referenced type
>   * IDs. This check is performed during type graph equivalence check and
>   * referenced types equivalence is checked separately.
>   */
> static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2)
>
> It should be two "struct/union types".

Yep, good catch, will fix!

>
> > +                     continue;
> > +
>
> I did not trace the algorithm how infinite loop happens. But the above

Check the test in follow up patch. It has a minimal example that
triggers this bug. It happens when we have some FWD x, which we
discover that it should be resolved to some STRUCT x (as a result of
equivalence check/resolution of some other struct s, that references
struct x internally). But that struct x might not have been
deduplicated yet, we just record this FWD -> STRUCT mapping so that we
don't lose this connection. Later, once we get to deduplication of
struct x, FWD x will be (in case of hash collision) one possible
candidate to consider for deduplication. At that point,
btf_dedup_is_equiv will consider them equivalent (but they are not
equal (!), that's where the bug is), so we'll try to resolve STRUCT x
-> FWD x, which creates a loop.

In btf_dedup_merge_hypot_map() that is used to record discovered
"equivalences" during struct/union type graph equivalence check, we
have explicit check to never resolve STRUCT/UNION into equivalent FWD,
so such loop shouldn't happen, except I missed the case of having FWD
as a possible dedup candidate due to hash collision.

> change is certainly a correct one, you want to do deduplication only
> after everything else (except member types) are euqal?

Well, if not for special case of FWD == STRUCT/UNION when
deduplicating structs, btf_dedup_is_equiv would be enough, because it
already checks for btf_equal_struct internally, when both types are
struct/union. It's just the special bit at the beginning of is_equiv
check that allows FWD and STRUCT/UNION with the same name to be
declared equivalent, that throws this off.

>
> If the bug is due to circle in struct->fwd and fwd->struct mappings,
> maybe a simple check whether such circle exists or not before update
> the mapping will also work? I am not proposing this fix, but want
> to understand better the issue.

That's essentially what we use btf_equal_struct for here, really. We
could equivalently just check BTF_INFO_KIND(t) == BTF_INFO_KIND(cand)
explicitly, but I btf_equal_struct feels a bit more generic and
obviously correct.

>
>
>
>
> >               btf_dedup_clear_hypot_map(d);
> >               eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
> >               if (eq < 0)
> >

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code
  2019-02-28 18:52   ` Arnaldo Carvalho de Melo
@ 2019-02-28 19:19     ` Andrii Nakryiko
  0 siblings, 0 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 19:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, Networking, bpf,
	Daniel Borkmann

On Thu, Feb 28, 2019 at 10:52 AM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Wed, Feb 27, 2019 at 02:46:37PM -0800, Andrii Nakryiko escreveu:
> > btf_dedup testing code doesn't account for length of struct btf_header
> > when calculating the start of a string section. This patch fixes this
> > problem.
> >
> > Fixes: 49b57e0d01db ("tools/bpf: remove btf__get_strings() superseded by raw data API")
>
> I think this clarifies things, but a Fixes seems excessive, right? I.e.
> if you missed it in both sides of the (a != b) expression, the test will
> be just as valid.
>
> I say this because Fixes tags are now tracked and generates backporting
> efforts that sometimes end up causing unnecessary human exchanges when
> the patches don't apply because of some other patch.o

I think this is a legitimate fix. The bug wasn't exposed until I added
a new test from this patchset, but still. See below.

>
> Or is there some further use of 'test_strs' and 'expect_strs' further
> down that do_test_dedup() function?

Yes, few lines further down we use those pointers to iterate over all
strings in BTF string section and compare them:

        test_str_cur = test_strs;
        test_str_end = test_strs + test_hdr->str_len;
        expect_str_cur = expect_strs;
        expect_str_end = expect_strs + expect_hdr->str_len;
        while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
                ... compare strings ...
        }

With this bug, what ended up happening was that we never tested last
sizeof(struct btf_header) bytes of string section. It worked fine for
existing tests by luck, but in general it would fail, because after
that loop we check that we finished iteration exactly where we should:

        if (CHECK(test_str_cur != test_str_end,
                  "test_str_cur:%p != test_str_end:%p",
                  test_str_cur, test_str_end)) {
                err = -1;
                goto done;
        }

New test I added didn't have lucky byte arrangement that made
test_str_cur == test_str_end true for previous tests.


>
> /me scratches head, probably missing something...
>
> - Arnaldo
>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/testing/selftests/bpf/.gitignore | 1 +
> >  tools/testing/selftests/bpf/test_btf.c | 4 ++--
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
> > index e47168d1257d..3b74d23fffab 100644
> > --- a/tools/testing/selftests/bpf/.gitignore
> > +++ b/tools/testing/selftests/bpf/.gitignore
> > @@ -14,6 +14,7 @@ feature
> >  test_libbpf_open
> >  test_sock
> >  test_sock_addr
> > +test_sock_fields
> >  urandom_read
> >  test_btf
> >  test_sockmap
> > diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
> > index 02d314383a9c..1426c0a905c8 100644
> > --- a/tools/testing/selftests/bpf/test_btf.c
> > +++ b/tools/testing/selftests/bpf/test_btf.c
> > @@ -5936,9 +5936,9 @@ static int do_test_dedup(unsigned int test_num)
> >       }
> >
> >       test_hdr = test_btf_data;
> > -     test_strs = test_btf_data + test_hdr->str_off;
> > +     test_strs = test_btf_data + sizeof(*test_hdr) + test_hdr->str_off;
> >       expect_hdr = expect_btf_data;
> > -     expect_strs = expect_btf_data + expect_hdr->str_off;
> > +     expect_strs = expect_btf_data + sizeof(*test_hdr) + expect_hdr->str_off;
> >       if (CHECK(test_hdr->str_len != expect_hdr->str_len,
> >                 "test_hdr->str_len:%u != expect_hdr->str_len:%u",
> >                 test_hdr->str_len, expect_hdr->str_len)) {
> > --
> > 2.17.1
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data
  2019-02-28 18:53   ` Arnaldo Carvalho de Melo
@ 2019-02-28 19:20     ` Andrii Nakryiko
  0 siblings, 0 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 19:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, Networking, bpf,
	Daniel Borkmann

On Thu, Feb 28, 2019 at 10:53 AM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Wed, Feb 27, 2019 at 02:46:38PM -0800, Andrii Nakryiko escreveu:
> > Fix invalid formatting of pointer arg.
> >
> > Fixes: ae4ab4b4117d ("btf: expose API to work with raw btf_ext data")
>
> Also I think Fixes here is not applicable :-)

Yeah, I was probably too pedantic, I'll remove it from v2 :)

>
> - Arnaldo
>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/lib/bpf/btf.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> > index 94bbc249b0f1..b60bb7cf5fff 100644
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -76,7 +76,7 @@ LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> >
> >  LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
> >  LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
> > -LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext* btf_ext,
> > +LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
> >                                            __u32 *size);
> >  LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,
> >                                       const struct btf_ext *btf_ext,
> > --
> > 2.17.1
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-28 19:02       ` Song Liu
@ 2019-02-28 19:40         ` Andrii Nakryiko
  2019-02-28 19:56           ` Song Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 19:40 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, Networking, bpf@vger.kernel.org,
	Daniel Borkmann

On Thu, Feb 28, 2019 at 11:02 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Feb 28, 2019, at 10:51 AM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Feb 28, 2019 at 10:27 AM Song Liu <liu.song.a23@gmail.com> wrote:
> >>
> >> On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
> >>>
> >>> Default size of dedup table (16k) is good enough for most binaries, even
> >>> typical vmlinux images. But there are cases of binaries with huge amount
> >>> of BTF types (e.g., allyesconfig variants of kernel), which benefit from
> >>> having bigger dedup table size to lower amount of unnecessary hash
> >>> collisions. Tools like pahole, thus, can tune this parameter to reach
> >>> optimal performance.
> >>>
> >>> This change also serves double purpose of allowing tests to force hash
> >>> collisions to test some corner cases, used in follow up patch.
> >>>
> >>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> >>> ---
> >>> tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
> >>> tools/lib/bpf/btf.h |  1 +
> >>> 2 files changed, 27 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> >>> index 68b50e9bbde1..6bbb710216e6 100644
> >>> --- a/tools/lib/bpf/btf.c
> >>> +++ b/tools/lib/bpf/btf.c
> >>> @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
> >>>        return err;
> >>> }
> >>>
> >>> -#define BTF_DEDUP_TABLE_SIZE_LOG 14
> >>> -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
> >>> +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
> >>> #define BTF_UNPROCESSED_ID ((__u32)-1)
> >>> #define BTF_IN_PROGRESS_ID ((__u32)-2)
> >>>
> >>> @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
> >>> #undef GOLDEN_RATIO_PRIME
> >>> }
> >>>
> >>> -#define for_each_hash_node(table, hash, node) \
> >>> -       for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
> >>> +#define for_each_dedup_cand(d, hash, node) \
> >>> +       for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
> >>> +            node;                                                         \
> >>> +            node = node->next)
> >>>
> >>> static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
> >>> {
> >>>        struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
> >>> +       int bucket = hash & (d->opts.dedup_table_size - 1);
> >>>
> >>>        if (!node)
> >>>                return -ENOMEM;
> >>>        node->type_id = type_id;
> >>> -       node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
> >>> -       d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
> >>> +       node->next = d->dedup_table[bucket];
> >>> +       d->dedup_table[bucket] = node;
> >>>        return 0;
> >>> }
> >>>
> >>> @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
> >>>        if (!d->dedup_table)
> >>>                return;
> >>>
> >>> -       for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
> >>> +       for (i = 0; i < d->opts.dedup_table_size; i++) {
> >>>                while (d->dedup_table[i]) {
> >>>                        tmp = d->dedup_table[i];
> >>>                        d->dedup_table[i] = tmp->next;
> >>> @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >>>        if (!d)
> >>>                return ERR_PTR(-ENOMEM);
> >>>
> >>> +       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> >>> +       /* ensure table size is power of two and limit to 2G */
> >>> +       d->opts.dedup_table_size = opts && opts->dedup_table_size
> >>> +               ? opts->dedup_table_size
> >>> +               : BTF_DEDUP_TABLE_DEFAULT_SIZE;
> >>> +       for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
> >>> +               ;
> >>> +       d->opts.dedup_table_size = 1 << i;
> >>> +
> >> So this is the roundup log2 logic? How about we define some marcos
> >> to make it cleaner? Like
> >>
> >> #define BTF_DEDUP_TABLE_MAX_SIZE xxxx
> >
> > You mean hide this loop behind macro? Or specify max size as a macro?
> > If former, I'd rather do static function, something like
> > roundup_pow_of_2_with_max?
>
> I meant specify max size as a macro. Doing static function is also a
> good idea.

Ok, will do.

>
> >
> >>
> >> Also, how big hash table do we need for allyesconfig? 2G seems really
> >> big to me.
> >
> > It works even with 16k and takes about 45 seconds. I didn't want to
> > artificially limit this to something small and left it up to user.
> > This algorithm can be used for arbitrary binaries after pahole's
> > dwarf2btf conversion, which could be much bigger than kernel, so I
> > didn't want to artificially limit this. Realistically, unlikely that
> > you'll need more than 64k-128k.
>
> How about we show some warning like "You are using really big size" for
> too big numbers, like 16M+?

Tbh, adding extra arbitrary constant and emitting warning seems
excessive. If it's huge number, allocation will fail and we'll return
error. If it's just big, RSS will be high, and if user cares he should
be able to deduce that he specified unreasonable size. In general, I
doubt this value will be overriden often, most uses will probably just
use default setting.

>
> Thanks,
> Song
>
> >
> >>
> >>>        d->btf = btf;
> >>>        d->btf_ext = btf_ext;
> >>>
> >>> -       d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
> >>> +       d->dedup_table = calloc(d->opts.dedup_table_size,
> >>>                                sizeof(struct btf_dedup_node *));
> >>>        if (!d->dedup_table) {
> >>>                err = -ENOMEM;
> >>> @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >>>        for (i = 0; i <= btf->nr_types; i++)
> >>>                d->hypot_map[i] = BTF_UNPROCESSED_ID;
> >>>
> >>> -       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> >>> -
> >>> done:
> >>>        if (err) {
> >>>                btf_dedup_free(d);
> >>> @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >>>
> >>>        case BTF_KIND_INT:
> >>>                h = btf_hash_int(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_int(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >>>
> >>>        case BTF_KIND_ENUM:
> >>>                h = btf_hash_enum(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_enum(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >>>
> >>>        case BTF_KIND_FWD:
> >>>                h = btf_hash_common(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_common(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >>>                return 0;
> >>>
> >>>        h = btf_hash_struct(t);
> >>> -       for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +       for_each_dedup_cand(d, h, cand_node) {
> >>>                int eq;
> >>>
> >>>                btf_dedup_clear_hypot_map(d);
> >>> @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >>>                t->type = ref_type_id;
> >>>
> >>>                h = btf_hash_common(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_common(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >>>                info->index_type = ref_type_id;
> >>>
> >>>                h = btf_hash_array(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_array(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >>>                }
> >>>
> >>>                h = btf_hash_fnproto(t);
> >>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
> >>> +               for_each_dedup_cand(d, h, cand_node) {
> >>>                        cand = d->btf->types[cand_node->type_id];
> >>>                        if (btf_equal_fnproto(t, cand)) {
> >>>                                new_id = cand_node->type_id;
> >>> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> >>> index b60bb7cf5fff..28a1e1e59861 100644
> >>> --- a/tools/lib/bpf/btf.h
> >>> +++ b/tools/lib/bpf/btf.h
> >>> @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
> >>> LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
> >>>
> >>> struct btf_dedup_opts {
> >>> +       unsigned int dedup_table_size;
> >>>        bool dont_resolve_fwds;
> >>> };
> >>>
> >>> --
> >>> 2.17.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-28 19:07     ` Andrii Nakryiko
@ 2019-02-28 19:41       ` Yonghong Song
  2019-02-28 20:26         ` Andrii Nakryiko
  0 siblings, 1 reply; 26+ messages in thread
From: Yonghong Song @ 2019-02-28 19:41 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, acme@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, daniel@iogearbox.net



On 2/28/19 11:07 AM, Andrii Nakryiko wrote:
> On Thu, Feb 28, 2019 at 10:19 AM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 2/27/19 2:46 PM, Andrii Nakryiko wrote:
>>> When checking available canonical candidates for struct/union algorithm
>>> utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
>>> check is not enough when candidate is corresponding FWD for that
>>> struct/union, because according to equivalence logic they are
>>> equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
>>> to the same bucket, it's possible to create remapping loop from FWD to
>>> STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
>>> forever.
>>>
>>> This patch fixes the issue by additionally checking that type and
>>> canonical candidate are strictly equal (utilizing btf_equal_struct).
>>
>> It looks like btf_equal_struct() checking equality except
>> member type id's. Maybe calling it btf_almost_equal_struct() or
>> something like that?
> 
> Yes, for struct/union we can't compare types directly, that's what
> btf_dedup_is_equiv is doing. I think btf_equal_struct w/ comment
> explaining this particular behavior is good enough. If you insist,
> though, I'd rather go to something like btf_shallow_equal_struct or
> something along those lines.

btf_shallow_equal_struct() will be fine.

> 
>>
>>>
>>> Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
>>> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>> ---
>>>    tools/lib/bpf/btf.c | 6 +++++-
>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>> index 6bbb710216e6..53db26d158c9 100644
>>> --- a/tools/lib/bpf/btf.c
>>> +++ b/tools/lib/bpf/btf.c
>>> @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
>>>    static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>>>    {
>>>        struct btf_dedup_node *cand_node;
>>> -     struct btf_type *t;
>>> +     struct btf_type *cand_type, *t;
>>>        /* if we don't find equivalent type, then we are canonical */
>>>        __u32 new_id = type_id;
>>>        __u16 kind;
>>> @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>>>        for_each_dedup_cand(d, h, cand_node) {
>>>                int eq;
>>>
>>> +             cand_type = d->btf->types[cand_node->type_id];
>>> +             if (!btf_equal_struct(t, cand_type))
>>
>> The comment for this btf_equal_struct is not quite right.
>> /*
>>    * Check structural compatibility of two FUNC_PROTOs, ignoring
>> referenced type
>>    * IDs. This check is performed during type graph equivalence check and
>>    * referenced types equivalence is checked separately.
>>    */
>> static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2)
>>
>> It should be two "struct/union types".
> 
> Yep, good catch, will fix!
> 
>>
>>> +                     continue;
>>> +
>>
>> I did not trace the algorithm how infinite loop happens. But the above
> 
> Check the test in follow up patch. It has a minimal example that
> triggers this bug. It happens when we have some FWD x, which we
> discover that it should be resolved to some STRUCT x (as a result of
> equivalence check/resolution of some other struct s, that references
> struct x internally). But that struct x might not have been
> deduplicated yet, we just record this FWD -> STRUCT mapping so that we
> don't lose this connection. Later, once we get to deduplication of
> struct x, FWD x will be (in case of hash collision) one possible
> candidate to consider for deduplication. At that point,
> btf_dedup_is_equiv will consider them equivalent (but they are not
> equal (!), that's where the bug is), so we'll try to resolve STRUCT x
> -> FWD x, which creates a loop.
> 
> In btf_dedup_merge_hypot_map() that is used to record discovered
> "equivalences" during struct/union type graph equivalence check, we
> have explicit check to never resolve STRUCT/UNION into equivalent FWD,
> so such loop shouldn't happen, except I missed the case of having FWD
> as a possible dedup candidate due to hash collision.
> 
>> change is certainly a correct one, you want to do deduplication only
>> after everything else (except member types) are euqal?
> 
> Well, if not for special case of FWD == STRUCT/UNION when
> deduplicating structs, btf_dedup_is_equiv would be enough, because it
> already checks for btf_equal_struct internally, when both types are
> struct/union. It's just the special bit at the beginning of is_equiv
> check that allows FWD and STRUCT/UNION with the same name to be
> declared equivalent, that throws this off.
> 
>>
>> If the bug is due to circle in struct->fwd and fwd->struct mappings,
>> maybe a simple check whether such circle exists or not before update
>> the mapping will also work? I am not proposing this fix, but want
>> to understand better the issue.
> 
> That's essentially what we use btf_equal_struct for here, really. We
> could equivalently just check BTF_INFO_KIND(t) == BTF_INFO_KIND(cand)
> explicitly, but I btf_equal_struct feels a bit more generic and
> obviously correct.

Okay, I see. So the goal is really to prevent processing FWD in the
struct/union dedup candidate list. It will be good to summarize
the above detailed explanation in commit message.

With the above suggested changes,
   Acked-by: Yonghong Song <yhs@fb.com>

> 
>>
>>
>>
>>
>>>                btf_dedup_clear_hypot_map(d);
>>>                eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
>>>                if (eq < 0)
>>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-28 19:40         ` Andrii Nakryiko
@ 2019-02-28 19:56           ` Song Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-02-28 19:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Song Liu, Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, Networking, bpf@vger.kernel.org,
	Daniel Borkmann



> On Feb 28, 2019, at 11:40 AM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Thu, Feb 28, 2019 at 11:02 AM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Feb 28, 2019, at 10:51 AM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>> 
>>> On Thu, Feb 28, 2019 at 10:27 AM Song Liu <liu.song.a23@gmail.com> wrote:
>>>> 
>>>> On Wed, Feb 27, 2019 at 2:47 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>>>> 
>>>>> Default size of dedup table (16k) is good enough for most binaries, even
>>>>> typical vmlinux images. But there are cases of binaries with huge amount
>>>>> of BTF types (e.g., allyesconfig variants of kernel), which benefit from
>>>>> having bigger dedup table size to lower amount of unnecessary hash
>>>>> collisions. Tools like pahole, thus, can tune this parameter to reach
>>>>> optimal performance.
>>>>> 
>>>>> This change also serves double purpose of allowing tests to force hash
>>>>> collisions to test some corner cases, used in follow up patch.
>>>>> 
>>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>>>> ---
>>>>> tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
>>>>> tools/lib/bpf/btf.h |  1 +
>>>>> 2 files changed, 27 insertions(+), 17 deletions(-)
>>>>> 
>>>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>>>> index 68b50e9bbde1..6bbb710216e6 100644
>>>>> --- a/tools/lib/bpf/btf.c
>>>>> +++ b/tools/lib/bpf/btf.c
>>>>> @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
>>>>>       return err;
>>>>> }
>>>>> 
>>>>> -#define BTF_DEDUP_TABLE_SIZE_LOG 14
>>>>> -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
>>>>> +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
>>>>> #define BTF_UNPROCESSED_ID ((__u32)-1)
>>>>> #define BTF_IN_PROGRESS_ID ((__u32)-2)
>>>>> 
>>>>> @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
>>>>> #undef GOLDEN_RATIO_PRIME
>>>>> }
>>>>> 
>>>>> -#define for_each_hash_node(table, hash, node) \
>>>>> -       for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
>>>>> +#define for_each_dedup_cand(d, hash, node) \
>>>>> +       for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
>>>>> +            node;                                                         \
>>>>> +            node = node->next)
>>>>> 
>>>>> static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
>>>>> {
>>>>>       struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
>>>>> +       int bucket = hash & (d->opts.dedup_table_size - 1);
>>>>> 
>>>>>       if (!node)
>>>>>               return -ENOMEM;
>>>>>       node->type_id = type_id;
>>>>> -       node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
>>>>> -       d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
>>>>> +       node->next = d->dedup_table[bucket];
>>>>> +       d->dedup_table[bucket] = node;
>>>>>       return 0;
>>>>> }
>>>>> 
>>>>> @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
>>>>>       if (!d->dedup_table)
>>>>>               return;
>>>>> 
>>>>> -       for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
>>>>> +       for (i = 0; i < d->opts.dedup_table_size; i++) {
>>>>>               while (d->dedup_table[i]) {
>>>>>                       tmp = d->dedup_table[i];
>>>>>                       d->dedup_table[i] = tmp->next;
>>>>> @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>>>>>       if (!d)
>>>>>               return ERR_PTR(-ENOMEM);
>>>>> 
>>>>> +       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
>>>>> +       /* ensure table size is power of two and limit to 2G */
>>>>> +       d->opts.dedup_table_size = opts && opts->dedup_table_size
>>>>> +               ? opts->dedup_table_size
>>>>> +               : BTF_DEDUP_TABLE_DEFAULT_SIZE;
>>>>> +       for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
>>>>> +               ;
>>>>> +       d->opts.dedup_table_size = 1 << i;
>>>>> +
>>>> So this is the roundup log2 logic? How about we define some marcos
>>>> to make it cleaner? Like
>>>> 
>>>> #define BTF_DEDUP_TABLE_MAX_SIZE xxxx
>>> 
>>> You mean hide this loop behind macro? Or specify max size as a macro?
>>> If former, I'd rather do static function, something like
>>> roundup_pow_of_2_with_max?
>> 
>> I meant specify max size as a macro. Doing static function is also a
>> good idea.
> 
> Ok, will do.
> 
>> 
>>> 
>>>> 
>>>> Also, how big hash table do we need for allyesconfig? 2G seems really
>>>> big to me.
>>> 
>>> It works even with 16k and takes about 45 seconds. I didn't want to
>>> artificially limit this to something small and left it up to user.
>>> This algorithm can be used for arbitrary binaries after pahole's
>>> dwarf2btf conversion, which could be much bigger than kernel, so I
>>> didn't want to artificially limit this. Realistically, unlikely that
>>> you'll need more than 64k-128k.
>> 
>> How about we show some warning like "You are using really big size" for
>> too big numbers, like 16M+?
> 
> Tbh, adding extra arbitrary constant and emitting warning seems
> excessive. If it's huge number, allocation will fail and we'll return
> error. If it's just big, RSS will be high, and if user cares he should
> be able to deduce that he specified unreasonable size. In general, I
> doubt this value will be overriden often, most uses will probably just
> use default setting.

Fair enough. Let's go with 2G as-is then. 

> 
>> 
>> Thanks,
>> Song
>> 
>>> 
>>>> 
>>>>>       d->btf = btf;
>>>>>       d->btf_ext = btf_ext;
>>>>> 
>>>>> -       d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
>>>>> +       d->dedup_table = calloc(d->opts.dedup_table_size,
>>>>>                               sizeof(struct btf_dedup_node *));
>>>>>       if (!d->dedup_table) {
>>>>>               err = -ENOMEM;
>>>>> @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
>>>>>       for (i = 0; i <= btf->nr_types; i++)
>>>>>               d->hypot_map[i] = BTF_UNPROCESSED_ID;
>>>>> 
>>>>> -       d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
>>>>> -
>>>>> done:
>>>>>       if (err) {
>>>>>               btf_dedup_free(d);
>>>>> @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>>>> 
>>>>>       case BTF_KIND_INT:
>>>>>               h = btf_hash_int(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_int(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>>>> 
>>>>>       case BTF_KIND_ENUM:
>>>>>               h = btf_hash_enum(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_enum(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
>>>>> 
>>>>>       case BTF_KIND_FWD:
>>>>>               h = btf_hash_common(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_common(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
>>>>>               return 0;
>>>>> 
>>>>>       h = btf_hash_struct(t);
>>>>> -       for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +       for_each_dedup_cand(d, h, cand_node) {
>>>>>               int eq;
>>>>> 
>>>>>               btf_dedup_clear_hypot_map(d);
>>>>> @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>>>               t->type = ref_type_id;
>>>>> 
>>>>>               h = btf_hash_common(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_common(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>>>               info->index_type = ref_type_id;
>>>>> 
>>>>>               h = btf_hash_array(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_array(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
>>>>>               }
>>>>> 
>>>>>               h = btf_hash_fnproto(t);
>>>>> -               for_each_hash_node(d->dedup_table, h, cand_node) {
>>>>> +               for_each_dedup_cand(d, h, cand_node) {
>>>>>                       cand = d->btf->types[cand_node->type_id];
>>>>>                       if (btf_equal_fnproto(t, cand)) {
>>>>>                               new_id = cand_node->type_id;
>>>>> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
>>>>> index b60bb7cf5fff..28a1e1e59861 100644
>>>>> --- a/tools/lib/bpf/btf.h
>>>>> +++ b/tools/lib/bpf/btf.h
>>>>> @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
>>>>> LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
>>>>> 
>>>>> struct btf_dedup_opts {
>>>>> +       unsigned int dedup_table_size;
>>>>>       bool dont_resolve_fwds;
>>>>> };
>>>>> 
>>>>> --
>>>>> 2.17.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size
  2019-02-28 18:57   ` Arnaldo Carvalho de Melo
@ 2019-02-28 20:08     ` Andrii Nakryiko
  0 siblings, 0 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 20:08 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, Networking, bpf,
	Daniel Borkmann

On Thu, Feb 28, 2019 at 10:57 AM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Wed, Feb 27, 2019 at 02:46:39PM -0800, Andrii Nakryiko escreveu:
> > Default size of dedup table (16k) is good enough for most binaries, even
> > typical vmlinux images. But there are cases of binaries with huge amount
> > of BTF types (e.g., allyesconfig variants of kernel), which benefit from
> > having bigger dedup table size to lower amount of unnecessary hash
> > collisions. Tools like pahole, thus, can tune this parameter to reach
> > optimal performance.
> >
> > This change also serves double purpose of allowing tests to force hash
> > collisions to test some corner cases, used in follow up patch.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/lib/bpf/btf.c | 43 ++++++++++++++++++++++++++-----------------
> >  tools/lib/bpf/btf.h |  1 +
> >  2 files changed, 27 insertions(+), 17 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 68b50e9bbde1..6bbb710216e6 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -1070,8 +1070,7 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
> >       return err;
> >  }
> >
> > -#define BTF_DEDUP_TABLE_SIZE_LOG 14
> > -#define BTF_DEDUP_TABLE_MOD ((1 << BTF_DEDUP_TABLE_SIZE_LOG) - 1)
> > +#define BTF_DEDUP_TABLE_DEFAULT_SIZE (1 << 14)
> >  #define BTF_UNPROCESSED_ID ((__u32)-1)
> >  #define BTF_IN_PROGRESS_ID ((__u32)-2)
> >
> > @@ -1128,18 +1127,21 @@ static inline __u32 hash_combine(__u32 h, __u32 value)
> >  #undef GOLDEN_RATIO_PRIME
> >  }
> >
> > -#define for_each_hash_node(table, hash, node) \
> > -     for (node = table[hash & BTF_DEDUP_TABLE_MOD]; node; node = node->next)
> > +#define for_each_dedup_cand(d, hash, node) \
> > +     for (node = d->dedup_table[hash & (d->opts.dedup_table_size - 1)]; \
> > +          node;                                                         \
> > +          node = node->next)
> >
> >  static int btf_dedup_table_add(struct btf_dedup *d, __u32 hash, __u32 type_id)
> >  {
> >       struct btf_dedup_node *node = malloc(sizeof(struct btf_dedup_node));
> > +     int bucket = hash & (d->opts.dedup_table_size - 1);
> >
> >       if (!node)
> >               return -ENOMEM;
> >       node->type_id = type_id;
> > -     node->next = d->dedup_table[hash & BTF_DEDUP_TABLE_MOD];
> > -     d->dedup_table[hash & BTF_DEDUP_TABLE_MOD] = node;
> > +     node->next = d->dedup_table[bucket];
> > +     d->dedup_table[bucket] = node;
> >       return 0;
> >  }
> >
> > @@ -1177,7 +1179,7 @@ static void btf_dedup_table_free(struct btf_dedup *d)
> >       if (!d->dedup_table)
> >               return;
> >
> > -     for (i = 0; i < (1 << BTF_DEDUP_TABLE_SIZE_LOG); i++) {
> > +     for (i = 0; i < d->opts.dedup_table_size; i++) {
> >               while (d->dedup_table[i]) {
> >                       tmp = d->dedup_table[i];
> >                       d->dedup_table[i] = tmp->next;
> > @@ -1221,10 +1223,19 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >       if (!d)
> >               return ERR_PTR(-ENOMEM);
> >
> > +     d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
>
> Is the above line a leftover from some other patch? Doesn't seem related
> to this patch.
>
> The rest seems ok.

No, I added another option and moved options processing to the top of
btf_dedup_new() (it used to be at the end, but it makes sense to first
process options, because they can be required during construction of
btf__dedup, as is the case for dedup_table_size).

>
> > +     /* ensure table size is power of two and limit to 2G */
> > +     d->opts.dedup_table_size = opts && opts->dedup_table_size
> > +             ? opts->dedup_table_size
> > +             : BTF_DEDUP_TABLE_DEFAULT_SIZE;
> > +     for (i = 0; i < 31 && (1 << i) < d->opts.dedup_table_size;  i++)
> > +             ;
> > +     d->opts.dedup_table_size = 1 << i;
> > +
> >       d->btf = btf;
> >       d->btf_ext = btf_ext;
> >
> > -     d->dedup_table = calloc(1 << BTF_DEDUP_TABLE_SIZE_LOG,
> > +     d->dedup_table = calloc(d->opts.dedup_table_size,
> >                               sizeof(struct btf_dedup_node *));
> >       if (!d->dedup_table) {
> >               err = -ENOMEM;
> > @@ -1249,8 +1260,6 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
> >       for (i = 0; i <= btf->nr_types; i++)
> >               d->hypot_map[i] = BTF_UNPROCESSED_ID;
> >
> > -     d->opts.dont_resolve_fwds = opts && opts->dont_resolve_fwds;
> > -
> >  done:
> >       if (err) {
> >               btf_dedup_free(d);
> > @@ -1824,7 +1833,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >       case BTF_KIND_INT:
> >               h = btf_hash_int(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_int(t, cand)) {
> >                               new_id = cand_node->type_id;
> > @@ -1835,7 +1844,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >       case BTF_KIND_ENUM:
> >               h = btf_hash_enum(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_enum(t, cand)) {
> >                               new_id = cand_node->type_id;
> > @@ -1846,7 +1855,7 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
> >
> >       case BTF_KIND_FWD:
> >               h = btf_hash_common(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_common(t, cand)) {
> >                               new_id = cand_node->type_id;
> > @@ -2263,7 +2272,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >               return 0;
> >
> >       h = btf_hash_struct(t);
> > -     for_each_hash_node(d->dedup_table, h, cand_node) {
> > +     for_each_dedup_cand(d, h, cand_node) {
> >               int eq;
> >
> >               btf_dedup_clear_hypot_map(d);
> > @@ -2349,7 +2358,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >               t->type = ref_type_id;
> >
> >               h = btf_hash_common(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_common(t, cand)) {
> >                               new_id = cand_node->type_id;
> > @@ -2372,7 +2381,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >               info->index_type = ref_type_id;
> >
> >               h = btf_hash_array(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_array(t, cand)) {
> >                               new_id = cand_node->type_id;
> > @@ -2403,7 +2412,7 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id)
> >               }
> >
> >               h = btf_hash_fnproto(t);
> > -             for_each_hash_node(d->dedup_table, h, cand_node) {
> > +             for_each_dedup_cand(d, h, cand_node) {
> >                       cand = d->btf->types[cand_node->type_id];
> >                       if (btf_equal_fnproto(t, cand)) {
> >                               new_id = cand_node->type_id;
> > diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> > index b60bb7cf5fff..28a1e1e59861 100644
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -90,6 +90,7 @@ LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
> >  LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
> >
> >  struct btf_dedup_opts {
> > +     unsigned int dedup_table_size;
> >       bool dont_resolve_fwds;
> >  };
> >
> > --
> > 2.17.1
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD
  2019-02-28 19:41       ` Yonghong Song
@ 2019-02-28 20:26         ` Andrii Nakryiko
  0 siblings, 0 replies; 26+ messages in thread
From: Andrii Nakryiko @ 2019-02-28 20:26 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, acme@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, daniel@iogearbox.net

On Thu, Feb 28, 2019 at 11:42 AM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 2/28/19 11:07 AM, Andrii Nakryiko wrote:
> > On Thu, Feb 28, 2019 at 10:19 AM Yonghong Song <yhs@fb.com> wrote:
> >>
> >>
> >>
> >> On 2/27/19 2:46 PM, Andrii Nakryiko wrote:
> >>> When checking available canonical candidates for struct/union algorithm
> >>> utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
> >>> check is not enough when candidate is corresponding FWD for that
> >>> struct/union, because according to equivalence logic they are
> >>> equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
> >>> to the same bucket, it's possible to create remapping loop from FWD to
> >>> STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
> >>> forever.
> >>>
> >>> This patch fixes the issue by additionally checking that type and
> >>> canonical candidate are strictly equal (utilizing btf_equal_struct).
> >>
> >> It looks like btf_equal_struct() checking equality except
> >> member type id's. Maybe calling it btf_almost_equal_struct() or
> >> something like that?
> >
> > Yes, for struct/union we can't compare types directly, that's what
> > btf_dedup_is_equiv is doing. I think btf_equal_struct w/ comment
> > explaining this particular behavior is good enough. If you insist,
> > though, I'd rather go to something like btf_shallow_equal_struct or
> > something along those lines.
>
> btf_shallow_equal_struct() will be fine.

Ok.

>
> >
> >>
> >>>
> >>> Fixes: d5caef5b5655 ("btf: add BTF types deduplication algorithm")
> >>> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> >>> ---
> >>>    tools/lib/bpf/btf.c | 6 +++++-
> >>>    1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> >>> index 6bbb710216e6..53db26d158c9 100644
> >>> --- a/tools/lib/bpf/btf.c
> >>> +++ b/tools/lib/bpf/btf.c
> >>> @@ -2255,7 +2255,7 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
> >>>    static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >>>    {
> >>>        struct btf_dedup_node *cand_node;
> >>> -     struct btf_type *t;
> >>> +     struct btf_type *cand_type, *t;
> >>>        /* if we don't find equivalent type, then we are canonical */
> >>>        __u32 new_id = type_id;
> >>>        __u16 kind;
> >>> @@ -2275,6 +2275,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
> >>>        for_each_dedup_cand(d, h, cand_node) {
> >>>                int eq;
> >>>
> >>> +             cand_type = d->btf->types[cand_node->type_id];
> >>> +             if (!btf_equal_struct(t, cand_type))
> >>
> >> The comment for this btf_equal_struct is not quite right.
> >> /*
> >>    * Check structural compatibility of two FUNC_PROTOs, ignoring
> >> referenced type
> >>    * IDs. This check is performed during type graph equivalence check and
> >>    * referenced types equivalence is checked separately.
> >>    */
> >> static bool btf_equal_struct(struct btf_type *t1, struct btf_type *t2)
> >>
> >> It should be two "struct/union types".
> >
> > Yep, good catch, will fix!
> >
> >>
> >>> +                     continue;
> >>> +
> >>
> >> I did not trace the algorithm how infinite loop happens. But the above
> >
> > Check the test in follow up patch. It has a minimal example that
> > triggers this bug. It happens when we have some FWD x, which we
> > discover that it should be resolved to some STRUCT x (as a result of
> > equivalence check/resolution of some other struct s, that references
> > struct x internally). But that struct x might not have been
> > deduplicated yet, we just record this FWD -> STRUCT mapping so that we
> > don't lose this connection. Later, once we get to deduplication of
> > struct x, FWD x will be (in case of hash collision) one possible
> > candidate to consider for deduplication. At that point,
> > btf_dedup_is_equiv will consider them equivalent (but they are not
> > equal (!), that's where the bug is), so we'll try to resolve STRUCT x
> > -> FWD x, which creates a loop.
> >
> > In btf_dedup_merge_hypot_map() that is used to record discovered
> > "equivalences" during struct/union type graph equivalence check, we
> > have explicit check to never resolve STRUCT/UNION into equivalent FWD,
> > so such loop shouldn't happen, except I missed the case of having FWD
> > as a possible dedup candidate due to hash collision.
> >
> >> change is certainly a correct one, you want to do deduplication only
> >> after everything else (except member types) are euqal?
> >
> > Well, if not for special case of FWD == STRUCT/UNION when
> > deduplicating structs, btf_dedup_is_equiv would be enough, because it
> > already checks for btf_equal_struct internally, when both types are
> > struct/union. It's just the special bit at the beginning of is_equiv
> > check that allows FWD and STRUCT/UNION with the same name to be
> > declared equivalent, that throws this off.
> >
> >>
> >> If the bug is due to circle in struct->fwd and fwd->struct mappings,
> >> maybe a simple check whether such circle exists or not before update
> >> the mapping will also work? I am not proposing this fix, but want
> >> to understand better the issue.
> >
> > That's essentially what we use btf_equal_struct for here, really. We
> > could equivalently just check BTF_INFO_KIND(t) == BTF_INFO_KIND(cand)
> > explicitly, but I btf_equal_struct feels a bit more generic and
> > obviously correct.
>
> Okay, I see. So the goal is really to prevent processing FWD in the
> struct/union dedup candidate list. It will be good to summarize
> the above detailed explanation in commit message.

Ok, will try to do this more succinctly.

>
> With the above suggested changes,
>    Acked-by: Yonghong Song <yhs@fb.com>
>
> >
> >>
> >>
> >>
> >>
> >>>                btf_dedup_clear_hypot_map(d);
> >>>                eq = btf_dedup_is_equiv(d, type_id, cand_node->type_id);
> >>>                if (eq < 0)
> >>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-02-28 20:26 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-27 22:46 [PATCH bpf-next 0/5] btf_dedup algorithm and test fixes Andrii Nakryiko
2019-02-27 22:46 ` [PATCH bpf-next 1/5] selftests/bpf: fix btf_dedup testing code Andrii Nakryiko
2019-02-28 18:17   ` Song Liu
2019-02-28 18:19   ` Yonghong Song
2019-02-28 18:52   ` Arnaldo Carvalho de Melo
2019-02-28 19:19     ` Andrii Nakryiko
2019-02-27 22:46 ` [PATCH bpf-next 2/5] libbpf: fix formatting for btf_ext__get_raw_data Andrii Nakryiko
2019-02-28 18:17   ` Song Liu
2019-02-28 18:53   ` Arnaldo Carvalho de Melo
2019-02-28 19:20     ` Andrii Nakryiko
2019-02-27 22:46 ` [PATCH bpf-next 3/5] btf: allow to customize dedup hash table size Andrii Nakryiko
2019-02-28 18:27   ` Song Liu
2019-02-28 18:51     ` Andrii Nakryiko
2019-02-28 19:02       ` Song Liu
2019-02-28 19:40         ` Andrii Nakryiko
2019-02-28 19:56           ` Song Liu
2019-02-28 18:57   ` Arnaldo Carvalho de Melo
2019-02-28 20:08     ` Andrii Nakryiko
2019-02-27 22:46 ` [PATCH bpf-next 4/5] btf: fix bug with resolving STRUCT/UNION into corresponding FWD Andrii Nakryiko
2019-02-28 18:18   ` Yonghong Song
2019-02-28 19:07     ` Andrii Nakryiko
2019-02-28 19:41       ` Yonghong Song
2019-02-28 20:26         ` Andrii Nakryiko
2019-02-28 18:29   ` Song Liu
2019-02-27 22:46 ` [PATCH bpf-next 5/5] selftests/bpf: add btf_dedup test of FWD/STRUCT resolution Andrii Nakryiko
2019-02-28 18:29   ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).