[PATCH 0/3] ss: pretty-printing BPF socket-local storage

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] ss: pretty-printing BPF socket-local storage
@ 2023-11-28  2:30 Quentin Deslandes
  2023-11-28  2:30 ` [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Quentin Deslandes
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Quentin Deslandes @ 2023-11-28  2:30 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes

BPF allows programs to store socket-specific data using
BPF_MAP_TYPE_SK_STORAGE maps. The data is attached to the socket itself,
and Martin added INET_DIAG_REQ_SK_BPF_STORAGES, so it can be fetched
using the INET_DIAG mechanism.

Currently, ss doesn't request the socket-local data, this patch aims to
fix this.

The first patch fixes a bug where the "Process" column would always be
printed on ss' output, even if --processes/-p is not used.

Patch #2 requests the socket-local data for the requested map ID
(--bpf-map-id=) or all the maps (--bpf-maps). It then prints the map_id
in a dedicated column.

Patch #3 uses libbpf and BTF to pretty print the map's content, like
`bpftool map dump` would do.

While I think it makes sense for ss to provide the socket-local storage
content for the sockets, it's difficult to conciliate the column-based
output of ss and having readable socket-local data. Hence, the
socket-local data is printed in a readable fashion over multiple lines
under its socket statistics, independently of the column-based approach.

Here is an example of ss' output with --bpf-maps:
[...]
ESTAB                  2960280             0 [...]
    map_id: 259 [
        (struct my_sk_storage) {
            .field_hh = (char)127,
            .<anon> = (union <anon>) {
                .a = (int)0,
                .b = (int)0,
            },
        },
    ]

Quentin Deslandes (3):
  ss: prevent "Process" column from being printed unless requested
  ss: add support for BPF socket-local storage
  ss: pretty-print BPF socket-local storage

 misc/ss.c | 822 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 818 insertions(+), 4 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] ss: prevent "Process" column from being printed unless requested
  2023-11-28  2:30 [PATCH 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
@ 2023-11-28  2:30 ` Quentin Deslandes
  2023-11-29  0:20   ` David Ahern
  2023-11-28  2:30 ` [PATCH 2/3] ss: add support for BPF socket-local storage Quentin Deslandes
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Quentin Deslandes @ 2023-11-28  2:30 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes

Commit 5883c6eba517 ("ss: show header for --processes/-p") added
"Process" to the list of columns printed by ss. However, the "Process"
header is now printed even if --processes/-p is not used.

This change aims to fix this by moving the COL_PROC column ID to the same
index as the corresponding column structure in the columns array, and
enabling it if --processes/-p is used.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
---
 misc/ss.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/misc/ss.c b/misc/ss.c
index 9438382b..09dc1f37 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -100,8 +100,8 @@ enum col_id {
 	COL_SERV,
 	COL_RADDR,
 	COL_RSERV,
-	COL_EXT,
 	COL_PROC,
+	COL_EXT,
 	COL_MAX
 };
 
@@ -5795,6 +5795,9 @@ int main(int argc, char *argv[])
 	if (ssfilter_parse(&current_filter.f, argc, argv, filter_fp))
 		usage();
 
+	if (!show_processes)
+		columns[COL_PROC].disabled = 1;
+
 	if (!(current_filter.dbs & (current_filter.dbs - 1)))
 		columns[COL_NETID].disabled = 1;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ss: prevent "Process" column from being printed unless requested
  2023-11-28  2:30 ` [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Quentin Deslandes
@ 2023-11-29  0:20   ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2023-11-29  0:20 UTC (permalink / raw)
  To: Quentin Deslandes, netdev; +Cc: Martin KaFai Lau

On 11/27/23 7:30 PM, Quentin Deslandes wrote:
> Commit 5883c6eba517 ("ss: show header for --processes/-p") added
> "Process" to the list of columns printed by ss. However, the "Process"
> header is now printed even if --processes/-p is not used.
> 
> This change aims to fix this by moving the COL_PROC column ID to the same
> index as the corresponding column structure in the columns array, and
> enabling it if --processes/-p is used.
> 
> Signed-off-by: Quentin Deslandes <qde@naccy.de>
> ---
>  misc/ss.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/misc/ss.c b/misc/ss.c
> index 9438382b..09dc1f37 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -100,8 +100,8 @@ enum col_id {
>  	COL_SERV,
>  	COL_RADDR,
>  	COL_RSERV,
> -	COL_EXT,
>  	COL_PROC,
> +	COL_EXT,
>  	COL_MAX
>  };
>  
> @@ -5795,6 +5795,9 @@ int main(int argc, char *argv[])
>  	if (ssfilter_parse(&current_filter.f, argc, argv, filter_fp))
>  		usage();
>  
> +	if (!show_processes)
> +		columns[COL_PROC].disabled = 1;
> +
>  	if (!(current_filter.dbs & (current_filter.dbs - 1)))
>  		columns[COL_NETID].disabled = 1;
>  

this one should go into main as a bug fix. Resubmit as a standalone
patch with:

Fixes: 5883c6eba517 ("ss: show header for --processes/-p")

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/3] ss: add support for BPF socket-local storage
  2023-11-28  2:30 [PATCH 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
  2023-11-28  2:30 ` [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Quentin Deslandes
@ 2023-11-28  2:30 ` Quentin Deslandes
  2023-11-28 23:35   ` Martin KaFai Lau
  2023-11-28  2:30 ` [PATCH 3/3] ss: pretty-print " Quentin Deslandes
  2023-11-28 22:43 ` [PATCH 0/3] ss: pretty-printing " Stephen Hemminger
  3 siblings, 1 reply; 10+ messages in thread
From: Quentin Deslandes @ 2023-11-28  2:30 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes

While sock_diag is able to return BPF socket-local storage in response
to INET_DIAG_REQ_SK_BPF_STORAGES requests, ss doesn't request it.

This change introduces the --bpf-maps and --bpf-map-id= options to request
BPF socket-local storage for all SK_STORAGE maps, or only specific ones.

The bigger part of this change will check the requested map IDs and
ensure they are valid. A new column has been added named "Socket
storage" to print a list of map ID a given socket has data defined for.
This column is disabled unless --bpf-maps or --bpf-map-id= is used.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
Co-authored-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 misc/ss.c | 273 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 270 insertions(+), 3 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index 09dc1f37..5b255ce3 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -51,6 +51,11 @@
 #include <linux/tls.h>
 #include <linux/mptcp.h>
 
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+
 #if HAVE_RPC
 #include <rpc/rpc.h>
 #include <rpc/xdr.h>
@@ -101,6 +106,7 @@ enum col_id {
 	COL_RADDR,
 	COL_RSERV,
 	COL_PROC,
+	COL_SKSTOR,
 	COL_EXT,
 	COL_MAX
 };
@@ -130,6 +136,7 @@ static struct column columns[] = {
 	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Port",			"",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Process",		"",	0, 0, 0 },
+	{ ALIGN_LEFT,	"Socket storage",	"",	1, 0, 0 },
 	{ ALIGN_LEFT,	"",			"",	0, 0, 0 },
 };
 
@@ -3368,6 +3375,222 @@ static void parse_diag_msg(struct nlmsghdr *nlh, struct sockstat *s)
 	memcpy(s->remote.data, r->id.idiag_dst, s->local.bytelen);
 }
 
+#ifdef HAVE_LIBBPF
+
+#define MAX_NR_BPF_MAP_ID_OPTS 32
+
+struct btf;
+
+static struct bpf_map_opts {
+	unsigned int nr_maps;
+	struct bpf_sk_storage_map_info {
+		unsigned int id;
+		int fd;
+	} maps[MAX_NR_BPF_MAP_ID_OPTS];
+	bool show_all;
+	struct btf *kernel_btf;
+} bpf_map_opts;
+
+static void bpf_map_opts_mixed_error(void)
+{
+	fprintf(stderr,
+		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
+}
+
+static int bpf_map_opts_add_all(void)
+{
+	unsigned int i;
+	unsigned int fd;
+	uint32_t id = 0;
+	int r;
+
+	if (bpf_map_opts.nr_maps) {
+		bpf_map_opts_mixed_error();
+		return -1;
+	}
+
+	while (1) {
+		struct bpf_map_info info = {};
+		uint32_t len = sizeof(info);
+
+		r = bpf_map_get_next_id(id, &id);
+		if (r) {
+			if (errno == ENOENT)
+				break;
+
+			fprintf(stderr, "ss: failed to fetch BPF map ID\n");
+			goto err;
+		}
+
+		fd = bpf_map_get_fd_by_id(id);
+		if (fd == -1) {
+			fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n",
+				id, errno == EPERM ?
+				": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
+			goto err;
+		}
+
+		r = bpf_obj_get_info_by_fd(fd, &info, &len);
+		if (r) {
+			fprintf(stderr, "ss: failed to get info for BPF map ID %u\n",
+				id);
+			close(fd);
+			goto err;
+		}
+
+		if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
+			close(fd);
+			continue;
+		}
+
+		if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
+			fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n",
+				MAX_NR_BPF_MAP_ID_OPTS, id);
+			close(fd);
+			continue;
+		}
+
+		bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
+		bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+	}
+
+	bpf_map_opts.show_all = true;
+
+	return 0;
+
+err:
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i)
+		close(bpf_map_opts.maps[i].fd);
+
+	return -1;
+}
+
+static int bpf_map_opts_add_id(const char *optarg)
+{
+	struct bpf_map_info info = {};
+	uint32_t len = sizeof(info);
+	size_t optarg_len;
+	unsigned long id;
+	unsigned int i;
+	char *end;
+	int fd;
+	int r;
+
+	if (bpf_map_opts.show_all) {
+		bpf_map_opts_mixed_error();
+		return -1;
+	}
+
+	optarg_len = strlen(optarg);
+	id = strtoul(optarg, &end, 0);
+	if (end != optarg + optarg_len || id == 0 || id > UINT32_MAX) {
+		fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg);
+		return -1;
+	}
+
+	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
+		if (bpf_map_opts.maps[i].id == id)
+			return 0;
+	}
+
+	if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
+		fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %lu\n",
+			MAX_NR_BPF_MAP_ID_OPTS, id);
+		return 0;
+	}
+
+	fd = bpf_map_get_fd_by_id(id);
+	if (fd == -1) {
+		fprintf(stderr, "ss: cannot get fd for BPF map ID %lu%s\n",
+			id, errno == EPERM ?
+			": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
+		return -1;
+	}
+
+	r = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (r) {
+		fprintf(stderr, "ss: failed to get info for BPF map ID %lu\n", id);
+		close(fd);
+		return -1;
+	}
+
+	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
+		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
+			optarg, libbpf_bpf_map_type_str(info.type));
+		close(fd);
+		return -1;
+	}
+
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+
+	return 0;
+}
+
+static inline bool bpf_map_opts_is_enabled(void)
+{
+	return bpf_map_opts.nr_maps;
+}
+
+static struct rtattr *bpf_map_opts_alloc_rta(void)
+{
+	size_t total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) * bpf_map_opts.nr_maps);
+	struct rtattr *stgs_rta, *fd_rta;
+	unsigned int i;
+	void *buf;
+
+	stgs_rta = malloc(RTA_LENGTH(0));
+	stgs_rta->rta_len = RTA_LENGTH(0);
+	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
+
+	buf = malloc(total_size);
+	if (!buf)
+		return NULL;
+
+	stgs_rta = buf;
+	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
+	stgs_rta->rta_len = total_size;
+
+	buf = RTA_DATA(stgs_rta);
+	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
+		int *fd;
+
+		fd_rta = buf;
+		fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD;
+		fd_rta->rta_len = RTA_LENGTH(sizeof(int));
+
+		fd = RTA_DATA(fd_rta);
+		*fd = bpf_map_opts.maps[i].fd;
+
+		buf += fd_rta->rta_len;
+	}
+
+	return stgs_rta;
+}
+
+static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
+{
+	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg;
+	unsigned int rem;
+
+	for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs);
+		RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) {
+
+		if ((bpf_stg->rta_type & NLA_TYPE_MASK) != SK_DIAG_BPF_STORAGE)
+			continue;
+
+		parse_rtattr_nested(tb, SK_DIAG_BPF_STORAGE_MAX,
+			(struct rtattr *)bpf_stg);
+
+		if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) {
+			out("map_id:%u",
+				rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]));
+		}
+	}
+}
+
+#endif
+
 static int inet_show_sock(struct nlmsghdr *nlh,
 			  struct sockstat *s)
 {
@@ -3375,8 +3598,8 @@ static int inet_show_sock(struct nlmsghdr *nlh,
 	struct inet_diag_msg *r = NLMSG_DATA(nlh);
 	unsigned char v6only = 0;
 
-	parse_rtattr(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
-		     nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
+	parse_rtattr_flags(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
+		nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)), NLA_F_NESTED);
 
 	if (tb[INET_DIAG_PROTOCOL])
 		s->type = rta_getattr_u8(tb[INET_DIAG_PROTOCOL]);
@@ -3473,6 +3696,11 @@ static int inet_show_sock(struct nlmsghdr *nlh,
 	}
 	sctp_ino = s->ino;
 
+	if (tb[INET_DIAG_SK_BPF_STORAGES]) {
+		field_set(COL_SKSTOR);
+		show_sk_bpf_storages(tb[INET_DIAG_SK_BPF_STORAGES]);
+	}
+
 	return 0;
 }
 
@@ -3554,13 +3782,14 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 {
 	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
 	DIAG_REQUEST(req, struct inet_diag_req_v2 r);
+	struct rtattr *bpf_stgs_rta = NULL;
 	char    *bc = NULL;
 	int	bclen;
 	__u32	proto;
 	struct msghdr msg;
 	struct rtattr rta_bc;
 	struct rtattr rta_proto;
-	struct iovec iov[5];
+	struct iovec iov[6];
 	int iovlen = 1;
 
 	if (family == PF_UNSPEC)
@@ -3613,6 +3842,17 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 		iovlen += 2;
 	}
 
+	if (bpf_map_opts_is_enabled()) {
+		bpf_stgs_rta = bpf_map_opts_alloc_rta();
+		if (!bpf_stgs_rta) {
+			fprintf(stderr, "ss: cannot alloc request for --bpf-map\n");
+			return -1;
+		}
+
+		iov[iovlen++] = (struct iovec){ bpf_stgs_rta, bpf_stgs_rta->rta_len };
+		req.nlh.nlmsg_len += bpf_stgs_rta->rta_len;
+	}
+
 	msg = (struct msghdr) {
 		.msg_name = (void *)&nladdr,
 		.msg_namelen = sizeof(nladdr),
@@ -3621,10 +3861,13 @@ static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 	};
 
 	if (sendmsg(fd, &msg, 0) < 0) {
+		free(bpf_stgs_rta);
 		close(fd);
 		return -1;
 	}
 
+	free(bpf_stgs_rta);
+
 	return 0;
 }
 
@@ -5344,6 +5587,10 @@ static void _usage(FILE *dest)
 "       --tos           show tos and priority information\n"
 "       --cgroup        show cgroup information\n"
 "   -b, --bpf           show bpf filter socket information\n"
+#ifdef HAVE_LIBBPF
+"       --bpf-maps      show all BPF socket-local storage maps\n"
+"       --bpf-maps-id=MAP-ID    show a BPF socket-local storage map\n"
+#endif
 "   -E, --events        continually display sockets as they are destroyed\n"
 "   -Z, --context       display task SELinux security contexts\n"
 "   -z, --contexts      display task and socket SELinux security contexts\n"
@@ -5460,6 +5707,9 @@ static int scan_state(const char *state)
 
 #define OPT_INET_SOCKOPT 262
 
+#define OPT_BPF_MAPS 263
+#define OPT_BPF_MAP_ID 264
+
 static const struct option long_opts[] = {
 	{ "numeric", 0, 0, 'n' },
 	{ "resolve", 0, 0, 'r' },
@@ -5504,6 +5754,10 @@ static const struct option long_opts[] = {
 	{ "mptcp", 0, 0, 'M' },
 	{ "oneline", 0, 0, 'O' },
 	{ "inet-sockopt", 0, 0, OPT_INET_SOCKOPT },
+#ifdef HAVE_LIBBPF
+	{ "bpf-maps", 0, 0, OPT_BPF_MAPS},
+	{ "bpf-map-id", 1, 0, OPT_BPF_MAP_ID},
+#endif
 	{ 0 }
 
 };
@@ -5706,6 +5960,16 @@ int main(int argc, char *argv[])
 		case OPT_INET_SOCKOPT:
 			show_inet_sockopt = 1;
 			break;
+#ifdef HAVE_LIBBPF
+		case OPT_BPF_MAPS:
+			if (bpf_map_opts_add_all())
+				exit(1);
+			break;
+		case OPT_BPF_MAP_ID:
+			if (bpf_map_opts_add_id(optarg))
+				exit(1);
+			break;
+#endif
 		case 'h':
 			help();
 		case '?':
@@ -5804,6 +6068,9 @@ int main(int argc, char *argv[])
 	if (!(current_filter.states & (current_filter.states - 1)))
 		columns[COL_STATE].disabled = 1;
 
+	if (bpf_map_opts.nr_maps)
+		columns[COL_SKSTOR].disabled = 0;
+
 	if (show_header)
 		print_header();
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] ss: add support for BPF socket-local storage
  2023-11-28  2:30 ` [PATCH 2/3] ss: add support for BPF socket-local storage Quentin Deslandes
@ 2023-11-28 23:35   ` Martin KaFai Lau
  0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2023-11-28 23:35 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, netdev

On 11/27/23 6:30 PM, Quentin Deslandes wrote:
> diff --git a/misc/ss.c b/misc/ss.c
> index 09dc1f37..5b255ce3 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -51,6 +51,11 @@
>   #include <linux/tls.h>
>   #include <linux/mptcp.h>
>   
> +#ifdef HAVE_LIBBPF
> +#include <bpf/bpf.h>
> +#include <bpf/libbpf.h>
> +#endif
> +
>   #if HAVE_RPC
>   #include <rpc/rpc.h>
>   #include <rpc/xdr.h>
> @@ -101,6 +106,7 @@ enum col_id {
>   	COL_RADDR,
>   	COL_RSERV,
>   	COL_PROC,
> +	COL_SKSTOR,
>   	COL_EXT,
>   	COL_MAX
>   };
> @@ -130,6 +136,7 @@ static struct column columns[] = {
>   	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0, 0, 0 },
>   	{ ALIGN_LEFT,	"Port",			"",	0, 0, 0 },
>   	{ ALIGN_LEFT,	"Process",		"",	0, 0, 0 },
> +	{ ALIGN_LEFT,	"Socket storage",	"",	1, 0, 0 },
>   	{ ALIGN_LEFT,	"",			"",	0, 0, 0 },
>   };
>   
> @@ -3368,6 +3375,222 @@ static void parse_diag_msg(struct nlmsghdr *nlh, struct sockstat *s)
>   	memcpy(s->remote.data, r->id.idiag_dst, s->local.bytelen);
>   }
>   
> +#ifdef HAVE_LIBBPF
> +
> +#define MAX_NR_BPF_MAP_ID_OPTS 32
> +
> +struct btf;
> +
> +static struct bpf_map_opts {
> +	unsigned int nr_maps;
> +	struct bpf_sk_storage_map_info {
> +		unsigned int id;
> +		int fd;
> +	} maps[MAX_NR_BPF_MAP_ID_OPTS];
> +	bool show_all;
> +	struct btf *kernel_btf;
> +} bpf_map_opts;
> +
> +static void bpf_map_opts_mixed_error(void)
> +{
> +	fprintf(stderr,
> +		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
> +}
> +
> +static int bpf_map_opts_add_all(void)
> +{
> +	unsigned int i;
> +	unsigned int fd;
> +	uint32_t id = 0;
> +	int r;
> +
> +	if (bpf_map_opts.nr_maps) {
> +		bpf_map_opts_mixed_error();
> +		return -1;
> +	}
> +
> +	while (1) {
> +		struct bpf_map_info info = {};
> +		uint32_t len = sizeof(info);
> +
> +		r = bpf_map_get_next_id(id, &id);
> +		if (r) {
> +			if (errno == ENOENT)
> +				break;
> +
> +			fprintf(stderr, "ss: failed to fetch BPF map ID\n");
> +			goto err;
> +		}
> +
> +		fd = bpf_map_get_fd_by_id(id);
> +		if (fd == -1) {

The map might be gone. Check for errno == -ENOENT and "continue;" instead of 
"goto err;".

> +			fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n",
> +				id, errno == EPERM ?
> +				": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
> +			goto err;
> +		}
> +
> +		r = bpf_obj_get_info_by_fd(fd, &info, &len);
> +		if (r) {
> +			fprintf(stderr, "ss: failed to get info for BPF map ID %u\n",
> +				id);
> +			close(fd);
> +			goto err;
> +		}
> +
> +		if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
> +			close(fd);
> +			continue;
> +		}
> +
> +		if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
> +			fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n",
> +				MAX_NR_BPF_MAP_ID_OPTS, id);
> +			close(fd);
> +			continue;
> +		}
> +
> +		bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
> +		bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;

Not sure how the ss takes care of the fd/memory resources before process exit.

May be the fd(s) need a close() at some point?

> +	}
> +
> +	bpf_map_opts.show_all = true;
> +
> +	return 0;
> +
> +err:
> +	for (i = 0; i < bpf_map_opts.nr_maps; ++i)
> +		close(bpf_map_opts.maps[i].fd);
> +
> +	return -1;
> +}
> +
> +static int bpf_map_opts_add_id(const char *optarg)
> +{
> +	struct bpf_map_info info = {};
> +	uint32_t len = sizeof(info);
> +	size_t optarg_len;
> +	unsigned long id;
> +	unsigned int i;
> +	char *end;
> +	int fd;
> +	int r;
> +
> +	if (bpf_map_opts.show_all) {
> +		bpf_map_opts_mixed_error();
> +		return -1;
> +	}
> +
> +	optarg_len = strlen(optarg);
> +	id = strtoul(optarg, &end, 0);
> +	if (end != optarg + optarg_len || id == 0 || id > UINT32_MAX) {

id >= INT32_MAX

> +		fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg);
> +		return -1;
> +	}
> +
> +	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
> +		if (bpf_map_opts.maps[i].id == id)
> +			return 0;
> +	}
> +
> +	if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
> +		fprintf(stderr, "ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %lu\n",
> +			MAX_NR_BPF_MAP_ID_OPTS, id);
> +		return 0;
> +	}
> +
> +	fd = bpf_map_get_fd_by_id(id);
> +	if (fd == -1) {
> +		fprintf(stderr, "ss: cannot get fd for BPF map ID %lu%s\n",
> +			id, errno == EPERM ?
> +			": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
> +		return -1;
> +	}
> +
> +	r = bpf_obj_get_info_by_fd(fd, &info, &len);
> +	if (r) {
> +		fprintf(stderr, "ss: failed to get info for BPF map ID %lu\n", id);
> +		close(fd);
> +		return -1;
> +	}
> +
> +	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
> +		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
> +			optarg, libbpf_bpf_map_type_str(info.type));
> +		close(fd);
> +		return -1;
> +	}
> +
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
> +	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
> +
> +	return 0;
> +}
> +
> +static inline bool bpf_map_opts_is_enabled(void)
> +{
> +	return bpf_map_opts.nr_maps;
> +}
> +
> +static struct rtattr *bpf_map_opts_alloc_rta(void)
> +{
> +	size_t total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) * bpf_map_opts.nr_maps);
> +	struct rtattr *stgs_rta, *fd_rta;
> +	unsigned int i;
> +	void *buf;
> +
> +	stgs_rta = malloc(RTA_LENGTH(0));

stgs_rta is malloc()-ed here.

> +	stgs_rta->rta_len = RTA_LENGTH(0);
> +	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
> +
> +	buf = malloc(total_size);
> +	if (!buf)
> +		return NULL;
> +
> +	stgs_rta = buf;

and then overwriteen by buf. doesn't look right.

> +	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
> +	stgs_rta->rta_len = total_size;
> +
> +	buf = RTA_DATA(stgs_rta);
> +	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
> +		int *fd;
> +
> +		fd_rta = buf;
> +		fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD;
> +		fd_rta->rta_len = RTA_LENGTH(sizeof(int));
> +
> +		fd = RTA_DATA(fd_rta);
> +		*fd = bpf_map_opts.maps[i].fd;
> +
> +		buf += fd_rta->rta_len;
> +	}
> +
> +	return stgs_rta;
> +}



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/3] ss: pretty-print BPF socket-local storage
  2023-11-28  2:30 [PATCH 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
  2023-11-28  2:30 ` [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Quentin Deslandes
  2023-11-28  2:30 ` [PATCH 2/3] ss: add support for BPF socket-local storage Quentin Deslandes
@ 2023-11-28  2:30 ` Quentin Deslandes
  2023-11-28 23:42   ` Martin KaFai Lau
  2023-11-28 22:43 ` [PATCH 0/3] ss: pretty-printing " Stephen Hemminger
  3 siblings, 1 reply; 10+ messages in thread
From: Quentin Deslandes @ 2023-11-28  2:30 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Martin KaFai Lau, Quentin Deslandes

ss is able to print the map ID(s) for which a given socket has BPF
socket-local storage defined (using --bpf-maps or --bpf-map-id=). However,
the actual content of the map remains hidden.

This change aims to pretty-print the socket-local storage content following
the socket details, similar to what `bpftool map dump` would do. The exact
output format is inspired by drgn, while the BTF data processing is similar
to bpftool's.

ss will print the map's content in a best-effort fashion: BTF types that can
be printed will be displayed, while types that are not yet supported
(e.g. BTF_KIND_VAR) will be replaced by a placeholder. For readability
reasons, the --oneline option is not compatible with this change.

The new out_prefix_t type is introduced to ease the printing of compound
types (e.g. structs, unions), it defines the prefix to print before the actual
value to ensure the output is properly indented. COL_SKSTOR's header is
replaced with an empty string, as it doesn't need to be printed anymore;
it's used as a "virtual" column to refer to the socket-local storage dump,
which will be printed under the socket information. The column's width is
fixed to 1, so it doesn't mess up ss' output.

ss' output remains unchanged unless --bpf-maps or --bpf-map-id= is used,
in which case each socket containing BPF local storage will be followed by
the content of the storage before the next socket's info is displayed.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
---
 misc/ss.c | 558 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 551 insertions(+), 7 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index 5b255ce3..545e5475 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -51,8 +51,13 @@
 #include <linux/tls.h>
 #include <linux/mptcp.h>
 
+#ifdef HAVE_LIBBPF
+#include <linux/btf.h>
+#endif
+
 #ifdef HAVE_LIBBPF
 #include <bpf/bpf.h>
+#include <bpf/btf.h>
 #include <bpf/libbpf.h>
 #endif
 
@@ -136,7 +141,7 @@ static struct column columns[] = {
 	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Port",			"",	0, 0, 0 },
 	{ ALIGN_LEFT,	"Process",		"",	0, 0, 0 },
-	{ ALIGN_LEFT,	"Socket storage",	"",	1, 0, 0 },
+	{ ALIGN_LEFT,	"",			"",	1, 0, 0 },
 	{ ALIGN_LEFT,	"",			"",	0, 0, 0 },
 };
 
@@ -1212,6 +1217,9 @@ static void render_calc_width(void)
 		 */
 		c->width = min(c->width, screen_width);
 
+		if (c == &columns[COL_SKSTOR])
+			c->width = 1;
+
 		if (c->width)
 			first = 0;
 	}
@@ -3386,6 +3394,8 @@ static struct bpf_map_opts {
 	struct bpf_sk_storage_map_info {
 		unsigned int id;
 		int fd;
+		struct bpf_map_info info;
+		struct btf *btf;
 	} maps[MAX_NR_BPF_MAP_ID_OPTS];
 	bool show_all;
 	struct btf *kernel_btf;
@@ -3397,6 +3407,32 @@ static void bpf_map_opts_mixed_error(void)
 		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
 }
 
+static int bpf_maps_opts_load_btf(struct bpf_map_info *info, struct btf **btf)
+{
+	if (info->btf_vmlinux_value_type_id) {
+		if (!bpf_map_opts.kernel_btf) {
+			bpf_map_opts.kernel_btf = libbpf_find_kernel_btf();
+			if (!bpf_map_opts.kernel_btf) {
+				fprintf(stderr, "ss: failed to load kernel BTF\n");
+				return -1;
+			}
+		}
+
+		*btf = bpf_map_opts.kernel_btf;
+	} else if (info->btf_value_type_id) {
+		*btf = btf__load_from_kernel_by_id(info->btf_id);
+		if (!*btf) {
+			fprintf(stderr, "ss: failed to load BTF for map ID %u\n",
+				info->id);
+			return -1;
+		}
+	} else {
+		*btf = NULL;
+	}
+
+	return 0;
+}
+
 static int bpf_map_opts_add_all(void)
 {
 	unsigned int i;
@@ -3412,6 +3448,7 @@ static int bpf_map_opts_add_all(void)
 	while (1) {
 		struct bpf_map_info info = {};
 		uint32_t len = sizeof(info);
+		struct btf *btf;
 
 		r = bpf_map_get_next_id(id, &id);
 		if (r) {
@@ -3450,8 +3487,18 @@ static int bpf_map_opts_add_all(void)
 			continue;
 		}
 
+		r = bpf_maps_opts_load_btf(&info, &btf);
+		if (r) {
+			fprintf(stderr, "ss: failed to get BTF data for BPF map ID: %u\n",
+				id);
+			close(fd);
+			goto err;
+		}
+
 		bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
-		bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+		bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd;
+		bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info;
+		bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf;
 	}
 
 	bpf_map_opts.show_all = true;
@@ -3470,6 +3517,7 @@ static int bpf_map_opts_add_id(const char *optarg)
 	struct bpf_map_info info = {};
 	uint32_t len = sizeof(info);
 	size_t optarg_len;
+	struct btf *btf;
 	unsigned long id;
 	unsigned int i;
 	char *end;
@@ -3521,12 +3569,34 @@ static int bpf_map_opts_add_id(const char *optarg)
 		return -1;
 	}
 
+	r = bpf_maps_opts_load_btf(&info, &btf);
+	if (r) {
+		fprintf(stderr, "ss: failed to get BTF data for BPF map ID: %lu\n",
+			id);
+		return -1;
+	}
+
 	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = id;
-	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].fd = fd;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].info = info;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps++].btf = btf;
 
 	return 0;
 }
 
+static const struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
+	unsigned int map_id)
+{
+	unsigned int i;
+
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i) {
+		if (bpf_map_opts.maps[i].id == map_id)
+			return &bpf_map_opts.maps[i];
+	}
+
+	return NULL;
+}
+
 static inline bool bpf_map_opts_is_enabled(void)
 {
 	return bpf_map_opts.nr_maps;
@@ -3568,10 +3638,472 @@ static struct rtattr *bpf_map_opts_alloc_rta(void)
 	return stgs_rta;
 }
 
+#define OUT_PREFIX_LEN 65
+
+/* Print a prefixed formatted string. Used to dump BPF socket-local storage
+ * nested structures properly. */
+#define OUT_P(p, fmt, ...) out("%s" fmt, *(p), ##__VA_ARGS__)
+
+typedef char(out_prefix_t)[OUT_PREFIX_LEN];
+
+static void out_prefix_push(out_prefix_t *prefix)
+{
+	size_t len = strlen(*prefix);
+
+	if (len + 5 > OUT_PREFIX_LEN)
+		return;
+
+	strncpy(&(*prefix)[len], "    ", 5);
+}
+
+static void out_prefix_pop(out_prefix_t *prefix)
+{
+	size_t len = strlen(*prefix);
+
+	if (len < 4)
+		return;
+
+	(*prefix)[len - 4] = '\0';
+}
+
+static inline const char *btf_typename_or_fallback(const struct btf *btf,
+	unsigned int name_off)
+{
+	static const char *fallback = "<invalid name_off>";
+	static const char *anon = "<anon>";
+	const char *typename;
+
+	typename = btf__name_by_offset(btf, name_off);
+	if (!typename)
+		return fallback;
+
+	if (strcmp(typename, "") == 0)
+		return anon;
+
+	return typename;
+}
+
+static void out_btf_int128(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	uint64_t high, low;
+	const char *typename;
+
+#ifdef __BIG_ENDIAN_BITFIELD
+	high = *(uint64_t *)data;
+	low = *(uint64_t *)(data + 8);
+#else
+	high = *(uint64_t *)(data + 8);
+	low = *(uint64_t *)data;
+#endif
+
+	typename = btf_typename_or_fallback(btf, type->name_off);
+
+	if (high == 0)
+		OUT_P(prefix, "(%s)0x%lx,\n", typename, low);
+	else
+		OUT_P(prefix, "(%s)0x%lx%016lx,\n", typename, high, low);
+}
+
+#define BITS_PER_BYTE_MASKED(bits) ((bits) & 7)
+#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
+#define BITS_ROUNDUP_BYTES(bits) \
+	(BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits))
+
+static void out_btf_bitfield(const struct btf *btf, const struct btf_type *type,
+	uint32_t bitfield_offset, uint8_t bitfield_size, const void *data,
+	out_prefix_t *prefix)
+{
+	int left_shift_bits, right_shift_bits;
+	uint64_t high, low;
+	uint64_t print_num[2] = {};
+	int bits_to_copy;
+	const char *typename;
+
+	bits_to_copy = bitfield_offset + bitfield_size;
+	memcpy(print_num, data, BITS_ROUNDUP_BYTES(bits_to_copy));
+
+	right_shift_bits = 128 - bitfield_size;
+#if defined(__BIG_ENDIAN_BITFIELD)
+	high = print_num[0];
+	low = print_num[1];
+	left_shift_bits = bitfield_offset;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+	high = print_num[1];
+	low = print_num[0];
+	left_shift_bits = 128 - bits_to_copy;
+#else
+#error neither big nor little endian
+#endif
+
+	/* shake out un-needed bits by shift/or operations */
+	if (left_shift_bits >= 64) {
+		high = low << (left_shift_bits - 64);
+		low = 0;
+	} else {
+		high = (high << left_shift_bits) | (low >> (64 - left_shift_bits));
+		low = low << left_shift_bits;
+	}
+
+	if (right_shift_bits >= 64) {
+		low = high >> (right_shift_bits - 64);
+		high = 0;
+	} else {
+		low = (low >> right_shift_bits) | (high << (64 - right_shift_bits));
+		high = high >> right_shift_bits;
+	}
+
+	typename = btf_typename_or_fallback(btf, type->name_off);
+
+	if (high == 0) {
+		OUT_P(prefix, "(%s:%d)0x%lx,\n", typename, bitfield_size, low);
+	} else {
+		OUT_P(prefix, "(%s:%d)0x%lx%016lx,\n", typename, bitfield_size,
+		high, low);
+	}
+}
+
+static void out_btf_int(const struct btf *btf, const struct btf_type *type,
+	uint32_t bit_offset, const void *data, out_prefix_t *prefix)
+{
+	uint32_t *int_type = (uint32_t *)(type + 1);
+	uint32_t nbits = BTF_INT_BITS(*int_type);
+	const char *typename;
+
+	typename = btf_typename_or_fallback(btf, type->name_off);
+
+	if (bit_offset || BTF_INT_OFFSET(*int_type) ||
+		BITS_PER_BYTE_MASKED(nbits)) {
+		out_btf_bitfield(btf, type, BTF_INT_OFFSET(*int_type), nbits,
+			data, prefix);
+		return;
+	}
+
+	if (nbits == 128) {
+		out_btf_int128(btf, type, data, prefix);
+		return;
+	}
+
+	switch (BTF_INT_ENCODING(*int_type)) {
+	case 0:
+		if (BTF_INT_BITS(*int_type) == 64)
+			OUT_P(prefix, "(%s)%lu,\n", typename, *(uint64_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 32)
+			OUT_P(prefix, "(%s)%u,\n", typename, *(uint32_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 16)
+			OUT_P(prefix, "(%s)%hu,\n", typename, *(uint16_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 8)
+			OUT_P(prefix, "(%s)%hhu,\n", typename, *(uint8_t *)data);
+		else
+			OUT_P(prefix, "<invalid unsigned int type>,");
+		break;
+	case BTF_INT_SIGNED:
+		if (BTF_INT_BITS(*int_type) == 64)
+			OUT_P(prefix, "(%s)%ld,\n", typename, *(int64_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 32)
+			OUT_P(prefix, "(%s)%d,\n", typename, *(int32_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 16)
+			OUT_P(prefix, "(%s)%hd,\n", typename, *(int16_t *)data);
+		else if (BTF_INT_BITS(*int_type) == 8)
+			OUT_P(prefix, "(%s)%hhd,\n", typename, *(int8_t *)data);
+		else
+			OUT_P(prefix, "<invalid signed int type>,");
+		break;
+	case BTF_INT_CHAR:
+		OUT_P(prefix, "(%s)0x%hhx,\n", typename, *(char *)data);
+		break;
+	case BTF_INT_BOOL:
+		OUT_P(prefix, "(%s)%s,\n", typename,
+			*(bool *)data ? "true" : "false");
+		break;
+	default:
+		OUT_P(prefix, "<unknown type>,\n");
+		break;
+	}
+}
+
+static void out_btf_ptr(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	unsigned long value = *(unsigned long *)data;
+	int actual_type_id;
+	const struct btf_type *actual_type;
+	const char *typename = NULL;
+
+	actual_type_id = btf__resolve_type(btf, type->type);
+	if (actual_type_id > 0) {
+		actual_type = btf__type_by_id(btf, actual_type_id);
+		if (actual_type)
+			typename = btf__name_by_offset(btf, actual_type->name_off);
+	}
+
+	typename = typename ? : "void";
+
+	OUT_P(prefix, "(%s *)%p,\n", typename, (void *)value);
+}
+
+static void out_btf_dump_type(const struct btf *btf, int bit_offset,
+	uint32_t type_id, const void *data, size_t len, out_prefix_t *prefix);
+
+static void out_btf_array(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	const struct btf_array *array = (struct btf_array *)(type + 1);
+	const struct btf_type *elem_type;
+	long long elem_size;
+
+	elem_type = btf__type_by_id(btf, array->type);
+	if (!elem_type) {
+		OUT_P(prefix, "<invalid type_id %u>,\n", array->type);
+		return;
+	}
+
+	elem_size = btf__resolve_size(btf, array->type);
+	if (elem_size < 0) {
+		OUT_P(prefix, "<can't resolve size for type_id %u>,\n", array->type);
+		return;
+	}
+
+	for (int i = 0; i < array->nelems; ++i) {
+		out_btf_dump_type(btf, 0, array->type, data + i * elem_size,
+			elem_size, prefix);
+	}
+}
+
+static void out_btf_struct(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	struct btf_member *member = (struct btf_member *)(type + 1);
+	const struct btf_type *member_type;
+	const void *member_data;
+	out_prefix_t prefix_override = {};
+	unsigned int i;
+
+	for (i = 0; i < BTF_INFO_VLEN(type->info); i++) {
+		uint32_t bitfield_offset = member[i].offset;
+		uint32_t bitfield_size = 0;
+
+		if (BTF_INFO_KFLAG(type->info)) {
+			/* If btf_type.info.kind_flag is set, then
+			 * btf_member.offset is composed of:
+			 *      bitfield_offset << 24 | bitfield_size
+			 */
+			bitfield_size = BTF_MEMBER_BITFIELD_SIZE(bitfield_offset);
+			bitfield_offset = BTF_MEMBER_BIT_OFFSET(bitfield_offset);
+		}
+
+		OUT_P(prefix, ".%s = ",
+			btf_typename_or_fallback(btf, member[i].name_off));
+
+		/* The prefix has to be overwritten as this function prints the
+		 * field's name, so we don't print the prefix once here before
+		 * the name, then again in out_btf_bitfield() or out_btf_int()
+		 * before printing the actual value on the same line. */
+
+		member_type = btf__type_by_id(btf, member[i].type);
+		if (!member_type) {
+			OUT_P(&prefix_override, "<invalid type_id %u>,\n",
+				member[i].type);
+			return;
+		}
+
+		member_data = data + BITS_ROUNDDOWN_BYTES(bitfield_offset);
+		bitfield_offset = BITS_PER_BYTE_MASKED(bitfield_offset);
+
+		if (bitfield_size) {
+			out_btf_bitfield(btf, member_type, bitfield_offset,
+				bitfield_size, member_data, &prefix_override);
+		} else {
+			out_btf_dump_type(btf, bitfield_offset, member[i].type,
+				member_data, 0, &prefix_override);
+		}
+	}
+}
+
+static void out_btf_enum(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	const struct btf_enum *enums = (struct btf_enum *)(type + 1);
+	int64_t value;
+	unsigned int i;
+
+	switch (type->size) {
+	case 8:
+		value = *(int64_t *)data;
+		break;
+	case 4:
+		value = *(int32_t *)data;
+		break;
+	case 2:
+		value = *(int16_t*)data;
+		break;
+	case 1:
+		value = *(int8_t *)data;
+		break;
+	default:
+		OUT_P(prefix, "<invalid type size %u>,\n", type->size);
+		return;
+	}
+
+	for (i = 0; BTF_INFO_VLEN(type->info); ++i) {
+		if (value == enums[i].val) {
+			OUT_P(prefix, "(enum %s)%s\n",
+				btf_typename_or_fallback(btf, type->name_off),
+				btf_typename_or_fallback(btf, enums[i].name_off));
+			return;
+		}
+	}
+}
+
+static void out_btf_enum64(const struct btf *btf, const struct btf_type *type,
+	const void *data, out_prefix_t *prefix)
+{
+	const struct btf_enum64 *enums = (struct btf_enum64 *)(type + 1);
+	uint32_t lo32, hi32;
+	uint64_t value;
+	unsigned int i;
+
+	value = *(uint64_t *)data;
+	lo32 = (uint32_t)value;
+	hi32 = value >> 32;
+
+	for (i = 0; i < BTF_INFO_VLEN(type->info); i++) {
+		if (lo32 == enums[i].val_lo32 && hi32 == enums[i].val_hi32) {
+			OUT_P(prefix, "(enum %s)%s\n",
+				btf_typename_or_fallback(btf, type->name_off),
+				btf__name_by_offset(btf, enums[i].name_off));
+			return;
+		}
+	}
+}
+
+static out_prefix_t out_global_prefix = {};
+
+static void out_btf_dump_type(const struct btf *btf, int bit_offset,
+	uint32_t type_id, const void *data, size_t len, out_prefix_t *prefix)
+{
+	const struct btf_type *type;
+	out_prefix_t *global_prefix = &out_global_prefix;
+
+	if (!btf) {
+		OUT_P(prefix, "<missing BTF information>,\n");
+		return;
+	}
+
+	type = btf__type_by_id(btf, type_id);
+	if (!type) {
+		OUT_P(prefix, "<invalid type_id %u>,\n", type_id);
+		return;
+	}
+
+	switch (BTF_INFO_KIND(type->info)) {
+	case BTF_KIND_UNION:
+	case BTF_KIND_STRUCT:
+		OUT_P(prefix, "(%s %s) {\n",
+			BTF_INFO_KIND(type->info) == BTF_KIND_STRUCT ? "struct" : "union",
+			btf_typename_or_fallback(btf, type->name_off));
+
+		out_prefix_push(global_prefix);
+		out_btf_struct(btf, type, data, global_prefix);
+		out_prefix_pop(global_prefix);
+		OUT_P(global_prefix, "},\n");
+		break;
+	case BTF_KIND_ARRAY:
+		{
+			struct btf_array *array = (struct btf_array *)(type + 1);
+			const struct btf_type *content_type = btf__type_by_id(btf, array->type);
+
+			if (!content_type) {
+				OUT_P(prefix, "<invalid type_id %u>,\n", array->type);
+				return;
+			}
+
+			OUT_P(prefix, "(%s[]) {\n",
+				btf_typename_or_fallback(btf, content_type->name_off));
+			out_prefix_push(global_prefix);
+			out_btf_array(btf, type, data, global_prefix);
+			out_prefix_pop(global_prefix);
+			OUT_P(global_prefix, "},\n");
+		}
+		break;
+	case BTF_KIND_TYPEDEF:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_CONST:
+	case BTF_KIND_RESTRICT:
+		{
+			int actual_type_id = btf__resolve_type(btf, type_id);
+
+			if (actual_type_id < 0) {
+				OUT_P(prefix, "<invalid type_id %u>,\n", type_id);
+				return;
+			}
+
+			return out_btf_dump_type(btf, 0, actual_type_id, data,
+				len, prefix);
+		}
+		break;
+	case BTF_KIND_INT:
+		out_btf_int(btf, type, bit_offset, data, prefix);
+		break;
+	case BTF_KIND_PTR:
+		out_btf_ptr(btf, type, data, prefix);
+		break;
+	case BTF_KIND_ENUM:
+		out_btf_enum(btf, type, data, prefix);
+		break;
+	case BTF_KIND_ENUM64:
+		out_btf_enum64(btf, type, data, prefix);
+		break;
+	case BTF_KIND_FWD:
+		OUT_P(prefix, "<forward kind invalid>,\n");
+		break;
+	case BTF_KIND_UNKN:
+		OUT_P(prefix, "<unknown>,\n");
+		break;
+	case BTF_KIND_VAR:
+	case BTF_KIND_DATASEC:
+	default:
+		OUT_P(prefix, "<unsupported kind %u>,\n",
+			BTF_INFO_KIND(type->info));
+		break;
+	}
+}
+
+static void out_bpf_sk_storage(int map_id, const void *data, size_t len,
+	out_prefix_t *prefix)
+{
+	uint32_t type_id;
+	struct bpf_sk_storage_map_info *map_info;
+
+	map_info = bpf_map_opts_get_info(map_id);
+	if (!map_info) {
+		OUT_P(prefix, "map_id: %d: missing map info", map_id);
+		return;
+	}
+
+	if (map_info->info.value_size != len) {
+		OUT_P(prefix, "map_id: %d: invalid value size, expecting %u, got %lu\n",
+			map_id, map_info->info.value_size, len);
+		return;
+	}
+
+	type_id = map_info->info.btf_vmlinux_value_type_id ?: map_info->info.btf_value_type_id;
+
+	OUT_P(prefix, "map_id: %d [\n", map_id);
+	out_prefix_push(prefix);
+
+	out_btf_dump_type(map_info->btf, 0, type_id, data, len, prefix);
+
+	out_prefix_pop(prefix);
+	OUT_P(prefix, "]");
+}
+
 static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
 {
-	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg;
-	unsigned int rem;
+	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX+1], *bpf_stg;
+	out_prefix_t *global_prefix = &out_global_prefix;
+	unsigned int rem, map_id;
+	struct rtattr *value;
 
 	for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs);
 		RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) {
@@ -3583,8 +4115,15 @@ static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
 			(struct rtattr *)bpf_stg);
 
 		if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) {
-			out("map_id:%u",
-				rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]));
+			out("\n");
+
+			map_id = rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]);
+			value = tb[SK_DIAG_BPF_STORAGE_MAP_VALUE];
+
+			out_prefix_push(global_prefix);
+			out_bpf_sk_storage(map_id, RTA_DATA(value),
+				RTA_PAYLOAD(value), global_prefix);
+			out_prefix_pop(global_prefix);
 		}
 	}
 }
@@ -5978,6 +6517,11 @@ int main(int argc, char *argv[])
 		}
 	}
 
+	if (oneline && (bpf_map_opts.nr_maps || bpf_map_opts.show_all)) {
+		fprintf(stderr, "ss: --oneline, --bpf-maps, and --bpf-map-id are incompatible\n");
+		exit(-1);
+	}
+
 	if (show_processes || show_threads || show_proc_ctx || show_sock_ctx)
 		user_ent_hash_build();
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] ss: pretty-print BPF socket-local storage
  2023-11-28  2:30 ` [PATCH 3/3] ss: pretty-print " Quentin Deslandes
@ 2023-11-28 23:42   ` Martin KaFai Lau
  0 siblings, 0 replies; 10+ messages in thread
From: Martin KaFai Lau @ 2023-11-28 23:42 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: David Ahern, Martin KaFai Lau, netdev

On 11/27/23 6:30 PM, Quentin Deslandes wrote:
> +static int bpf_maps_opts_load_btf(struct bpf_map_info *info, struct btf **btf)
> +{
> +	if (info->btf_vmlinux_value_type_id) {
> +		if (!bpf_map_opts.kernel_btf) {
> +			bpf_map_opts.kernel_btf = libbpf_find_kernel_btf();
> +			if (!bpf_map_opts.kernel_btf) {
> +				fprintf(stderr, "ss: failed to load kernel BTF\n");
> +				return -1;
> +			}
> +		}
> +
> +		*btf = bpf_map_opts.kernel_btf;
> +	} else if (info->btf_value_type_id) {
> +		*btf = btf__load_from_kernel_by_id(info->btf_id);
> +		if (!*btf) {
> +			fprintf(stderr, "ss: failed to load BTF for map ID %u\n",
> +				info->id);
> +			return -1;
> +		}
> +	} else {
> +		*btf = NULL;
> +	}
> +
> +	return 0;
> +}

[ ... ]

> +static void out_bpf_sk_storage(int map_id, const void *data, size_t len,
> +	out_prefix_t *prefix)
> +{
> +	uint32_t type_id;
> +	struct bpf_sk_storage_map_info *map_info;
> +
> +	map_info = bpf_map_opts_get_info(map_id);
> +	if (!map_info) {
> +		OUT_P(prefix, "map_id: %d: missing map info", map_id);
> +		return;
> +	}
> +
> +	if (map_info->info.value_size != len) {
> +		OUT_P(prefix, "map_id: %d: invalid value size, expecting %u, got %lu\n",
> +			map_id, map_info->info.value_size, len);
> +		return;
> +	}
> +
> +	type_id = map_info->info.btf_vmlinux_value_type_id ?: map_info->info.btf_value_type_id;

sk_storage does not use info.btf_vmlinux_value_type_id, so no need to handle 
this case. Only info.btf_value_type_id is used.

> +
> +	OUT_P(prefix, "map_id: %d [\n", map_id);
> +	out_prefix_push(prefix);
> +
> +	out_btf_dump_type(map_info->btf, 0, type_id, data, len, prefix);
> +
> +	out_prefix_pop(prefix);
> +	OUT_P(prefix, "]");
> +}
> +



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] ss: pretty-printing BPF socket-local storage
  2023-11-28  2:30 [PATCH 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
                   ` (2 preceding siblings ...)
  2023-11-28  2:30 ` [PATCH 3/3] ss: pretty-print " Quentin Deslandes
@ 2023-11-28 22:43 ` Stephen Hemminger
  2023-12-08 15:01   ` Quentin Deslandes
  3 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2023-11-28 22:43 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: netdev, David Ahern, Martin KaFai Lau

On Mon, 27 Nov 2023 18:30:55 -0800
Quentin Deslandes <qde@naccy.de> wrote:

> BPF allows programs to store socket-specific data using
> BPF_MAP_TYPE_SK_STORAGE maps. The data is attached to the socket itself,
> and Martin added INET_DIAG_REQ_SK_BPF_STORAGES, so it can be fetched
> using the INET_DIAG mechanism.
> 
> Currently, ss doesn't request the socket-local data, this patch aims to
> fix this.
> 
> The first patch fixes a bug where the "Process" column would always be
> printed on ss' output, even if --processes/-p is not used.
> 
> Patch #2 requests the socket-local data for the requested map ID
> (--bpf-map-id=) or all the maps (--bpf-maps). It then prints the map_id
> in a dedicated column.
> 
> Patch #3 uses libbpf and BTF to pretty print the map's content, like
> `bpftool map dump` would do.
> 
> While I think it makes sense for ss to provide the socket-local storage
> content for the sockets, it's difficult to conciliate the column-based
> output of ss and having readable socket-local data. Hence, the
> socket-local data is printed in a readable fashion over multiple lines
> under its socket statistics, independently of the column-based approach.
> 
> Here is an example of ss' output with --bpf-maps:
> [...]
> ESTAB                  2960280             0 [...]
>     map_id: 259 [
>         (struct my_sk_storage) {
>             .field_hh = (char)127,
>             .<anon> = (union <anon>) {
>                 .a = (int)0,
>                 .b = (int)0,
>             },
>         },
>     ]
> 
> Quentin Deslandes (3):
>   ss: prevent "Process" column from being printed unless requested
>   ss: add support for BPF socket-local storage
>   ss: pretty-print BPF socket-local storage
> 
>  misc/ss.c | 822 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 818 insertions(+), 4 deletions(-)


Useful, but ss is growing into a huge monolithic program and may need some
refactoring. Also, this cries out for a json output format. Which ss doesn't
have yet.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] ss: pretty-printing BPF socket-local storage
  2023-11-28 22:43 ` [PATCH 0/3] ss: pretty-printing " Stephen Hemminger
@ 2023-12-08 15:01   ` Quentin Deslandes
  2023-12-08 17:27     ` Stephen Hemminger
  0 siblings, 1 reply; 10+ messages in thread
From: Quentin Deslandes @ 2023-12-08 15:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, David Ahern, Martin KaFai Lau

On 2023-11-28 23:43, Stephen Hemminger wrote:
> On Mon, 27 Nov 2023 18:30:55 -0800
> Quentin Deslandes <qde@naccy.de> wrote:
> 
>> BPF allows programs to store socket-specific data using
>> BPF_MAP_TYPE_SK_STORAGE maps. The data is attached to the socket itself,
>> and Martin added INET_DIAG_REQ_SK_BPF_STORAGES, so it can be fetched
>> using the INET_DIAG mechanism.
>>
>> Currently, ss doesn't request the socket-local data, this patch aims to
>> fix this.
>>
>> The first patch fixes a bug where the "Process" column would always be
>> printed on ss' output, even if --processes/-p is not used.
>>
>> Patch #2 requests the socket-local data for the requested map ID
>> (--bpf-map-id=) or all the maps (--bpf-maps). It then prints the map_id
>> in a dedicated column.
>>
>> Patch #3 uses libbpf and BTF to pretty print the map's content, like
>> `bpftool map dump` would do.
>>
>> While I think it makes sense for ss to provide the socket-local storage
>> content for the sockets, it's difficult to conciliate the column-based
>> output of ss and having readable socket-local data. Hence, the
>> socket-local data is printed in a readable fashion over multiple lines
>> under its socket statistics, independently of the column-based approach.
>>
>> Here is an example of ss' output with --bpf-maps:
>> [...]
>> ESTAB                  2960280             0 [...]
>>     map_id: 259 [
>>         (struct my_sk_storage) {
>>             .field_hh = (char)127,
>>             .<anon> = (union <anon>) {
>>                 .a = (int)0,
>>                 .b = (int)0,
>>             },
>>         },
>>     ]
>>
>> Quentin Deslandes (3):
>>   ss: prevent "Process" column from being printed unless requested
>>   ss: add support for BPF socket-local storage
>>   ss: pretty-print BPF socket-local storage
>>
>>  misc/ss.c | 822 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 818 insertions(+), 4 deletions(-)
> 
> 
> Useful, but ss is growing into a huge monolithic program and may need some
> refactoring. Also, this cries out for a json output format. Which ss doesn't
> have yet.

I've submitted a v2 to fix Martin's comments and also improve the printing
behavior. The updated revision reduces the number of lines added by 50%.

Regarding the JSON output, is it specifically for socket-local storage, or
more generally, for the whole tool? I agree with you anyway, but I would argue
that it doesn't fit this series, although I can work on this as a next step.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] ss: pretty-printing BPF socket-local storage
  2023-12-08 15:01   ` Quentin Deslandes
@ 2023-12-08 17:27     ` Stephen Hemminger
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2023-12-08 17:27 UTC (permalink / raw)
  To: Quentin Deslandes; +Cc: netdev, David Ahern, Martin KaFai Lau

On Fri, 8 Dec 2023 16:01:56 +0100
Quentin Deslandes <qde@naccy.de> wrote:

> I've submitted a v2 to fix Martin's comments and also improve the printing
> behavior. The updated revision reduces the number of lines added by 50%.

Not sure about what best format for this is.

> 
> Regarding the JSON output, is it specifically for socket-local storage, or
> more generally, for the whole tool? I agree with you anyway, but I would argue
> that it doesn't fit this series, although I can work on this as a next step.

It is more for the whole tool in future.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-08 17:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-28  2:30 [PATCH 0/3] ss: pretty-printing BPF socket-local storage Quentin Deslandes
2023-11-28  2:30 ` [PATCH 1/3] ss: prevent "Process" column from being printed unless requested Quentin Deslandes
2023-11-29  0:20   ` David Ahern
2023-11-28  2:30 ` [PATCH 2/3] ss: add support for BPF socket-local storage Quentin Deslandes
2023-11-28 23:35   ` Martin KaFai Lau
2023-11-28  2:30 ` [PATCH 3/3] ss: pretty-print " Quentin Deslandes
2023-11-28 23:42   ` Martin KaFai Lau
2023-11-28 22:43 ` [PATCH 0/3] ss: pretty-printing " Stephen Hemminger
2023-12-08 15:01   ` Quentin Deslandes
2023-12-08 17:27     ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).