Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: ethernet: ti: cpsw: use for mcast entries only host port
From: Grygorii Strashko @ 2018-10-13  0:41 UTC (permalink / raw)
  To: Ivan Khoronzhuk, davem; +Cc: linux-omap, netdev, linux-kernel
In-Reply-To: <20181012160629.7245-1-ivan.khoronzhuk@linaro.org>



On 10/12/2018 11:06 AM, Ivan Khoronzhuk wrote:
> In dual-emac mode the cpsw driver sends directed packets, that means
> that packets go to the directed port, but an ALE lookup is performed
> to determine untagged egress only. It means that on tx side no need
> to add port bit for ALE mcast entry mask, and basically ALE entry
> for port identification is needed only on rx side.
> 
> So, add only host port in dual_emac mode as used directed
> transmission, and no need in one more port. For single port boards
> and switch mode all ports used, as usual, so no changes for them.
> Also it simplifies farther changes.
> 
> In other words, mcast entries for dual-emac should behave exactly
> like unicast. It also can help avoid leaking packets between ports
> with same vlan on h/w level if ports could became members of same vid.
> 
> So now, for instance, if mcast address 33:33:00:00:00:01 is added then
> entries in ALE table:
> 
> vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1
> vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1
> 
> Instead of:
> vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x3
> vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x5
> 
> With the same considerations, set only host port for unregistered
> mcast for dual-emac mode in case of IFF_ALLMULTI is set, exactly like
> it's done in cpsw_ale_set_allmulti().
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
> 
> Based on net-next/master



Thank you.
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>

-- 
regards,
-grygorii

^ permalink raw reply

* Re: [PATCH net-next] net: bridge: add support for per-port vlan stats
From: David Miller @ 2018-10-12 17:20 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, bridge, roopa
In-Reply-To: <20181012104116.12751-1-nikolay@cumulusnetworks.com>

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Fri, 12 Oct 2018 13:41:16 +0300

> This patch adds an option to have per-port vlan stats instead of the
> default global stats. The option can be set only when there are no port
> vlans in the bridge since we need to allocate the stats if it is set
> when vlans are being added to ports (and respectively free them
> when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
> largest user of options. The current stats design allows us to add
> these without any changes to the fast-path, it all comes down to
> the per-vlan stats pointer which, if this option is enabled, will
> be allocated for each port vlan instead of using the global bridge-wide
> one.
> 
> CC: bridge@lists.linux-foundation.org
> CC: Roopa Prabhu <roopa@cumulusnetworks.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] netpoll: allow cleanup to be synchronous
From: Banerjee, Debabrata @ 2018-10-12 17:27 UTC (permalink / raw)
  To: David S . Miller, netdev@vger.kernel.org; +Cc: Neil Horman
In-Reply-To: <20181012165929.20098-1-dbanerje@akamai.com>

Actually I realized this patch might be problematic, although someone might be holding rtnl, it might not be the current callstack. A different solution is needed I think. Input appreciated.

-Deb

^ permalink raw reply

* [PATCH net-next] netpoll: allow cleanup to be synchronous
From: Debabrata Banerjee @ 2018-10-12 16:59 UTC (permalink / raw)
  To: David S . Miller, netdev; +Cc: dbanerje, Neil Horman

This fixes a problem introduced by:
commit 2cde6acd49da ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")

When using netconsole on a bond, __netpoll_cleanup can asynchronously
recurse multiple times, each __netpoll_free_async call can result in
more __netpoll_free_async's. This means there is now a race between
cleanup_work queues on multiple netpoll_info's on multiple devices and
the configuration of a new netpoll. For example if a netconsole is set
to enable 0, reconfigured, and enable 1 immediately, this netconsole
will likely not work.

Given the reason for __netpoll_free_async is it can be called when rtnl
is not locked, if it is locked, we should be able to execute
synchronously.

CC: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
 net/core/netpoll.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index de1d1ba92f2d..b899cbfbe639 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -826,7 +826,10 @@ static void netpoll_async_cleanup(struct work_struct *work)

 void __netpoll_free_async(struct netpoll *np)
 {
-	schedule_work(&np->cleanup_work);
+	if (rtnl_is_locked())
+		__netpoll_cleanup(np);
+	else
+		schedule_work(&np->cleanup_work);
 }
 EXPORT_SYMBOL_GPL(__netpoll_free_async);

-- 
2.19.1

^ permalink raw reply related

* Re: pull-request: mac80211-next 2018-10-12
From: David Miller @ 2018-10-12 17:57 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20181012111354.30062-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 12 Oct 2018 13:13:53 +0200

> Here's another set of updates. Note that there's a patch here
> (lib80211 skcipher removal to simplify the code) that will
> cause conflicts when merging with the crypto tree, the new
> code from this should be used. Please advise Greg/Linus unless
> the crypto tree will somehow end up in your tree first?
> 
> Please pull and let me know if there's any problem.

Thanks I will keep that merge issue in mind.

Even if I forget, Linus and Greg are adults and usually can sort
things like that out. :-)

Really nice to see the new netlink validation code put to use.

^ permalink raw reply

* Re: [PATCH v2] netlink: replace __NLA_ENSURE implementation
From: David Miller @ 2018-10-12 18:00 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, netdev, john.garry, johannes.berg
In-Reply-To: <20181012105300.25747-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 12 Oct 2018 12:53:00 +0200

> From: Johannes Berg <johannes.berg@intel.com>
> 
> We already have BUILD_BUG_ON_ZERO() which I just hadn't found
> before, so we should use it here instead of open-coding another
> implementation thereof.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> v2: remove all the language about -Wvla, I misunderstood John
>     and he was just referring to *other* VLA warnings in the
>     wifi stack (which we know about and are being fixed by the
>     crypto tree)

Applied to net-next.

^ permalink raw reply

* [PATCH bpf-next] tools: bpftool: add map create command
From: Jakub Kicinski @ 2018-10-12 18:06 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jakub Kicinski

Add a way of creating maps from user space.  The command takes
as parameters most of the attributes of the map creation system
call command.  After map is created its pinned to bpffs.  This makes
it possible to easily and dynamically (without rebuilding programs)
test various corner cases related to map creation.

Map type names are taken from bpftool's array used for printing.
In general these days we try to make use of libbpf type names, but
there are no map type names in libbpf as of today.

As with most features I add the motivation is testing (offloads) :)

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 .../bpf/bpftool/Documentation/bpftool-map.rst |  15 ++-
 tools/bpf/bpftool/bash-completion/bpftool     |  38 ++++++-
 tools/bpf/bpftool/common.c                    |  21 ++++
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/map.c                       | 105 +++++++++++++++++-
 5 files changed, 176 insertions(+), 4 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index a6258bc8ec4f..5c1baa714fbf 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -15,13 +15,15 @@ SYNOPSIS
 	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
 
 	*COMMANDS* :=
-	{ **show** | **list** | **dump** | **update** | **lookup** | **getnext** | **delete**
-	| **pin** | **help** }
+	{ **show** | **list** | **create** | **dump** | **update** | **lookup** | **getnext**
+	| **delete** | **pin** | **help** }
 
 MAP COMMANDS
 =============
 
 |	**bpftool** **map { show | list }**   [*MAP*]
+|	**bpftool** **map create**     *FILE* **type** *TYPE* **key** *KEY_SIZE* **value** *VALUE_SIZE* \
+|		**entries** *MAX_ENTRIES* [**name** *NAME*] [**flags** *FLAGS*] [**dev** *NAME*]
 |	**bpftool** **map dump**       *MAP*
 |	**bpftool** **map update**     *MAP*  **key** *DATA*   **value** *VALUE* [*UPDATE_FLAGS*]
 |	**bpftool** **map lookup**     *MAP*  **key** *DATA*
@@ -36,6 +38,11 @@ MAP COMMANDS
 |	*PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* }
 |	*VALUE* := { *DATA* | *MAP* | *PROG* }
 |	*UPDATE_FLAGS* := { **any** | **exist** | **noexist** }
+|	*TYPE* := { **hash** | **array** | **prog_array** | **perf_event_array** | **percpu_hash**
+|		| **percpu_array** | **stack_trace** | **cgroup_array** | **lru_hash**
+|		| **lru_percpu_hash** | **lpm_trie** | **array_of_maps** | **hash_of_maps**
+|		| **devmap** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
+|		| **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage** }
 
 DESCRIPTION
 ===========
@@ -47,6 +54,10 @@ DESCRIPTION
 		  Output will start with map ID followed by map type and
 		  zero or more named attributes (depending on kernel version).
 
+	**bpftool map create** *FILE* **type** *TYPE* **key** *KEY_SIZE* **value** *VALUE_SIZE*  **entries** *MAX_ENTRIES* [**name** *NAME*] [**flags** *FLAGS*] [**dev** *NAME*]
+		  Create a new map with given parameters and pin it to *bpffs*
+		  as *FILE*.
+
 	**bpftool map dump**    *MAP*
 		  Dump all entries in a given *MAP*.
 
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index df1060b852c1..277bab46cd05 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -370,6 +370,42 @@ _bpftool()
                             ;;
                     esac
                     ;;
+                create)
+                    case $prev in
+                        $command)
+                            _filedir
+                            return 0
+                            ;;
+                        type)
+                            COMPREPLY=( $( compgen -W 'hash array prog_array \
+                                perf_event_array percpu_hash percpu_array \
+                                stack_trace cgroup_array lru_hash \
+                                lru_percpu_hash lpm_trie array_of_maps \
+                                hash_of_maps devmap sockmap cpumap xskmap \
+                                sockhash cgroup_storage reuseport_sockarray \
+                                percpu_cgroup_storage' -- \
+                                                   "$cur" ) )
+                            return 0
+                            ;;
+                        key|value|flags|name|entries)
+                            return 0
+                            ;;
+                        dev)
+                            _sysfs_get_netdevs
+                            return 0
+                            ;;
+                        *)
+                            _bpftool_once_attr 'type'
+                            _bpftool_once_attr 'key'
+                            _bpftool_once_attr 'value'
+                            _bpftool_once_attr 'entries'
+                            _bpftool_once_attr 'name'
+                            _bpftool_once_attr 'flags'
+                            _bpftool_once_attr 'dev'
+                            return 0
+                            ;;
+                    esac
+                    ;;
                 lookup|getnext|delete)
                     case $prev in
                         $command)
@@ -483,7 +519,7 @@ _bpftool()
                 *)
                     [[ $prev == $object ]] && \
                         COMPREPLY=( $( compgen -W 'delete dump getnext help \
-                            lookup pin event_pipe show list update' -- \
+                            lookup pin event_pipe show list update create' -- \
                             "$cur" ) )
                     ;;
             esac
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index b3a0709ea7ed..3318da8060bd 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -618,3 +618,24 @@ void print_dev_json(__u32 ifindex, __u64 ns_dev, __u64 ns_inode)
 		jsonw_string_field(json_wtr, "ifname", name);
 	jsonw_end_object(json_wtr);
 }
+
+int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what)
+{
+	char *endptr;
+
+	NEXT_ARGP();
+
+	if (*val) {
+		p_err("%s already specified", what);
+		return -1;
+	}
+
+	*val = strtoul(**argv, &endptr, 0);
+	if (*endptr) {
+		p_err("can't parse %s as %s", **argv, what);
+		return -1;
+	}
+	NEXT_ARGP();
+
+	return 0;
+}
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 40492cdc4e53..cf29c691ca35 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -138,6 +138,7 @@ int do_cgroup(int argc, char **arg);
 int do_perf(int argc, char **arg);
 int do_net(int argc, char **arg);
 
+int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what);
 int prog_parse_fd(int *argc, char ***argv);
 int map_parse_fd(int *argc, char ***argv);
 int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len);
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 6003e9598973..eb4d781901c8 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -36,6 +36,7 @@
 #include <fcntl.h>
 #include <linux/err.h>
 #include <linux/kernel.h>
+#include <net/if.h>
 #include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -94,6 +95,17 @@ static bool map_is_map_of_progs(__u32 type)
 	return type == BPF_MAP_TYPE_PROG_ARRAY;
 }
 
+static int map_type_from_str(const char *type)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(map_type_name); i++)
+		/* Don't allow prefixing in case of possible future shadowing */
+		if (map_type_name[i] && !strcmp(map_type_name[i], type))
+			return i;
+	return -1;
+}
+
 static void *alloc_value(struct bpf_map_info *info)
 {
 	if (map_is_per_cpu(info->type))
@@ -1024,6 +1036,87 @@ static int do_pin(int argc, char **argv)
 	return err;
 }
 
+static int do_create(int argc, char **argv)
+{
+	struct bpf_create_map_attr attr = { NULL, };
+	const char *pinfile;
+	int err, fd;
+
+	if (!REQ_ARGS(7))
+		return -1;
+	pinfile = GET_ARG();
+
+	while (argc) {
+		if (!REQ_ARGS(2))
+			return -1;
+
+		if (is_prefix(*argv, "type")) {
+			NEXT_ARG();
+
+			if (attr.map_type) {
+				p_err("map type already specified");
+				return -1;
+			}
+
+			attr.map_type = map_type_from_str(*argv);
+			if ((int)attr.map_type < 0) {
+				p_err("unrecognized map type: %s", *argv);
+				return -1;
+			}
+			NEXT_ARG();
+		} else if (is_prefix(*argv, "name")) {
+			NEXT_ARG();
+			attr.name = GET_ARG();
+		} else if (is_prefix(*argv, "key")) {
+			if (parse_u32_arg(&argc, &argv, &attr.key_size,
+					  "key size"))
+				return -1;
+		} else if (is_prefix(*argv, "value")) {
+			if (parse_u32_arg(&argc, &argv, &attr.value_size,
+					  "value size"))
+				return -1;
+		} else if (is_prefix(*argv, "entries")) {
+			if (parse_u32_arg(&argc, &argv, &attr.max_entries,
+					  "max entries"))
+				return -1;
+		} else if (is_prefix(*argv, "flags")) {
+			if (parse_u32_arg(&argc, &argv, &attr.map_flags,
+					  "flags"))
+				return -1;
+		} else if (is_prefix(*argv, "dev")) {
+			NEXT_ARG();
+
+			if (attr.map_ifindex) {
+				p_err("offload device already specified");
+				return -1;
+			}
+
+			attr.map_ifindex = if_nametoindex(*argv);
+			if (!attr.map_ifindex) {
+				p_err("unrecognized netdevice '%s': %s",
+				      *argv, strerror(errno));
+				return -1;
+			}
+			NEXT_ARG();
+		}
+	}
+
+	fd = bpf_create_map_xattr(&attr);
+	if (fd < 0) {
+		p_err("map create failed: %s", strerror(errno));
+		return -1;
+	}
+
+	err = do_pin_fd(fd, pinfile);
+	close(fd);
+	if (err)
+		return err;
+
+	if (json_output)
+		jsonw_null(json_wtr);
+	return 0;
+}
+
 static int do_help(int argc, char **argv)
 {
 	if (json_output) {
@@ -1033,6 +1126,9 @@ static int do_help(int argc, char **argv)
 
 	fprintf(stderr,
 		"Usage: %s %s { show | list }   [MAP]\n"
+		"       %s %s create     FILE type TYPE key KEY_SIZE value VALUE_SIZE \\\n"
+		"                              entries MAX_ENTRIES [name NAME] [flags FLAGS] \\\n"
+		"                              [dev NAME]\n"
 		"       %s %s dump       MAP\n"
 		"       %s %s update     MAP  key DATA value VALUE [UPDATE_FLAGS]\n"
 		"       %s %s lookup     MAP  key DATA\n"
@@ -1047,11 +1143,17 @@ static int do_help(int argc, char **argv)
 		"       " HELP_SPEC_PROGRAM "\n"
 		"       VALUE := { DATA | MAP | PROG }\n"
 		"       UPDATE_FLAGS := { any | exist | noexist }\n"
+		"       TYPE := { hash | array | prog_array | perf_event_array | percpu_hash |\n"
+		"                 percpu_array | stack_trace | cgroup_array | lru_hash |\n"
+		"                 lru_percpu_hash | lpm_trie | array_of_maps | hash_of_maps |\n"
+		"                 devmap | sockmap | cpumap | xskmap | sockhash |\n"
+		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
 		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
-		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2]);
 
 	return 0;
 }
@@ -1067,6 +1169,7 @@ static const struct cmd cmds[] = {
 	{ "delete",	do_delete },
 	{ "pin",	do_pin },
 	{ "event_pipe",	do_event_pipe },
+	{ "create",	do_create },
 	{ 0 }
 };
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH net-next] nfp: devlink port split support for 1x100G CXP NIC
From: Jakub Kicinski @ 2018-10-12 18:09 UTC (permalink / raw)
  To: davem; +Cc: netdev, oss-drivers, Ryan C Goodfellow

From: Ryan C Goodfellow <rgoodfel@isi.edu>

This commit makes it possible to use devlink to split the 100G CXP
Netronome into two 40G interfaces. Currently when you ask for 2
interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split
calculates that you want 5 lanes per port because for some reason
eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really
want when asking for 2 breakout interfaces is 4 lanes per port. This
commit makes that happen by calculating based on 8 lanes if 10 are
present.

Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Greg Weeks <greg.weeks@netronome.com>
---
 .../net/ethernet/netronome/nfp/nfp_devlink.c    | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 107b048b33b4..808647ec3573 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -66,6 +66,7 @@ nfp_devlink_port_split(struct devlink *devlink, unsigned int port_index,
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
 	struct nfp_eth_table_port eth_port;
+	unsigned int lanes;
 	int ret;
 
 	if (count < 2)
@@ -84,8 +85,12 @@ nfp_devlink_port_split(struct devlink *devlink, unsigned int port_index,
 		goto out;
 	}
 
-	ret = nfp_devlink_set_lanes(pf, eth_port.index,
-				    eth_port.port_lanes / count);
+	/* Special case the 100G CXP -> 2x40G split */
+	lanes = eth_port.port_lanes / count;
+	if (eth_port.lanes == 10 && count == 2)
+		lanes = 8 / count;
+
+	ret = nfp_devlink_set_lanes(pf, eth_port.index, lanes);
 out:
 	mutex_unlock(&pf->lock);
 
@@ -98,6 +103,7 @@ nfp_devlink_port_unsplit(struct devlink *devlink, unsigned int port_index,
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
 	struct nfp_eth_table_port eth_port;
+	unsigned int lanes;
 	int ret;
 
 	mutex_lock(&pf->lock);
@@ -113,7 +119,12 @@ nfp_devlink_port_unsplit(struct devlink *devlink, unsigned int port_index,
 		goto out;
 	}
 
-	ret = nfp_devlink_set_lanes(pf, eth_port.index, eth_port.port_lanes);
+	/* Special case the 100G CXP -> 2x40G unsplit */
+	lanes = eth_port.port_lanes;
+	if (eth_port.port_lanes == 8)
+		lanes = 10;
+
+	ret = nfp_devlink_set_lanes(pf, eth_port.index, lanes);
 out:
 	mutex_unlock(&pf->lock);
 
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH net] rxrpc: Fix incorrect conditional on IPV6
From: David Miller @ 2018-10-12 18:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: dhowells, arnd, netdev, linux-afs
In-Reply-To: <9ee1a767-b865-6afa-91a2-311422b8338d@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 12 Oct 2018 08:19:20 -0700

> 
> 
> On 10/12/2018 07:52 AM, David Howells wrote:
>> The udpv6_encap_enable() function is part of the ipv6 code, and if that is
>> configured as a loadable module and rxrpc is built in then a build failure
>> will occur because the conditional check is wrong:
>> 
>>   net/rxrpc/local_object.o: In function `rxrpc_lookup_local':
>>   local_object.c:(.text+0x2688): undefined reference to `udpv6_encap_enable'
>> 
>> Use the correct config symbol (CONFIG_AF_RXRPC_IPV6) in the conditional
>> check rather than CONFIG_IPV6 as that will do the right thing.
>> 
>> Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook")
>> Signed-off-by: David Howells <dhowells@redhat.com>
>> cc: Arnd Bergmann <arnd@arndb.de>
> 
> Nit : Correct attribution would require a Reported-by: tag

Right.

^ permalink raw reply

* Re: [PATCH net-next 0/4] s390/qeth: updates 2018-10-12
From: David Miller @ 2018-10-12 18:27 UTC (permalink / raw)
  To: jwi; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20181012152715.6153-1-jwi@linux.ibm.com>

From: Julian Wiedmann <jwi@linux.ibm.com>
Date: Fri, 12 Oct 2018 17:27:11 +0200

> please apply one more patchset for net-next. This extends the TSO support
> in qeth.

Series applied.

^ permalink raw reply

* [PATCH] net: qla3xxx: Remove overflowing shift statement
From: Nathan Chancellor @ 2018-10-13  2:14 UTC (permalink / raw)
  To: Dept-GELinuxNICDev, David S. Miller
  Cc: netdev, Sudarsana Reddy Kalluru, Tomer Tayar, Michal Kalderon,
	Ariel Elior, linux-kernel, Nathan Chancellor

Clang currently warns:

drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift
result (0xF00000000) requires 37 bits to represent, but 'int' only has
32 bits [-Wshift-overflow]
                    ((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
                      ~~~~~~~~~~~~~~ ^  ~~
1 warning generated.

The warning is certainly accurate since ISP_NVRAM_MASK is defined as
(0x000F << 16) which is then shifted by 16, resulting in 64424509440,
well above UINT_MAX.

Given that this is the only location in this driver where ISP_NVRAM_MASK
is shifted again, it seems likely that ISP_NVRAM_MASK was originally
defined without a shift and during the move of the shift to the
definition, this statement wasn't properly removed (since ISP_NVRAM_MASK
is used in the statenent right above this). Only the maintainers can
confirm this since this statment has been here since the driver was
first added to the kernel.

Link: https://github.com/ClangBuiltLinux/linux/issues/127
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---

I sent this diff in a RFC email to the email in the MAINTAINERS file
with no response. Given that this driver has not received any actual
Cavium/QLogic treatment since at least the moving of this file in
commit aa43c2158d5a ("qlogic: Move the QLogic drivers") back in 2011,
I'd be curious to see who is actually using this driver but I'll leave
that for another time.

I've added a few of the active 'drivers/net/ethernet/qlogic' Cavium
developers (hope they don't mind) to see if this makes any sense since I
don't have the hardware to test and I know they're active.

 drivers/net/ethernet/qlogic/qla3xxx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
index b48f76182049..10b075bc5959 100644
--- a/drivers/net/ethernet/qlogic/qla3xxx.c
+++ b/drivers/net/ethernet/qlogic/qla3xxx.c
@@ -380,8 +380,6 @@ static void fm93c56a_select(struct ql3_adapter *qdev)

 	qdev->eeprom_cmd_data = AUBURN_EEPROM_CS_1;
 	ql_write_nvram_reg(qdev, spir, ISP_NVRAM_MASK | qdev->eeprom_cmd_data);
-	ql_write_nvram_reg(qdev, spir,
-			   ((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
 }

 /*
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH net-next v2] vxlan: support NTF_USE refresh of fdb entries
From: Stephen Hemminger @ 2018-10-12 18:49 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: davem, netdev
In-Reply-To: <1539286513-26165-1-git-send-email-roopa@cumulusnetworks.com>

On Thu, 11 Oct 2018 12:35:13 -0700
Roopa Prabhu <roopa@cumulusnetworks.com> wrote:

> From: Roopa Prabhu <roopa@cumulusnetworks.com>
> 
> This makes use of NTF_USE in vxlan driver consistent
> with bridge driver.
> 
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> v2: fix patch prefix
> 
>  drivers/net/vxlan.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Acked-by: Stephen Hemminger <stephen@networkplumber.org>

^ permalink raw reply

* Re: [rtnetlink] Potential bug in Linux (rt)netlink code
From: Stephen Hemminger @ 2018-10-12 18:51 UTC (permalink / raw)
  To: Henning Rogge; +Cc: netdev
In-Reply-To: <4d7a11b7-1f43-5669-6f19-3c746cc88306@fkie.fraunhofer.de>

On Fri, 12 Oct 2018 09:30:40 +0200
Henning Rogge <henning.rogge@fkie.fraunhofer.de> wrote:

> Hi,
> 
> I am working on a self-written routing agent 
> (https://github.com/OLSR/OONF) and am stuck on a problem with netlink 
> that I cannot explain with an userspace error.
> 
> I am using a netlink socket for setting routes 
> (RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes 
> in the database (via a RTM_GETROUTE dump) and for getting multicast 
> messages for ongoing routing changes.
> 
> After a few netlink messages I get to the point where the kernel just 
> does not responst to a RTM_NEWROUTE. No error, no answer, despite the 
> NLM_F_ACK flag set)... but sometime when (during shutdown of the routing 
> agent) the program sends another route command (most times a 
> RTM_DELROUTE) I get a single netlink packet with a "successful" response 
> for both the "missing" RTM_NEWROUTE and one for the new RTM DELROUTE 
> sequence number.
> 
> I am testing two routing agents, each of them in a systemd-nspawn based 
> container connected over a bridge on the host system on a current Debian 
> Testing (kernel 4.18.0-1-amd64).
> 
> I am directly using the netlink sockets, without any other userspace 
> library in between.
> 
> I have checked the hexdumps of a couple of netlink messages (including 
> the ones just before the bug happens) by hand and they seem to be okay.
> 
> When I tried to add a "netlink listener" socket for futher debugging (ip 
> link add nlmon0 type nlmon) the problem vanished until I removed the 
> listener socket again.
> 
> Any ideas how to debug this problem? Unfortunately I have no short 
> example program to trigger the bug... I have rarely seen the problem for 
> years (once every couple of months), but until a few days ago I never 
> managed to reproduce it.
> 
> Henning Rogge

Are you reading the responses to your requests?  If you don't read
the response, the socket will get flow blocked.

^ permalink raw reply

* [net-next PATCH] net: sched: cls_flower: Classify packets using port ranges
From: Amritha Nambiar @ 2018-10-12 13:53 UTC (permalink / raw)
  To: netdev, davem
  Cc: jakub.kicinski, amritha.nambiar, sridhar.samudrala, jhs,
	xiyou.wangcong, jiri

Added support in tc flower for filtering based on port ranges.
This is a rework of the RFC patch at:
https://patchwork.ozlabs.org/patch/969595/

Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
  action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  dst_port_min 20
  dst_port_max 30
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 181 sec used 5 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
  skip_hw action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
  eth_type ipv4
  ip_proto tcp
  dst_ip 192.168.1.1
  dst_port_min 100
  dst_port_max 200
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1 installed 28 sec used 6 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 include/uapi/linux/pkt_cls.h |    5 ++
 net/sched/cls_flower.c       |  134 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 132 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 401d0c1..b569308 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -405,6 +405,11 @@ enum {
 	TCA_FLOWER_KEY_UDP_SRC,		/* be16 */
 	TCA_FLOWER_KEY_UDP_DST,		/* be16 */
 
+	TCA_FLOWER_KEY_PORT_SRC_MIN,	/* be16 */
+	TCA_FLOWER_KEY_PORT_SRC_MAX,	/* be16 */
+	TCA_FLOWER_KEY_PORT_DST_MIN,	/* be16 */
+	TCA_FLOWER_KEY_PORT_DST_MAX,	/* be16 */
+
 	TCA_FLOWER_FLAGS,
 	TCA_FLOWER_KEY_VLAN_ID,		/* be16 */
 	TCA_FLOWER_KEY_VLAN_PRIO,	/* u8   */
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 9aada2d..5f135f0 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -55,6 +55,9 @@ struct fl_flow_key {
 	struct flow_dissector_key_ip ip;
 	struct flow_dissector_key_ip enc_ip;
 	struct flow_dissector_key_enc_opts enc_opts;
+
+	struct flow_dissector_key_ports tp_min;
+	struct flow_dissector_key_ports tp_max;
 } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */
 
 struct fl_flow_mask_range {
@@ -103,6 +106,11 @@ struct cls_fl_filter {
 	struct net_device *hw_dev;
 };
 
+enum fl_endpoint {
+	FLOWER_ENDPOINT_DST,
+	FLOWER_ENDPOINT_SRC
+};
+
 static const struct rhashtable_params mask_ht_params = {
 	.key_offset = offsetof(struct fl_flow_mask, key),
 	.key_len = sizeof(struct fl_flow_key),
@@ -179,11 +187,86 @@ static void fl_clear_masked_range(struct fl_flow_key *key,
 	memset(fl_key_get_start(key, mask), 0, fl_mask_range(mask));
 }
 
+static int fl_range_compare_params(struct cls_fl_filter *filter,
+				   struct fl_flow_key *key,
+				   struct fl_flow_key *mkey,
+				   enum fl_endpoint endpoint)
+{
+	__be16 min_mask, max_mask, min_val, max_val;
+
+	if (endpoint == FLOWER_ENDPOINT_DST) {
+		min_mask = htons(filter->mask->key.tp_min.dst);
+		max_mask = htons(filter->mask->key.tp_max.dst);
+		min_val = htons(filter->key.tp_min.dst);
+		max_val = htons(filter->key.tp_max.dst);
+
+		if (min_mask && max_mask) {
+			if (htons(key->tp.dst) < min_val ||
+			    htons(key->tp.dst) > max_val)
+				return -1;
+
+			/* skb does not have min and max values */
+			mkey->tp_min.dst = filter->mkey.tp_min.dst;
+			mkey->tp_max.dst = filter->mkey.tp_max.dst;
+		}
+	} else {
+		min_mask = htons(filter->mask->key.tp_min.src);
+		max_mask = htons(filter->mask->key.tp_max.src);
+		min_val = htons(filter->key.tp_min.src);
+		max_val = htons(filter->key.tp_max.src);
+
+		if (min_mask && max_mask) {
+			if (htons(key->tp.src) < min_val ||
+			    htons(key->tp.src) > max_val)
+				return -1;
+
+			/* skb does not have min and max values */
+			mkey->tp_min.src = filter->mkey.tp_min.src;
+			mkey->tp_max.src = filter->mkey.tp_max.src;
+		}
+	}
+	return 0;
+}
+
+static struct cls_fl_filter *fl_lookup_range(struct fl_flow_mask *mask,
+					     struct fl_flow_key *mkey,
+					     struct fl_flow_key *key)
+{
+	struct cls_fl_filter *filter, *f;
+	int ret;
+
+	list_for_each_entry_rcu(filter, &mask->filters, list) {
+		ret = fl_range_compare_params(filter, key, mkey,
+					      FLOWER_ENDPOINT_DST);
+		if (ret < 0)
+			continue;
+
+		ret = fl_range_compare_params(filter, key, mkey,
+					      FLOWER_ENDPOINT_SRC);
+		if (ret < 0)
+			continue;
+
+		f = rhashtable_lookup_fast(&mask->ht,
+					   fl_key_get_start(mkey, mask),
+					   mask->filter_ht_params);
+		if (f)
+			return f;
+	}
+	return NULL;
+}
+
 static struct cls_fl_filter *fl_lookup(struct fl_flow_mask *mask,
-				       struct fl_flow_key *mkey)
+				       struct fl_flow_key *mkey,
+				       struct fl_flow_key *key, bool is_skb)
 {
-	return rhashtable_lookup_fast(&mask->ht, fl_key_get_start(mkey, mask),
-				      mask->filter_ht_params);
+	if ((!(mask->key.tp_min.dst && mask->key.tp_max.dst) &&
+	     !(mask->key.tp_min.src && mask->key.tp_max.src)) || !is_skb) {
+		return  rhashtable_lookup_fast(&mask->ht,
+					       fl_key_get_start(mkey, mask),
+					       mask->filter_ht_params);
+	}
+	/* Classify based on range */
+	return fl_lookup_range(mask, mkey, key);
 }
 
 static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
@@ -207,8 +290,8 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		skb_flow_dissect(skb, &mask->dissector, &skb_key, 0);
 
 		fl_set_masked_key(&skb_mkey, &skb_key, mask);
+		f = fl_lookup(mask, &skb_mkey, &skb_key, true);
 
-		f = fl_lookup(mask, &skb_mkey);
 		if (f && !tc_skip_sw(f->flags)) {
 			*res = f->res;
 			return tcf_exts_exec(skb, &f->exts, res);
@@ -909,6 +992,23 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
 			       sizeof(key->arp.tha));
 	}
 
+	if (key->basic.ip_proto == IPPROTO_TCP ||
+	    key->basic.ip_proto == IPPROTO_UDP ||
+	    key->basic.ip_proto == IPPROTO_SCTP) {
+		fl_set_key_val(tb, &key->tp_min.dst,
+			       TCA_FLOWER_KEY_PORT_DST_MIN, &mask->tp_min.dst,
+			       TCA_FLOWER_UNSPEC, sizeof(key->tp_min.dst));
+		fl_set_key_val(tb, &key->tp_max.dst,
+			       TCA_FLOWER_KEY_PORT_DST_MAX, &mask->tp_max.dst,
+			       TCA_FLOWER_UNSPEC, sizeof(key->tp_max.dst));
+		fl_set_key_val(tb, &key->tp_min.src,
+			       TCA_FLOWER_KEY_PORT_SRC_MIN, &mask->tp_min.src,
+			       TCA_FLOWER_UNSPEC, sizeof(key->tp_min.src));
+		fl_set_key_val(tb, &key->tp_max.src,
+			       TCA_FLOWER_KEY_PORT_SRC_MAX, &mask->tp_max.src,
+			       TCA_FLOWER_UNSPEC, sizeof(key->tp_max.src));
+	}
+
 	if (tb[TCA_FLOWER_KEY_ENC_IPV4_SRC] ||
 	    tb[TCA_FLOWER_KEY_ENC_IPV4_DST]) {
 		key->enc_control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
@@ -1026,8 +1126,7 @@ static void fl_init_dissector(struct flow_dissector *dissector,
 			     FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4);
 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6);
-	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
-			     FLOW_DISSECTOR_KEY_PORTS, tp);
+	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_PORTS, tp);
 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
 			     FLOW_DISSECTOR_KEY_IP, ip);
 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
@@ -1227,7 +1326,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
 		goto errout_idr;
 
 	if (!tc_skip_sw(fnew->flags)) {
-		if (!fold && fl_lookup(fnew->mask, &fnew->mkey)) {
+		if (!fold && fl_lookup(fnew->mask, &fnew->mkey, NULL, false)) {
 			err = -EEXIST;
 			goto errout_mask;
 		}
@@ -1800,6 +1899,27 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net,
 				  sizeof(key->arp.tha))))
 		goto nla_put_failure;
 
+	if ((key->basic.ip_proto == IPPROTO_TCP ||
+	     key->basic.ip_proto == IPPROTO_UDP ||
+	     key->basic.ip_proto == IPPROTO_SCTP) &&
+	     (fl_dump_key_val(skb, &key->tp_min.dst,
+			      TCA_FLOWER_KEY_PORT_DST_MIN,
+			      &mask->tp_min.dst, TCA_FLOWER_UNSPEC,
+			      sizeof(key->tp_min.dst)) ||
+	      fl_dump_key_val(skb, &key->tp_max.dst,
+			      TCA_FLOWER_KEY_PORT_DST_MAX,
+			      &mask->tp_max.dst, TCA_FLOWER_UNSPEC,
+			      sizeof(key->tp_max.dst)) ||
+	      fl_dump_key_val(skb, &key->tp_min.src,
+			      TCA_FLOWER_KEY_PORT_SRC_MIN,
+			      &mask->tp_min.src, TCA_FLOWER_UNSPEC,
+			      sizeof(key->tp_min.src)) ||
+	      fl_dump_key_val(skb, &key->tp_max.src,
+			      TCA_FLOWER_KEY_PORT_SRC_MAX,
+			      &mask->tp_max.src, TCA_FLOWER_UNSPEC,
+			      sizeof(key->tp_max.src))))
+		goto nla_put_failure;
+
 	if (key->enc_control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS &&
 	    (fl_dump_key_val(skb, &key->enc_ipv4.src,
 			    TCA_FLOWER_KEY_ENC_IPV4_SRC, &mask->enc_ipv4.src,

^ permalink raw reply related

* [PATCH bpf-next 00/13] bpf: add btf func info support
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team

The BTF support was added to kernel by Commit 69b693f0aefa
("bpf: btf: Introduce BPF Type Format (BTF)"), which introduced
.BTF section into ELF file and is primarily
used for map pretty print.
pahole is used to convert dwarf to BTF for ELF files.

The next step would be add func type info and debug line info
into BTF. For debug line info, it is desirable to encode
source code directly in the BTF to ease deployment and
introspection.

The func type and debug line info are relative to byte code offset.
Also since byte codes may need to be relocated by the loader,
func info and line info are placed in a different section,
.BTF.ext, so the loader could manupilate it according to how
byte codes are placed, before loading into the kernel.

LLVM commit https://reviews.llvm.org/rL344366 (in llvm trunk)
now can generate type/func/line info.
For the below example, with a llvm compiler built with Debug mode,
the following is generated:

  -bash-4.2$ cat test.c
  int foo(int (*bar)(int)) { return bar(5); }
  -bash-4.2$ clang -target bpf -g -O2 -mllvm -debug-only=btf -c test.c
  Type Table:
  [1] FUNC name_off=1 info=0x0c000001 size/type=2
        param_type=3
  [2] INT name_off=11 info=0x01000000 size/type=4
        desc=0x01000020
  [3] PTR name_off=0 info=0x02000000 size/type=4
  [4] FUNC_PROTO name_off=0 info=0x0d000001 size/type=2
        param_type=2

  String Table:
  0 : 
  1 : foo
  5 : .text
  11 : int
  15 : test.c
  22 : int foo(int (*bar)(int)) { return bar(5); }

  FuncInfo Table:
  sec_name_off=5
        insn_offset=<Omitted> type_id=1

  LineInfo Table:
  sec_name_off=5
        insn_offset=<Omitted> file_name_off=15 line_off=22 line_num=1 column_num=0
        insn_offset=<Omitted> file_name_off=15 line_off=22 line_num=1 column_num=35
        insn_offset=<Omitted> file_name_off=15 line_off=22 line_num=1 column_num=28

In the above, type and string tables are in .BTF section, and
func and line info tables in .BTF.ext. The "<Omitted>" is the
insn offset which is not available during the dump time but
resolved during later compilation process.
Following the format specification at Patch #9 and examine the
raw data in .BTF.ext section, we have
  FuncInfo Table:
  sec_name_off=5
        insn_offset=0 type_id=1

  LineInfo Table:
  sec_name_off=5
        insn_offset=0  file_name_off=15 line_off=22 line_num=1 column_num=0
        insn_offset=8  file_name_off=15 line_off=22 line_num=1 column_num=35
        insn_offset=24 file_name_off=15 line_off=22 line_num=1 column_num=28
In the above insn_offset is the byte offset.

With this support, better ksym for bpf programs and functions can be
generated. Below is a demonstration from Patch #13.
  $ bpftool prog dump jited id 1
  int _dummy_tracepoint(struct dummy_tracepoint_args * ):
  bpf_prog_b07ccb89267cf242__dummy_tracepoint:
         0:   push   %rbp
         1:   mov    %rsp,%rbp
        ......
        3c:   add    $0x28,%rbp
        40:   leaveq
        41:   retq
    
  int test_long_fname_1(struct dummy_tracepoint_args * ):
  bpf_prog_2dcecc18072623fc_test_long_fname_1:
         0:   push   %rbp
         1:   mov    %rsp,%rbp
        ......
        3a:   add    $0x28,%rbp
        3e:   leaveq
        3f:   retq
    
  int test_long_fname_2(struct dummy_tracepoint_args * ):
  bpf_prog_89d64e4abf0f0126_test_long_fname_2:
         0:   push   %rbp
         1:   mov    %rsp,%rbp
        ......
        80:   add    $0x28,%rbp
        84:   leaveq
        85:   retq

For the patchset,
Patch #1  refactors the code to break up btf_type_is_void().
Patch #2  introduces new BTF types BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO.
Patch #3  syncs btf.h header to tools directory.
Patch #4  adds btf func/func_proto self tests in test_btf.
Patch #5  adds kernel interface to load func_info to kernel
          and pass func_info to userspace.
Patch #6  syncs bpf.h header to tools directory.
Patch #7  adds news btf/func_info related fields in libbpf
          program load function.
Patch #8  extends selftest test_btf to test load/retrieve func_type info.
Patch #9  adds .BTF.ext func info support.
Patch #10 changes Makefile to avoid using pahole if llvm is capable of
          generating BTF sections.
Patch #11 refactors to have btf_get_from_id() in libbpf for reuse.
Patch #12 enhance test_btf file testing to test func info.
Patch #13 adds bpftool support for func signature dump.

Yonghong Song (13):
  bpf: btf: Break up btf_type_is_void()
  bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
  tools/bpf: sync kernel btf.h header
  tools/bpf: add btf func/func_proto unit tests in selftest test_btf
  bpf: get better bpf_prog ksyms based on btf func type_id
  tools/bpf: sync kernel uapi bpf.h header to tools directory
  tools/bpf: add new fields for program load in lib/bpf
  tools/bpf: extends test_btf to test load/retrieve func_type info
  tools/bpf: add support to read .BTF.ext sections
  tools/bpf: do not use pahole if clang/llvm can generate BTF sections
  tools/bpf: refactor to implement btf_get_from_id() in lib/bpf
  tools/bpf: enhance test_btf file testing to test func info
  tools/bpf: bpftool: add support for jited func types

 include/linux/bpf.h                          |   2 +
 include/linux/bpf_verifier.h                 |   1 +
 include/linux/btf.h                          |   2 +
 include/uapi/linux/bpf.h                     |  11 +
 include/uapi/linux/btf.h                     |   9 +-
 kernel/bpf/btf.c                             | 322 +++++++++--
 kernel/bpf/core.c                            |   9 +
 kernel/bpf/syscall.c                         |  86 ++-
 kernel/bpf/verifier.c                        |  50 ++
 samples/bpf/Makefile                         |   8 +
 tools/bpf/bpftool/btf_dumper.c               |  96 ++++
 tools/bpf/bpftool/main.h                     |   2 +
 tools/bpf/bpftool/map.c                      |  68 +--
 tools/bpf/bpftool/prog.c                     |  54 ++
 tools/include/uapi/linux/bpf.h               |  11 +
 tools/include/uapi/linux/btf.h               |   9 +-
 tools/lib/bpf/bpf.c                          |   3 +
 tools/lib/bpf/bpf.h                          |   3 +
 tools/lib/bpf/btf.c                          | 305 ++++++++++
 tools/lib/bpf/btf.h                          |  32 ++
 tools/lib/bpf/libbpf.c                       |  53 +-
 tools/testing/selftests/bpf/Makefile         |   8 +
 tools/testing/selftests/bpf/test_btf.c       | 566 ++++++++++++++++++-
 tools/testing/selftests/bpf/test_btf_haskv.c |  16 +-
 tools/testing/selftests/bpf/test_btf_nokv.c  |  16 +-
 25 files changed, 1613 insertions(+), 129 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [PATCH bpf-next 01/13] bpf: btf: Break up btf_type_is_void()
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185424.2378502-1-yhs@fb.com>

This patch breaks up btf_type_is_void() into
btf_type_is_void() and btf_type_is_fwd().

It also adds btf_type_nosize() to better describe it is
testing a type has nosize info.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/btf.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 378cef70341c..be406d8906ce 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -306,15 +306,22 @@ static bool btf_type_is_modifier(const struct btf_type *t)
 
 static bool btf_type_is_void(const struct btf_type *t)
 {
-	/* void => no type and size info.
-	 * Hence, FWD is also treated as void.
-	 */
-	return t == &btf_void || BTF_INFO_KIND(t->info) == BTF_KIND_FWD;
+	return t == &btf_void;
+}
+
+static bool btf_type_is_fwd(const struct btf_type *t)
+{
+	return BTF_INFO_KIND(t->info) == BTF_KIND_FWD;
+}
+
+static bool btf_type_nosize(const struct btf_type *t)
+{
+	return btf_type_is_void(t) || btf_type_is_fwd(t);
 }
 
-static bool btf_type_is_void_or_null(const struct btf_type *t)
+static bool btf_type_nosize_or_null(const struct btf_type *t)
 {
-	return !t || btf_type_is_void(t);
+	return !t || btf_type_nosize(t);
 }
 
 /* union is only a special case of struct:
@@ -826,7 +833,7 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
 	u32 size = 0;
 
 	size_type = btf_type_by_id(btf, size_type_id);
-	if (btf_type_is_void_or_null(size_type))
+	if (btf_type_nosize_or_null(size_type))
 		return NULL;
 
 	if (btf_type_has_size(size_type)) {
@@ -842,7 +849,7 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
 		size = btf->resolved_sizes[size_type_id];
 		size_type_id = btf->resolved_ids[size_type_id];
 		size_type = btf_type_by_id(btf, size_type_id);
-		if (btf_type_is_void(size_type))
+		if (btf_type_nosize_or_null(size_type))
 			return NULL;
 	}
 
@@ -1164,7 +1171,7 @@ static int btf_modifier_resolve(struct btf_verifier_env *env,
 	}
 
 	/* "typedef void new_void", "const void"...etc */
-	if (btf_type_is_void(next_type))
+	if (btf_type_is_void(next_type) || btf_type_is_fwd(next_type))
 		goto resolved;
 
 	if (!env_type_is_resolve_sink(env, next_type) &&
@@ -1178,7 +1185,7 @@ static int btf_modifier_resolve(struct btf_verifier_env *env,
 	 * pretty print).
 	 */
 	if (!btf_type_id_size(btf, &next_type_id, &next_type_size) &&
-	    !btf_type_is_void(btf_type_id_resolve(btf, &next_type_id))) {
+	    !btf_type_nosize(btf_type_id_resolve(btf, &next_type_id))) {
 		btf_verifier_log_type(env, v->t, "Invalid type_id");
 		return -EINVAL;
 	}
@@ -1205,7 +1212,7 @@ static int btf_ptr_resolve(struct btf_verifier_env *env,
 	}
 
 	/* "void *" */
-	if (btf_type_is_void(next_type))
+	if (btf_type_is_void(next_type) || btf_type_is_fwd(next_type))
 		goto resolved;
 
 	if (!env_type_is_resolve_sink(env, next_type) &&
@@ -1235,7 +1242,7 @@ static int btf_ptr_resolve(struct btf_verifier_env *env,
 	}
 
 	if (!btf_type_id_size(btf, &next_type_id, &next_type_size) &&
-	    !btf_type_is_void(btf_type_id_resolve(btf, &next_type_id))) {
+	    !btf_type_nosize(btf_type_id_resolve(btf, &next_type_id))) {
 		btf_verifier_log_type(env, v->t, "Invalid type_id");
 		return -EINVAL;
 	}
@@ -1396,7 +1403,7 @@ static int btf_array_resolve(struct btf_verifier_env *env,
 	/* Check array->index_type */
 	index_type_id = array->index_type;
 	index_type = btf_type_by_id(btf, index_type_id);
-	if (btf_type_is_void_or_null(index_type)) {
+	if (btf_type_nosize_or_null(index_type)) {
 		btf_verifier_log_type(env, v->t, "Invalid index");
 		return -EINVAL;
 	}
@@ -1415,7 +1422,7 @@ static int btf_array_resolve(struct btf_verifier_env *env,
 	/* Check array->type */
 	elem_type_id = array->type;
 	elem_type = btf_type_by_id(btf, elem_type_id);
-	if (btf_type_is_void_or_null(elem_type)) {
+	if (btf_type_nosize_or_null(elem_type)) {
 		btf_verifier_log_type(env, v->t,
 				      "Invalid elem");
 		return -EINVAL;
@@ -1615,7 +1622,7 @@ static int btf_struct_resolve(struct btf_verifier_env *env,
 		const struct btf_type *member_type = btf_type_by_id(env->btf,
 								member_type_id);
 
-		if (btf_type_is_void_or_null(member_type)) {
+		if (btf_type_nosize_or_null(member_type)) {
 			btf_verifier_log_member(env, v->t, member,
 						"Invalid member");
 			return -EINVAL;
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 03/13] tools/bpf: sync kernel btf.h header
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185424.2378502-1-yhs@fb.com>

The kernel uapi btf.h is synced to the tools directory.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/include/uapi/linux/btf.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
index 972265f32871..63f8500e6f34 100644
--- a/tools/include/uapi/linux/btf.h
+++ b/tools/include/uapi/linux/btf.h
@@ -40,7 +40,8 @@ struct btf_type {
 	/* "size" is used by INT, ENUM, STRUCT and UNION.
 	 * "size" tells the size of the type it is describing.
 	 *
-	 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST and RESTRICT.
+	 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
+	 * FUNC and FUNC_PROTO.
 	 * "type" is a type_id referring to another type.
 	 */
 	union {
@@ -64,8 +65,10 @@ struct btf_type {
 #define BTF_KIND_VOLATILE	9	/* Volatile	*/
 #define BTF_KIND_CONST		10	/* Const	*/
 #define BTF_KIND_RESTRICT	11	/* Restrict	*/
-#define BTF_KIND_MAX		11
-#define NR_BTF_KINDS		12
+#define BTF_KIND_FUNC		12	/* Function	*/
+#define BTF_KIND_FUNC_PROTO	13	/* Function Prototype	*/
+#define BTF_KIND_MAX		13
+#define NR_BTF_KINDS		14
 
 /* For some specific BTF_KIND, "struct btf_type" is immediately
  * followed by extra data.
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 08/13] tools/bpf: extends test_btf to test load/retrieve func_type info
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185446.2379289-1-yhs@fb.com>

A two function bpf program is loaded with btf and func_info.
After successful prog load, the bpf_get_info syscall is called
to retrieve prog info to ensure the types returned from the
kernel matches the types passed to the kernel from the
user space.

Several negative tests are also added to test loading/retriving
of func_type info.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/test_btf.c | 278 ++++++++++++++++++++++++-
 1 file changed, 275 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index b6461c3c5e11..e03a8cea4bb7 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -5,6 +5,7 @@
 #include <linux/btf.h>
 #include <linux/err.h>
 #include <linux/kernel.h>
+#include <linux/filter.h>
 #include <bpf/bpf.h>
 #include <sys/resource.h>
 #include <libelf.h>
@@ -22,9 +23,13 @@
 #include "bpf_rlimit.h"
 #include "bpf_util.h"
 
+#define MAX_INSNS	512
+#define MAX_SUBPROGS	16
+
 static uint32_t pass_cnt;
 static uint32_t error_cnt;
 static uint32_t skip_cnt;
+static bool jit_enabled;
 
 #define CHECK(condition, format...) ({					\
 	int __ret = !!(condition);					\
@@ -60,6 +65,24 @@ static int __base_pr(const char *format, ...)
 	return err;
 }
 
+static bool is_jit_enabled(void)
+{
+	const char *jit_sysctl = "/proc/sys/net/core/bpf_jit_enable";
+	bool enabled = false;
+	int sysctl_fd;
+
+	sysctl_fd = open(jit_sysctl, 0, O_RDONLY);
+	if (sysctl_fd != -1) {
+		char tmpc;
+
+		if (read(sysctl_fd, &tmpc, sizeof(tmpc)) == 1)
+			enabled = (tmpc != '0');
+		close(sysctl_fd);
+	}
+
+	return enabled;
+}
+
 #define BTF_INFO_ENC(kind, root, vlen)			\
 	((!!(root) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
 
@@ -103,6 +126,7 @@ static struct args {
 	bool get_info_test;
 	bool pprint_test;
 	bool always_log;
+	bool func_type_test;
 } args;
 
 static char btf_log_buf[BTF_LOG_BUF_SIZE];
@@ -2693,16 +2717,256 @@ static int test_pprint(void)
 	return err;
 }
 
+static struct btf_func_type_test {
+	const char *descr;
+	const char *str_sec;
+	__u32 raw_types[MAX_NR_RAW_TYPES];
+	__u32 str_sec_size;
+	struct bpf_insn insns[MAX_INSNS];
+	__u32 prog_type;
+	struct bpf_func_info func_info[MAX_SUBPROGS];
+	__u32 func_info_len;
+	bool expected_prog_load_failure;
+} func_type_test[] = {
+
+{
+	.descr = "func_type test #1",
+	.str_sec = "\0int\0unsigned int\0funcA\0funcB",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(NAME_TBD, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */
+		BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 32, 4),               /* [2] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [3] */
+		1, 2,
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [4] */
+		2, 1,
+		BTF_END_RAW,
+	},
+	.str_sec_size = sizeof("\0int\0unsigned int\0funcA\0funcB"),
+	.insns = {
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.func_info = { {0, 3}, {3, 4} },
+	.func_info_len = 2 * sizeof(struct bpf_func_info),
+},
+
+{
+	.descr = "func_type test #2",
+	.str_sec = "\0int\0unsigned int\0funcA\0funcB",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(NAME_TBD, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */
+		BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 32, 4),               /* [2] */
+		/* incorrect func type */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 2), 1),  /* [3] */
+		1, 2,
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [4] */
+		2, 1,
+		BTF_END_RAW,
+	},
+	.str_sec_size = sizeof("\0int\0unsigned int\0funcA\0funcB"),
+	.insns = {
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.func_info = { {0, 3}, {3, 4} },
+	.func_info_len = 2 * sizeof(struct bpf_func_info),
+	.expected_prog_load_failure = true,
+},
+
+{
+	.descr = "func_type test #3",
+	.str_sec = "\0int\0unsigned int\0funcA\0funcB",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(NAME_TBD, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */
+		BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 32, 4),               /* [2] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [3] */
+		1, 2,
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [4] */
+		2, 1,
+		BTF_END_RAW,
+	},
+	.str_sec_size = sizeof("\0int\0unsigned int\0funcA\0funcB"),
+	.insns = {
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	.func_info = { {0, 3}, {3, 4} },
+	/* incorrect func_info_len */
+	.func_info_len = sizeof(struct bpf_func_info),
+	.expected_prog_load_failure = true,
+},
+
+{
+	.descr = "func_type test #4",
+	.str_sec = "\0int\0unsigned int\0funcA\0funcB",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(NAME_TBD, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */
+		BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 32, 4),               /* [2] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [3] */
+		1, 2,
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),  /* [4] */
+		2, 1,
+		BTF_END_RAW,
+	},
+	.str_sec_size = sizeof("\0int\0unsigned int\0funcA\0funcB"),
+	.insns = {
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	/* incorrect func_info insn_offset */
+	.func_info = { {0, 3}, {2, 4} },
+	.func_info_len = 2 * sizeof(struct bpf_func_info),
+	.expected_prog_load_failure = true,
+},
+
+};
+
+static size_t probe_prog_length(const struct bpf_insn *fp)
+{
+	size_t len;
+
+	for (len = MAX_INSNS - 1; len > 0; --len)
+		if (fp[len].code != 0 || fp[len].imm != 0)
+			break;
+	return len + 1;
+}
+
+static int do_test_func_type(int test_num)
+{
+	const struct btf_func_type_test *test = &func_type_test[test_num];
+	__u32 func_lens[4], func_types[4], info_len;
+	struct bpf_load_program_attr attr = {};
+	unsigned int raw_btf_size;
+	struct bpf_prog_info info = {};
+	int btf_fd, prog_fd, err = 0;
+	void *raw_btf;
+
+	fprintf(stderr, "%s......", test->descr);
+	raw_btf = btf_raw_create(&hdr_tmpl, test->raw_types,
+				 test->str_sec, test->str_sec_size,
+				 &raw_btf_size);
+
+	if (!raw_btf)
+		return -1;
+
+	*btf_log_buf = '\0';
+	btf_fd = bpf_load_btf(raw_btf, raw_btf_size,
+			      btf_log_buf, BTF_LOG_BUF_SIZE,
+			      args.always_log);
+	free(raw_btf);
+
+	if (CHECK(btf_fd == -1, "invalid btf_fd errno:%d", errno)) {
+		fprintf(stderr, "%s\n", btf_log_buf);
+		err = -1;
+		goto done;
+	}
+
+	attr.prog_type = test->prog_type;
+	attr.insns = test->insns;
+	attr.insns_cnt = probe_prog_length(attr.insns);
+	attr.license = "GPL";
+	attr.prog_btf_fd = btf_fd;
+	attr.func_info_len = test->func_info_len;
+	attr.func_info = test->func_info;
+
+	*btf_log_buf = '\0';
+	prog_fd = bpf_load_program_xattr(&attr, btf_log_buf,
+					 BTF_LOG_BUF_SIZE);
+	if (test->expected_prog_load_failure && prog_fd == -1) {
+		err = 0;
+		goto done;
+	}
+	if (CHECK(prog_fd == -1, "invalid prog_id errno:%d", errno)) {
+		fprintf(stderr, "%s\n", btf_log_buf);
+		err = -1;
+		goto done;
+	}
+	if (!jit_enabled) {
+		skip_cnt++;
+		fprintf(stderr, "SKIPPED, please enable sysctl bpf_jit_enable\n");
+		err = 0;
+		goto done;
+	}
+
+	info_len = sizeof(struct bpf_prog_info);
+	info.nr_jited_func_types = 4;
+	info.nr_jited_func_lens = 4;
+	info.jited_func_types = ptr_to_u64(&func_types[0]);
+	info.jited_func_lens = ptr_to_u64(&func_lens[0]);
+	err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
+	if (CHECK(err == -1, "invalid get info errno:%d", errno)) {
+		fprintf(stderr, "%s\n", btf_log_buf);
+		err = -1;
+		goto done;
+	}
+	if (CHECK(info.nr_jited_func_lens != 2,
+		  "incorrect info.nr_jited_func_lens %d\n",
+		  info.nr_jited_func_lens)) {
+		err = -1;
+		goto done;
+	}
+	if (CHECK(info.nr_jited_func_types != 2,
+		  "incorrect info.nr_jited_func_types %d\n",
+		  info.nr_jited_func_types)) {
+		err = -1;
+		goto done;
+	}
+	if (CHECK(info.btf_id == 0, "incorrect btf_id = 0")) {
+		err = -1;
+		goto done;
+	}
+	if (CHECK(func_types[0] != test->func_info[0].type_id ||
+		  func_types[1] != test->func_info[1].type_id,
+		  "incorrect func_types (%u, %u) expected (%u %d)",
+		  func_types[0], func_types[1],
+		  test->func_info[0].type_id, test->func_info[1].type_id)) {
+		err = -1;
+		goto done;
+	}
+
+done:
+	return err;
+}
+
+static int test_func_type(void)
+{
+	unsigned int i;
+	int err = 0;
+
+	for (i = 0; i < ARRAY_SIZE(func_type_test); i++)
+		err |= count_result(do_test_func_type(i));
+
+	return err;
+}
+
 static void usage(const char *cmd)
 {
-	fprintf(stderr, "Usage: %s [-l] [[-r test_num (1 - %zu)] | [-g test_num (1 - %zu)] | [-f test_num (1 - %zu)] | [-p]]\n",
+	fprintf(stderr, "Usage: %s [-l] [[-r test_num (1 - %zu)] |"
+			" [-g test_num (1 - %zu)] |"
+			" [-f test_num (1 - %zu)] | [-p] | [-k] ]\n",
 		cmd, ARRAY_SIZE(raw_tests), ARRAY_SIZE(get_info_tests),
 		ARRAY_SIZE(file_tests));
 }
 
 static int parse_args(int argc, char **argv)
 {
-	const char *optstr = "lpf:r:g:";
+	const char *optstr = "lpkf:r:g:";
 	int opt;
 
 	while ((opt = getopt(argc, argv, optstr)) != -1) {
@@ -2725,6 +2989,9 @@ static int parse_args(int argc, char **argv)
 		case 'p':
 			args.pprint_test = true;
 			break;
+		case 'k':
+			args.func_type_test = true;
+			break;
 		case 'h':
 			usage(argv[0]);
 			exit(0);
@@ -2778,6 +3045,8 @@ int main(int argc, char **argv)
 	if (args.always_log)
 		libbpf_set_print(__base_pr, __base_pr, __base_pr);
 
+	jit_enabled = is_jit_enabled();
+
 	if (args.raw_test)
 		err |= test_raw();
 
@@ -2790,8 +3059,11 @@ int main(int argc, char **argv)
 	if (args.pprint_test)
 		err |= test_pprint();
 
+	if (args.func_type_test)
+		err |= test_func_type();
+
 	if (args.raw_test || args.get_info_test || args.file_test ||
-	    args.pprint_test)
+	    args.pprint_test || args.func_type_test)
 		goto done;
 
 	err |= test_raw();
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 02/13] bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185424.2378502-1-yhs@fb.com>

This patch adds BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
support to the type section. BTF_KIND_FUNC_PROTO is used
to specify the type of a function pointer. With this,
BTF has a complete set of C types (except float).

BTF_KIND_FUNC is used to specify the signature of a
defined subprogram. BTF_KIND_FUNC_PROTO can be referenced
by another type, e.g., a pointer type, and BTF_KIND_FUNC
type cannot be referenced by another type.

For both BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO types,
the func return type is in t->type (where t is a
"struct btf_type" object). The func args are an array of
u32s immediately following object "t".

As a concrete example, for the C program below,
  $ cat test.c
  int foo(int (*bar)(int)) { return bar(5); }
with latest llvm trunk built with Debug mode, we have
  $ clang -target bpf -g -O2 -mllvm -debug-only=btf -c test.c
  Type Table:
  [1] FUNC name_off=1 info=0x0c000001 size/type=2
          param_type=3
  [2] INT name_off=11 info=0x01000000 size/type=4
          desc=0x01000020
  [3] PTR name_off=0 info=0x02000000 size/type=4
  [4] FUNC_PROTO name_off=0 info=0x0d000001 size/type=2
          param_type=2

  String Table:
  0 :
  1 : foo
  5 : .text
  11 : int
  15 : test.c
  22 : int foo(int (*bar)(int)) { return bar(5); }

  FuncInfo Table:
  sec_name_off=5
          insn_offset=<Omitted> type_id=1

  ...

(Eventually we shall have bpftool to dump btf information
 like the above.)

Function "foo" has a FUNC type (type_id = 1).
The parameter of "foo" has type_id 3 which is PTR->FUNC_PROTO,
where FUNC_PROTO refers to function pointer "bar".

In FuncInfo Table, for section .text, the function,
with to-be-determined offset (marked as <Omitted>),
has type_id=1 which refers to a FUNC type.
This way, the function signature is
available to both kernel and user space.
Here, the insn offset is not available during the dump time
as relocation is resolved pretty late in the compilation process.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/uapi/linux/btf.h |   9 +-
 kernel/bpf/btf.c         | 277 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 250 insertions(+), 36 deletions(-)

diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index 972265f32871..63f8500e6f34 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -40,7 +40,8 @@ struct btf_type {
 	/* "size" is used by INT, ENUM, STRUCT and UNION.
 	 * "size" tells the size of the type it is describing.
 	 *
-	 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST and RESTRICT.
+	 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
+	 * FUNC and FUNC_PROTO.
 	 * "type" is a type_id referring to another type.
 	 */
 	union {
@@ -64,8 +65,10 @@ struct btf_type {
 #define BTF_KIND_VOLATILE	9	/* Volatile	*/
 #define BTF_KIND_CONST		10	/* Const	*/
 #define BTF_KIND_RESTRICT	11	/* Restrict	*/
-#define BTF_KIND_MAX		11
-#define NR_BTF_KINDS		12
+#define BTF_KIND_FUNC		12	/* Function	*/
+#define BTF_KIND_FUNC_PROTO	13	/* Function Prototype	*/
+#define BTF_KIND_MAX		13
+#define NR_BTF_KINDS		14
 
 /* For some specific BTF_KIND, "struct btf_type" is immediately
  * followed by extra data.
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index be406d8906ce..794a185f11bf 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5,6 +5,7 @@
 #include <uapi/linux/types.h>
 #include <linux/seq_file.h>
 #include <linux/compiler.h>
+#include <linux/ctype.h>
 #include <linux/errno.h>
 #include <linux/slab.h>
 #include <linux/anon_inodes.h>
@@ -259,6 +260,8 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = {
 	[BTF_KIND_VOLATILE]	= "VOLATILE",
 	[BTF_KIND_CONST]	= "CONST",
 	[BTF_KIND_RESTRICT]	= "RESTRICT",
+	[BTF_KIND_FUNC]		= "FUNC",
+	[BTF_KIND_FUNC_PROTO]	= "FUNC_PROTO",
 };
 
 struct btf_kind_operations {
@@ -281,6 +284,9 @@ struct btf_kind_operations {
 static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS];
 static struct btf_type btf_void;
 
+static int btf_resolve(struct btf_verifier_env *env,
+		       const struct btf_type *t, u32 type_id);
+
 static bool btf_type_is_modifier(const struct btf_type *t)
 {
 	/* Some of them is not strictly a C modifier
@@ -314,9 +320,20 @@ static bool btf_type_is_fwd(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_FWD;
 }
 
+static bool btf_type_is_func(const struct btf_type *t)
+{
+	return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC;
+}
+
+static bool btf_type_is_func_proto(const struct btf_type *t)
+{
+	return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
+}
+
 static bool btf_type_nosize(const struct btf_type *t)
 {
-	return btf_type_is_void(t) || btf_type_is_fwd(t);
+	return btf_type_is_void(t) || btf_type_is_fwd(t) ||
+	       btf_type_is_func(t) || btf_type_is_func_proto(t);
 }
 
 static bool btf_type_nosize_or_null(const struct btf_type *t)
@@ -433,6 +450,24 @@ static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
 		offset < btf->hdr.str_len;
 }
 
+static bool btf_name_valid_identifier(const struct btf *btf, u32 offset)
+{
+	/* offset must be valid */
+	const char *src = &btf->strings[offset];
+
+	if (!isalpha(*src) && *src != '_')
+		return false;
+
+	src++;
+	while (*src) {
+		if (!isalnum(*src) && *src != '_')
+			return false;
+		src++;
+	}
+
+	return true;
+}
+
 static const char *btf_name_by_offset(const struct btf *btf, u32 offset)
 {
 	if (!offset)
@@ -747,7 +782,9 @@ static bool env_type_is_resolve_sink(const struct btf_verifier_env *env,
 		/* int, enum or void is a sink */
 		return !btf_type_needs_resolve(next_type);
 	case RESOLVE_PTR:
-		/* int, enum, void, struct or array is a sink for ptr */
+		/* int, enum, void, struct, array or func_ptoto is a sink
+		 * for ptr
+		 */
 		return !btf_type_is_modifier(next_type) &&
 			!btf_type_is_ptr(next_type);
 	case RESOLVE_STRUCT_OR_ARRAY:
@@ -1170,6 +1207,14 @@ static int btf_modifier_resolve(struct btf_verifier_env *env,
 		return -EINVAL;
 	}
 
+	if (btf_type_is_func_proto(next_type)) {
+		if (env->resolve_mode == RESOLVE_PTR)
+			goto resolved;
+
+		btf_verifier_log_type(env, v->t, "Invalid type_id");
+		return -EINVAL;
+	}
+
 	/* "typedef void new_void", "const void"...etc */
 	if (btf_type_is_void(next_type) || btf_type_is_fwd(next_type))
 		goto resolved;
@@ -1185,7 +1230,8 @@ static int btf_modifier_resolve(struct btf_verifier_env *env,
 	 * pretty print).
 	 */
 	if (!btf_type_id_size(btf, &next_type_id, &next_type_size) &&
-	    !btf_type_nosize(btf_type_id_resolve(btf, &next_type_id))) {
+	    (!btf_type_nosize(btf_type_id_resolve(btf, &next_type_id)) ||
+	     btf_type_is_func(next_type))) {
 		btf_verifier_log_type(env, v->t, "Invalid type_id");
 		return -EINVAL;
 	}
@@ -1212,7 +1258,8 @@ static int btf_ptr_resolve(struct btf_verifier_env *env,
 	}
 
 	/* "void *" */
-	if (btf_type_is_void(next_type) || btf_type_is_fwd(next_type))
+	if (btf_type_is_void(next_type) || btf_type_is_fwd(next_type) ||
+	    btf_type_is_func_proto(next_type))
 		goto resolved;
 
 	if (!env_type_is_resolve_sink(env, next_type) &&
@@ -1242,7 +1289,8 @@ static int btf_ptr_resolve(struct btf_verifier_env *env,
 	}
 
 	if (!btf_type_id_size(btf, &next_type_id, &next_type_size) &&
-	    !btf_type_nosize(btf_type_id_resolve(btf, &next_type_id))) {
+	    (!btf_type_nosize(btf_type_id_resolve(btf, &next_type_id)) ||
+	     btf_type_is_func(next_type))) {
 		btf_verifier_log_type(env, v->t, "Invalid type_id");
 		return -EINVAL;
 	}
@@ -1550,6 +1598,14 @@ static s32 btf_struct_check_meta(struct btf_verifier_env *env,
 			return -EINVAL;
 		}
 
+		if (member->name_off &&
+		    !btf_name_valid_identifier(btf, member->name_off)) {
+			btf_verifier_log_member(env, t, member,
+						"Invalid member name:%s",
+						btf_name_by_offset(btf, member->name_off));
+			return -EINVAL;
+		}
+
 		/* A member cannot be in type void */
 		if (!member->type || !BTF_TYPE_ID_VALID(member->type)) {
 			btf_verifier_log_member(env, t, member,
@@ -1787,6 +1843,142 @@ static struct btf_kind_operations enum_ops = {
 	.seq_show = btf_enum_seq_show,
 };
 
+static s32 btf_func_check_meta(struct btf_verifier_env *env,
+			       const struct btf_type *t,
+			       u32 meta_left)
+{
+	u32 meta_needed = btf_type_vlen(t) * sizeof(u32);
+
+	if (meta_left < meta_needed) {
+		btf_verifier_log_basic(env, t,
+				       "meta_left:%u meta_needed:%u",
+				       meta_left, meta_needed);
+		return -EINVAL;
+	}
+
+	btf_verifier_log_type(env, t, NULL);
+
+	return meta_needed;
+}
+
+static void btf_func_log(struct btf_verifier_env *env,
+			 const struct btf_type *t)
+{
+	u16 nr_args = btf_type_vlen(t), i;
+	u32 *args = (u32 *)(t + 1);
+
+	btf_verifier_log(env, "return=%u args=(", t->type);
+	if (!nr_args) {
+		btf_verifier_log(env, "void");
+		goto done;
+	}
+
+	if (nr_args == 1 && !args[0]) {
+		/* Only one vararg */
+		btf_verifier_log(env, "vararg");
+		goto done;
+	}
+
+	btf_verifier_log(env, "%u", args[0]);
+	for (i = 1; i < nr_args - 1; i++)
+		btf_verifier_log(env, ", %u", args[i]);
+	if (nr_args > 1) {
+		u32 last_arg = args[nr_args - 1];
+
+		if (last_arg)
+			btf_verifier_log(env, ", %u", last_arg);
+		else
+			btf_verifier_log(env, ", vararg");
+	}
+
+done:
+	btf_verifier_log(env, ")");
+}
+
+static int btf_func_check(struct btf_verifier_env *env,
+			  const struct btf_type *t,
+			  u32 type_id)
+{
+	u16 nr_args = btf_type_vlen(t), i;
+	const struct btf *btf = env->btf;
+	const struct btf_type *ret_type;
+	u32 *args = (u32 *)(t + 1);
+	u32 ret_type_id;
+	int err;
+
+	/* Check func return type which could be "void" (t->type == 0) */
+	ret_type_id = t->type;
+	if (ret_type_id) {
+		/* return type is not "void" */
+		ret_type = btf_type_by_id(btf, ret_type_id);
+		if (btf_type_needs_resolve(ret_type) &&
+		    !env_type_is_resolved(env, ret_type_id)) {
+			err = btf_resolve(env, ret_type, ret_type_id);
+			if (err)
+				return err;
+		}
+
+		/* Ensure the return type is a type that has a size */
+		if (!btf_type_id_size(btf, &ret_type_id, NULL)) {
+			btf_verifier_log_type(env, t, "Invalid return type");
+			return -EINVAL;
+		}
+	}
+
+	if (!nr_args)
+		return 0;
+
+	/* Last func arg type_id could be 0 if it is a vararg */
+	if (!args[nr_args - 1])
+		nr_args--;
+
+	err = 0;
+	for (i = 0; i < nr_args; i++) {
+		const struct btf_type *arg_type;
+		u32 arg_type_id;
+
+		arg_type_id = args[i];
+		arg_type = btf_type_by_id(btf, arg_type_id);
+		if (!arg_type) {
+			btf_verifier_log_type(env, t, "Invalid arg#%u", i + 1);
+			err = -EINVAL;
+			break;
+		}
+
+		if (btf_type_needs_resolve(arg_type) &&
+		    !env_type_is_resolved(env, arg_type_id)) {
+			err = btf_resolve(env, arg_type, arg_type_id);
+			if (err)
+				break;
+		}
+
+		if (!btf_type_id_size(btf, &arg_type_id, NULL)) {
+			btf_verifier_log_type(env, t, "Invalid arg#%u", i + 1);
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	return err;
+}
+
+static struct btf_kind_operations func_ops = {
+	.check_meta = btf_func_check_meta,
+	.resolve = btf_df_resolve,
+	/*
+	 * BPF_KIND_FUNC_PROTO cannot be directly referred by
+	 * a struct's member.
+	 *
+	 * It should be a funciton pointer instead.
+	 * (i.e. struct's member -> BTF_KIND_PTR -> BTF_KIND_FUNC_PROTO)
+	 *
+	 * Hence, there is no btf_func_check_member().
+	 */
+	.check_member = btf_df_check_member,
+	.log_details = btf_func_log,
+	.seq_show = btf_df_seq_show,
+};
+
 static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = {
 	[BTF_KIND_INT] = &int_ops,
 	[BTF_KIND_PTR] = &ptr_ops,
@@ -1799,6 +1991,8 @@ static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS] = {
 	[BTF_KIND_VOLATILE] = &modifier_ops,
 	[BTF_KIND_CONST] = &modifier_ops,
 	[BTF_KIND_RESTRICT] = &modifier_ops,
+	[BTF_KIND_FUNC] = &func_ops,
+	[BTF_KIND_FUNC_PROTO] = &func_ops,
 };
 
 static s32 btf_check_meta(struct btf_verifier_env *env,
@@ -1870,30 +2064,6 @@ static int btf_check_all_metas(struct btf_verifier_env *env)
 	return 0;
 }
 
-static int btf_resolve(struct btf_verifier_env *env,
-		       const struct btf_type *t, u32 type_id)
-{
-	const struct resolve_vertex *v;
-	int err = 0;
-
-	env->resolve_mode = RESOLVE_TBD;
-	env_stack_push(env, t, type_id);
-	while (!err && (v = env_stack_peak(env))) {
-		env->log_type_id = v->type_id;
-		err = btf_type_ops(v->t)->resolve(env, v);
-	}
-
-	env->log_type_id = type_id;
-	if (err == -E2BIG)
-		btf_verifier_log_type(env, t,
-				      "Exceeded max resolving depth:%u",
-				      MAX_RESOLVE_DEPTH);
-	else if (err == -EEXIST)
-		btf_verifier_log_type(env, t, "Loop detected");
-
-	return err;
-}
-
 static bool btf_resolve_valid(struct btf_verifier_env *env,
 			      const struct btf_type *t,
 			      u32 type_id)
@@ -1927,6 +2097,39 @@ static bool btf_resolve_valid(struct btf_verifier_env *env,
 	return false;
 }
 
+static int btf_resolve(struct btf_verifier_env *env,
+		       const struct btf_type *t, u32 type_id)
+{
+	u32 save_log_type_id = env->log_type_id;
+	const struct resolve_vertex *v;
+	int err = 0;
+
+	env->resolve_mode = RESOLVE_TBD;
+	env_stack_push(env, t, type_id);
+	while (!err && (v = env_stack_peak(env))) {
+		env->log_type_id = v->type_id;
+		err = btf_type_ops(v->t)->resolve(env, v);
+	}
+
+	env->log_type_id = type_id;
+	if (err == -E2BIG) {
+		btf_verifier_log_type(env, t,
+				      "Exceeded max resolving depth:%u",
+				      MAX_RESOLVE_DEPTH);
+	} else if (err == -EEXIST) {
+		btf_verifier_log_type(env, t, "Loop detected");
+	}
+
+	/* Final sanity check */
+	if (!err && !btf_resolve_valid(env, t, type_id)) {
+		btf_verifier_log_type(env, t, "Invalid resolve state");
+		err = -EINVAL;
+	}
+
+	env->log_type_id = save_log_type_id;
+	return err;
+}
+
 static int btf_check_all_types(struct btf_verifier_env *env)
 {
 	struct btf *btf = env->btf;
@@ -1949,10 +2152,18 @@ static int btf_check_all_types(struct btf_verifier_env *env)
 				return err;
 		}
 
-		if (btf_type_needs_resolve(t) &&
-		    !btf_resolve_valid(env, t, type_id)) {
-			btf_verifier_log_type(env, t, "Invalid resolve state");
-			return -EINVAL;
+		if (btf_type_is_func(t) || btf_type_is_func_proto(t)) {
+			if (btf_type_is_func(t) &&
+			    (!btf_name_offset_valid(btf, t->name_off) ||
+			     !btf_name_valid_identifier(btf, t->name_off))) {
+				btf_verifier_log_type(env, t,
+						      "Invalid type_id");
+				return -EINVAL;
+			}
+
+			err = btf_func_check(env, t, type_id);
+			if (err)
+				return err;
 		}
 	}
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 06/13] tools/bpf: sync kernel uapi bpf.h header to tools directory
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185446.2379289-1-yhs@fb.com>

The kernel uapi bpf.h is synced to tools directory.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/include/uapi/linux/bpf.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index f9187b41dff6..7ebbf4f06a65 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -332,6 +332,9 @@ union bpf_attr {
 		 * (context accesses, allowed helpers, etc).
 		 */
 		__u32		expected_attach_type;
+		__u32		prog_btf_fd;	/* fd pointing to BTF type data */
+		__u32		func_info_len;	/* func_info length */
+		__aligned_u64	func_info;	/* func type info */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -2585,6 +2588,9 @@ struct bpf_prog_info {
 	__u32 nr_jited_func_lens;
 	__aligned_u64 jited_ksyms;
 	__aligned_u64 jited_func_lens;
+	__u32 btf_id;
+	__u32 nr_jited_func_types;
+	__aligned_u64 jited_func_types;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
@@ -2896,4 +2902,9 @@ struct bpf_flow_keys {
 	};
 };
 
+struct bpf_func_info {
+	__u32	insn_offset;
+	__u32	type_id;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 05/13] bpf: get better bpf_prog ksyms based on btf func type_id
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team

This patch added interface to load a program with the following
additional information:
   . prog_btf_fd
   . func_info and func_info_len
where func_info will provides function range and type_id
corresponding to each function.

If verifier agrees with function range provided by the user,
the bpf_prog ksym for each function will use the func name
provided in the type_id, which is supposed to provide better
encoding as it is not limited by 16 bytes program name
limitation and this is better for bpf program which contains
multiple subprograms.

The bpf_prog_info interface is also extended to
return btf_id and jited_func_types, so user spaces can
print out the function prototype for each jited function.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/bpf.h          |  2 +
 include/linux/bpf_verifier.h |  1 +
 include/linux/btf.h          |  2 +
 include/uapi/linux/bpf.h     | 11 +++++
 kernel/bpf/btf.c             | 16 +++++++
 kernel/bpf/core.c            |  9 ++++
 kernel/bpf/syscall.c         | 86 +++++++++++++++++++++++++++++++++++-
 kernel/bpf/verifier.c        | 50 +++++++++++++++++++++
 8 files changed, 176 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9b558713447f..e9c63ffa01af 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -308,6 +308,8 @@ struct bpf_prog_aux {
 	void *security;
 #endif
 	struct bpf_prog_offload *offload;
+	struct btf *btf;
+	u32 type_id; /* type id for this prog/func */
 	union {
 		struct work_struct work;
 		struct rcu_head	rcu;
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 9e8056ec20fa..e84782ec50ac 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -201,6 +201,7 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
 struct bpf_subprog_info {
 	u32 start; /* insn idx of function entry point */
 	u16 stack_depth; /* max. stack depth used by this function */
+	u32 type_id; /* btf type_id for this subprog */
 };
 
 /* single container for all structs
diff --git a/include/linux/btf.h b/include/linux/btf.h
index e076c4697049..90e91b52aa90 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -46,5 +46,7 @@ void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
 		       struct seq_file *m);
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
+bool is_btf_func_type(const struct btf *btf, u32 type_id);
+const char *btf_get_name_by_id(const struct btf *btf, u32 type_id);
 
 #endif
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f9187b41dff6..7ebbf4f06a65 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -332,6 +332,9 @@ union bpf_attr {
 		 * (context accesses, allowed helpers, etc).
 		 */
 		__u32		expected_attach_type;
+		__u32		prog_btf_fd;	/* fd pointing to BTF type data */
+		__u32		func_info_len;	/* func_info length */
+		__aligned_u64	func_info;	/* func type info */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -2585,6 +2588,9 @@ struct bpf_prog_info {
 	__u32 nr_jited_func_lens;
 	__aligned_u64 jited_ksyms;
 	__aligned_u64 jited_func_lens;
+	__u32 btf_id;
+	__u32 nr_jited_func_types;
+	__aligned_u64 jited_func_types;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
@@ -2896,4 +2902,9 @@ struct bpf_flow_keys {
 	};
 };
 
+struct bpf_func_info {
+	__u32	insn_offset;
+	__u32	type_id;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 794a185f11bf..85b8eeccddbd 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -486,6 +486,15 @@ static const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id)
 	return btf->types[type_id];
 }
 
+bool is_btf_func_type(const struct btf *btf, u32 type_id)
+{
+	const struct btf_type *type = btf_type_by_id(btf, type_id);
+
+	if (!type || BTF_INFO_KIND(type->info) != BTF_KIND_FUNC)
+		return false;
+	return true;
+}
+
 /*
  * Regular int is not a bit field and it must be either
  * u8/u16/u32/u64.
@@ -2579,3 +2588,10 @@ u32 btf_id(const struct btf *btf)
 {
 	return btf->id;
 }
+
+const char *btf_get_name_by_id(const struct btf *btf, u32 type_id)
+{
+	const struct btf_type *t = btf_type_by_id(btf, type_id);
+
+	return btf_name_by_offset(btf, t->name_off);
+}
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 3f5bf1af0826..add3866a120e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -27,6 +27,7 @@
 #include <linux/random.h>
 #include <linux/moduleloader.h>
 #include <linux/bpf.h>
+#include <linux/btf.h>
 #include <linux/frame.h>
 #include <linux/rbtree_latch.h>
 #include <linux/kallsyms.h>
@@ -387,6 +388,7 @@ bpf_get_prog_addr_region(const struct bpf_prog *prog,
 static void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
 {
 	const char *end = sym + KSYM_NAME_LEN;
+	const char *func_name;
 
 	BUILD_BUG_ON(sizeof("bpf_prog_") +
 		     sizeof(prog->tag) * 2 +
@@ -401,6 +403,13 @@ static void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
 
 	sym += snprintf(sym, KSYM_NAME_LEN, "bpf_prog_");
 	sym  = bin2hex(sym, prog->tag, sizeof(prog->tag));
+
+	if (prog->aux->btf) {
+		func_name = btf_get_name_by_id(prog->aux->btf, prog->aux->type_id);
+		snprintf(sym, (size_t)(end - sym), "_%s", func_name);
+		return;
+	}
+
 	if (prog->aux->name[0])
 		snprintf(sym, (size_t)(end - sym), "_%s", prog->aux->name);
 	else
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4f416234251f..aa4688a1a137 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1120,6 +1120,7 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
 		/* bpf_prog_free_id() must be called first */
 		bpf_prog_free_id(prog, do_idr_lock);
 		bpf_prog_kallsyms_del_all(prog);
+		btf_put(prog->aux->btf);
 
 		call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
 	}
@@ -1343,8 +1344,45 @@ bpf_prog_load_check_attach_type(enum bpf_prog_type prog_type,
 	}
 }
 
+static int prog_check_btf(const struct bpf_prog *prog, const struct btf *btf,
+			  union bpf_attr *attr)
+{
+	struct bpf_func_info __user *uinfo, info;
+	int i, nfuncs, usize;
+	u32 prev_offset;
+
+	usize = sizeof(struct bpf_func_info);
+	if (attr->func_info_len % usize)
+		return -EINVAL;
+
+	/* func_info section should have increasing and valid insn_offset
+	 * and type should be BTF_KIND_FUNC.
+	 */
+	nfuncs = attr->func_info_len / usize;
+	uinfo = u64_to_user_ptr(attr->func_info);
+	for (i = 0; i < nfuncs; i++) {
+		if (copy_from_user(&info, uinfo, usize))
+			return -EFAULT;
+
+		if (!is_btf_func_type(btf, info.type_id))
+			return -EINVAL;
+
+		if (i == 0) {
+			if (info.insn_offset)
+				return -EINVAL;
+			prev_offset = 0;
+		} else if (info.insn_offset < prev_offset) {
+			return -EINVAL;
+		}
+
+		prev_offset = info.insn_offset;
+	}
+
+	return 0;
+}
+
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD expected_attach_type
+#define	BPF_PROG_LOAD_LAST_FIELD func_info
 
 static int bpf_prog_load(union bpf_attr *attr)
 {
@@ -1431,6 +1469,27 @@ static int bpf_prog_load(union bpf_attr *attr)
 	if (err)
 		goto free_prog;
 
+	/* copy func_info before verifier which may make
+	 * some adjustment.
+	 */
+	if (attr->func_info_len) {
+		struct btf *btf;
+
+		btf = btf_get_by_fd(attr->prog_btf_fd);
+		if (IS_ERR(btf)) {
+			err = PTR_ERR(btf);
+			goto free_prog;
+		}
+
+		err = prog_check_btf(prog, btf, attr);
+		if (err) {
+			btf_put(btf);
+			goto free_prog;
+		}
+
+		prog->aux->btf = btf;
+	}
+
 	/* run eBPF verifier */
 	err = bpf_check(&prog, attr);
 	if (err < 0)
@@ -1463,6 +1522,7 @@ static int bpf_prog_load(union bpf_attr *attr)
 	bpf_prog_kallsyms_del_subprogs(prog);
 	free_used_maps(prog->aux);
 free_prog:
+	btf_put(prog->aux->btf);
 	bpf_prog_uncharge_memlock(prog);
 free_prog_sec:
 	security_bpf_prog_free(prog->aux);
@@ -2108,6 +2168,30 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 		}
 	}
 
+	if (prog->aux->btf) {
+		info.btf_id = btf_id(prog->aux->btf);
+
+		ulen = info.nr_jited_func_types;
+		info.nr_jited_func_types = prog->aux->func_cnt;
+		if (info.nr_jited_func_types && ulen) {
+			if (bpf_dump_raw_ok()) {
+				u32 __user *user_types;
+				u32 func_type, i;
+
+				ulen = min_t(u32, info.nr_jited_func_types,
+					     ulen);
+				user_types = u64_to_user_ptr(info.jited_func_types);
+				for (i = 0; i < ulen; i++) {
+					func_type = prog->aux->func[i]->aux->type_id;
+					if (put_user(func_type, &user_types[i]))
+						return -EFAULT;
+				}
+			} else {
+				info.jited_func_types = 0;
+			}
+		}
+	}
+
 done:
 	if (copy_to_user(uinfo, &info, info_len) ||
 	    put_user(info_len, &uattr->info.info_len))
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3f93a548a642..97c408e84322 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4589,6 +4589,50 @@ static int check_cfg(struct bpf_verifier_env *env)
 	return ret;
 }
 
+static int check_btf_func(struct bpf_prog *prog, struct bpf_verifier_env *env,
+			  union bpf_attr *attr)
+{
+	struct bpf_func_info *data;
+	int i, nfuncs, ret = 0;
+
+	if (!attr->func_info_len)
+		return 0;
+
+	nfuncs = attr->func_info_len / sizeof(struct bpf_func_info);
+	if (env->subprog_cnt != nfuncs) {
+		verbose(env, "number of funcs in func_info does not match verifier\n");
+		return -EINVAL;
+	}
+
+	data = kvmalloc(attr->func_info_len, GFP_KERNEL | __GFP_NOWARN);
+	if (!data) {
+		verbose(env, "no memory to allocate attr func_info\n");
+		return -ENOMEM;
+	}
+
+	if (copy_from_user(data, u64_to_user_ptr(attr->func_info),
+			   attr->func_info_len)) {
+		verbose(env, "memory copy error for attr func_info\n");
+		ret = -EFAULT;
+		goto cleanup;
+		}
+
+	for (i = 0; i < nfuncs; i++) {
+		if (env->subprog_info[i].start != data[i].insn_offset) {
+			verbose(env, "func_info subprog start (%d) does not match verifier (%d)\n",
+				env->subprog_info[i].start, data[i].insn_offset);
+			ret = -EINVAL;
+			goto cleanup;
+		}
+		env->subprog_info[i].type_id = data[i].type_id;
+	}
+
+	prog->aux->type_id = data[0].type_id;
+cleanup:
+	kvfree(data);
+	return ret;
+}
+
 /* check %cur's range satisfies %old's */
 static bool range_within(struct bpf_reg_state *old,
 			 struct bpf_reg_state *cur)
@@ -5873,6 +5917,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->name[0] = 'F';
 		func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
 		func[i]->jit_requested = 1;
+		func[i]->aux->btf = prog->aux->btf;
+		func[i]->aux->type_id = env->subprog_info[i].type_id;
 		func[i] = bpf_int_jit_compile(func[i]);
 		if (!func[i]->jited) {
 			err = -ENOTSUPP;
@@ -6307,6 +6353,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = check_btf_func(env->prog, env, attr);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = do_check(env);
 	if (env->cur_state) {
 		free_verifier_state(env->cur_state, true);
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 04/13] tools/bpf: add btf func/func_proto unit tests in selftest test_btf
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185424.2378502-1-yhs@fb.com>

Add several BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
unit tests in bpf selftest test_btf.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/lib/bpf/btf.c                    |   4 +
 tools/testing/selftests/bpf/test_btf.c | 216 +++++++++++++++++++++++++
 2 files changed, 220 insertions(+)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 449591aa9900..33095fc1860b 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -165,6 +165,10 @@ static int btf_parse_type_sec(struct btf *btf, btf_print_fn_t err_log)
 		case BTF_KIND_ENUM:
 			next_type += vlen * sizeof(struct btf_enum);
 			break;
+		case BTF_KIND_FUNC:
+		case BTF_KIND_FUNC_PROTO:
+			next_type += vlen * sizeof(int);
+			break;
 		case BTF_KIND_TYPEDEF:
 		case BTF_KIND_PTR:
 		case BTF_KIND_FWD:
diff --git a/tools/testing/selftests/bpf/test_btf.c b/tools/testing/selftests/bpf/test_btf.c
index f42b3396d622..b6461c3c5e11 100644
--- a/tools/testing/selftests/bpf/test_btf.c
+++ b/tools/testing/selftests/bpf/test_btf.c
@@ -1374,6 +1374,222 @@ static struct btf_raw_test raw_tests[] = {
 	.map_create_err = true,
 },
 
+{
+	.descr = "func pointer #1",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		/* int (*func)(int, unsigned int) */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 2), 1),	/* [3] */
+		1, 2,
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),	/* [4] */
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+},
+
+{
+	.descr = "func pointer #2",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		/* void (*func)(int, unsigned int, ....) */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 3), 0),	/* [3] */
+		1, 2, 0,
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),	/* [4] */
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+},
+
+{
+	.descr = "func pointer #3",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		/* void (*func)(void, int, unsigned int) */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 3), 0),	/* [3] */
+		1, 0, 2,
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),	/* [4] */
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "Invalid arg#2",
+},
+
+{
+	.descr = "func pointer #4",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		/*
+		 * Testing:
+		 * BTF_KIND_CONST => BTF_KIND_TYPEDEF => BTF_KIND_PTR =>
+		 * BTF_KIND_FUNC_PROTO
+		 */
+		/* typedef void (*func_ptr)(int, unsigned int) */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 5),/* [3] */
+		/* const func_ptr */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 3),	/* [4] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 6),	/* [5] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 2), 0),	/* [6] */
+		1, 2,
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+},
+
+{
+	.descr = "func pointer #5",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		/*
+		 * Skipped the BTF_KIND_PTR.
+		 * BTF_KIND_CONST => BTF_KIND_TYPEDEF => BTF_KIND_FUNC_PROTO
+		 */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 4),	/* [3] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 5),/* [4] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 2), 0),	/* [5] */
+		1, 2,
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "Invalid type_id",
+},
+
+{
+	/* Test btf_resolve() in btf_check_func() */
+	.descr = "func pointer #6",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		/* void (*func)(const void *) */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 0),	/* [2] */
+		4,
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2),	/* [3] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_CONST, 0, 0), 5),	/* [4] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 0),	/* [5] */
+		BTF_END_RAW,
+	},
+	.str_sec = "",
+	.str_sec_size = sizeof(""),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+},
+
+{
+	.descr = "func #1",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),	/* [3] */
+		1, 2,
+		BTF_END_RAW,
+	},
+	.str_sec = "\0A\0",
+	.str_sec_size = sizeof("\0A\0"),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+},
+
+{
+	.descr = "func #2",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 1),	/* [3] */
+		1, 2,
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 3),	/* [4] */
+		BTF_END_RAW,
+	},
+	.str_sec = "\0A\0",
+	.str_sec_size = sizeof("\0A\0"),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "Invalid type_id",
+},
+
+{
+	.descr = "func #3",
+	.raw_types = {
+		BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),	/* [1] */
+		BTF_TYPE_INT_ENC(0, 0, 0, 32, 4),		/* [2] */
+		BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_TYPEDEF, 0, 0), 4),/* [3] */
+		BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 2), 0),	/* [4] */
+		1, 2,
+		BTF_END_RAW,
+	},
+	.str_sec = "\0A\0",
+	.str_sec_size = sizeof("\0A\0"),
+	.map_type = BPF_MAP_TYPE_ARRAY,
+	.map_name = "func_type_check_btf",
+	.key_size = sizeof(int),
+	.value_size = sizeof(int),
+	.key_type_id = 1,
+	.value_type_id = 1,
+	.max_entries = 4,
+	.btf_load_err = true,
+	.err_str = "Invalid type_id",
+},
+
 }; /* struct btf_raw_test raw_tests[] */
 
 static const char *get_next_str(const char *start, const char *end)
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 07/13] tools/bpf: add new fields for program load in lib/bpf
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185446.2379289-1-yhs@fb.com>

The new fields are added for program load in lib/bpf so
application uses api bpf_load_program_xattr() is able
to load program with btf and func_info data.

This functionality will be used in next patch
by bpf selftest test_btf.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/lib/bpf/bpf.c | 3 +++
 tools/lib/bpf/bpf.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index d70a255cb05e..d8d48ab34220 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -196,6 +196,9 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 	attr.log_level = 0;
 	attr.kern_version = load_attr->kern_version;
 	attr.prog_ifindex = load_attr->prog_ifindex;
+	attr.prog_btf_fd = load_attr->prog_btf_fd;
+	attr.func_info_len = load_attr->func_info_len;
+	attr.func_info = ptr_to_u64(load_attr->func_info);
 	memcpy(attr.prog_name, load_attr->name,
 	       min(name_len, BPF_OBJ_NAME_LEN - 1));
 
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 87520a87a75f..d8fe72b22168 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -67,6 +67,9 @@ struct bpf_load_program_attr {
 	const char *license;
 	__u32 kern_version;
 	__u32 prog_ifindex;
+	__u32 prog_btf_fd;
+	__u32 func_info_len;
+	const struct bpf_func_info *func_info;
 };
 
 /* Recommend log buffer size */
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 09/13] tools/bpf: add support to read .BTF.ext sections
From: Yonghong Song @ 2018-10-12 18:54 UTC (permalink / raw)
  To: ast, kafai, daniel, netdev; +Cc: kernel-team
In-Reply-To: <20181012185446.2379289-1-yhs@fb.com>

The .BTF section is already available to encode types.
These types can be used for map
pretty print. The whole .BTF will be passed to the
kernel as well for which kernel can verify and return
to the user space for pretty print etc.

Recently landed llvm patch "[BPF] Add BTF generation
for BPF target" (https://reviews.llvm.org/rL344366)
will generate .BTF section and one more section
.BTF.ext. The .BTF.ext section encodes function type
information and line information. For line information,
the actual source code is encoded in the section, which
makes compiler itself as an ideal place for section
generation.

The .BTF section does not depend on any other section,
and .BTF.ext has dependency on .BTF for strings and types.

The .BTF section can be directly loaded into the
kernel, and the .BTF.ext section cannot. The loader
may need to do some relocation and merging,
similar to merging multiple code sections, before
loading into the kernel.

In this patch, only func type info is processed.
The functionality is implemented in libbpf.
In this patch, the header for .BTF.ext is the same
as the one in LLVM
(https://github.com/llvm-mirror/llvm/blob/master/include/llvm/MC/MCBTFContext.h).

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/lib/bpf/btf.c    | 232 +++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/btf.h    |  31 ++++++
 tools/lib/bpf/libbpf.c |  53 +++++++++-
 3 files changed, 312 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 33095fc1860b..4748e0bacd2b 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -37,6 +37,11 @@ struct btf {
 	int fd;
 };
 
+struct btf_ext {
+	void *func_info;
+	__u32 func_info_len;
+};
+
 static int btf_add_type(struct btf *btf, struct btf_type *t)
 {
 	if (btf->types_size - btf->nr_types < 2) {
@@ -397,3 +402,230 @@ const char *btf__name_by_offset(const struct btf *btf, __u32 offset)
 	else
 		return NULL;
 }
+
+static int btf_ext_validate_func_info(const struct btf_sec_func_info *sinfo,
+				      __u32 size, btf_print_fn_t err_log)
+{
+	int sec_hdrlen = sizeof(struct btf_sec_func_info);
+	__u32 record_size = sizeof(struct bpf_func_info);
+	__u32 size_left = size, num_records;
+	__u64 total_record_size;
+
+	while (size_left) {
+		if (size_left < sec_hdrlen) {
+			elog("BTF.ext func_info header not found");
+			return -EINVAL;
+		}
+
+		num_records = sinfo->num_func_info;
+		if (num_records == 0) {
+			elog("incorrect BTF.ext num_func_info");
+			return -EINVAL;
+		}
+
+		total_record_size = sec_hdrlen +
+				    (__u64)num_records * record_size;
+		if (size_left < total_record_size) {
+			elog("incorrect BTF.ext num_func_info");
+			return -EINVAL;
+		}
+
+		size_left -= total_record_size;
+		sinfo = (void *)sinfo + total_record_size;
+	}
+
+	return 0;
+}
+static int btf_ext_parse_hdr(__u8 *data, __u32 data_size,
+			     btf_print_fn_t err_log)
+{
+	const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
+	const struct btf_sec_func_info *sinfo;
+	__u32 meta_left, last_func_info_pos;
+
+	if (data_size < sizeof(*hdr)) {
+		elog("BTF.ext header not found");
+		return -EINVAL;
+	}
+
+	if (hdr->magic != BTF_MAGIC) {
+		elog("Invalid BTF.ext magic:%x\n", hdr->magic);
+		return -EINVAL;
+	}
+
+	if (hdr->version != BTF_VERSION) {
+		elog("Unsupported BTF.ext version:%u\n", hdr->version);
+		return -ENOTSUP;
+	}
+
+	if (hdr->flags) {
+		elog("Unsupported BTF.ext flags:%x\n", hdr->flags);
+		return -ENOTSUP;
+	}
+
+	meta_left = data_size - sizeof(*hdr);
+	if (!meta_left) {
+		elog("BTF.ext has no data\n");
+		return -EINVAL;
+	}
+
+	if (meta_left < hdr->func_info_off) {
+		elog("Invalid BTF.ext func_info section offset:%u\n",
+		     hdr->func_info_off);
+		return -EINVAL;
+	}
+
+	if (hdr->func_info_off & 0x02) {
+		elog("BTF.ext func_info section is not aligned to 4 bytes\n");
+		return -EINVAL;
+	}
+
+	last_func_info_pos = sizeof(*hdr) + hdr->func_info_off +
+			     hdr->func_info_len;
+	if (last_func_info_pos > data_size) {
+		elog("Invalid BTF.ext func_info section size:%u\n",
+		     hdr->func_info_len);
+		return -EINVAL;
+	}
+
+	sinfo = (const struct btf_sec_func_info *)(data + sizeof(*hdr) +
+						   hdr->func_info_off);
+	return btf_ext_validate_func_info(sinfo, hdr->func_info_len,
+					  err_log);
+}
+
+void btf_ext__free(struct btf_ext *btf_ext)
+{
+	if (!btf_ext)
+		return;
+
+	free(btf_ext->func_info);
+	free(btf_ext);
+}
+
+struct btf_ext *btf_ext__new(__u8 *data, __u32 size, btf_print_fn_t err_log)
+{
+	int hdrlen = sizeof(struct btf_ext_header);
+	const struct btf_ext_header *hdr;
+	struct btf_ext *btf_ext;
+	void *fdata;
+	int err;
+
+	err = btf_ext_parse_hdr(data, size, err_log);
+	if (err)
+		return ERR_PTR(err);
+
+	btf_ext = calloc(1, sizeof(struct btf_ext));
+	if (!btf_ext)
+		return ERR_PTR(-ENOMEM);
+
+	hdr = (const struct btf_ext_header *)data;
+	fdata = malloc(hdr->func_info_len);
+	if (!fdata)
+		return ERR_PTR(-ENOMEM);
+
+	memcpy(fdata, data + hdrlen + hdr->func_info_off, hdr->func_info_len);
+	btf_ext->func_info = fdata;
+	btf_ext->func_info_len = hdr->func_info_len;
+
+	return btf_ext;
+}
+
+int btf_ext_reloc_init(struct btf *btf, struct btf_ext *btf_ext,
+		       const char *sec_name, __u32 *btf_fd,
+		       void **func_info, __u32 *func_info_len)
+{
+	int sec_hdrlen = sizeof(struct btf_sec_func_info);
+	int record_len = sizeof(struct bpf_func_info);
+	struct btf_sec_func_info *sinfo;
+	int i, remain_len, records_len;
+	const char *info_sec_name;
+	void *data;
+
+	if (!btf || !btf_ext) {
+		*btf_fd = 0;
+		*func_info = NULL;
+		*func_info_len = 0;
+		return 0;
+	}
+
+	sinfo = btf_ext->func_info;
+	remain_len = btf_ext->func_info_len;
+	*btf_fd = btf__fd(btf);
+
+	while (remain_len > 0) {
+		records_len = sinfo->num_func_info * record_len;
+		info_sec_name = btf__name_by_offset(btf, sinfo->sec_name_off);
+		if (strcmp(info_sec_name, sec_name)) {
+			remain_len -= sec_hdrlen + records_len;
+			sinfo = (void *)sinfo + sec_hdrlen + records_len;
+			continue;
+		}
+
+		data = malloc(records_len);
+		if (!data)
+			return -ENOMEM;
+
+		memcpy(data, sinfo->data, records_len);
+
+		/* adjust the insn_offset, the data in .BTF.ext is
+		 * the actual byte offset, and the kernel expects
+		 * the offset in term of bpf_insn.
+		 */
+		for (i = 0; i < sinfo->num_func_info; i++)
+			((struct bpf_func_info *)data)[i].insn_offset
+				/= sizeof(struct bpf_insn);
+
+		*func_info = data;
+		*func_info_len = records_len;
+		break;
+	}
+
+	return 0;
+}
+
+int btf_ext_reloc(struct btf *btf, struct btf_ext *btf_ext,
+		  const char *sec_name, __u32 insns_cnt,
+		  void **func_info, __u32 *func_info_len)
+{
+	int sec_hdrlen = sizeof(struct btf_sec_func_info);
+	int record_len = sizeof(struct bpf_func_info);
+	__u32 i, remain_len, records_len;
+	struct btf_sec_func_info *sinfo;
+	struct bpf_func_info *record;
+	const char *info_sec_name;
+	__u32 existing_flen;
+	void *data;
+
+	if (!*func_info)
+		return 0;
+
+	sinfo = btf_ext->func_info;
+	remain_len = btf_ext->func_info_len;
+	while (remain_len > 0) {
+		records_len = sinfo->num_func_info * record_len;
+		info_sec_name = btf__name_by_offset(btf, sinfo->sec_name_off);
+		if (strcmp(info_sec_name, sec_name)) {
+			remain_len -= sec_hdrlen + records_len;
+			sinfo = (void *)sinfo + sec_hdrlen + records_len;
+			continue;
+		}
+
+		existing_flen = *func_info_len;
+		data = realloc(*func_info, existing_flen + records_len);
+		if (!data)
+			return -ENOMEM;
+
+		memcpy(data + existing_flen, sinfo->data, records_len);
+		record = data + existing_flen;
+		for (i = 0; i < sinfo->num_func_info; i++)
+			record[i].insn_offset =
+				record[i].insn_offset/sizeof(struct bpf_insn) +
+				insns_cnt;
+		*func_info = data;
+		*func_info_len = existing_flen + records_len;
+		break;
+	}
+
+	return 0;
+}
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 6db5462bb2ef..2bbcc9e41cf5 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -7,10 +7,32 @@
 #include <linux/types.h>
 
 #define BTF_ELF_SEC ".BTF"
+#define BTF_EXT_ELF_SEC ".BTF.ext"
 
 struct btf;
+struct btf_ext;
 struct btf_type;
 
+struct btf_ext_header {
+	__u16	magic;
+	__u8	version;
+	__u8	flags;
+	__u32	hdr_len;
+
+	/* All offsets are in bytes relative to the end of this header */
+	__u32	func_info_off;
+	__u32	func_info_len;
+	__u32	line_info_off;
+	__u32	line_info_len;
+};
+
+struct btf_sec_func_info {
+	__u32	sec_name_off;
+	__u32	num_func_info;
+	/* followed by num_func_info number of bpf_func_info records */
+	__u8	data[0];
+};
+
 typedef int (*btf_print_fn_t)(const char *, ...)
 	__attribute__((format(printf, 1, 2)));
 
@@ -23,4 +45,13 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id);
 int btf__fd(const struct btf *btf);
 const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
 
+struct btf_ext *btf_ext__new(__u8 *data, __u32 size, btf_print_fn_t err_log);
+void btf_ext__free(struct btf_ext *btf_ext);
+int btf_ext_reloc_init(struct btf *btf, struct btf_ext *btf_ext,
+		       const char *sec_name, __u32 *btf_fd,
+		       void **func_info, __u32 *func_info_len);
+int btf_ext_reloc(struct btf *btf, struct btf_ext *btf_ext,
+		  const char *sec_name, __u32 insns_cnt, void **func_info,
+		  __u32 *func_info_len);
+
 #endif /* __LIBBPF_BTF_H */
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 176cf5523728..9271b06330f7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -151,6 +151,9 @@ struct bpf_program {
 	bpf_program_clear_priv_t clear_priv;
 
 	enum bpf_attach_type expected_attach_type;
+	__u32 btf_fd;
+	void *func_info;
+	__u32 func_info_len;
 };
 
 struct bpf_map {
@@ -207,6 +210,7 @@ struct bpf_object {
 	struct list_head list;
 
 	struct btf *btf;
+	struct btf_ext *btf_ext;
 
 	void *priv;
 	bpf_object_clear_priv_t clear_priv;
@@ -781,6 +785,15 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 					   BTF_ELF_SEC, PTR_ERR(obj->btf));
 				obj->btf = NULL;
 			}
+		} else if (strcmp(name, BTF_EXT_ELF_SEC) == 0) {
+			obj->btf_ext = btf_ext__new(data->d_buf, data->d_size,
+						    __pr_debug);
+			if (IS_ERR(obj->btf_ext)) {
+				pr_warning("Error loading ELF section %s: %ld. Ignored and continue.\n",
+					   BTF_EXT_ELF_SEC,
+					   PTR_ERR(obj->btf_ext));
+				obj->btf_ext = NULL;
+			}
 		} else if (sh.sh_type == SHT_SYMTAB) {
 			if (obj->efile.symbols) {
 				pr_warning("bpf: multiple SYMTAB in %s\n",
@@ -1164,6 +1177,7 @@ bpf_program__reloc_text(struct bpf_program *prog, struct bpf_object *obj,
 	struct bpf_insn *insn, *new_insn;
 	struct bpf_program *text;
 	size_t new_cnt;
+	int err;
 
 	if (relo->type != RELO_CALL)
 		return -LIBBPF_ERRNO__RELOC;
@@ -1186,6 +1200,16 @@ bpf_program__reloc_text(struct bpf_program *prog, struct bpf_object *obj,
 			pr_warning("oom in prog realloc\n");
 			return -ENOMEM;
 		}
+
+		err = btf_ext_reloc(obj->btf, obj->btf_ext, text->section_name,
+				    prog->insns_cnt, &prog->func_info,
+				    &prog->func_info_len);
+		if (err) {
+			pr_warning("error in btf_ext_reloc for sec %s\n",
+				   text->section_name);
+			return err;
+		}
+
 		memcpy(new_insn + prog->insns_cnt, text->insns,
 		       text->insns_cnt * sizeof(*insn));
 		prog->insns = new_insn;
@@ -1205,7 +1229,19 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 {
 	int i, err;
 
-	if (!prog || !prog->reloc_desc)
+	if (!prog)
+		return 0;
+
+	err = btf_ext_reloc_init(obj->btf, obj->btf_ext, prog->section_name,
+				 &prog->btf_fd, &prog->func_info,
+				 &prog->func_info_len);
+	if (err) {
+		pr_warning("err in btf_ext_reloc_init for sec %s\n",
+			   prog->section_name);
+		return err;
+	}
+
+	if (!prog->reloc_desc)
 		return 0;
 
 	for (i = 0; i < prog->nr_reloc; i++) {
@@ -1295,7 +1331,8 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
 static int
 load_program(enum bpf_prog_type type, enum bpf_attach_type expected_attach_type,
 	     const char *name, struct bpf_insn *insns, int insns_cnt,
-	     char *license, __u32 kern_version, int *pfd, int prog_ifindex)
+	     char *license, __u32 kern_version, int *pfd, int prog_ifindex,
+	     u32 btf_fd, struct bpf_func_info *func_info, u32 func_info_len)
 {
 	struct bpf_load_program_attr load_attr;
 	char *cp, errmsg[STRERR_BUFSIZE];
@@ -1311,6 +1348,9 @@ load_program(enum bpf_prog_type type, enum bpf_attach_type expected_attach_type,
 	load_attr.license = license;
 	load_attr.kern_version = kern_version;
 	load_attr.prog_ifindex = prog_ifindex;
+	load_attr.prog_btf_fd = btf_fd;
+	load_attr.func_info = func_info;
+	load_attr.func_info_len = func_info_len;
 
 	if (!load_attr.insns || !load_attr.insns_cnt)
 		return -EINVAL;
@@ -1394,7 +1434,9 @@ bpf_program__load(struct bpf_program *prog,
 		err = load_program(prog->type, prog->expected_attach_type,
 				   prog->name, prog->insns, prog->insns_cnt,
 				   license, kern_version, &fd,
-				   prog->prog_ifindex);
+				   prog->prog_ifindex,
+				   prog->btf_fd, prog->func_info,
+				   prog->func_info_len);
 		if (!err)
 			prog->instances.fds[0] = fd;
 		goto out;
@@ -1426,7 +1468,9 @@ bpf_program__load(struct bpf_program *prog,
 				   prog->name, result.new_insn_ptr,
 				   result.new_insn_cnt,
 				   license, kern_version, &fd,
-				   prog->prog_ifindex);
+				   prog->prog_ifindex,
+				   prog->btf_fd, prog->func_info,
+				   prog->func_info_len);
 
 		if (err) {
 			pr_warning("Loading the %dth instance of program '%s' failed\n",
@@ -1835,6 +1879,7 @@ void bpf_object__close(struct bpf_object *obj)
 	bpf_object__elf_finish(obj);
 	bpf_object__unload(obj);
 	btf__free(obj->btf);
+	btf_ext__free(obj->btf_ext);
 
 	for (i = 0; i < obj->nr_maps; i++) {
 		zfree(&obj->maps[i].name);
-- 
2.17.1

^ permalink raw reply related

* [iproute2 PATCH] tc: flower: Classify packets based port ranges
From: Amritha Nambiar @ 2018-10-12 13:54 UTC (permalink / raw)
  To: stephen, netdev
  Cc: jakub.kicinski, amritha.nambiar, sridhar.samudrala, jhs,
	xiyou.wangcong, jiri

Added support for filtering based on port ranges.

Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
  action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  dst_port_min 20
  dst_port_max 30
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 181 sec used 5 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
  skip_hw action drop

$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
  eth_type ipv4
  ip_proto tcp
  dst_ip 192.168.1.1
  dst_port_min 100
  dst_port_max 200
  skip_hw
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1 installed 28 sec used 6 sec
        Action statistics:
        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
---
 include/uapi/linux/pkt_cls.h |    5 +
 tc/f_flower.c                |  145 +++++++++++++++++++++++++++++++++++++++---
 2 files changed, 140 insertions(+), 10 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index be382fb..3d9727f 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -405,6 +405,11 @@ enum {
 	TCA_FLOWER_KEY_UDP_SRC,		/* be16 */
 	TCA_FLOWER_KEY_UDP_DST,		/* be16 */
 
+	TCA_FLOWER_KEY_PORT_SRC_MIN,	/* be16 */
+	TCA_FLOWER_KEY_PORT_SRC_MAX,	/* be16 */
+	TCA_FLOWER_KEY_PORT_DST_MIN,	/* be16 */
+	TCA_FLOWER_KEY_PORT_DST_MAX,	/* be16 */
+
 	TCA_FLOWER_FLAGS,
 	TCA_FLOWER_KEY_VLAN_ID,		/* be16 */
 	TCA_FLOWER_KEY_VLAN_PRIO,	/* u8   */
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 59e5f57..1a7bc80 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -33,6 +33,11 @@ enum flower_endpoint {
 	FLOWER_ENDPOINT_DST
 };
 
+struct range_type {
+	__be16 min_port_type;
+	__be16 max_port_type;
+};
+
 enum flower_icmp_field {
 	FLOWER_ICMP_FIELD_TYPE,
 	FLOWER_ICMP_FIELD_CODE
@@ -493,6 +498,64 @@ static int flower_parse_port(char *str, __u8 ip_proto,
 	return 0;
 }
 
+static int flower_port_range_attr_type(__u8 ip_proto, enum flower_endpoint type,
+				       struct range_type *range)
+{
+	if (ip_proto == IPPROTO_TCP || ip_proto == IPPROTO_UDP ||
+	    ip_proto == IPPROTO_SCTP) {
+		if (type == FLOWER_ENDPOINT_SRC) {
+			range->min_port_type = TCA_FLOWER_KEY_PORT_SRC_MIN;
+			range->max_port_type = TCA_FLOWER_KEY_PORT_SRC_MAX;
+		} else {
+			range->min_port_type = TCA_FLOWER_KEY_PORT_DST_MIN;
+			range->max_port_type = TCA_FLOWER_KEY_PORT_DST_MAX;
+		}
+	} else {
+		return -1;
+	}
+
+	return 0;
+}
+
+static int flower_parse_port_range(__be16 *min, __be16 *max, __u8 ip_proto,
+				   enum flower_endpoint endpoint,
+				   struct nlmsghdr *n)
+{
+	struct range_type range;
+
+	flower_port_range_attr_type(ip_proto, endpoint, &range);
+	addattr16(n, MAX_MSG, range.min_port_type, *min);
+	addattr16(n, MAX_MSG, range.max_port_type, *max);
+
+	return 0;
+}
+
+static int get_range(__be16 *min, __be16 *max, char *argv)
+{
+	char *r;
+
+	r = strchr(argv, '-');
+	if (r) {
+		*r = '\0';
+		if (get_be16(min, argv, 10)) {
+			fprintf(stderr, "invalid min range\n");
+			return -1;
+		}
+		if (get_be16(max, r + 1, 10)) {
+			fprintf(stderr, "invalid max range\n");
+			return -1;
+		}
+		if (htons(*max) <= htons(*min)) {
+			fprintf(stderr, "max value should be greater than min value\n");
+			return -1;
+		}
+	} else {
+		fprintf(stderr, "Illegal range format\n");
+		return -1;
+	}
+	return 0;
+}
+
 #define TCP_FLAGS_MAX_MASK 0xfff
 
 static int flower_parse_tcp_flags(char *str, int flags_type, int mask_type,
@@ -887,20 +950,54 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 				return -1;
 			}
 		} else if (matches(*argv, "dst_port") == 0) {
+			__be16 min, max;
+
 			NEXT_ARG();
-			ret = flower_parse_port(*argv, ip_proto,
-						FLOWER_ENDPOINT_DST, n);
-			if (ret < 0) {
-				fprintf(stderr, "Illegal \"dst_port\"\n");
-				return -1;
+			if (matches(*argv, "range") == 0) {
+				NEXT_ARG();
+				ret = get_range(&min, &max, *argv);
+				if (ret < 0)
+					return -1;
+				ret = flower_parse_port_range(&min, &max,
+							      ip_proto,
+							      FLOWER_ENDPOINT_DST,
+							      n);
+				if (ret < 0) {
+					fprintf(stderr, "Illegal \"dst_port range\"\n");
+					return -1;
+				}
+			} else {
+				ret = flower_parse_port(*argv, ip_proto,
+							FLOWER_ENDPOINT_DST, n);
+				if (ret < 0) {
+					fprintf(stderr, "Illegal \"dst_port\"\n");
+					return -1;
+				}
 			}
 		} else if (matches(*argv, "src_port") == 0) {
+			__be16 min, max;
+
 			NEXT_ARG();
-			ret = flower_parse_port(*argv, ip_proto,
-						FLOWER_ENDPOINT_SRC, n);
-			if (ret < 0) {
-				fprintf(stderr, "Illegal \"src_port\"\n");
-				return -1;
+			if (matches(*argv, "range") == 0) {
+				NEXT_ARG();
+				ret = get_range(&min, &max, *argv);
+				if (ret < 0)
+					return -1;
+				ret = flower_parse_port_range(&min, &max,
+							      ip_proto,
+							      FLOWER_ENDPOINT_SRC,
+							      n);
+				if (ret < 0) {
+					fprintf(stderr, "Illegal \"src_port range\"\n");
+					return -1;
+				}
+			} else {
+				ret = flower_parse_port(*argv, ip_proto,
+							FLOWER_ENDPOINT_SRC, n);
+				if (ret < 0) {
+					fprintf(stderr, "Illegal \"src_port\"\n");
+					return -1;
+				}
 			}
 		} else if (matches(*argv, "tcp_flags") == 0) {
 			NEXT_ARG();
@@ -1309,6 +1406,17 @@ static void flower_print_port(char *name, struct rtattr *attr)
 	print_hu(PRINT_ANY, name, namefrm, rta_getattr_be16(attr));
 }
 
+static void flower_print_port_range(char *name, struct rtattr *attr)
+{
+	SPRINT_BUF(namefrm);
+
+	if (!attr)
+		return;
+
+	sprintf(namefrm, "\n  %s %%u", name);
+	print_uint(PRINT_ANY, name, namefrm, rta_getattr_be16(attr));
+}
+
 static void flower_print_tcp_flags(const char *name, struct rtattr *flags_attr,
 				   struct rtattr *mask_attr)
 {
@@ -1398,6 +1506,7 @@ static int flower_print_opt(struct filter_util *qu, FILE *f,
 			    struct rtattr *opt, __u32 handle)
 {
 	struct rtattr *tb[TCA_FLOWER_MAX + 1];
+	struct range_type range;
 	int nl_type, nl_mask_type;
 	__be16 eth_type = 0;
 	__u8 ip_proto = 0xff;
@@ -1516,6 +1625,22 @@ static int flower_print_opt(struct filter_util *qu, FILE *f,
 	if (nl_type >= 0)
 		flower_print_port("src_port", tb[nl_type]);
 
+	if (flower_port_range_attr_type(ip_proto, FLOWER_ENDPOINT_DST, &range)
+	    == 0) {
+		flower_print_port_range("dst_port_min",
+					tb[range.min_port_type]);
+		flower_print_port_range("dst_port_max",
+					tb[range.max_port_type]);
+	}
+
+	if (flower_port_range_attr_type(ip_proto, FLOWER_ENDPOINT_SRC, &range)
+	    == 0) {
+		flower_print_port_range("src_port_min",
+					tb[range.min_port_type]);
+		flower_print_port_range("src_port_max",
+					tb[range.max_port_type]);
+	}
+
 	flower_print_tcp_flags("tcp_flags", tb[TCA_FLOWER_KEY_TCP_FLAGS],
 			       tb[TCA_FLOWER_KEY_TCP_FLAGS_MASK]);
 

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox