Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] kcm: Remove redundant unlikely()
From: David Miller @ 2017-09-26 16:54 UTC (permalink / raw)
  To: tklauser; +Cc: netdev
In-Reply-To: <20170926092258.21391-1-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Tue, 26 Sep 2017 11:22:58 +0200

> IS_ERR() already implies unlikely(), so it can be omitted.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: Remove redundant unlikely()
From: David Miller @ 2017-09-26 16:54 UTC (permalink / raw)
  To: tklauser; +Cc: netdev, kuznet, yoshfuji
In-Reply-To: <20170926092231.21289-1-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Tue, 26 Sep 2017 11:22:31 +0200

> IS_ERR() already implies unlikely(), so it can be omitted.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] datagram: Remove redundant unlikely()
From: David Miller @ 2017-09-26 16:54 UTC (permalink / raw)
  To: tklauser; +Cc: netdev
In-Reply-To: <20170926092142.21156-1-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Tue, 26 Sep 2017 11:21:42 +0200

> IS_ERR() already implies unlikely(), so it can be omitted.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: ena: Remove redundant unlikely()
From: David Miller @ 2017-09-26 16:54 UTC (permalink / raw)
  To: tklauser; +Cc: netdev, netanel, saeed, zorik
In-Reply-To: <20170926090423.16937-1-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Tue, 26 Sep 2017 11:04:23 +0200

> IS_ERR() already implies unlikely(), so it can be omitted.
> 
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH v2] p54: don't unregister leds when they are not initialized
From: Christian Lamparter @ 2017-09-26 16:53 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Kalle Valo, linux-wireless, netdev, linux-kernel, Dmitry Vyukov,
	Kostya Serebryany
In-Reply-To: <17c60ebcc8ce7f20de41a55087d24dfdfca09c67.1506438620.git.andreyknvl@google.com>

On Tuesday, September 26, 2017 5:11:33 PM CEST Andrey Konovalov wrote:
> ieee80211_register_hw() in p54_register_common() may fail and leds won't
> get initialized. Currently p54_unregister_common() doesn't check that and
> always calls p54_unregister_leds(). The fix is to check priv->registered
> flag before calling p54_unregister_leds().
> 
> Found by syzkaller.
> 
> [...]
>  process_scheduled_works kernel/workqueue.c:2179
>  worker_thread+0xb2b/0x1850 kernel/workqueue.c:2255
>  kthread+0x3a1/0x470 kernel/kthread.c:231
>  ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
> 
> Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: stable@vger.kernel.org
Acked-by: Christian Lamparter <chunkeey@googlemail.com>

Thanks for making the patch too!

^ permalink raw reply

* Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN
From: Andrey Ryabinin @ 2017-09-26 16:49 UTC (permalink / raw)
  To: Arnd Bergmann, David Laight
  Cc: Mauro Carvalho Chehab, Jiri Pirko, Arend van Spriel, Kalle Valo,
	David S. Miller, Alexander Potapenko, Dmitry Vyukov,
	Masahiro Yamada, Michal Marek, Andrew Morton, Kees Cook,
	Geert Uytterhoeven, Greg Kroah-Hartman,
	linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-wireless@vger.kernel.org
In-Reply-To: <CAK8P3a37Ts5q7BvA2JWse87huyAp+=e18CUXEt8731RrBnB+Ow@mail.gmail.com>



On 09/26/2017 09:47 AM, Arnd Bergmann wrote:
> On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Mon, Sep 25, 2017 at 7:41 AM, David Laight <David.Laight@aculab.com> wrote:
>>> From: Arnd Bergmann
>>>> Sent: 22 September 2017 22:29
>>> ...
>>>> It seems that this is triggered in part by using strlcpy(), which the
>>>> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
>>>> is not part of the C standard.
>>>
>>> Neither is strncpy().
>>>
>>> It'll almost certainly be a marker in a header file somewhere,
>>> so it should be possibly to teach it about other functions.
>>
>> I'm currently travelling and haven't investigated in detail, but from
>> taking a closer look here, I found that the hardened 'strlcpy()'
>> in include/linux/string.h triggers it. There is also a hardened
>> (much shorted) 'strncpy()' that doesn't trigger it in the same file,
>> and having only the extern declaration of strncpy also doesn't.
> 
> And a little more experimenting leads to this simple patch that fixes
> the problem:
> 
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -254,7 +254,7 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const
> char *q, size_t size)
>         size_t q_size = __builtin_object_size(q, 0);
>         if (p_size == (size_t)-1 && q_size == (size_t)-1)
>                 return __real_strlcpy(p, q, size);
> -       ret = strlen(q);
> +       ret = __builtin_strlen(q);


I think this is not correct. Fortified strlen called here on purpose. If sizeof q is known at compile time
and 'q' contains not-null fortified strlen() will panic.


>         if (size) {
>                 size_t len = (ret >= size) ? size - 1 : ret;
>                 if (__builtin_constant_p(len) && len >= p_size)
> 
> The problem is apparently that the fortified strlcpy calls the fortified strlen,
> which in turn calls strnlen and that ends up calling the extern '__real_strnlen'
> that gcc cannot reduce to a constant expression for a constant input.


Per my observation, it's the code like this:
	if () 
		fortify_panic(__func__);


somehow prevent gcc to merge several "struct i2c_board_info info;" into one stack slot.
With the hack bellow, stack usage reduced to ~1,6K:

---
 include/linux/string.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index 54d21783e18d..9a96ff3ebf94 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p)
 	if (p_size == (size_t)-1)
 		return __builtin_strlen(p);
 	ret = strnlen(p, p_size);
-	if (p_size <= ret)
-		fortify_panic(__func__);
 	return ret;
 }
 
@@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, __kernel_size_t maxlen)
 {
 	size_t p_size = __builtin_object_size(p, 0);
 	__kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : p_size);
-	if (p_size <= ret && maxlen != ret)
-		fortify_panic(__func__);
 	return ret;
 }




> Not sure if that change is the best fix, but it seems to address the problem in
> this driver and probably leads to better code in other places as well.
> 

Probably it would be better to solve this on the strlcpy side, but I haven't found the way to do this right.
Alternative solutions:

 - use memcpy() instead of strlcpy(). All source strings are smaller than I2C_NAME_SIZE, so we could
   do something like this - memcpy(info.type, "si2168", sizeof("si2168"));
   Also this should be faster.

 - Move code under different "case:" in the switch(dev->model) to the separate function should help as well.
   But it might be harder to backport into stables.

^ permalink raw reply related

* Re: Can libpcap filter on vlan tags when vlans are hardware-accelerated?
From: Ben Greear @ 2017-09-26 16:41 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: netdev
In-Reply-To: <20170912202617.ny3bbi2dthxxffh3@unicorn.suse.cz>

On 09/12/2017 01:26 PM, Michal Kubecek wrote:
> On Tue, Sep 12, 2017 at 11:54:43AM -0700, Ben Greear wrote:
>> It does not appear to work on Fedora-26, and I'm curious if someone
>> knows what needs doing to get this support working?
>
> It's rather complicated. The "vlan" and "vlan <id>" filters didn't
> handle the case when vlan information is passed in metadata until commit
> 04660eb1e561 ("Use BPF extensions in compiled filters"), i.e. libpcap
> 1.7.0. Unfortunately that commit made libpcap always check only metadata
> for the first outermost vlan tag so that it broke the case when vlan
> information is passed in packet itself (which is less frequent today).
>
> To handle both cases correctly, you would need libpcap with commits
> d739b068ac29 ("Make VLAN filter handle both metadata and inline tags")
> and 7c7a19fbd9af ("Fix logic of combined VLAN test") and also the
> optimizer fix from
>
>   https://github.com/the-tcpdump-group/libpcap/pull/582/commits/075015a3d17a
>
> (without it the filters generate incorrect BPF in some cases unless the
> optimizer is disabled). As far as I can see, these commits are not in
> any release yet.
>
>                                                        Michal Kubecek
>

So, I cloned the latest libpcap, and I'm going to start poking at this.

Do you happen to know if I need to do anything special other than
'pcap_compile()'?  I'm curious how the library would know if it can use
newer kernel API or not...or maybe it is somehow magically backwards/forward
compatible?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [iproute PATCH v2 0/3] Check user supplied interface name lengths
From: Phil Sutter @ 2017-09-26 16:35 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

This series adds explicit checks for user-supplied interface names to
make sure their length fits Linux's requirements.

The first two patches simplify interface name parsing in some places -
these are side-effects of working on the actual implementation provided
in patch three.

Changes since v1:
- Patches 1 and 2 introduced.
- Changes to patch 3 are listed in there.

Phil Sutter (3):
  ip{6,}tunnel: Avoid copying user-supplied interface name around
  tc: flower: No need to cache indev arg
  Check user supplied interface name lengths

 include/utils.h |  1 +
 ip/ip6tunnel.c  |  9 +++++----
 ip/ipl2tp.c     |  3 ++-
 ip/iplink.c     | 27 ++++++++-------------------
 ip/ipmaddr.c    |  1 +
 ip/iprule.c     |  4 ++++
 ip/iptunnel.c   | 27 +++++++++++++--------------
 ip/iptuntap.c   |  4 +++-
 lib/utils.c     | 10 ++++++++++
 misc/arpd.c     |  1 +
 tc/f_flower.c   |  6 ++----
 11 files changed, 50 insertions(+), 43 deletions(-)

-- 
2.13.1

^ permalink raw reply

* [iproute PATCH v2 3/3] Check user supplied interface name lengths
From: Phil Sutter @ 2017-09-26 16:35 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20170926163548.24347-1-phil@nwl.cc>

The original problem was that something like:

| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);

might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty or overlong interface names
right from the start - this way calls to strncpy() like shown above
become safe and the user has a chance to reconsider what he was trying
to do.

Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v1:
- added missing check to tc/f_flower.c
- Drop some useless checks from ip/ip{6,}tunnel.c (ll_name_to_index()
  will detect illegal interface names for us).
- Renamed assert_valid_dev_name() to the shorter check_ifname().
- iplink: Check 'name' and 'dev' parameters right where they are parsed.
- ipl2tp: Drop needless check for p->ifname[0].
---
 include/utils.h |  1 +
 ip/ip6tunnel.c  |  3 ++-
 ip/ipl2tp.c     |  3 ++-
 ip/iplink.c     | 27 ++++++++-------------------
 ip/ipmaddr.c    |  1 +
 ip/iprule.c     |  4 ++++
 ip/iptunnel.c   |  5 ++++-
 ip/iptuntap.c   |  4 +++-
 lib/utils.c     | 10 ++++++++++
 misc/arpd.c     |  1 +
 tc/f_flower.c   |  1 +
 11 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index c9ed230b96044..12f0735e8aa0d 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -133,6 +133,7 @@ void missarg(const char *) __attribute__((noreturn));
 void invarg(const char *, const char *) __attribute__((noreturn));
 void duparg(const char *, const char *) __attribute__((noreturn));
 void duparg2(const char *, const char *) __attribute__((noreturn));
+void check_ifname(const char *, const char *);
 int matches(const char *arg, const char *pattern);
 int inet_addr_match(const inet_prefix *a, const inet_prefix *b, int bits);
 
diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index c12d700e74189..2c10b9c3fa7b5 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -273,7 +273,8 @@ static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p)
 				usage();
 			if (p->name[0])
 				duparg2("name", *argv);
-			strncpy(p->name, *argv, IFNAMSIZ - 1);
+			check_ifname("name", *argv);
+			strncpy(p->name, *argv, IFNAMSIZ);
 			if (cmd == SIOCCHGTUNNEL && count == 0) {
 				struct ip6_tnl_parm2 old_p = {};
 
diff --git a/ip/ipl2tp.c b/ip/ipl2tp.c
index 88664c909e11f..06f1bd064c914 100644
--- a/ip/ipl2tp.c
+++ b/ip/ipl2tp.c
@@ -182,7 +182,7 @@ static int create_session(struct l2tp_parm *p)
 	if (p->peer_cookie_len)
 		addattr_l(&req.n, 1024, L2TP_ATTR_PEER_COOKIE,
 			  p->peer_cookie,  p->peer_cookie_len);
-	if (p->ifname && p->ifname[0])
+	if (p->ifname)
 		addattrstrz(&req.n, 1024, L2TP_ATTR_IFNAME, p->ifname);
 
 	if (rtnl_talk(&genl_rth, &req.n, NULL, 0) < 0)
@@ -545,6 +545,7 @@ static int parse_args(int argc, char **argv, int cmd, struct l2tp_parm *p)
 			}
 		} else if (strcmp(*argv, "name") == 0) {
 			NEXT_ARG();
+			check_ifname("name", *argv);
 			p->ifname = *argv;
 		} else if (strcmp(*argv, "remote") == 0) {
 			NEXT_ARG();
diff --git a/ip/iplink.c b/ip/iplink.c
index ff5b56c038d28..f000a7992c92e 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -573,6 +573,7 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req,
 			req->i.ifi_flags &= ~IFF_UP;
 		} else if (strcmp(*argv, "name") == 0) {
 			NEXT_ARG();
+			check_ifname("name", *argv);
 			*name = *argv;
 		} else if (strcmp(*argv, "index") == 0) {
 			NEXT_ARG();
@@ -848,6 +849,7 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req,
 				NEXT_ARG();
 			if (*dev)
 				duparg2("dev", *argv);
+			check_ifname("dev", *argv);
 			*dev = *argv;
 			dev_index = ll_name_to_index(*dev);
 		}
@@ -870,7 +872,6 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req,
 
 static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
 {
-	int len;
 	char *dev = NULL;
 	char *name = NULL;
 	char *link = NULL;
@@ -960,13 +961,8 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
 	}
 
 	if (name) {
-		len = strlen(name) + 1;
-		if (len == 1)
-			invarg("\"\" is not a valid device identifier\n",
-			       "name");
-		if (len > IFNAMSIZ)
-			invarg("\"name\" too long\n", name);
-		addattr_l(&req.n, sizeof(req), IFLA_IFNAME, name, len);
+		addattr_l(&req.n, sizeof(req),
+			  IFLA_IFNAME, name, strlen(name) + 1);
 	}
 
 	if (type) {
@@ -1016,7 +1012,6 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
 
 int iplink_get(unsigned int flags, char *name, __u32 filt_mask)
 {
-	int len;
 	struct iplink_req req = {
 		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
 		.n.nlmsg_flags = NLM_F_REQUEST | flags,
@@ -1029,13 +1024,8 @@ int iplink_get(unsigned int flags, char *name, __u32 filt_mask)
 	} answer;
 
 	if (name) {
-		len = strlen(name) + 1;
-		if (len == 1)
-			invarg("\"\" is not a valid device identifier\n",
-				   "name");
-		if (len > IFNAMSIZ)
-			invarg("\"name\" too long\n", name);
-		addattr_l(&req.n, sizeof(req), IFLA_IFNAME, name, len);
+		addattr_l(&req.n, sizeof(req),
+			  IFLA_IFNAME, name, strlen(name) + 1);
 	}
 	addattr32(&req.n, sizeof(req), IFLA_EXT_MASK, filt_mask);
 
@@ -1265,6 +1255,7 @@ static int do_set(int argc, char **argv)
 			flags &= ~IFF_UP;
 		} else if (strcmp(*argv, "name") == 0) {
 			NEXT_ARG();
+			check_ifname("name", newname);
 			newname = *argv;
 		} else if (matches(*argv, "address") == 0) {
 			NEXT_ARG();
@@ -1355,6 +1346,7 @@ static int do_set(int argc, char **argv)
 
 			if (dev)
 				duparg2("dev", *argv);
+			check_ifname("dev", dev);
 			dev = *argv;
 		}
 		argc--; argv++;
@@ -1383,9 +1375,6 @@ static int do_set(int argc, char **argv)
 	}
 
 	if (newname && strcmp(dev, newname)) {
-		if (strlen(newname) == 0)
-			invarg("\"\" is not a valid device identifier\n",
-			       "name");
 		if (do_changename(dev, newname) < 0)
 			return -1;
 		dev = newname;
diff --git a/ip/ipmaddr.c b/ip/ipmaddr.c
index 85a69e779563d..282a06153c79a 100644
--- a/ip/ipmaddr.c
+++ b/ip/ipmaddr.c
@@ -284,6 +284,7 @@ static int multiaddr_modify(int cmd, int argc, char **argv)
 			NEXT_ARG();
 			if (ifr.ifr_name[0])
 				duparg("dev", *argv);
+			check_ifname("dev", *argv);
 			strncpy(ifr.ifr_name, *argv, IFNAMSIZ);
 		} else {
 			if (matches(*argv, "address") == 0) {
diff --git a/ip/iprule.c b/ip/iprule.c
index 8313138db815f..33cfd87195212 100644
--- a/ip/iprule.c
+++ b/ip/iprule.c
@@ -472,10 +472,12 @@ static int iprule_list_flush_or_save(int argc, char **argv, int action)
 		} else if (strcmp(*argv, "dev") == 0 ||
 			   strcmp(*argv, "iif") == 0) {
 			NEXT_ARG();
+			check_ifname("iif", *argv);
 			strncpy(filter.iif, *argv, IFNAMSIZ);
 			filter.iifmask = 1;
 		} else if (strcmp(*argv, "oif") == 0) {
 			NEXT_ARG();
+			check_ifname("oif", *argv);
 			strncpy(filter.oif, *argv, IFNAMSIZ);
 			filter.oifmask = 1;
 		} else if (strcmp(*argv, "l3mdev") == 0) {
@@ -695,10 +697,12 @@ static int iprule_modify(int cmd, int argc, char **argv)
 		} else if (strcmp(*argv, "dev") == 0 ||
 			   strcmp(*argv, "iif") == 0) {
 			NEXT_ARG();
+			check_ifname("dev/iif", *argv);
 			addattr_l(&req.n, sizeof(req), FRA_IFNAME,
 				  *argv, strlen(*argv)+1);
 		} else if (strcmp(*argv, "oif") == 0) {
 			NEXT_ARG();
+			check_ifname("oif", *argv);
 			addattr_l(&req.n, sizeof(req), FRA_OIFNAME,
 				  *argv, strlen(*argv)+1);
 		} else if (strcmp(*argv, "l3mdev") == 0) {
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 0acfd0793d3cd..851a80aad73a9 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -178,7 +178,8 @@ static int parse_args(int argc, char **argv, int cmd, struct ip_tunnel_parm *p)
 
 			if (p->name[0])
 				duparg2("name", *argv);
-			strncpy(p->name, *argv, IFNAMSIZ - 1);
+			check_ifname("name", *argv);
+			strncpy(p->name, *argv, IFNAMSIZ);
 			if (cmd == SIOCCHGTUNNEL && count == 0) {
 				struct ip_tunnel_parm old_p = {};
 
@@ -487,6 +488,7 @@ static int do_prl(int argc, char **argv)
 			count++;
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
+			check_ifname("dev", *argv);
 			medium = *argv;
 		} else {
 			fprintf(stderr,
@@ -534,6 +536,7 @@ static int do_6rd(int argc, char **argv)
 			cmd = SIOCDEL6RD;
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
+			check_ifname("dev", *argv);
 			medium = *argv;
 		} else {
 			fprintf(stderr,
diff --git a/ip/iptuntap.c b/ip/iptuntap.c
index 451f7f0eac6bb..4400dc2fa2a88 100644
--- a/ip/iptuntap.c
+++ b/ip/iptuntap.c
@@ -176,7 +176,8 @@ static int parse_args(int argc, char **argv,
 			ifr->ifr_flags |= IFF_MULTI_QUEUE;
 		} else if (matches(*argv, "dev") == 0) {
 			NEXT_ARG();
-			strncpy(ifr->ifr_name, *argv, IFNAMSIZ-1);
+			check_ifname("dev", *argv);
+			strncpy(ifr->ifr_name, *argv, IFNAMSIZ);
 		} else {
 			if (matches(*argv, "name") == 0) {
 				NEXT_ARG();
@@ -184,6 +185,7 @@ static int parse_args(int argc, char **argv,
 				usage();
 			if (ifr->ifr_name[0])
 				duparg2("name", *argv);
+			check_ifname("name", *argv);
 			strncpy(ifr->ifr_name, *argv, IFNAMSIZ);
 		}
 		count++;
diff --git a/lib/utils.c b/lib/utils.c
index bbd3cbc46a0e5..c4a02b8f9f52a 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -699,6 +699,16 @@ void duparg2(const char *key, const char *arg)
 	exit(-1);
 }
 
+void check_ifname(const char *arg, const char *name)
+{
+	size_t len = strlen(name);
+
+	if (!len)
+		invarg("Empty interface name not allowed.", arg);
+	if (len >= IFNAMSIZ)
+		invarg("Interface name is too long.", name);
+}
+
 int matches(const char *cmd, const char *pattern)
 {
 	int len = strlen(cmd);
diff --git a/misc/arpd.c b/misc/arpd.c
index bfab44544ee1d..d42df9e58a9f1 100644
--- a/misc/arpd.c
+++ b/misc/arpd.c
@@ -664,6 +664,7 @@ int main(int argc, char **argv)
 		struct ifreq ifr = {};
 
 		for (i = 0; i < ifnum; i++) {
+			check_ifname(ifnames[i], ifnames[i]);
 			strncpy(ifr.ifr_name, ifnames[i], IFNAMSIZ);
 			if (ioctl(udp_sock, SIOCGIFINDEX, &ifr)) {
 				perror("ioctl(SIOCGIFINDEX)");
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 99e62a382dec6..ff45ea7af086e 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -630,6 +630,7 @@ static int flower_parse_opt(struct filter_util *qu, char *handle,
 			flags |= TCA_CLS_FLAGS_SKIP_SW;
 		} else if (matches(*argv, "indev") == 0) {
 			NEXT_ARG();
+			check_ifname("indev", *argv);
 			addattrstrz(n, MAX_MSG, TCA_FLOWER_INDEV, *argv);
 		} else if (matches(*argv, "vlan_id") == 0) {
 			__u16 vid;
-- 
2.13.1

^ permalink raw reply related

* [iproute PATCH v2 1/3] ip{6,}tunnel: Avoid copying user-supplied interface name around
From: Phil Sutter @ 2017-09-26 16:35 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20170926163548.24347-1-phil@nwl.cc>

In both files' parse_args() functions as well as in iptunnel's do_prl()
and do_6rd() functions, a user-supplied 'dev' parameter is uselessly
copied into a temporary buffer before passing it to ll_name_to_index()
or copying into a struct ifreq.  Avoid this by just caching the argv
pointer value until the later lookup/strcpy.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 ip/ip6tunnel.c |  6 +++---
 ip/iptunnel.c  | 22 +++++++++-------------
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index b4a7def144226..c12d700e74189 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -136,7 +136,7 @@ static void print_tunnel(struct ip6_tnl_parm2 *p)
 static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p)
 {
 	int count = 0;
-	char medium[IFNAMSIZ] = {};
+	const char *medium = NULL;
 
 	while (argc > 0) {
 		if (strcmp(*argv, "mode") == 0) {
@@ -180,7 +180,7 @@ static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p)
 			memcpy(&p->laddr, &laddr.data, sizeof(p->laddr));
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
-			strncpy(medium, *argv, IFNAMSIZ - 1);
+			medium = *argv;
 		} else if (strcmp(*argv, "encaplimit") == 0) {
 			NEXT_ARG();
 			if (strcmp(*argv, "none") == 0) {
@@ -285,7 +285,7 @@ static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p)
 		count++;
 		argc--; argv++;
 	}
-	if (medium[0]) {
+	if (medium) {
 		p->link = ll_name_to_index(medium);
 		if (p->link == 0) {
 			fprintf(stderr, "Cannot find device \"%s\"\n", medium);
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 105d0f5576f1a..0acfd0793d3cd 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -60,7 +60,7 @@ static void set_tunnel_proto(struct ip_tunnel_parm *p, int proto)
 static int parse_args(int argc, char **argv, int cmd, struct ip_tunnel_parm *p)
 {
 	int count = 0;
-	char medium[IFNAMSIZ] = {};
+	const char *medium = NULL;
 	int isatap = 0;
 
 	memset(p, 0, sizeof(*p));
@@ -139,7 +139,7 @@ static int parse_args(int argc, char **argv, int cmd, struct ip_tunnel_parm *p)
 				p->iph.saddr = htonl(INADDR_ANY);
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
-			strncpy(medium, *argv, IFNAMSIZ - 1);
+			medium = *argv;
 		} else if (strcmp(*argv, "ttl") == 0 ||
 			   strcmp(*argv, "hoplimit") == 0 ||
 			   strcmp(*argv, "hlim") == 0) {
@@ -216,7 +216,7 @@ static int parse_args(int argc, char **argv, int cmd, struct ip_tunnel_parm *p)
 		}
 	}
 
-	if (medium[0]) {
+	if (medium) {
 		p->link = ll_name_to_index(medium);
 		if (p->link == 0) {
 			fprintf(stderr, "Cannot find device \"%s\"\n", medium);
@@ -465,9 +465,8 @@ static int do_prl(int argc, char **argv)
 {
 	struct ip_tunnel_prl p = {};
 	int count = 0;
-	int devname = 0;
 	int cmd = 0;
-	char medium[IFNAMSIZ] = {};
+	const char *medium = NULL;
 
 	while (argc > 0) {
 		if (strcmp(*argv, "prl-default") == 0) {
@@ -488,8 +487,7 @@ static int do_prl(int argc, char **argv)
 			count++;
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
-			strncpy(medium, *argv, IFNAMSIZ-1);
-			devname++;
+			medium = *argv;
 		} else {
 			fprintf(stderr,
 				"Invalid PRL parameter \"%s\"\n", *argv);
@@ -502,7 +500,7 @@ static int do_prl(int argc, char **argv)
 		}
 		argc--; argv++;
 	}
-	if (devname == 0) {
+	if (!medium) {
 		fprintf(stderr, "Must specify device\n");
 		exit(-1);
 	}
@@ -513,9 +511,8 @@ static int do_prl(int argc, char **argv)
 static int do_6rd(int argc, char **argv)
 {
 	struct ip_tunnel_6rd ip6rd = {};
-	int devname = 0;
 	int cmd = 0;
-	char medium[IFNAMSIZ] = {};
+	const char *medium = NULL;
 	inet_prefix prefix;
 
 	while (argc > 0) {
@@ -537,8 +534,7 @@ static int do_6rd(int argc, char **argv)
 			cmd = SIOCDEL6RD;
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
-			strncpy(medium, *argv, IFNAMSIZ-1);
-			devname++;
+			medium = *argv;
 		} else {
 			fprintf(stderr,
 				"Invalid 6RD parameter \"%s\"\n", *argv);
@@ -546,7 +542,7 @@ static int do_6rd(int argc, char **argv)
 		}
 		argc--; argv++;
 	}
-	if (devname == 0) {
+	if (!medium) {
 		fprintf(stderr, "Must specify device\n");
 		exit(-1);
 	}
-- 
2.13.1

^ permalink raw reply related

* [PATCH v3] ebtables: fix race condition in frame_filter_net_init()
From: Artem Savkov @ 2017-09-26 16:35 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, netdev, linux-kernel, netfilter-devel,
	Artem Savkov
In-Reply-To: <20170926153923.30094-1-asavkov@redhat.com>

It is possible for ebt_in_hook to be triggered before ebt_table is assigned
resulting in a NULL-pointer dereference. Make sure hooks are
registered as the last step.

v3: restore errorneously removed ops == NULL case check

Fixes: aee12a0a3727 ebtables: remove nf_hook_register usage
Signed-off-by: Artem Savkov <asavkov@redhat.com>
---
 include/linux/netfilter_bridge/ebtables.h |  7 ++++---
 net/bridge/netfilter/ebtable_broute.c     |  4 ++--
 net/bridge/netfilter/ebtable_filter.c     |  4 ++--
 net/bridge/netfilter/ebtable_nat.c        |  4 ++--
 net/bridge/netfilter/ebtables.c           | 17 +++++++++--------
 5 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index 2c2a5514b0df..528b24c78308 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -108,9 +108,10 @@ struct ebt_table {
 
 #define EBT_ALIGN(s) (((s) + (__alignof__(struct _xt_align)-1)) & \
 		     ~(__alignof__(struct _xt_align)-1))
-extern struct ebt_table *ebt_register_table(struct net *net,
-					    const struct ebt_table *table,
-					    const struct nf_hook_ops *);
+extern int ebt_register_table(struct net *net,
+			      const struct ebt_table *table,
+			      const struct nf_hook_ops *ops,
+			      struct ebt_table **res);
 extern void ebt_unregister_table(struct net *net, struct ebt_table *table,
 				 const struct nf_hook_ops *);
 extern unsigned int ebt_do_table(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 2585b100ebbb..276b60262981 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -65,8 +65,8 @@ static int ebt_broute(struct sk_buff *skb)
 
 static int __net_init broute_net_init(struct net *net)
 {
-	net->xt.broute_table = ebt_register_table(net, &broute_table, NULL);
-	return PTR_ERR_OR_ZERO(net->xt.broute_table);
+	return ebt_register_table(net, &broute_table, NULL,
+				  &net->xt.broute_table);
 }
 
 static void __net_exit broute_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index 45a00dbdbcad..c41da5fac84f 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_filter[] = {
 
 static int __net_init frame_filter_net_init(struct net *net)
 {
-	net->xt.frame_filter = ebt_register_table(net, &frame_filter, ebt_ops_filter);
-	return PTR_ERR_OR_ZERO(net->xt.frame_filter);
+	return ebt_register_table(net, &frame_filter, ebt_ops_filter,
+				  &net->xt.frame_filter);
 }
 
 static void __net_exit frame_filter_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 57cd5bb154e7..08df7406ecb3 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_nat[] = {
 
 static int __net_init frame_nat_net_init(struct net *net)
 {
-	net->xt.frame_nat = ebt_register_table(net, &frame_nat, ebt_ops_nat);
-	return PTR_ERR_OR_ZERO(net->xt.frame_nat);
+	return ebt_register_table(net, &frame_nat, ebt_ops_nat,
+				  &net->xt.frame_nat);
 }
 
 static void __net_exit frame_nat_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 83951f978445..3b3dcf719e07 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1169,9 +1169,8 @@ static void __ebt_unregister_table(struct net *net, struct ebt_table *table)
 	kfree(table);
 }
 
-struct ebt_table *
-ebt_register_table(struct net *net, const struct ebt_table *input_table,
-		   const struct nf_hook_ops *ops)
+int ebt_register_table(struct net *net, const struct ebt_table *input_table,
+		       const struct nf_hook_ops *ops, struct ebt_table **res)
 {
 	struct ebt_table_info *newinfo;
 	struct ebt_table *t, *table;
@@ -1183,7 +1182,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	    repl->entries == NULL || repl->entries_size == 0 ||
 	    repl->counters != NULL || input_table->private != NULL) {
 		BUGPRINT("Bad table data for ebt_register_table!!!\n");
-		return ERR_PTR(-EINVAL);
+		return -EINVAL;
 	}
 
 	/* Don't add one table to multiple lists. */
@@ -1252,16 +1251,18 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	list_add(&table->list, &net->xt.tables[NFPROTO_BRIDGE]);
 	mutex_unlock(&ebt_mutex);
 
+	WRITE_ONCE(*res, table);
+
 	if (!ops)
-		return table;
+		return 0;
 
 	ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
 	if (ret) {
 		__ebt_unregister_table(net, table);
-		return ERR_PTR(ret);
+		*res = NULL;
 	}
 
-	return table;
+	return ret;
 free_unlock:
 	mutex_unlock(&ebt_mutex);
 free_chainstack:
@@ -1276,7 +1277,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 free_table:
 	kfree(table);
 out:
-	return ERR_PTR(ret);
+	return ret;
 }
 
 void ebt_unregister_table(struct net *net, struct ebt_table *table,
-- 
2.13.5

^ permalink raw reply related

* Re: [PATCH] lib: fix multiple strlcpy definition
From: Baruch Siach @ 2017-09-26 16:27 UTC (permalink / raw)
  To: Phil Sutter, Stephen Hemminger, netdev
In-Reply-To: <20170926155524.GA26610@orbyte.nwl.cc>

Hi Phil,

On Tue, Sep 26, 2017 at 05:55:24PM +0200, Phil Sutter wrote:
> On Tue, Sep 26, 2017 at 02:08:49PM +0300, Baruch Siach wrote:
> [...]
> > diff --git a/configure b/configure
> > index 7be8fb113cc9..787b2e061af9 100755
> > --- a/configure
> > +++ b/configure
> > @@ -326,6 +326,27 @@ EOF
> >      rm -f $TMPDIR/dbtest.c $TMPDIR/dbtest
> >  }
> >  
> > +check_strlcpy()
> > +{
> > +    cat >$TMPDIR/strtest.c <<EOF
> > +#include <string.h>
> > +int main(int argc, char **argv) {
> > +	char dst[10];
> > +	strlcpy("test", dst, sizeof(dst));
> 
> You swapped source and destination here. It's not important for the
> given use-case, but the resulting binary should segfault.

Will fix that in v2.

> Apart from that, LGTM!

Thanks. I'll take this as an ack.

baruch

-- 
     http://baruch.siach.name/blog/                  ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -

^ permalink raw reply

* [PATCH net] packet: only test po->has_vnet_hdr once in packet_snd
From: Willem de Bruijn @ 2017-09-26 16:20 UTC (permalink / raw)
  To: netdev; +Cc: davem, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Packet socket option po->has_vnet_hdr can be updated concurrently with
other operations if no ring is attached.

Do not test the option twice in packet_snd, as the value may change in
between calls. A race on setsockopt disable may cause a packet > mtu
to be sent without having GSO options set.

Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
 net/packet/af_packet.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d288f52c53f7..1da0851f51f2 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2840,6 +2840,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	struct virtio_net_hdr vnet_hdr = { 0 };
 	int offset = 0;
 	struct packet_sock *po = pkt_sk(sk);
+	bool has_vnet_hdr = false;
 	int hlen, tlen, linear;
 	int extra_len = 0;
 
@@ -2883,6 +2884,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		err = packet_snd_vnet_parse(msg, &len, &vnet_hdr);
 		if (err)
 			goto out_unlock;
+		has_vnet_hdr = true;
 	}
 
 	if (unlikely(sock_flag(sk, SOCK_NOFCS))) {
@@ -2941,7 +2943,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	skb->priority = sk->sk_priority;
 	skb->mark = sockc.mark;
 
-	if (po->has_vnet_hdr) {
+	if (has_vnet_hdr) {
 		err = virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le());
 		if (err)
 			goto out_free;
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* [PATCH net] packet: in packet_do_bind, test fanout with bind_lock held
From: Willem de Bruijn @ 2017-09-26 16:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

Once a socket has po->fanout set, it remains a member of the group
until it is destroyed. The prot_hook must be constant and identical
across sockets in the group.

If fanout_add races with packet_do_bind between the test of po->fanout
and taking the lock, the bind call may make type or dev inconsistent
with that of the fanout group.

Hold po->bind_lock when testing po->fanout to avoid this race.

I had to introduce artificial delay (local_bh_enable) to actually
observe the race.

Fixes: dc99f600698d ("packet: Add fanout support.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
 net/packet/af_packet.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 1da0851f51f2..bec01a3daf5b 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3071,13 +3071,15 @@ static int packet_do_bind(struct sock *sk, const char *name, int ifindex,
 	int ret = 0;
 	bool unlisted = false;

-	if (po->fanout)
-		return -EINVAL;
-
 	lock_sock(sk);
 	spin_lock(&po->bind_lock);
 	rcu_read_lock();

+	if (po->fanout) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	if (name) {
 		dev = dev_get_by_name_rcu(sock_net(sk), name);
 		if (!dev) {
-- 
2.14.1.821.g8fa685d3b7-goog

^ permalink raw reply related

* Re: [PATCH] lib: fix multiple strlcpy definition
From: Phil Sutter @ 2017-09-26 15:55 UTC (permalink / raw)
  To: Baruch Siach; +Cc: Stephen Hemminger, netdev
In-Reply-To: <7f5fee59e46bfd3b4acf8ed9a8fbd8c7b4f1cd70.1506424129.git.baruch@tkos.co.il>

On Tue, Sep 26, 2017 at 02:08:49PM +0300, Baruch Siach wrote:
[...]
> diff --git a/configure b/configure
> index 7be8fb113cc9..787b2e061af9 100755
> --- a/configure
> +++ b/configure
> @@ -326,6 +326,27 @@ EOF
>      rm -f $TMPDIR/dbtest.c $TMPDIR/dbtest
>  }
>  
> +check_strlcpy()
> +{
> +    cat >$TMPDIR/strtest.c <<EOF
> +#include <string.h>
> +int main(int argc, char **argv) {
> +	char dst[10];
> +	strlcpy("test", dst, sizeof(dst));

You swapped source and destination here. It's not important for the
given use-case, but the resulting binary should segfault.

Apart from that, LGTM!

Cheers, Phil

^ permalink raw reply

* Re: [PATCH net-next 2/7] nfp: compile flower vxlan tunnel metadata match fields
From: John Hurley @ 2017-09-26 15:39 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
	oss-drivers
In-Reply-To: <CAJ3xEMhio5+5BK3p5mq=TvA8SzAvjSJm75mN0XiChdWkW3C89g@mail.gmail.com>

On Tue, Sep 26, 2017 at 4:33 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Tue, Sep 26, 2017 at 6:11 PM, John Hurley <john.hurley@netronome.com> wrote:
>> On Tue, Sep 26, 2017 at 3:12 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Tue, Sep 26, 2017 at 4:58 PM, John Hurley <john.hurley@netronome.com> wrote:
>>>> On Mon, Sep 25, 2017 at 7:35 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>> On Mon, Sep 25, 2017 at 1:23 PM, Simon Horman
>>>>> <simon.horman@netronome.com> wrote:
>>>>>> From: John Hurley <john.hurley@netronome.com>
>>>>>>
>>>>>> Compile ovs-tc flower vxlan metadata match fields for offloading. Only
>>>>>
>>>>> anything in the npf kernel bits has direct relation to ovs? what?
>>>>>
>>>>
>>>> Sorry, this is a typo  and should refer to TC.
>>>>
>>>>>> +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
>>>>>> @@ -52,8 +52,25 @@
>>>>>>          BIT(FLOW_DISSECTOR_KEY_PORTS) | \
>>>>>>          BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | \
>>>>>>          BIT(FLOW_DISSECTOR_KEY_VLAN) | \
>>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | \
>>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | \
>>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | \
>>>>>
>>>>> this series takes care of IPv6 tunnels too?
>>>>
>>>> IPv6 is not included in this set.
>>>> The reason the IPv6 bit is included here is to account for behavior we
>>>> have noticed in TC flower.
>>>> If, for example, I add a filter with the following match fields:
>>>> 'protocol ip flower enc_src_ip 10.0.0.1 enc_dst_ip 10.0.0.2
>>>> enc_dst_port 4789 enc_key_id 123'
>>>> The 'used_keys' value in the dissector marks both IPv4 and IPv6 encap
>>>> addresses as 'used'.
>>>> I am not sure if this is a bug in TC or that we are expected to check
>>>> the enc_control fields to determine if IPv4 or v6 addresses are used.
>>>
>>> you should have your code to check enc_control->addr_type to be
>>> FLOW_DISSECTOR_KEY_IPV4_ADDRS or IPV6_ADDRS
>>>
>>>
>>>> Including the IPv6 used_keys bit in our whitelist approach allows us
>>>> to accept legitimate IPv4 tunnel rules in these situations.
>>>
>>> mmm can please take a look on fl_init_dissector() and tell me if you
>>> see why FLOW_DISSECTOR_KEY_IPV6_ADDRS is set for ipv4 tunnels,
>>> I am not sure.
>>
>>
>> The fl_init_dissector uses the FL_KEY_SET_IF_MASKED macro to set an
>> array of keys which are then translated to the used_keys values.
>> The FL_KEY_SET_IF_MASKED takes a 'struct fl_flow_key' as input and
>> checks if any mask bits are set in a particular field - if so it
>> eventually marks it as used.
>> In struct fl_flow_key, the encap ipv4 and ipv6 addresses are
>> represented as a union of the 2.
>> Therefore, if we have masked bits set for IPv4, they are also being
>> set for the IPv6 field.
>
> I see, do you consider it a bug?

The code seems to insist that, if either IPv4 or IPv6 is in use then a
control encap key is also used:

if (FL_KEY_IS_MASKED(&mask->key, enc_ipv4) ||
   FL_KEY_IS_MASKED(&mask->key, enc_ipv6))
FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_CONTROL,
  enc_control);

Therefore, I think it should be ok to use this to determine the IP
type in use by the tunnel.

^ permalink raw reply

* [PATCH v2] ebtables: fix race condition in frame_filter_net_init()
From: Artem Savkov @ 2017-09-26 15:39 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, netdev, linux-kernel, netfilter-devel,
	Artem Savkov
In-Reply-To: <20170926124211.GA14971@breakpoint.cc>

It is possible for ebt_in_hook to be triggered before ebt_table is assigned
resulting in a NULL-pointer dereference. Make sure hooks are
registered as the last step.

Fixes: aee12a0a3727 ebtables: remove nf_hook_register usage
Signed-off-by: Artem Savkov <asavkov@redhat.com>
---
 include/linux/netfilter_bridge/ebtables.h |  7 ++++---
 net/bridge/netfilter/ebtable_broute.c     |  4 ++--
 net/bridge/netfilter/ebtable_filter.c     |  4 ++--
 net/bridge/netfilter/ebtable_nat.c        |  4 ++--
 net/bridge/netfilter/ebtables.c           | 17 ++++++++---------
 5 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index 2c2a5514b0df..528b24c78308 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -108,9 +108,10 @@ struct ebt_table {
 
 #define EBT_ALIGN(s) (((s) + (__alignof__(struct _xt_align)-1)) & \
 		     ~(__alignof__(struct _xt_align)-1))
-extern struct ebt_table *ebt_register_table(struct net *net,
-					    const struct ebt_table *table,
-					    const struct nf_hook_ops *);
+extern int ebt_register_table(struct net *net,
+			      const struct ebt_table *table,
+			      const struct nf_hook_ops *ops,
+			      struct ebt_table **res);
 extern void ebt_unregister_table(struct net *net, struct ebt_table *table,
 				 const struct nf_hook_ops *);
 extern unsigned int ebt_do_table(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 2585b100ebbb..276b60262981 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -65,8 +65,8 @@ static int ebt_broute(struct sk_buff *skb)
 
 static int __net_init broute_net_init(struct net *net)
 {
-	net->xt.broute_table = ebt_register_table(net, &broute_table, NULL);
-	return PTR_ERR_OR_ZERO(net->xt.broute_table);
+	return ebt_register_table(net, &broute_table, NULL,
+				  &net->xt.broute_table);
 }
 
 static void __net_exit broute_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index 45a00dbdbcad..c41da5fac84f 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_filter[] = {
 
 static int __net_init frame_filter_net_init(struct net *net)
 {
-	net->xt.frame_filter = ebt_register_table(net, &frame_filter, ebt_ops_filter);
-	return PTR_ERR_OR_ZERO(net->xt.frame_filter);
+	return ebt_register_table(net, &frame_filter, ebt_ops_filter,
+				  &net->xt.frame_filter);
 }
 
 static void __net_exit frame_filter_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 57cd5bb154e7..08df7406ecb3 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -93,8 +93,8 @@ static const struct nf_hook_ops ebt_ops_nat[] = {
 
 static int __net_init frame_nat_net_init(struct net *net)
 {
-	net->xt.frame_nat = ebt_register_table(net, &frame_nat, ebt_ops_nat);
-	return PTR_ERR_OR_ZERO(net->xt.frame_nat);
+	return ebt_register_table(net, &frame_nat, ebt_ops_nat,
+				  &net->xt.frame_nat);
 }
 
 static void __net_exit frame_nat_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 83951f978445..aa81afe81f23 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1169,9 +1169,8 @@ static void __ebt_unregister_table(struct net *net, struct ebt_table *table)
 	kfree(table);
 }
 
-struct ebt_table *
-ebt_register_table(struct net *net, const struct ebt_table *input_table,
-		   const struct nf_hook_ops *ops)
+int ebt_register_table(struct net *net, const struct ebt_table *input_table,
+		       const struct nf_hook_ops *ops, struct ebt_table **res)
 {
 	struct ebt_table_info *newinfo;
 	struct ebt_table *t, *table;
@@ -1183,7 +1182,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	    repl->entries == NULL || repl->entries_size == 0 ||
 	    repl->counters != NULL || input_table->private != NULL) {
 		BUGPRINT("Bad table data for ebt_register_table!!!\n");
-		return ERR_PTR(-EINVAL);
+		return -EINVAL;
 	}
 
 	/* Don't add one table to multiple lists. */
@@ -1252,16 +1251,16 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	list_add(&table->list, &net->xt.tables[NFPROTO_BRIDGE]);
 	mutex_unlock(&ebt_mutex);
 
-	if (!ops)
-		return table;
+	WRITE_ONCE(*res, table);
 
 	ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
 	if (ret) {
 		__ebt_unregister_table(net, table);
-		return ERR_PTR(ret);
+		*res = NULL;
+		return ret;
 	}
 
-	return table;
+	return 0;
 free_unlock:
 	mutex_unlock(&ebt_mutex);
 free_chainstack:
@@ -1276,7 +1275,7 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
 free_table:
 	kfree(table);
 out:
-	return ERR_PTR(ret);
+	return ret;
 }
 
 void ebt_unregister_table(struct net *net, struct ebt_table *table,
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next 2/2] tools: bpf: add bpftool
From: Jakub Kicinski @ 2017-09-26 15:35 UTC (permalink / raw)
  To: netdev
  Cc: daniel, alexei.starovoitov, davem, hannes, dsahern, oss-drivers,
	Jakub Kicinski
In-Reply-To: <20170926153522.31500-1-jakub.kicinski@netronome.com>

Add a simple tool for querying and updating BPF objects on the system.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
---
 tools/bpf/Makefile             |  18 +-
 tools/bpf/bpftool/Makefile     |  80 +++++
 tools/bpf/bpftool/common.c     | 214 ++++++++++++
 tools/bpf/bpftool/jit_disasm.c |  83 +++++
 tools/bpf/bpftool/main.c       | 212 ++++++++++++
 tools/bpf/bpftool/main.h       |  99 ++++++
 tools/bpf/bpftool/map.c        | 742 +++++++++++++++++++++++++++++++++++++++++
 tools/bpf/bpftool/prog.c       | 392 ++++++++++++++++++++++
 8 files changed, 1837 insertions(+), 3 deletions(-)
 create mode 100644 tools/bpf/bpftool/Makefile
 create mode 100644 tools/bpf/bpftool/common.c
 create mode 100644 tools/bpf/bpftool/jit_disasm.c
 create mode 100644 tools/bpf/bpftool/main.c
 create mode 100644 tools/bpf/bpftool/main.h
 create mode 100644 tools/bpf/bpftool/map.c
 create mode 100644 tools/bpf/bpftool/prog.c

diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile
index ddf888010652..325a35e1c28e 100644
--- a/tools/bpf/Makefile
+++ b/tools/bpf/Makefile
@@ -3,6 +3,7 @@ prefix = /usr
 CC = gcc
 LEX = flex
 YACC = bison
+MAKE = make
 
 CFLAGS += -Wall -O2
 CFLAGS += -D__EXPORTED_HEADERS__ -I../../include/uapi -I../../include
@@ -13,7 +14,7 @@ CFLAGS += -D__EXPORTED_HEADERS__ -I../../include/uapi -I../../include
 %.lex.c: %.l
 	$(LEX) -o $@ $<
 
-all : bpf_jit_disasm bpf_dbg bpf_asm
+all: bpf_jit_disasm bpf_dbg bpf_asm bpftool
 
 bpf_jit_disasm : CFLAGS += -DPACKAGE='bpf_jit_disasm'
 bpf_jit_disasm : LDLIBS = -lopcodes -lbfd -ldl
@@ -26,10 +27,21 @@ bpf_asm : LDLIBS =
 bpf_asm : bpf_asm.o bpf_exp.yacc.o bpf_exp.lex.o
 bpf_exp.lex.o : bpf_exp.yacc.c
 
-clean :
+clean: bpftool_clean
 	rm -rf *.o bpf_jit_disasm bpf_dbg bpf_asm bpf_exp.yacc.* bpf_exp.lex.*
 
-install :
+install: bpftool_install
 	install bpf_jit_disasm $(prefix)/bin/bpf_jit_disasm
 	install bpf_dbg $(prefix)/bin/bpf_dbg
 	install bpf_asm $(prefix)/bin/bpf_asm
+
+bpftool:
+	$(MAKE) -C bpftool
+
+bpftool_install:
+	$(MAKE) -C bpftool install
+
+bpftool_clean:
+	$(MAKE) -C bpftool clean
+
+.PHONY: bpftool FORCE
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
new file mode 100644
index 000000000000..a7151f47fb40
--- /dev/null
+++ b/tools/bpf/bpftool/Makefile
@@ -0,0 +1,80 @@
+include ../../scripts/Makefile.include
+
+include ../../scripts/utilities.mak
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(CURDIR)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+#$(info Determined 'srctree' to be $(srctree))
+endif
+
+ifneq ($(objtree),)
+#$(info Determined 'objtree' to be $(objtree))
+endif
+
+ifneq ($(OUTPUT),)
+#$(info Determined 'OUTPUT' to be $(OUTPUT))
+# Adding $(OUTPUT) as a directory to look for source files,
+# because use generated output files as sources dependency
+# for flex/bison parsers.
+VPATH += $(OUTPUT)
+export VPATH
+endif
+
+ifeq ($(V),1)
+  Q =
+else
+  Q = @
+endif
+
+BPF_DIR	= $(srctree)/tools/lib/bpf/
+
+ifneq ($(OUTPUT),)
+  BPF_PATH=$(OUTPUT)
+else
+  BPF_PATH=$(BPF_DIR)
+endif
+
+LIBBPF = $(BPF_PATH)libbpf.a
+
+$(LIBBPF): FORCE
+	$(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(OUTPUT) $(OUTPUT)libbpf.a FEATURES_DUMP=$(FEATURE_DUMP_EXPORT)
+
+$(LIBBPF)-clean:
+	$(call QUIET_CLEAN, libbpf)
+	$(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(OUTPUT) clean >/dev/null
+
+prefix = /usr
+
+CC = gcc
+
+CFLAGS += -O2
+CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wshadow
+CFLAGS += -D__EXPORTED_HEADERS__ -I$(srctree)/tools/include/uapi -I$(srctree)/tools/include -I$(srctree)/tools/lib/bpf
+LIBS = -lelf -lbfd -lopcodes $(LIBBPF)
+
+include $(wildcard *.d)
+
+all: $(OUTPUT)bpftool
+
+SRCS=$(wildcard *.c)
+OBJS=$(patsubst %.c,$(OUTPUT)%.o,$(SRCS))
+
+$(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
+	$(QUIET_LINK)$(CC) $(CFLAGS) -o $@ $^ $(LIBS)
+
+$(OUTPUT)%.o: %.c
+	$(QUIET_CC)$(COMPILE.c) -MMD -o $@ $<
+
+clean: $(LIBBPF)-clean
+	$(call QUIET_CLEAN, bpftool)
+	$(Q)rm -rf $(OUTPUT)bpftool $(OUTPUT)*.o $(OUTPUT)*.d
+
+install:
+	install $(OUTPUT)bpftool $(prefix)/sbin/bpftool
+
+FORCE:
+
+.PHONY: all clean FORCE
+.DEFAULT_GOAL := all
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
new file mode 100644
index 000000000000..db7bb966c844
--- /dev/null
+++ b/tools/bpf/bpftool/common.c
@@ -0,0 +1,214 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* Author: Jakub Kicinski <kubakici@wp.pl> */
+
+#include <errno.h>
+#include <libgen.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <linux/limits.h>
+#include <linux/magic.h>
+#include <sys/types.h>
+#include <sys/vfs.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+static bool is_bpffs(char *path)
+{
+	struct statfs st_fs;
+
+	if (statfs(path, &st_fs) < 0)
+		return false;
+
+	return (unsigned long)st_fs.f_type == BPF_FS_MAGIC;
+}
+
+int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type)
+{
+	enum bpf_obj_type type;
+	int fd;
+
+	fd = bpf_obj_get(path);
+	if (fd < 1) {
+		err("bpf obj get (%s): %s\n", path,
+		    errno == EACCES && !is_bpffs(dirname(path)) ?
+		    "directory not in bpf file system (bpffs)" :
+		    strerror(errno));
+		return -1;
+	}
+
+	type = get_fd_type(fd);
+	if (type < 0) {
+		close(fd);
+		return type;
+	}
+	if (type != exp_type) {
+		err("incorrect object type: %s\n", get_fd_type_name(type));
+		close(fd);
+		return -1;
+	}
+
+	return fd;
+}
+
+int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32))
+{
+	unsigned int id;
+	char *endptr;
+	int err;
+	int fd;
+
+	if (!is_prefix(*argv, "id")) {
+		err("expected 'id' got %s\n", *argv);
+		return -1;
+	}
+	NEXT_ARG();
+
+	id = strtoul(*argv, &endptr, 0);
+	if (*endptr) {
+		err("can't parse %s as ID\n", *argv);
+		return -1;
+	}
+	NEXT_ARG();
+
+	if (argc != 1)
+		usage();
+
+	fd = get_fd_by_id(id);
+	if (fd < 1) {
+		err("can't get prog by id (%u): %s\n", id, strerror(errno));
+		return -1;
+	}
+
+	err = bpf_obj_pin(fd, *argv);
+	close(fd);
+	if (err) {
+		err("can't pin the object (%s): %s\n", *argv,
+		    errno == EACCES && !is_bpffs(dirname(*argv)) ?
+		    "directory not in bpf file system (bpffs)" :
+		    strerror(errno));
+		return -1;
+	}
+
+	return 0;
+}
+
+const char *get_fd_type_name(enum bpf_obj_type type)
+{
+	static const char * const names[] = {
+		[BPF_OBJ_PROG]	= "prog",
+		[BPF_OBJ_MAP]	= "map",
+	};
+
+	if (type > 0 && type < ARRAY_SIZE(names) && names[type])
+		return names[type];
+
+	return "unknown";
+}
+
+int get_fd_type(int fd)
+{
+	char path[PATH_MAX];
+	char buf[512];
+	ssize_t n;
+
+	snprintf(path, sizeof(path), "/proc/%d/fd/%d", getpid(), fd);
+
+	n = readlink(path, buf, sizeof(buf));
+	if (n < 0) {
+		err("can't read link type: %s\n", strerror(errno));
+		return -1;
+	}
+	if (n == sizeof(path)) {
+		err("can't read link type: path too long!\n");
+		return -1;
+	}
+
+	if (strstr(buf, "bpf-map"))
+		return BPF_OBJ_MAP;
+	else if (strstr(buf, "bpf-prog"))
+		return BPF_OBJ_PROG;
+
+	return BPF_OBJ_UNKNOWN;
+}
+
+char *get_fdinfo(int fd, const char *key)
+{
+	char path[PATH_MAX];
+	char *line = NULL;
+	size_t line_n = 0;
+	ssize_t n;
+	FILE *fdi;
+
+	snprintf(path, sizeof(path), "/proc/%d/fdinfo/%d", getpid(), fd);
+
+	fdi = fopen(path, "r");
+	if (!fdi) {
+		err("can't open fdinfo: %s\n", strerror(errno));
+		return NULL;
+	}
+
+	while ((n = getline(&line, &line_n, fdi))) {
+		char *value;
+		int len;
+
+		if (!strstr(line, key))
+			continue;
+
+		fclose(fdi);
+
+		value = strchr(line, '\t');
+		if (!value) {
+			err("malformed fdinfo!?\n");
+			free(line);
+			return NULL;
+		}
+		value++;
+
+		len = strlen(value);
+		memmove(line, value, len);
+		line[len - 1] = '\0';
+
+		return line;
+	}
+
+	err("key '%s' not found in fdinfo\n", key);
+	fclose(fdi);
+	return NULL;
+}
diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
new file mode 100644
index 000000000000..e2bcfbf9b824
--- /dev/null
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -0,0 +1,83 @@
+/*
+ * Based on:
+ *
+ * Minimal BPF JIT image disassembler
+ *
+ * Disassembles BPF JIT compiler emitted opcodes back to asm insn's for
+ * debugging or verification purposes.
+ *
+ * Copyright 2013 Daniel Borkmann <daniel@iogearbox.net>
+ * Licensed under the GNU General Public License, version 2.0 (GPLv2)
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <string.h>
+#include <bfd.h>
+#include <dis-asm.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+static void get_exec_path(char *tpath, size_t size)
+{
+	ssize_t len;
+	char *path;
+
+	snprintf(tpath, size, "/proc/%d/exe", (int) getpid());
+	tpath[size - 1] = 0;
+
+	path = strdup(tpath);
+	assert(path);
+
+	len = readlink(path, tpath, size);
+	tpath[len] = 0;
+
+	free(path);
+}
+
+void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes)
+{
+	disassembler_ftype disassemble;
+	struct disassemble_info info;
+	int count, i, pc = 0;
+	char tpath[256];
+	bfd *bfdf;
+
+	memset(tpath, 0, sizeof(tpath));
+	get_exec_path(tpath, sizeof(tpath));
+
+	bfdf = bfd_openr(tpath, NULL);
+	assert(bfdf);
+	assert(bfd_check_format(bfdf, bfd_object));
+
+	init_disassemble_info(&info, stdout, (fprintf_ftype) fprintf);
+	info.arch = bfd_get_arch(bfdf);
+	info.mach = bfd_get_mach(bfdf);
+	info.buffer = image;
+	info.buffer_length = len;
+
+	disassemble_init_for_target(&info);
+
+	disassemble = disassembler(bfdf);
+	assert(disassemble);
+
+	do {
+		printf("%4x:\t", pc);
+
+		count = disassemble(pc, &info);
+
+		if (opcodes) {
+			printf("\n\t");
+			for (i = 0; i < count; ++i)
+				printf("%02x ", (uint8_t) image[pc + i]);
+		}
+		printf("\n");
+
+		pc += count;
+	} while (count > 0 && pc < len);
+
+	bfd_close(bfdf);
+}
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
new file mode 100644
index 000000000000..622be3b3a28a
--- /dev/null
+++ b/tools/bpf/bpftool/main.c
@@ -0,0 +1,212 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* Author: Jakub Kicinski <kubakici@wp.pl> */
+
+#include <bfd.h>
+#include <ctype.h>
+#include <errno.h>
+#include <linux/bpf.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+const char *bin_name;
+static int last_argc;
+static char **last_argv;
+static int (*last_do_help)(int argc, char **argv);
+
+void usage(void)
+{
+	last_do_help(last_argc - 1, last_argv + 1);
+
+	exit(-1);
+}
+
+static int do_help(int argc, char **argv)
+{
+	fprintf(stderr,
+		"Usage: %s OBJECT { COMMAND | help }\n"
+		"       %s batch file FILE\n"
+		"\n"
+		"       OBJECT := { prog | map }\n",
+		bin_name, bin_name);
+
+	return 0;
+}
+
+int cmd_select(const struct cmd *cmds, int argc, char **argv,
+	       int (*help)(int argc, char **argv))
+{
+	unsigned int i;
+
+	last_argc = argc;
+	last_argv = argv;
+	last_do_help = help;
+
+	if (argc < 1 && cmds[0].func)
+		return cmds[0].func(argc, argv);
+
+	for (i = 0; cmds[i].func; i++)
+		if (is_prefix(*argv, cmds[i].cmd))
+			return cmds[i].func(argc - 1, argv + 1);
+
+	help(argc - 1, argv + 1);
+
+	return -1;
+}
+
+bool is_prefix(const char *pfx, const char *str)
+{
+	if (!pfx)
+		return false;
+	if (strlen(str) < strlen(pfx))
+		return false;
+
+	return !memcmp(str, pfx, strlen(pfx));
+}
+
+void print_hex(void *arg, unsigned int n, const char *sep)
+{
+	unsigned char *data = arg;
+	unsigned int i;
+
+	for (i = 0; i < n; i++) {
+		const char *pfx = "";
+
+		if (!i)
+			/* nothing */;
+		else if (!(i % 16))
+			printf("\n");
+		else if (!(i % 8))
+			printf("  ");
+		else
+			pfx = sep;
+
+		printf("%s%02hhx", i ? pfx : "", data[i]);
+	}
+}
+
+static int do_batch(int argc, char **argv);
+
+static const struct cmd cmds[] = {
+	{ "help",	do_help },
+	{ "batch",	do_batch },
+	{ "prog",	do_prog },
+	{ "map",	do_map },
+	{ 0 }
+};
+
+static int do_batch(int argc, char **argv)
+{
+	unsigned int lines = 0;
+	char *n_argv[4096];
+	char buf[65536];
+	int n_argc;
+	char *ptr;
+	FILE *fp;
+	int err;
+
+	if (argc < 2) {
+		err("too few parameters for batch\n");
+		return -1;
+	} else if (!is_prefix(*argv, "file")) {
+		err("expected 'file', got: %s\n", *argv);
+		return -1;
+	} else if (argc > 2) {
+		err("too many parameters for batch\n");
+		return -1;
+	}
+	NEXT_ARG();
+
+	fp = fopen(*argv, "r");
+	if (!fp) {
+		err("Can't open file (%s): %s\n", *argv, strerror(errno));
+		return -1;
+	}
+
+	while (fgets(buf, sizeof(buf), fp)) {
+		if (strlen(buf) == sizeof(buf) - 1) {
+			errno = E2BIG;
+			break;
+		}
+
+		ptr = buf;
+		n_argc = 0;
+		while (*ptr) {
+			if (isspace(*ptr)) {
+				ptr++;
+				continue;
+			}
+			n_argv[n_argc++] = ptr;
+
+			ptr += strcspn(ptr, " \t\n");
+			*ptr++ = 0;
+		}
+
+		if (!n_argc)
+			continue;
+
+		err = cmd_select(cmds, n_argc, n_argv, do_help);
+		if (err)
+			goto err_close;
+
+		lines++;
+	}
+
+	if (errno && errno != ENOENT) {
+		perror("reading batch file failed");
+		err = -1;
+	} else {
+		info("processed %d lines\n", lines);
+		err = 0;
+	}
+err_close:
+	fclose(fp);
+
+	return err;
+}
+
+int main(int argc, char **argv)
+{
+	bin_name = argv[0];
+	NEXT_ARG();
+
+	bfd_init();
+
+	return cmd_select(cmds, argc, argv, do_help);
+}
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
new file mode 100644
index 000000000000..85d2d7870a58
--- /dev/null
+++ b/tools/bpf/bpftool/main.h
@@ -0,0 +1,99 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* Author: Jakub Kicinski <kubakici@wp.pl> */
+
+#ifndef __BPF_TOOL_H
+#define __BPF_TOOL_H
+
+#include <stdbool.h>
+#include <stdio.h>
+#include <linux/bpf.h>
+
+#define ARRAY_SIZE(a)	(sizeof(a) / sizeof(a[0]))
+
+#define err(msg...)	fprintf(stderr, "Error: " msg)
+#define warn(msg...)	fprintf(stderr, "Warning: " msg)
+#define info(msg...)	fprintf(stderr, msg)
+
+#define ptr_to_u64(ptr)	((__u64)(unsigned long)(ptr))
+
+#define min(a, b)							\
+	({ typeof(a) _a = (a); typeof(b) _b = (b); _a > _b ? _b : _a; })
+#define max(a, b)							\
+	({ typeof(a) _a = (a); typeof(b) _b = (b); _a < _b ? _b : _a; })
+
+#define NEXT_ARG()	({ argc--; argv++; if (argc < 0) usage(); })
+#define NEXT_ARGP()	({ (*argc)--; (*argv)++; if (*argc < 0) usage(); })
+#define BAD_ARG()	({ err("what is '%s'?\n", *argv); -1; })
+
+#define BPF_TAG_FMT	"%02hhx:%02hhx:%02hhx:%02hhx:"	\
+			"%02hhx:%02hhx:%02hhx:%02hhx"
+
+#define HELP_SPEC_PROGRAM						\
+	"PROG := { id PROG_ID | pinned FILE | tag PROG_TAG }"
+
+enum bpf_obj_type {
+	BPF_OBJ_UNKNOWN,
+	BPF_OBJ_PROG,
+	BPF_OBJ_MAP,
+};
+
+extern const char *bin_name;
+
+bool is_prefix(const char *pfx, const char *str);
+void print_hex(void *arg, unsigned int n, const char *sep);
+void usage(void) __attribute__((noreturn));
+
+struct cmd {
+	const char *cmd;
+	int (*func)(int argc, char **argv);
+};
+
+int cmd_select(const struct cmd *cmds, int argc, char **argv,
+	       int (*help)(int argc, char **argv));
+
+int get_fd_type(int fd);
+const char *get_fd_type_name(enum bpf_obj_type type);
+char *get_fdinfo(int fd, const char *key);
+int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type);
+int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32));
+
+int do_prog(int argc, char **arg);
+int do_map(int argc, char **arg);
+
+int prog_parse_fd(int *argc, char ***argv);
+
+void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes);
+
+#endif
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
new file mode 100644
index 000000000000..db46986fef73
--- /dev/null
+++ b/tools/bpf/bpftool/map.c
@@ -0,0 +1,742 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* Author: Jakub Kicinski <kubakici@wp.pl> */
+
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+static const char * const map_type_name[] = {
+	[BPF_MAP_TYPE_UNSPEC]		= "unspec",
+	[BPF_MAP_TYPE_HASH]		= "hash",
+	[BPF_MAP_TYPE_ARRAY]		= "array",
+	[BPF_MAP_TYPE_PROG_ARRAY]	= "prog_array",
+	[BPF_MAP_TYPE_PERF_EVENT_ARRAY]	= "perf_event_array",
+	[BPF_MAP_TYPE_PERCPU_HASH]	= "percpu_hash",
+	[BPF_MAP_TYPE_PERCPU_ARRAY]	= "percpu_array",
+	[BPF_MAP_TYPE_STACK_TRACE]	= "stack_trace",
+	[BPF_MAP_TYPE_CGROUP_ARRAY]	= "cgroup_array",
+	[BPF_MAP_TYPE_LRU_HASH]		= "lru_hash",
+	[BPF_MAP_TYPE_LRU_PERCPU_HASH]	= "lru_percpu_hash",
+	[BPF_MAP_TYPE_LPM_TRIE]		= "lpm_trie",
+	[BPF_MAP_TYPE_ARRAY_OF_MAPS]	= "array_of_maps",
+	[BPF_MAP_TYPE_HASH_OF_MAPS]	= "hash_of_maps",
+	[BPF_MAP_TYPE_DEVMAP]		= "devmap",
+	[BPF_MAP_TYPE_SOCKMAP]		= "sockmap",
+};
+
+static unsigned int get_possible_cpus(void)
+{
+	static unsigned int result;
+	char buf[128];
+	long int n;
+	char *ptr;
+	int fd;
+
+	if (result)
+		return result;
+
+	fd = open("/sys/devices/system/cpu/possible", O_RDONLY);
+	if (fd < 1) {
+		err("can't open sysfs possible cpus\n");
+		exit(-1);
+	}
+
+	n = read(fd, buf, sizeof(buf));
+	if (n < 2) {
+		err("can't read sysfs possible cpus\n");
+		exit(-1);
+	}
+	close(fd);
+
+	if (n == sizeof(buf)) {
+		err("read sysfs possible cpus overflow\n");
+		exit(-1);
+	}
+
+	ptr = buf;
+	n = 0;
+	while (*ptr && *ptr != '\n') {
+		unsigned int a, b;
+
+		if (sscanf(ptr, "%u-%u", &a, &b) == 2) {
+			n += b - a + 1;
+
+			ptr = strchr(ptr, '-') + 1;
+		} else if (sscanf(ptr, "%u", &a) == 1) {
+			n++;
+		}
+
+		while (isdigit(*ptr))
+			ptr++;
+		if (*ptr == ',')
+			ptr++;
+	}
+
+	result = n;
+
+	return result;
+}
+
+static bool map_is_per_cpu(__u32 type)
+{
+	return type == BPF_MAP_TYPE_PERCPU_HASH ||
+	       type == BPF_MAP_TYPE_PERCPU_ARRAY ||
+	       type == BPF_MAP_TYPE_LRU_PERCPU_HASH;
+}
+
+static bool map_is_map_of_maps(__u32 type)
+{
+	return type == BPF_MAP_TYPE_ARRAY_OF_MAPS ||
+	       type == BPF_MAP_TYPE_HASH_OF_MAPS;
+}
+
+static bool map_is_map_of_progs(__u32 type)
+{
+	return type == BPF_MAP_TYPE_PROG_ARRAY;
+}
+
+static void *alloc_value(struct bpf_map_info *info)
+{
+	if (map_is_per_cpu(info->type))
+		return malloc(info->value_size * get_possible_cpus());
+	else
+		return malloc(info->value_size);
+}
+
+static int map_parse_fd(int *argc, char ***argv)
+{
+	int fd;
+
+	if (is_prefix(**argv, "id")) {
+		unsigned int id;
+		char *endptr;
+
+		NEXT_ARGP();
+
+		id = strtoul(**argv, &endptr, 0);
+		if (*endptr) {
+			err("can't parse %s as ID\n", **argv);
+			return -1;
+		}
+		NEXT_ARGP();
+
+		fd = bpf_map_get_fd_by_id(id);
+		if (fd < 1)
+			err("get map by id (%u): %s\n", id, strerror(errno));
+		return fd;
+	} else if (is_prefix(**argv, "pinned")) {
+		char *path;
+
+		NEXT_ARGP();
+
+		path = **argv;
+		NEXT_ARGP();
+
+		return open_obj_pinned_any(path, BPF_OBJ_MAP);
+	}
+
+	err("expected 'id' or 'pinned', got: '%s'?\n", **argv);
+	return -1;
+}
+
+static int
+map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len)
+{
+	int err;
+	int fd;
+
+	fd = map_parse_fd(argc, argv);
+	if (fd < 1)
+		return -1;
+
+	err = bpf_obj_get_info_by_fd(fd, info, info_len);
+	if (err) {
+		err("can't get map info: %s\n", strerror(errno));
+		close(fd);
+		return err;
+	}
+
+	return fd;
+}
+
+static void print_entry(struct bpf_map_info *info, unsigned char *key,
+			unsigned char *value)
+{
+	unsigned int i, n;
+
+	if (!map_is_per_cpu(info->type)) {
+		/* Single line print */
+		if (info->key_size + info->value_size <= 24 &&
+		    max(info->key_size, info->value_size) <= 16) {
+			printf("key: ");
+			print_hex(key, info->key_size, " ");
+			printf("  value: ");
+			print_hex(value, info->value_size, " ");
+			printf("\n");
+			return;
+		}
+
+		printf("key: ");
+		print_hex(key, info->key_size, " ");
+		printf("\nvalue: ");
+		print_hex(value, info->value_size, " ");
+		printf("\n");
+
+		return;
+	}
+
+	n = get_possible_cpus();
+
+	printf("key:\n");
+	print_hex(key, info->key_size, " ");
+	printf("\n");
+	for (i = 0; i < n; i++) {
+		printf("value (CPU %02d):%c",
+		       i, info->value_size > 16 ? '\n' : ' ');
+		print_hex(value + i * info->value_size, info->value_size, " ");
+		printf("\n");
+	}
+}
+
+static char **parse_val(char **argv, const char *name, unsigned char *val,
+			unsigned int n)
+{
+	unsigned int i = 0;
+	char *endptr;
+
+	while (i < n && argv[i]) {
+		val[i] = strtoul(argv[i], &endptr, 0);
+		if (*endptr) {
+			err("error parsing byte: %s\n", argv[i]);
+			break;
+		}
+		i++;
+	}
+
+	if (i != n) {
+		err("%s expected %d bytes got %d\n", name, n, i);
+		return NULL;
+	}
+
+	return argv + i;
+}
+
+static int parse_elem(char **argv, struct bpf_map_info *info,
+		      void *key, void *value, __u32 key_size, __u32 value_size,
+		      __u32 *flags, __u32 **value_fd)
+{
+	if (!*argv) {
+		if (!key && !value)
+			return 0;
+		err("did not find %s\n", key ? "key" : "value");
+		return -1;
+	}
+
+	if (is_prefix(*argv, "key")) {
+		if (!key) {
+			if (key_size)
+				err("duplicate key\n");
+			else
+				err("unnecessary key\n");
+			return -1;
+		}
+
+		argv = parse_val(argv + 1, "key", key, key_size);
+		if (!argv)
+			return -1;
+
+		return parse_elem(argv, info, NULL, value, key_size, value_size,
+				  flags, value_fd);
+	} else if (is_prefix(*argv, "value")) {
+		int fd;
+
+		if (!value) {
+			if (value_size)
+				err("duplicate value\n");
+			else
+				err("unnecessary value\n");
+			return -1;
+		}
+
+		argv++;
+
+		if (map_is_map_of_maps(info->type)) {
+			int argc = 2;
+
+			if (value_size != 4) {
+				err("value smaller than 4B for map in map?\n");
+				return -1;
+			}
+			if (!argv[0] || !argv[1]) {
+				err("not enough value arguments for map in map\n");
+				return -1;
+			}
+
+			fd = map_parse_fd(&argc, &argv);
+			if (fd < 1)
+				return -1;
+
+			*value_fd = value;
+			**value_fd = fd;
+		} else if (map_is_map_of_progs(info->type)) {
+			int argc = 2;
+
+			if (value_size != 4) {
+				err("value smaller than 4B for map of progs?\n");
+				return -1;
+			}
+			if (!argv[0] || !argv[1]) {
+				err("not enough value arguments for map of progs\n");
+				return -1;
+			}
+
+			fd = prog_parse_fd(&argc, &argv);
+			if (fd < 1)
+				return -1;
+
+			*value_fd = value;
+			**value_fd = fd;
+		} else {
+			argv = parse_val(argv, "value", value, value_size);
+			if (!argv)
+				return -1;
+		}
+
+		return parse_elem(argv, info, key, NULL, key_size, value_size,
+				  flags, NULL);
+	} else if (is_prefix(*argv, "any") || is_prefix(*argv, "noexist") ||
+		   is_prefix(*argv, "exist")) {
+		if (!flags) {
+			err("flags specified multiple times: %s\n", *argv);
+			return -1;
+		}
+
+		if (is_prefix(*argv, "any"))
+			*flags = BPF_ANY;
+		else if (is_prefix(*argv, "noexist"))
+			*flags = BPF_NOEXIST;
+		else if (is_prefix(*argv, "exist"))
+			*flags = BPF_EXIST;
+
+		return parse_elem(argv + 1, info, key, value, key_size,
+				  value_size, NULL, value_fd);
+	}
+
+	err("expected key or value, got: %s\n", *argv);
+	return -1;
+}
+
+static int show_map_close(int fd, struct bpf_map_info *info)
+{
+	char *memlock;
+
+	memlock = get_fdinfo(fd, "memlock");
+	close(fd);
+
+	printf("   %u: ", info->id);
+	if (info->type < ARRAY_SIZE(map_type_name))
+		printf("%s  ", map_type_name[info->type]);
+	else
+		printf("type:%u  ", info->type);
+
+	printf("flags:0x%x  key:%uB  value:%uB  max_entries:%u",
+	       info->map_flags, info->key_size, info->value_size,
+	       info->max_entries);
+
+	if (memlock)
+		printf("  memlock:%sB", memlock);
+	free(memlock);
+
+	printf("\n");
+
+	return 0;
+}
+
+static int do_show(int argc, char **argv)
+{
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	__u32 id = 0;
+	int err;
+	int fd;
+
+	if (argc == 2) {
+		fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+		if (fd < 0)
+			return -1;
+
+		return show_map_close(fd, &info);
+	}
+
+	if (argc)
+		return BAD_ARG();
+
+	while (true) {
+		err = bpf_map_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT)
+				break;
+			err("can't get next map: %s\n", strerror(errno));
+			if (errno == EINVAL)
+				err("kernel too old?\n");
+			return -1;
+		}
+
+		fd = bpf_map_get_fd_by_id(id);
+		if (fd < 1) {
+			err("can't get map by id (%u): %s\n",
+			    id, strerror(errno));
+			return -1;
+		}
+
+		err = bpf_obj_get_info_by_fd(fd, &info, &len);
+		if (err) {
+			err("can't get map info: %s\n", strerror(errno));
+			close(fd);
+			return -1;
+		}
+
+		show_map_close(fd, &info);
+	}
+
+	return errno == ENOENT ? 0 : -1;
+}
+
+static int do_dump(int argc, char **argv)
+{
+	void *key, *value, *prev_key;
+	unsigned int num_elems = 0;
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	int err;
+	int fd;
+
+	if (argc != 2)
+		usage();
+
+	fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+	if (fd < 0)
+		return -1;
+
+	if (map_is_map_of_maps(info.type) || map_is_map_of_progs(info.type)) {
+		err("Dumping maps of maps and program maps not supported\n");
+		close(fd);
+		return -1;
+	}
+
+	key = malloc(info.key_size);
+	value = alloc_value(&info);
+	if (!key || !value) {
+		err("mem alloc failed\n");
+		err = -1;
+		goto exit_free;
+	}
+
+	prev_key = NULL;
+	while (true) {
+		err = bpf_map_get_next_key(fd, prev_key, key);
+		if (err) {
+			if (errno == ENOENT)
+				err = 0;
+			break;
+		}
+
+		err = bpf_map_lookup_elem(fd, key, value);
+		if (err) {
+			info("can't lookup element with key: ");
+			print_hex(key, info.key_size, " ");
+			printf("\n");
+			goto next_key;
+		}
+
+		print_entry(&info, key, value);
+next_key:
+		prev_key = key;
+		num_elems++;
+	}
+
+	printf("Found %u element%s\n", num_elems, num_elems != 1 ? "s" : "");
+
+exit_free:
+	free(key);
+	free(value);
+	close(fd);
+
+	return err;
+}
+
+static int do_update(int argc, char **argv)
+{
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	__u32 *value_fd = NULL;
+	__u32 flags = BPF_ANY;
+	void *key, *value;
+	int fd, err;
+
+	if (argc < 2)
+		usage();
+
+	fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+	if (fd < 0)
+		return -1;
+
+	key = malloc(info.key_size);
+	value = alloc_value(&info);
+	if (!key || !value) {
+		err("mem alloc failed");
+		err = -1;
+		goto exit_free;
+	}
+
+	err = parse_elem(argv, &info, key, value, info.key_size,
+			 info.value_size, &flags, &value_fd);
+	if (err)
+		goto exit_free;
+
+	err = bpf_map_update_elem(fd, key, value, flags);
+	if (err) {
+		err("update failed: %s\n", strerror(errno));
+		goto exit_free;
+	}
+
+exit_free:
+	if (value_fd)
+		close(*value_fd);
+	free(key);
+	free(value);
+	close(fd);
+
+	return err;
+}
+
+static int do_lookup(int argc, char **argv)
+{
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	void *key, *value;
+	int err;
+	int fd;
+
+	if (argc < 2)
+		usage();
+
+	fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+	if (fd < 0)
+		return -1;
+
+	key = malloc(info.key_size);
+	value = alloc_value(&info);
+	if (!key || !value) {
+		err("mem alloc failed");
+		err = -1;
+		goto exit_free;
+	}
+
+	err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL);
+	if (err)
+		goto exit_free;
+
+	err = bpf_map_lookup_elem(fd, key, value);
+	if (!err) {
+		print_entry(&info, key, value);
+	} else if (errno == ENOENT) {
+		printf("key:\n");
+		print_hex(key, info.key_size, " ");
+		printf("\n\nNot found\n");
+	} else {
+		err("lookup failed: %s\n", strerror(errno));
+	}
+
+exit_free:
+	free(key);
+	free(value);
+	close(fd);
+
+	return err;
+}
+
+static int do_getnext(int argc, char **argv)
+{
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	void *key, *nextkey;
+	int err;
+	int fd;
+
+	if (argc < 2)
+		usage();
+
+	fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+	if (fd < 0)
+		return -1;
+
+	key = malloc(info.key_size);
+	nextkey = malloc(info.key_size);
+	if (!key || !nextkey) {
+		err("mem alloc failed");
+		err = -1;
+		goto exit_free;
+	}
+
+	if (argc) {
+		err = parse_elem(argv, &info, key, NULL, info.key_size, 0,
+				 NULL, NULL);
+		if (err)
+			goto exit_free;
+	} else {
+		free(key);
+		key = NULL;
+	}
+
+	err = bpf_map_get_next_key(fd, key, nextkey);
+	if (err) {
+		err("can't get next key: %s\n", strerror(errno));
+		goto exit_free;
+	}
+
+	if (key) {
+		printf("key:\n");
+		print_hex(key, info.key_size, " ");
+		printf("\n");
+	} else {
+		printf("key: None\n");
+	}
+
+	printf("next key:\n");
+	print_hex(nextkey, info.key_size, " ");
+	printf("\n");
+
+exit_free:
+	free(nextkey);
+	free(key);
+	close(fd);
+
+	return err;
+}
+
+static int do_delete(int argc, char **argv)
+{
+	struct bpf_map_info info = {};
+	__u32 len = sizeof(info);
+	void *key;
+	int err;
+	int fd;
+
+	if (argc < 2)
+		usage();
+
+	fd = map_parse_fd_and_info(&argc, &argv, &info, &len);
+	if (fd < 0)
+		return -1;
+
+	key = malloc(info.key_size);
+	if (!key) {
+		err("mem alloc failed");
+		err = -1;
+		goto exit_free;
+	}
+
+	err = parse_elem(argv, &info, key, NULL, info.key_size, 0, NULL, NULL);
+	if (err)
+		goto exit_free;
+
+	err = bpf_map_delete_elem(fd, key);
+	if (err)
+		err("delete failed: %s\n", strerror(errno));
+
+exit_free:
+	free(key);
+	close(fd);
+
+	return err;
+}
+
+static int do_pin(int argc, char **argv)
+{
+	return do_pin_any(argc, argv, bpf_map_get_fd_by_id);
+}
+
+static int do_help(int argc, char **argv)
+{
+	fprintf(stderr,
+		"Usage: %s %s show   [MAP]\n"
+		"       %s %s dump    MAP\n"
+		"       %s %s update  MAP  key BYTES value VALUE [UPDATE_FLAGS]\n"
+		"       %s %s lookup  MAP  key BYTES\n"
+		"       %s %s getnext MAP [key BYTES]\n"
+		"       %s %s delete  MAP  key BYTES\n"
+		"       %s %s pin     MAP  FILE\n"
+		"       %s %s help\n"
+		"\n"
+		"       MAP := { id MAP_ID | pinned FILE }\n"
+		"       " HELP_SPEC_PROGRAM "\n"
+		"       VALUE := { BYTES | MAP | PROG }\n"
+		"       UPDATE_FLAGS := { any | exist | noexist }\n"
+		"",
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2], bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "show",	do_show },
+	{ "help",	do_help },
+	{ "dump",	do_dump },
+	{ "update",	do_update },
+	{ "lookup",	do_lookup },
+	{ "getnext",	do_getnext },
+	{ "delete",	do_delete },
+	{ "pin",	do_pin },
+	{ 0 }
+};
+
+int do_map(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
new file mode 100644
index 000000000000..3129159c593e
--- /dev/null
+++ b/tools/bpf/bpftool/prog.c
@@ -0,0 +1,392 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* Author: Jakub Kicinski <kubakici@wp.pl> */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+static const char * const prog_type_name[] = {
+	[BPF_PROG_TYPE_UNSPEC]		= "unspec",
+	[BPF_PROG_TYPE_SOCKET_FILTER]	= "socket_filter",
+	[BPF_PROG_TYPE_KPROBE]		= "kprobe",
+	[BPF_PROG_TYPE_SCHED_CLS]	= "sched_cls",
+	[BPF_PROG_TYPE_SCHED_ACT]	= "sched_act",
+	[BPF_PROG_TYPE_TRACEPOINT]	= "tracepoint",
+	[BPF_PROG_TYPE_XDP]		= "xdp",
+	[BPF_PROG_TYPE_PERF_EVENT]	= "perf_event",
+	[BPF_PROG_TYPE_CGROUP_SKB]	= "cgroup_skb",
+	[BPF_PROG_TYPE_CGROUP_SOCK]	= "cgroup_sock",
+	[BPF_PROG_TYPE_LWT_IN]		= "lwt_in",
+	[BPF_PROG_TYPE_LWT_OUT]		= "lwt_out",
+	[BPF_PROG_TYPE_LWT_XMIT]	= "lwt_xmit",
+	[BPF_PROG_TYPE_SOCK_OPS]	= "sock_ops",
+	[BPF_PROG_TYPE_SK_SKB]		= "sk_skb",
+};
+
+static int prog_fd_by_tag(unsigned char *tag)
+{
+	struct bpf_prog_info info = {};
+	__u32 len = sizeof(info);
+	unsigned int id = 0;
+	int err;
+	int fd;
+
+	while (true) {
+		err = bpf_prog_get_next_id(id, &id);
+		if (err) {
+			err("%s\n", strerror(errno));
+			return -1;
+		}
+
+		fd = bpf_prog_get_fd_by_id(id);
+		if (fd < 1) {
+			err("can't get prog by id (%u): %s\n",
+			    id, strerror(errno));
+			return -1;
+		}
+
+		err = bpf_obj_get_info_by_fd(fd, &info, &len);
+		if (err) {
+			err("can't get prog info (%u): %s\n",
+			    id, strerror(errno));
+			close(fd);
+			return -1;
+		}
+
+		if (!memcmp(tag, info.tag, BPF_TAG_SIZE))
+			return fd;
+
+		close(fd);
+	}
+}
+
+int prog_parse_fd(int *argc, char ***argv)
+{
+	int fd;
+
+	if (is_prefix(**argv, "id")) {
+		unsigned int id;
+		char *endptr;
+
+		NEXT_ARGP();
+
+		id = strtoul(**argv, &endptr, 0);
+		if (*endptr) {
+			err("can't parse %s as ID\n", **argv);
+			return -1;
+		}
+		NEXT_ARGP();
+
+		fd = bpf_prog_get_fd_by_id(id);
+		if (fd < 1)
+			err("get by id (%u): %s\n", id, strerror(errno));
+		return fd;
+	} else if (is_prefix(**argv, "tag")) {
+		unsigned char tag[BPF_TAG_SIZE];
+
+		NEXT_ARGP();
+
+		if (sscanf(**argv, BPF_TAG_FMT, tag, tag + 1, tag + 2,
+			   tag + 3, tag + 4, tag + 5, tag + 6, tag + 7)
+		    != BPF_TAG_SIZE) {
+			err("can't parse tag\n");
+			return -1;
+		}
+		NEXT_ARGP();
+
+		return prog_fd_by_tag(tag);
+	} else if (is_prefix(**argv, "pinned")) {
+		char *path;
+
+		NEXT_ARGP();
+
+		path = **argv;
+		NEXT_ARGP();
+
+		return open_obj_pinned_any(path, BPF_OBJ_PROG);
+	}
+
+	err("expected 'id', 'tag' or 'pinned', got: '%s'?\n", **argv);
+	return -1;
+}
+
+static int show_prog(int fd)
+{
+	struct bpf_prog_info info = {};
+	__u32 len = sizeof(info);
+	char *memlock;
+	int err;
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (err) {
+		err("can't get prog info: %s\n", strerror(errno));
+		return -1;
+	}
+
+	printf("   %u: ", info.id);
+	if (info.type < ARRAY_SIZE(prog_type_name))
+		printf("%s  ", prog_type_name[info.type]);
+	else
+		printf("type:%u  ", info.type);
+
+	printf("tag ");
+	print_hex(info.tag, BPF_TAG_SIZE, ":");
+
+	printf("  xlated:%uB", info.xlated_prog_len);
+
+	if (info.jited_prog_len)
+		printf("  jited:%uB", info.jited_prog_len);
+	else
+		printf("  not jited");
+
+	memlock = get_fdinfo(fd, "memlock");
+	if (memlock)
+		printf("  memlock:%sB", memlock);
+	free(memlock);
+
+	printf("\n");
+
+	return 0;
+}
+
+static int do_show(int argc, char **argv)
+{	__u32 id = 0;
+	int err;
+	int fd;
+
+	if (argc == 2) {
+		fd = prog_parse_fd(&argc, &argv);
+		if (fd < 1)
+			return -1;
+
+		return show_prog(fd);
+	}
+
+	if (argc)
+		return BAD_ARG();
+
+	while (true) {
+		err = bpf_prog_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT)
+				break;
+			err("can't get next program: %s\n", strerror(errno));
+			if (errno == EINVAL)
+				err("kernel too old?\n");
+			return -1;
+		}
+
+		fd = bpf_prog_get_fd_by_id(id);
+		if (fd < 1) {
+			err("can't get prog by id (%u): %s\n",
+			    id, strerror(errno));
+			return -1;
+		}
+
+		err = show_prog(fd);
+		close(fd);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int do_dump(int argc, char **argv)
+{
+	struct bpf_prog_info info = {};
+	__u32 len = sizeof(info);
+	bool can_disasm = false;
+	unsigned int buf_size;
+	char *filepath = NULL;
+	bool opcodes = false;
+	unsigned char *buf;
+	__u32 *member_len;
+	__u64 *member_ptr;
+	ssize_t n;
+	int err;
+	int fd;
+
+	if (is_prefix(*argv, "jited")) {
+		member_len = &info.jited_prog_len;
+		member_ptr = &info.jited_prog_insns;
+		can_disasm = true;
+	} else if (is_prefix(*argv, "xlated")) {
+		member_len = &info.xlated_prog_len;
+		member_ptr = &info.xlated_prog_insns;
+	} else {
+		err("expected 'xlated' or 'jited', got: %s\n", *argv);
+		return -1;
+	}
+	NEXT_ARG();
+
+	if (argc < 2)
+		usage();
+
+	fd = prog_parse_fd(&argc, &argv);
+	if (fd < 0)
+		return -1;
+
+	if (is_prefix(*argv, "file")) {
+		NEXT_ARG();
+		if (!argc) {
+			err("expected file path\n");
+			return -1;
+		}
+
+		filepath = *argv;
+		NEXT_ARG();
+	} else if (is_prefix(*argv, "opcodes")) {
+		opcodes = true;
+		NEXT_ARG();
+	}
+
+	if (!filepath && !can_disasm) {
+		err("expected 'file' got %s\n", *argv);
+		return -1;
+	}
+	if (argc) {
+		usage();
+		return -1;
+	}
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (err) {
+		err("can't get prog info: %s\n", strerror(errno));
+		return -1;
+	}
+
+	if (!*member_len) {
+		info("no instructions returned\n");
+		close(fd);
+		return 0;
+	}
+
+	buf_size = *member_len;
+
+	buf = malloc(buf_size);
+	if (!buf) {
+		err("mem alloc failed\n");
+		close(fd);
+		return -1;
+	}
+
+	memset(&info, 0, sizeof(info));
+
+	*member_ptr = ptr_to_u64(buf);
+	*member_len = buf_size;
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &len);
+	close(fd);
+	if (err) {
+		err("can't get prog info: %s\n", strerror(errno));
+		goto err_free;
+	}
+
+	if (*member_len > buf_size) {
+		info("too many instructions returned\n");
+		goto err_free;
+	}
+
+	if (filepath) {
+		fd = open(filepath, O_WRONLY | O_CREAT | O_TRUNC, 0600);
+		if (fd < 1) {
+			err("can't open file %s: %s\n", filepath,
+			    strerror(errno));
+			goto err_free;
+		}
+
+		n = write(fd, buf, *member_len);
+		close(fd);
+		if (n != *member_len) {
+			err("error writing output file: %s\n",
+			    n < 0 ? strerror(errno) : "short write");
+			goto err_free;
+		}
+	} else {
+		disasm_print_insn(buf, *member_len, opcodes);
+	}
+
+	free(buf);
+
+	return 0;
+
+err_free:
+	free(buf);
+	return -1;
+}
+
+static int do_pin(int argc, char **argv)
+{
+	return do_pin_any(argc, argv, bpf_prog_get_fd_by_id);
+}
+
+static int do_help(int argc, char **argv)
+{
+	fprintf(stderr,
+		"Usage: %s %s show [PROG]\n"
+		"       %s %s dump xlated PROG  file FILE\n"
+		"       %s %s dump jited  PROG [file FILE] [opcodes]\n"
+		"       %s %s pin   PROG FILE\n"
+		"       %s %s help\n"
+		"\n"
+		"       " HELP_SPEC_PROGRAM "\n"
+		"",
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2], bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "show",	do_show },
+	{ "dump",	do_dump },
+	{ "pin",	do_pin },
+	{ 0 }
+};
+
+int do_prog(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
-- 
2.14.1

^ permalink raw reply related

* [PATCH v2 net-next 2/2] bpf/verifier: improve disassembly of BPF_NEG instructions
From: Edward Cree @ 2017-09-26 15:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, daniel, alexei.starovoitov, ys114321
In-Reply-To: <52270348-67f1-4e7a-cd2f-9d611ae94064@solarflare.com>

BPF_NEG takes only one operand, unlike the bulk of BPF_ALU[64] which are
 compound-assignments.  So give it its own format in print_bpf_insn().

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 kernel/bpf/verifier.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3aaa3262..04e0508 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -344,6 +344,11 @@ static void print_bpf_insn(const struct bpf_verifier_env *env,
 				verbose("BUG_alu64_%02x\n", insn->code);
 			else
 				print_bpf_end_insn(env, insn);
+		} else if (BPF_OP(insn->code) == BPF_NEG) {
+			verbose("(%02x) r%d = %s-r%d\n",
+				insn->code, insn->dst_reg,
+				class == BPF_ALU ? "(u32) " : "",
+				insn->dst_reg);
 		} else if (BPF_SRC(insn->code) == BPF_X) {
 			verbose("(%02x) %sr%d %s %sr%d\n",
 				insn->code, class == BPF_ALU ? "(u32) " : "",

^ permalink raw reply related

* [PATCH net-next 1/2] tools: rename tools/net directory to tools/bpf
From: Jakub Kicinski @ 2017-09-26 15:35 UTC (permalink / raw)
  To: netdev
  Cc: daniel, alexei.starovoitov, davem, hannes, dsahern, oss-drivers,
	Jakub Kicinski
In-Reply-To: <20170926153522.31500-1-jakub.kicinski@netronome.com>

We currently only have BPF tools in the tools/net directory.
We are about to add more BPF tools there, not necessarily
networking related, rename the directory and related Makefile
targets to bpf.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
---
 MAINTAINERS                         |  3 +--
 tools/Makefile                      | 14 +++++++-------
 tools/{net => bpf}/Makefile         |  0
 tools/{net => bpf}/bpf_asm.c        |  0
 tools/{net => bpf}/bpf_dbg.c        |  0
 tools/{net => bpf}/bpf_exp.l        |  0
 tools/{net => bpf}/bpf_exp.y        |  0
 tools/{net => bpf}/bpf_jit_disasm.c |  0
 8 files changed, 8 insertions(+), 9 deletions(-)
 rename tools/{net => bpf}/Makefile (100%)
 rename tools/{net => bpf}/bpf_asm.c (100%)
 rename tools/{net => bpf}/bpf_dbg.c (100%)
 rename tools/{net => bpf}/bpf_exp.l (100%)
 rename tools/{net => bpf}/bpf_exp.y (100%)
 rename tools/{net => bpf}/bpf_jit_disasm.c (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6671f375f7fc..2f79b94a41ec 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2725,7 +2725,7 @@ F:	net/core/filter.c
 F:	net/sched/act_bpf.c
 F:	net/sched/cls_bpf.c
 F:	samples/bpf/
-F:	tools/net/bpf*
+F:	tools/bpf/
 F:	tools/testing/selftests/bpf/
 
 BROADCOM B44 10/100 ETHERNET DRIVER
@@ -9416,7 +9416,6 @@ F:	include/uapi/linux/in.h
 F:	include/uapi/linux/net.h
 F:	include/uapi/linux/netdevice.h
 F:	include/uapi/linux/net_namespace.h
-F:	tools/net/
 F:	tools/testing/selftests/net/
 F:	lib/random32.c
 
diff --git a/tools/Makefile b/tools/Makefile
index 9dfede37c8ff..df6fcb293fbc 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -19,7 +19,7 @@ include scripts/Makefile.include
 	@echo '  kvm_stat               - top-like utility for displaying kvm statistics'
 	@echo '  leds                   - LEDs  tools'
 	@echo '  liblockdep             - user-space wrapper for kernel locking-validator'
-	@echo '  net                    - misc networking tools'
+	@echo '  bpf                    - misc BPF tools'
 	@echo '  perf                   - Linux performance measurement and analysis tool'
 	@echo '  selftests              - various kernel selftests'
 	@echo '  spi                    - spi tools'
@@ -57,7 +57,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm net iio gpio objtool leds: FORCE
+cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -91,7 +91,7 @@ kvm_stat: FORCE
 
 all: acpi cgroup cpupower gpio hv firewire liblockdep \
 		perf selftests spi turbostat usb \
-		virtio vm net x86_energy_perf_policy \
+		virtio vm bpf x86_energy_perf_policy \
 		tmon freefall iio objtool kvm_stat
 
 acpi_install:
@@ -100,7 +100,7 @@ all: acpi cgroup cpupower gpio hv firewire liblockdep \
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install net_install objtool_install:
+cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install:
 	$(call descend,$(@:_install=),install)
 
 liblockdep_install:
@@ -124,7 +124,7 @@ all: acpi cgroup cpupower gpio hv firewire liblockdep \
 install: acpi_install cgroup_install cpupower_install gpio_install \
 		hv_install firewire_install iio_install liblockdep_install \
 		perf_install selftests_install turbostat_install usb_install \
-		virtio_install vm_install net_install x86_energy_perf_policy_install \
+		virtio_install vm_install bpf_install x86_energy_perf_policy_install \
 		tmon_install freefall_install objtool_install kvm_stat_install
 
 acpi_clean:
@@ -133,7 +133,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean net_clean iio_clean gpio_clean objtool_clean leds_clean:
+cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -169,7 +169,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
 
 clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
-		vm_clean net_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
+		vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
 		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
 		gpio_clean objtool_clean leds_clean
 
diff --git a/tools/net/Makefile b/tools/bpf/Makefile
similarity index 100%
rename from tools/net/Makefile
rename to tools/bpf/Makefile
diff --git a/tools/net/bpf_asm.c b/tools/bpf/bpf_asm.c
similarity index 100%
rename from tools/net/bpf_asm.c
rename to tools/bpf/bpf_asm.c
diff --git a/tools/net/bpf_dbg.c b/tools/bpf/bpf_dbg.c
similarity index 100%
rename from tools/net/bpf_dbg.c
rename to tools/bpf/bpf_dbg.c
diff --git a/tools/net/bpf_exp.l b/tools/bpf/bpf_exp.l
similarity index 100%
rename from tools/net/bpf_exp.l
rename to tools/bpf/bpf_exp.l
diff --git a/tools/net/bpf_exp.y b/tools/bpf/bpf_exp.y
similarity index 100%
rename from tools/net/bpf_exp.y
rename to tools/bpf/bpf_exp.y
diff --git a/tools/net/bpf_jit_disasm.c b/tools/bpf/bpf_jit_disasm.c
similarity index 100%
rename from tools/net/bpf_jit_disasm.c
rename to tools/bpf/bpf_jit_disasm.c
-- 
2.14.1

^ permalink raw reply related

* [PATCH net-next 0/2] tools: add bpftool
From: Jakub Kicinski @ 2017-09-26 15:35 UTC (permalink / raw)
  To: netdev
  Cc: daniel, alexei.starovoitov, davem, hannes, dsahern, oss-drivers,
	Jakub Kicinski

Hi!

I'm looking for a home for bpftool, Daniel suggested that 
tools/net could be a good place, since there are only BPF
utilities there already.

The tool should be complete for simple use cases and we
will continue extending it as we go along.  E.g. providing
disassembly of loaded programs directly using LLVM library
and JSON output are high on the priority list.

The first patch renames tools/net to tools/bpf, while the
second one adds the new code.


Jakub Kicinski (2):
  tools: rename tools/net directory to tools/bpf
  tools: bpf: add bpftool

 MAINTAINERS                         |   3 +-
 tools/Makefile                      |  20 +-
 tools/{net => bpf}/Makefile         |  18 +-
 tools/{net => bpf}/bpf_asm.c        |   0
 tools/{net => bpf}/bpf_dbg.c        |   0
 tools/{net => bpf}/bpf_exp.l        |   0
 tools/{net => bpf}/bpf_exp.y        |   0
 tools/{net => bpf}/bpf_jit_disasm.c |   0
 tools/bpf/bpftool/Makefile          |  80 ++++
 tools/bpf/bpftool/common.c          | 214 +++++++++++
 tools/bpf/bpftool/jit_disasm.c      |  83 ++++
 tools/bpf/bpftool/main.c            | 212 +++++++++++
 tools/bpf/bpftool/main.h            |  99 +++++
 tools/bpf/bpftool/map.c             | 728 ++++++++++++++++++++++++++++++++++++
 tools/bpf/bpftool/prog.c            | 392 +++++++++++++++++++
 15 files changed, 1834 insertions(+), 15 deletions(-)
 rename tools/{net => bpf}/Makefile (73%)
 rename tools/{net => bpf}/bpf_asm.c (100%)
 rename tools/{net => bpf}/bpf_dbg.c (100%)
 rename tools/{net => bpf}/bpf_exp.l (100%)
 rename tools/{net => bpf}/bpf_exp.y (100%)
 rename tools/{net => bpf}/bpf_jit_disasm.c (100%)
 create mode 100644 tools/bpf/bpftool/Makefile
 create mode 100644 tools/bpf/bpftool/common.c
 create mode 100644 tools/bpf/bpftool/jit_disasm.c
 create mode 100644 tools/bpf/bpftool/main.c
 create mode 100644 tools/bpf/bpftool/main.h
 create mode 100644 tools/bpf/bpftool/map.c
 create mode 100644 tools/bpf/bpftool/prog.c

-- 
2.14.1

^ permalink raw reply

* [PATCH v2 net-next 1/2] bpf/verifier: improve disassembly of BPF_END instructions
From: Edward Cree @ 2017-09-26 15:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, daniel, alexei.starovoitov, ys114321
In-Reply-To: <52270348-67f1-4e7a-cd2f-9d611ae94064@solarflare.com>

print_bpf_insn() was treating all BPF_ALU[64] the same, but BPF_END has a
 different structure: it has a size in insn->imm (even if it's BPF_X) and
 uses the BPF_SRC (X or K) to indicate which endianness to use.  So it
 needs different code to print it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 kernel/bpf/verifier.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 799b245..3aaa3262 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -325,26 +325,40 @@ static const char *const bpf_jmp_string[16] = {
 	[BPF_EXIT >> 4] = "exit",
 };
 
+static void print_bpf_end_insn(const struct bpf_verifier_env *env,
+			       const struct bpf_insn *insn)
+{
+	verbose("(%02x) r%d = %s%d r%d\n", insn->code, insn->dst_reg,
+		BPF_SRC(insn->code) == BPF_TO_BE ? "be" : "le",
+		insn->imm, insn->dst_reg);
+}
+
 static void print_bpf_insn(const struct bpf_verifier_env *env,
 			   const struct bpf_insn *insn)
 {
 	u8 class = BPF_CLASS(insn->code);
 
 	if (class == BPF_ALU || class == BPF_ALU64) {
-		if (BPF_SRC(insn->code) == BPF_X)
+		if (BPF_OP(insn->code) == BPF_END) {
+			if (class == BPF_ALU64)
+				verbose("BUG_alu64_%02x\n", insn->code);
+			else
+				print_bpf_end_insn(env, insn);
+		} else if (BPF_SRC(insn->code) == BPF_X) {
 			verbose("(%02x) %sr%d %s %sr%d\n",
 				insn->code, class == BPF_ALU ? "(u32) " : "",
 				insn->dst_reg,
 				bpf_alu_string[BPF_OP(insn->code) >> 4],
 				class == BPF_ALU ? "(u32) " : "",
 				insn->src_reg);
-		else
+		} else {
 			verbose("(%02x) %sr%d %s %s%d\n",
 				insn->code, class == BPF_ALU ? "(u32) " : "",
 				insn->dst_reg,
 				bpf_alu_string[BPF_OP(insn->code) >> 4],
 				class == BPF_ALU ? "(u32) " : "",
 				insn->imm);
+		}
 	} else if (class == BPF_STX) {
 		if (BPF_MODE(insn->code) == BPF_MEM)
 			verbose("(%02x) *(%s *)(r%d %+d) = r%d\n",

^ permalink raw reply related

* Re: [PATCH net-next 2/7] nfp: compile flower vxlan tunnel metadata match fields
From: Or Gerlitz @ 2017-09-26 15:33 UTC (permalink / raw)
  To: John Hurley
  Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
	oss-drivers
In-Reply-To: <CAK+XE=kBG9761tN6uC3ziiqP-jrAjLJ_3J7M-cyfKvTqHVBk7A@mail.gmail.com>

On Tue, Sep 26, 2017 at 6:11 PM, John Hurley <john.hurley@netronome.com> wrote:
> On Tue, Sep 26, 2017 at 3:12 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Tue, Sep 26, 2017 at 4:58 PM, John Hurley <john.hurley@netronome.com> wrote:
>>> On Mon, Sep 25, 2017 at 7:35 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>> On Mon, Sep 25, 2017 at 1:23 PM, Simon Horman
>>>> <simon.horman@netronome.com> wrote:
>>>>> From: John Hurley <john.hurley@netronome.com>
>>>>>
>>>>> Compile ovs-tc flower vxlan metadata match fields for offloading. Only
>>>>
>>>> anything in the npf kernel bits has direct relation to ovs? what?
>>>>
>>>
>>> Sorry, this is a typo  and should refer to TC.
>>>
>>>>> +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
>>>>> @@ -52,8 +52,25 @@
>>>>>          BIT(FLOW_DISSECTOR_KEY_PORTS) | \
>>>>>          BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | \
>>>>>          BIT(FLOW_DISSECTOR_KEY_VLAN) | \
>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | \
>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | \
>>>>> +        BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | \
>>>>
>>>> this series takes care of IPv6 tunnels too?
>>>
>>> IPv6 is not included in this set.
>>> The reason the IPv6 bit is included here is to account for behavior we
>>> have noticed in TC flower.
>>> If, for example, I add a filter with the following match fields:
>>> 'protocol ip flower enc_src_ip 10.0.0.1 enc_dst_ip 10.0.0.2
>>> enc_dst_port 4789 enc_key_id 123'
>>> The 'used_keys' value in the dissector marks both IPv4 and IPv6 encap
>>> addresses as 'used'.
>>> I am not sure if this is a bug in TC or that we are expected to check
>>> the enc_control fields to determine if IPv4 or v6 addresses are used.
>>
>> you should have your code to check enc_control->addr_type to be
>> FLOW_DISSECTOR_KEY_IPV4_ADDRS or IPV6_ADDRS
>>
>>
>>> Including the IPv6 used_keys bit in our whitelist approach allows us
>>> to accept legitimate IPv4 tunnel rules in these situations.
>>
>> mmm can please take a look on fl_init_dissector() and tell me if you
>> see why FLOW_DISSECTOR_KEY_IPV6_ADDRS is set for ipv4 tunnels,
>> I am not sure.
>
>
> The fl_init_dissector uses the FL_KEY_SET_IF_MASKED macro to set an
> array of keys which are then translated to the used_keys values.
> The FL_KEY_SET_IF_MASKED takes a 'struct fl_flow_key' as input and
> checks if any mask bits are set in a particular field - if so it
> eventually marks it as used.
> In struct fl_flow_key, the encap ipv4 and ipv6 addresses are
> represented as a union of the 2.
> Therefore, if we have masked bits set for IPv4, they are also being
> set for the IPv6 field.

I see, do you consider it a bug?

^ permalink raw reply

* [PATCH v2 net-next 0/2] bpf/verifier: disassembly improvements
From: Edward Cree @ 2017-09-26 15:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, daniel, alexei.starovoitov, ys114321

Fix the output of print_bpf_insn() for ALU ops that don't look like
 compound assignment (i.e. BPF_END and BPF_NEG).

Sample output for a short test program:
0: (b4) (u32) r0 = (u32) 0
1: (dc) r0 = be32 r0
2: (84) r0 = (u32) -r0
3: (95) exit
processed 4 insns, stack depth 0

Edward Cree (2):
  bpf/verifier: improve disassembly of BPF_END instructions
  bpf/verifier: improve disassembly of BPF_NEG instructions

 kernel/bpf/verifier.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

^ permalink raw reply

* Re: [PATCH v2 net-next 0/7] net: speedup netns create/delete time
From: Eric Dumazet @ 2017-09-26 15:30 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Tariq Toukan, David S . Miller, netdev, Eric W . Biederman,
	Eric Dumazet, Majd Dibbiny, Yonatan Cohen, Eran Ben Elisha
In-Reply-To: <33E69A03-6594-423A-86E6-02029046BE7D@gmail.com>

On Tue, Sep 26, 2017 at 8:22 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:

> It is in Greg's tree where all kobject patches should go through as far as I know.

Yes, I will fix this, adding a second memmove()

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox