Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: enable RPS on vlan devices
From: Eric Dumazet @ 2018-10-10 18:32 UTC (permalink / raw)
  To: Shannon Nelson, davem, netdev; +Cc: silviu.smarandache
In-Reply-To: <13b1a780-ebde-3f0c-65f3-72157bc23690@oracle.com>

On 10/10/2018 11:23 AM, Shannon Nelson wrote:
> For that matter, lots of features are "default off" until someone configures them, and there are lots of features that are only used by a subset of users.  In this case, we're trying to use something that already exists and arguably is broken for the vlan case.  Are there some technical reasons why this is not a good solution?

Simply code maintenance, and added cost for non vlan users.

Adding extra tests might help your use-case, but is slowing down other cases.

Since the need for new stuff will never end, really eBPF allow for arbitrary usage,
without the need for review kernel patches and maintaining it for next 10 years.

^ permalink raw reply

* Re: [PATCH net-next] net: enable RPS on vlan devices
From: Shannon Nelson @ 2018-10-10 18:25 UTC (permalink / raw)
  To: John Fastabend, Eric Dumazet, davem, netdev; +Cc: silviu.smarandache
In-Reply-To: <ed80d9bb-fad3-a0b7-9473-ad2cdf0c1e2e@gmail.com>

On 10/10/2018 10:37 AM, John Fastabend wrote:
> Latest tree has a sk_lookup() helper supported in 'tc' layer now
> to lookup the socket. And XDP has support for a "cpumap" object
> that allows redirect to remote CPUs. Neither was specifically
> designed for this but I suspect with some extra work these might
> be what is needed.
> 
> I would start by looking at bpf_sk_lookup() in filter.c and the
> cpumap type in ./kernel/bpf/cpumap.c, also in general sk_lookup
> from XDP layer will likely be needed shortly anyways.

Thanks, John, for pointing to something I can start looking at.  My 
customer still wants to use the RPS that they already know, but I'll 
start looking into how this also might work for us.

sln

^ permalink raw reply

* Please Reply / Can i have a bit of your attention?
From: Bauer McDonalds @ 2018-10-10 16:30 UTC (permalink / raw)




-- 

Hello dear! I sincerely hope i do not bother you with my message.I know 
this might look strange because we don't know each other but i believe 
anything is possible if we try. I have actually lost confidence on 
dating sites and due to the nature of my work its not easy to find 
someone so i decided to give this a shot. Hey relax haha! I know you're 
still in shock Let me quickly introduce myself I'm Bauer McDonalds I'm a 
down to earth person Hardworking and Ambitious, I would love to meet a 
good person to be friends with and see where it leads to. Finding 
someone is very important for me right now because i have less busy days 
and i'm ready to give it my 100% to make this work when i find that 
special one, You could be the one :).. Feel free to write back and ask 
any questions about me and i will answer and i also promise to send 
pictures of myself when i get your reply.. Hope to hear back from you 
kisses & hugs


Bauer

^ permalink raw reply

* Re: [PATCH net-next] net: enable RPS on vlan devices
From: Shannon Nelson @ 2018-10-10 18:23 UTC (permalink / raw)
  To: Eric Dumazet, davem, netdev; +Cc: silviu.smarandache
In-Reply-To: <ff43dd14-37e8-115a-15b2-27fa4bbbfd28@gmail.com>

On 10/10/2018 10:14 AM, Eric Dumazet wrote:
> 
> On 10/10/2018 09:18 AM, Shannon Nelson wrote:
>> On 10/9/2018 7:17 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 10/09/2018 07:11 PM, Shannon Nelson wrote:
>>>>
>>>> Hence the reason we sent this as an RFC a couple of weeks ago.  We got no response, so followed up with this patch in order to get some input. Do you have any suggestions for how we might accomplish this in a less ugly way?
>>>
>>> I dunno, maybe a modern way for all these very specific needs would be to use an eBPF
>>> hook to implement whatever combination of RPS/RFS/what_have_you
>>>
>>> Then, we no longer have to review what various strategies are used by users.
>>
>> We're trying to make use of an existing useful feature that was designed for exactly this kind of problem.  It is already there and no new user training is needed.  We're actually fixing what could arguably be called a bug since the /sys/class/net/<dev>/queues/rx-0/rps_cpus entry exists for vlan devices but currently doesn't do anything.  We're also addressing a security concern related to the recent L1TF excitement.
>>
>> For this case, we want to target the network stack processing to happen on a certain subset of CPUs.  With admittedly only a cursory look through eBPF, I don't see an obvious way to target the packet processing to an alternative CPU, unless we add yet another field to the skb that eBPF/XDP could fill and then query that field in the same time as we currently check get_rps_cpu().  But adding to the skb is usually frowned upon unless absolutely necessary, and this seems like a duplication of what we already have with RPS, so why add a competing feature?
>>
>> Back to my earlier question: are there any suggestions for how we might accomplish this in a less ugly way?
> 
> 
> What if you want to have efficient multi queue processing ?
> The Vlan device could have multiple RX queues, but you forced queue_mapping=0

This would be easy enough to change to a simple modulus of a recorded 
queue mapping with the number of queues in the vlan device.  We can fix 
that.

> 
> Honestly, RPS & RFS show their age and complexity (look at net/core/net-sysfs.c ...)
> 
> We should not expand it, we should put in place a new infrastructure, fully expandable.
> With socket lookups, we even can avoid having a hashtable for flow information, removing
> one cache miss, and removing flow collisions.
> 
> eBPF seems perfect to me.

And yet specifying CPUs for processing is exactly what RPS was designed for.

> 
> It is time that we stop adding core infra that most users do not need/use.
> (RPS and RFS are default off)

For that matter, lots of features are "default off" until someone 
configures them, and there are lots of features that are only used by a 
subset of users.  In this case, we're trying to use something that 
already exists and arguably is broken for the vlan case.  Are there some 
technical reasons why this is not a good solution?

sln

^ permalink raw reply

* [net-next] tipc: support binding to specific ip address when activating UDP bearer
From: Hoang Le @ 2018-10-11  1:43 UTC (permalink / raw)
  To: jon.maloy, maloy, ying.xue, netdev, tipc-discussion

INADDR_ANY is hard-coded when activating UDP bearer. So, we could not
bind to a specific IP address even with replicast mode using - given
remote ip address instead of using multicast ip address.

In this commit, we fixed it by checking and switch to use appropriate
local ip address.

before:
$netstat -plu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
udp        0      0 **0.0.0.0:6118**            0.0.0.0:*

after:
$netstat -plu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
udp        0      0 **10.0.0.2:6118**           0.0.0.0:*

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
---
 net/tipc/udp_media.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index 9783101bc4a9..10dc59ce9c82 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -650,6 +650,7 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 	struct udp_tunnel_sock_cfg tuncfg = {NULL};
 	struct nlattr *opts[TIPC_NLA_UDP_MAX + 1];
 	u8 node_id[NODE_ID_LEN] = {0,};
+	int rmcast = 0;
 
 	ub = kzalloc(sizeof(*ub), GFP_ATOMIC);
 	if (!ub)
@@ -680,6 +681,9 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 	if (err)
 		goto err;
 
+	/* Checking remote ip address */
+	rmcast = tipc_udp_is_mcast_addr(&remote);
+
 	/* Autoconfigure own node identity if needed */
 	if (!tipc_own_id(net)) {
 		memcpy(node_id, local.ipv6.in6_u.u6_addr8, 16);
@@ -705,7 +709,12 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 			goto err;
 		}
 		udp_conf.family = AF_INET;
-		udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
+
+		/* Switch to use ANY to receive packets from group */
+		if (rmcast)
+			udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
+		else
+			udp_conf.local_ip.s_addr = local.ipv4.s_addr;
 		udp_conf.use_udp_checksums = false;
 		ub->ifindex = dev->ifindex;
 		if (tipc_mtu_bad(dev, sizeof(struct iphdr) +
@@ -719,7 +728,10 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 		udp_conf.family = AF_INET6;
 		udp_conf.use_udp6_tx_checksums = true;
 		udp_conf.use_udp6_rx_checksums = true;
-		udp_conf.local_ip6 = in6addr_any;
+		if (rmcast)
+			udp_conf.local_ip6 = in6addr_any;
+		else
+			udp_conf.local_ip6 = local.ipv6;
 		b->mtu = 1280;
 #endif
 	} else {
@@ -741,7 +753,7 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 	 * is used if it's a multicast address.
 	 */
 	memcpy(&b->bcast_addr.value, &remote, sizeof(remote));
-	if (tipc_udp_is_mcast_addr(&remote))
+	if (rmcast)
 		err = enable_mcast(ub, &remote);
 	else
 		err = tipc_udp_rcast_add(b, &remote);
-- 
2.17.1

^ permalink raw reply related

* [PATCH net-next v5] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Justin.Lee1 @ 2018-10-10 18:11 UTC (permalink / raw)
  To: sam, joel; +Cc: linux-aspeed, netdev, openbmc, amithash, christian, vijaykhemka

The new command (NCSI_CMD_SEND_CMD) is added to allow user space application
to send NC-SI command to the network card.
Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response.

The work flow is as below. 

Request:
User space application
	-> Netlink interface (msg)
	-> new Netlink handler - ncsi_send_cmd_nl()
	-> ncsi_xmit_cmd()

Response:
Response received - ncsi_rcv_rsp()
	-> internal response handler - ncsi_rsp_handler_xxx()
	-> ncsi_rsp_handler_netlink()
	-> ncsi_send_netlink_rsp ()
	-> Netlink interface (msg)
	-> user space application

Command timeout - ncsi_request_timeout()
	-> ncsi_send_netlink_timeout ()
	-> Netlink interface (msg with zero data length)
	-> user space application

Error:
Error detected
	-> ncsi_send_netlink_err ()
	-> Netlink interface (err msg)
	-> user space application


Signed-off-by: Justin Lee <justin.lee1@dell.com> 


---
V5: Update comments and debug message.
V4: Update comments and remove some debug message.
V3: Based on http://patchwork.ozlabs.org/patch/979688/ to remove the duplicated code.
V2: Remove non-related debug message and clean up the code.

 include/uapi/linux/ncsi.h |   6 ++
 net/ncsi/internal.h       |   7 ++
 net/ncsi/ncsi-cmd.c       |   8 ++
 net/ncsi/ncsi-manage.c    |  16 ++++
 net/ncsi/ncsi-netlink.c   | 200 ++++++++++++++++++++++++++++++++++++++++++++++
 net/ncsi/ncsi-netlink.h   |  12 +++
 net/ncsi/ncsi-rsp.c       |  67 ++++++++++++++--
 7 files changed, 311 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
index 4c292ec..0a26a55 100644
--- a/include/uapi/linux/ncsi.h
+++ b/include/uapi/linux/ncsi.h
@@ -23,6 +23,9 @@
  *	optionally the preferred NCSI_ATTR_CHANNEL_ID.
  * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
  *	Requires NCSI_ATTR_IFINDEX.
+ * @NCSI_CMD_SEND_CMD: send NC-SI command to network card.
+ *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID
+ *	and NCSI_ATTR_CHANNEL_ID.
  * @NCSI_CMD_MAX: highest command number
  */
 enum ncsi_nl_commands {
@@ -30,6 +33,7 @@ enum ncsi_nl_commands {
 	NCSI_CMD_PKG_INFO,
 	NCSI_CMD_SET_INTERFACE,
 	NCSI_CMD_CLEAR_INTERFACE,
+	NCSI_CMD_SEND_CMD,
 
 	__NCSI_CMD_AFTER_LAST,
 	NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
@@ -43,6 +47,7 @@ enum ncsi_nl_commands {
  * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
  * @NCSI_ATTR_PACKAGE_ID: package ID
  * @NCSI_ATTR_CHANNEL_ID: channel ID
+ * @NCSI_ATTR_DATA: command payload
  * @NCSI_ATTR_MAX: highest attribute number
  */
 enum ncsi_nl_attrs {
@@ -51,6 +56,7 @@ enum ncsi_nl_attrs {
 	NCSI_ATTR_PACKAGE_LIST,
 	NCSI_ATTR_PACKAGE_ID,
 	NCSI_ATTR_CHANNEL_ID,
+	NCSI_ATTR_DATA,
 
 	__NCSI_ATTR_AFTER_LAST,
 	NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
index 3d0a33b..13c9b5e 100644
--- a/net/ncsi/internal.h
+++ b/net/ncsi/internal.h
@@ -175,6 +175,8 @@ struct ncsi_package;
 #define NCSI_RESERVED_CHANNEL	0x1f
 #define NCSI_CHANNEL_INDEX(c)	((c) & ((1 << NCSI_PACKAGE_SHIFT) - 1))
 #define NCSI_TO_CHANNEL(p, c)	(((p) << NCSI_PACKAGE_SHIFT) | (c))
+#define NCSI_MAX_PACKAGE	8
+#define NCSI_MAX_CHANNEL	32
 
 struct ncsi_channel {
 	unsigned char               id;
@@ -220,11 +222,15 @@ struct ncsi_request {
 	bool                 used;    /* Request that has been assigned  */
 	unsigned int         flags;   /* NCSI request property           */
 #define NCSI_REQ_FLAG_EVENT_DRIVEN	1
+#define NCSI_REQ_FLAG_NETLINK_DRIVEN	2
 	struct ncsi_dev_priv *ndp;    /* Associated NCSI device          */
 	struct sk_buff       *cmd;    /* Associated NCSI command packet  */
 	struct sk_buff       *rsp;    /* Associated NCSI response packet */
 	struct timer_list    timer;   /* Timer on waiting for response   */
 	bool                 enabled; /* Time has been enabled or not    */
+	u32                  snd_seq;     /* netlink sending sequence number */
+	u32                  snd_portid;  /* netlink portid of sender        */
+	struct nlmsghdr      nlhdr;       /* netlink message header          */
 };
 
 enum {
@@ -310,6 +316,7 @@ struct ncsi_cmd_arg {
 		unsigned int   dwords[4];
 	};
 	unsigned char        *data;       /* NCSI OEM data                 */
+	struct genl_info     *info;       /* Netlink information           */
 };
 
 extern struct list_head ncsi_dev_list;
diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c
index 82b7d92..356af47 100644
--- a/net/ncsi/ncsi-cmd.c
+++ b/net/ncsi/ncsi-cmd.c
@@ -17,6 +17,7 @@
 #include <net/ncsi.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
@@ -346,6 +347,13 @@ int ncsi_xmit_cmd(struct ncsi_cmd_arg *nca)
 	if (!nr)
 		return -ENOMEM;
 
+	/* track netlink information */
+	if (nca->req_flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		nr->snd_seq = nca->info->snd_seq;
+		nr->snd_portid = nca->info->snd_portid;
+		nr->nlhdr = *nca->info->nlhdr;
+	}
+
 	/* Prepare the packet */
 	nca->id = nr->id;
 	ret = nch->handler(nr->cmd, nca);
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index 0912847..76a4bcb 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -19,6 +19,7 @@
 #include <net/addrconf.h>
 #include <net/ipv6.h>
 #include <net/if_inet6.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
@@ -406,6 +407,9 @@ static void ncsi_request_timeout(struct timer_list *t)
 {
 	struct ncsi_request *nr = from_timer(nr, t, timer);
 	struct ncsi_dev_priv *ndp = nr->ndp;
+	struct ncsi_package *np;
+	struct ncsi_channel *nc;
+	struct ncsi_cmd_pkt *cmd;
 	unsigned long flags;
 
 	/* If the request already had associated response,
@@ -419,6 +423,18 @@ static void ncsi_request_timeout(struct timer_list *t)
 	}
 	spin_unlock_irqrestore(&ndp->lock, flags);
 
+	if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		if (nr->cmd) {
+			/* Find the package */
+			cmd = (struct ncsi_cmd_pkt *)
+			      skb_network_header(nr->cmd);
+			ncsi_find_package_and_channel(ndp,
+						      cmd->cmd.common.channel,
+						      &np, &nc);
+			ncsi_send_netlink_timeout(nr, np, nc);
+		}
+	}
+
 	/* Release the request */
 	ncsi_free_request(nr);
 }
diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
index 45f33d6..eaee570 100644
--- a/net/ncsi/ncsi-netlink.c
+++ b/net/ncsi/ncsi-netlink.c
@@ -20,6 +20,7 @@
 #include <uapi/linux/ncsi.h>
 
 #include "internal.h"
+#include "ncsi-pkt.h"
 #include "ncsi-netlink.h"
 
 static struct genl_family ncsi_genl_family;
@@ -29,6 +30,7 @@ static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
 	[NCSI_ATTR_PACKAGE_LIST] =	{ .type = NLA_NESTED },
 	[NCSI_ATTR_PACKAGE_ID] =	{ .type = NLA_U32 },
 	[NCSI_ATTR_CHANNEL_ID] =	{ .type = NLA_U32 },
+	[NCSI_ATTR_DATA] =		{ .type = NLA_BINARY, .len = 2048 },
 };
 
 static struct ncsi_dev_priv *ndp_from_ifindex(struct net *net, u32 ifindex)
@@ -366,6 +368,198 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
 	return 0;
 }
 
+static int ncsi_send_cmd_nl(struct sk_buff *msg, struct genl_info *info)
+{
+	struct ncsi_dev_priv *ndp;
+
+	struct ncsi_cmd_arg nca;
+	struct ncsi_pkt_hdr *hdr;
+
+	u32 package_id, channel_id;
+	unsigned char *data;
+	int len, ret;
+
+	if (!info || !info->attrs) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_IFINDEX]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
+			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
+	if (!ndp) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
+	channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
+
+	if (package_id >= NCSI_MAX_PACKAGE || channel_id >= NCSI_MAX_CHANNEL) {
+		ret = -ERANGE;
+		goto out_netlink;
+	}
+
+	len = nla_len(info->attrs[NCSI_ATTR_DATA]);
+	if (len < sizeof(struct ncsi_pkt_hdr)) {
+		netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
+			    package_id);
+		ret = -EINVAL;
+		goto out_netlink;
+	} else {
+		data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
+	}
+
+	hdr = (struct ncsi_pkt_hdr *)data;
+
+	nca.ndp = ndp;
+	nca.package = (unsigned char)package_id;
+	nca.channel = (unsigned char)channel_id;
+	nca.type = hdr->type;
+	nca.req_flags = NCSI_REQ_FLAG_NETLINK_DRIVEN;
+	nca.info = info;
+	nca.payload = ntohs(hdr->length);
+	nca.data = data + sizeof(*hdr);
+
+	ret = ncsi_xmit_cmd(&nca);
+out_netlink:
+	if (ret != 0) {
+		netdev_err(ndp->ndev.dev,
+			   "NCSI: Error %d sending command\n",
+			   ret);
+		ncsi_send_netlink_err(ndp->ndev.dev,
+				      info->snd_seq,
+				      info->snd_portid,
+				      info->nlhdr,
+				      ret);
+	}
+out:
+	return ret;
+}
+
+int ncsi_send_netlink_rsp(struct ncsi_request *nr,
+			  struct ncsi_package *np,
+			  struct ncsi_channel *nc)
+{
+	struct sk_buff *skb;
+	struct net *net;
+	void *hdr;
+	int rc;
+
+	net = dev_net(nr->rsp->dev);
+
+	skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq,
+			  &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD);
+	if (!hdr) {
+		kfree_skb(skb);
+		return -EMSGSIZE;
+	}
+
+	nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->rsp->dev->ifindex);
+	if (np)
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id);
+	if (nc)
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL);
+
+	rc = nla_put(skb, NCSI_ATTR_DATA, nr->rsp->len, (void *)nr->rsp->data);
+	if (rc)
+		goto err;
+
+	genlmsg_end(skb, hdr);
+	return genlmsg_unicast(net, skb, nr->snd_portid);
+
+err:
+	kfree_skb(skb);
+	return rc;
+}
+
+int ncsi_send_netlink_timeout(struct ncsi_request *nr,
+			      struct ncsi_package *np,
+			      struct ncsi_channel *nc)
+{
+	struct sk_buff *skb;
+	struct net *net;
+	void *hdr;
+
+	skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq,
+			  &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD);
+	if (!hdr) {
+		kfree_skb(skb);
+		return -EMSGSIZE;
+	}
+
+	net = dev_net(nr->cmd->dev);
+
+	nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->cmd->dev->ifindex);
+
+	if (np)
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID,
+			    NCSI_PACKAGE_INDEX((((struct ncsi_pkt_hdr *)
+						 nr->cmd->data)->channel)));
+
+	if (nc)
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL);
+
+	genlmsg_end(skb, hdr);
+	return genlmsg_unicast(net, skb, nr->snd_portid);
+}
+
+int ncsi_send_netlink_err(struct net_device *dev,
+			  u32 snd_seq,
+			  u32 snd_portid,
+			  struct nlmsghdr *nlhdr,
+			  int err)
+{
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+	struct nlmsgerr *nle;
+	struct net *net;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	net = dev_net(dev);
+
+	nlh = nlmsg_put(skb, snd_portid, snd_seq,
+			NLMSG_ERROR, sizeof(*nle), 0);
+	nle = (struct nlmsgerr *)nlmsg_data(nlh);
+	nle->error = err;
+	memcpy(&nle->msg, nlhdr, sizeof(*nlh));
+
+	nlmsg_end(skb, nlh);
+
+	return nlmsg_unicast(net->genl_sock, skb, snd_portid);
+}
+
 static const struct genl_ops ncsi_ops[] = {
 	{
 		.cmd = NCSI_CMD_PKG_INFO,
@@ -386,6 +580,12 @@ static const struct genl_ops ncsi_ops[] = {
 		.doit = ncsi_clear_interface_nl,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = NCSI_CMD_SEND_CMD,
+		.policy = ncsi_genl_policy,
+		.doit = ncsi_send_cmd_nl,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family ncsi_genl_family __ro_after_init = {
diff --git a/net/ncsi/ncsi-netlink.h b/net/ncsi/ncsi-netlink.h
index 91a5c25..c4a4688 100644
--- a/net/ncsi/ncsi-netlink.h
+++ b/net/ncsi/ncsi-netlink.h
@@ -14,6 +14,18 @@
 
 #include "internal.h"
 
+int ncsi_send_netlink_rsp(struct ncsi_request *nr,
+			  struct ncsi_package *np,
+			  struct ncsi_channel *nc);
+int ncsi_send_netlink_timeout(struct ncsi_request *nr,
+			      struct ncsi_package *np,
+			      struct ncsi_channel *nc);
+int ncsi_send_netlink_err(struct net_device *dev,
+			  u32 snd_seq,
+			  u32 snd_portid,
+			  struct nlmsghdr *nlhdr,
+			  int err);
+
 int ncsi_init_netlink(struct net_device *dev);
 int ncsi_unregister_netlink(struct net_device *dev);
 
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index d66b347..dd931d2 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -16,9 +16,11 @@
 #include <net/ncsi.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
+#include "ncsi-netlink.h"
 
 static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 				 unsigned short payload)
@@ -32,15 +34,25 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 	 * before calling this function.
 	 */
 	h = (struct ncsi_rsp_pkt_hdr *)skb_network_header(nr->rsp);
-	if (h->common.revision != NCSI_PKT_REVISION)
+
+	if (h->common.revision != NCSI_PKT_REVISION) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: unsupported header revision\n");
 		return -EINVAL;
-	if (ntohs(h->common.length) != payload)
+	}
+	if (ntohs(h->common.length) != payload) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: payload length mismatched\n");
 		return -EINVAL;
+	}
 
 	/* Check on code and reason */
 	if (ntohs(h->code) != NCSI_PKT_RSP_C_COMPLETED ||
-	    ntohs(h->reason) != NCSI_PKT_RSP_R_NO_ERROR)
-		return -EINVAL;
+	    ntohs(h->reason) != NCSI_PKT_RSP_R_NO_ERROR) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: non zero response/reason code\n");
+		return -EPERM;
+	}
 
 	/* Validate checksum, which might be zeroes if the
 	 * sender doesn't support checksum according to NCSI
@@ -52,8 +64,11 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 
 	checksum = ncsi_calculate_checksum((unsigned char *)h,
 					   sizeof(*h) + payload - 4);
-	if (*pchecksum != htonl(checksum))
+
+	if (*pchecksum != htonl(checksum)) {
+		netdev_dbg(nr->ndp->ndev.dev, "NCSI: checksum mismatched\n");
 		return -EINVAL;
+	}
 
 	return 0;
 }
@@ -941,6 +956,26 @@ static int ncsi_rsp_handler_gpuuid(struct ncsi_request *nr)
 	return 0;
 }
 
+static int ncsi_rsp_handler_netlink(struct ncsi_request *nr)
+{
+	struct ncsi_rsp_pkt *rsp;
+	struct ncsi_dev_priv *ndp = nr->ndp;
+	struct ncsi_package *np;
+	struct ncsi_channel *nc;
+	int ret;
+
+	/* Find the package */
+	rsp = (struct ncsi_rsp_pkt *)skb_network_header(nr->rsp);
+	ncsi_find_package_and_channel(ndp, rsp->rsp.common.channel,
+				      &np, &nc);
+	if (!np)
+		return -ENODEV;
+
+	ret = ncsi_send_netlink_rsp(nr, np, nc);
+
+	return ret;
+}
+
 static struct ncsi_rsp_handler {
 	unsigned char	type;
 	int             payload;
@@ -1043,6 +1078,17 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
 		netdev_warn(ndp->ndev.dev,
 			    "NCSI: 'bad' packet ignored for type 0x%x\n",
 			    hdr->type);
+
+		if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+			if (ret == -EPERM)
+				goto out_netlink;
+			else
+				ncsi_send_netlink_err(ndp->ndev.dev,
+						      nr->snd_seq,
+						      nr->snd_portid,
+						      &nr->nlhdr,
+						      ret);
+		}
 		goto out;
 	}
 
@@ -1052,6 +1098,17 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
 		netdev_err(ndp->ndev.dev,
 			   "NCSI: Handler for packet type 0x%x returned %d\n",
 			   hdr->type, ret);
+
+out_netlink:
+	if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		ret = ncsi_rsp_handler_netlink(nr);
+		if (ret) {
+			netdev_err(ndp->ndev.dev,
+				   "NCSI: Netlink handler for packet type 0x%x returned %d\n",
+				   hdr->type, ret);
+		}
+	}
+
 out:
 	ncsi_free_request(nr);
 	return ret;
-- 
2.9.3



^ permalink raw reply related

* [PATCH] qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface
From: Giacinto Cifelli @ 2018-10-10 18:05 UTC (permalink / raw)
  To: netdev; +Cc: Giacinto Cifelli

Added support for Gemalto's Cinterion ALASxx WWAN interfaces
by adding QMI_FIXED_INTF with Cinterion's VID and PID.

Signed-off-by: Giacinto Cifelli <gciofono@gmail.com>
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 533b6fb8d923..72a55b6b4211 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1241,6 +1241,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x0b3c, 0xc00b, 4)},	/* Olivetti Olicard 500 */
 	{QMI_FIXED_INTF(0x1e2d, 0x0060, 4)},	/* Cinterion PLxx */
 	{QMI_FIXED_INTF(0x1e2d, 0x0053, 4)},	/* Cinterion PHxx,PXxx */
+	{QMI_FIXED_INTF(0x1e2d, 0x0063, 10)},	/* Cinterion ALASxx (1 RmNet) */
 	{QMI_FIXED_INTF(0x1e2d, 0x0082, 4)},	/* Cinterion PHxx,PXxx (2 RmNet) */
 	{QMI_FIXED_INTF(0x1e2d, 0x0082, 5)},	/* Cinterion PHxx,PXxx (2 RmNet) */
 	{QMI_FIXED_INTF(0x1e2d, 0x0083, 4)},	/* Cinterion PHxx,PXxx (1 RmNet + USB Audio)*/
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] bpf: bpftool, add support for attaching programs to maps
From: John Fastabend @ 2018-10-10 17:53 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: ast, daniel, netdev
In-Reply-To: <20181010101146.18e7d3ff@cakuba.netronome.com>

On 10/10/2018 10:11 AM, Jakub Kicinski wrote:
> On Wed, 10 Oct 2018 09:44:26 -0700, John Fastabend wrote:
>> Sock map/hash introduce support for attaching programs to maps. To
>> date I have been doing this with custom tooling but this is less than
>> ideal as we shift to using bpftool as the single CLI for our BPF uses.
>> This patch adds new sub commands 'attach' and 'detach' to the 'prog'
>> command to attach programs to maps and then detach them.
>>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> ---
>>  tools/bpf/bpftool/main.h |    1 +
>>  tools/bpf/bpftool/prog.c |   92 ++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 92 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
>> index 40492cd..9ceb2b6 100644
>> --- a/tools/bpf/bpftool/main.h
>> +++ b/tools/bpf/bpftool/main.h
>> @@ -137,6 +137,7 @@ int cmd_select(const struct cmd *cmds, int argc, char **argv,
>>  int do_cgroup(int argc, char **arg);
>>  int do_perf(int argc, char **arg);
>>  int do_net(int argc, char **arg);
>> +int do_attach_cmd(int argc, char **arg);
> 
> Looks like a leftover?
> 

Yeah original I made attach/detach its own top level command
but it seems better fit under prog.

[..]

>> +	if (!REQ_ARGS(4)) {
> 
> Hm, 4 or 5?  id $prog $type id $map ?
> 

Yep thanks.

[...]

>> +
>> +	NEXT_ARG();
> 
> nit: maybe NEXT_ARG() should be grouped with the code that consumes the
>      parameter, i.e. new line after not before?

sure.

> 
>> +	mapfd = map_parse_fd(&argc, &argv);
>> +	if (mapfd < 0)
>> +		return mapfd;
>> +
>> +	err = bpf_prog_attach(progfd, mapfd, attach_type, 0);
>> +	if (err) {
>> +		p_err("failed prog attach to map");
>> +		return -EINVAL;
>> +	}
> 
> Could you plunk a
> 
> if (json_output)
> 	jsonw_null(json_wtr);
> 
> here to always produce valid JSON even for commands with no output
> today?
> 

Makes sense.

> Same comments for detach.

[...]

> Would you mind updating the man page and the bash completions?
> 

Will do this. Thanks.

^ permalink raw reply

* RE: [PATCH net-next v4] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Justin.Lee1 @ 2018-10-10 17:50 UTC (permalink / raw)
  To: vijaykhemka, sam, joel; +Cc: linux-aspeed, netdev, openbmc, amithash, christian
In-Reply-To: <57729D0A-A210-41F4-BAF8-3BDB2FCBF1B5@fb.com>

Hi Vijay,

I will address the comment and send out the v5.

Thanks,
Justin


> Hi Justin,
> Please see minor comments below.
> 
> Regards
> -Vijay
> 
> On 10/10/18, 9:01 AM, "Justin.Lee1@Dell.com" <Justin.Lee1@Dell.com> wrote:
> 
>     The new command (NCSI_CMD_SEND_CMD) is added to allow user space application
>     to send NC-SI command to the network card.
>     Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response.
>     
>     The work flow is as below. 
>     
>     Request:
>     User space application
>     	-> Netlink interface (msg)
>     	-> new Netlink handler - ncsi_send_cmd_nl()
>     	-> ncsi_xmit_cmd()
>     
>     Response:
>     Response received - ncsi_rcv_rsp()
>     	-> internal response handler - ncsi_rsp_handler_xxx()
>     	-> ncsi_rsp_handler_netlink()
>     	-> ncsi_send_netlink_rsp ()
>     	-> Netlink interface (msg)
>     	-> user space application
>     
>     Command timeout - ncsi_request_timeout()
>     	-> ncsi_send_netlink_timeout ()
>     	-> Netlink interface (msg with zero data length)
>     	-> user space application
>     
>     Error:
>     Error detected
>     	-> ncsi_send_netlink_err ()
>     	-> Netlink interface (err msg)
>     	-> user space application
>     
>     
>     Signed-off-by: Justin Lee <justin.lee1@dell.com> 
>     
>     
>     ---
>     V4: Update comments and remove some debug message.
>     V3: Based on https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_979688_&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=v9MU0Ki9pWnTXCWwjHPVgpnCR80vXkkcrIaqU7USl5g&m=38KSrF_ThuoRmouLx-xKKkkpk9EfhNFEEqXbduQqQpE&s=GRXR1sIEzNo7xDKyUWS9t1Ita6vxEX_llzMx2azueVc&e= to remove the duplicated code.
>     V2: Remove non-related debug message and clean up the code.
>     
>     
>     --- a/net/ncsi/internal.h
>     +++ b/net/ncsi/internal.h
>     @@ -175,6 +175,8 @@ struct ncsi_package;
>      #define NCSI_RESERVED_CHANNEL	0x1f
>      #define NCSI_CHANNEL_INDEX(c)	((c) & ((1 << NCSI_PACKAGE_SHIFT) - 1))
>      #define NCSI_TO_CHANNEL(p, c)	(((p) << NCSI_PACKAGE_SHIFT) | (c))
>     +#define NCSI_MAX_PACKAGE	8
>     +#define NCSI_MAX_CHANNEL	32
>      
>      struct ncsi_channel {
>      	unsigned char               id;
>     @@ -219,12 +221,17 @@ struct ncsi_request {
>      	unsigned char        id;      /* Request ID - 0 to 255           */
>      	bool                 used;    /* Request that has been assigned  */
>      	unsigned int         flags;   /* NCSI request property           */
>     -#define NCSI_REQ_FLAG_EVENT_DRIVEN	1
>     +#define NCSI_REQ_FLAG_EVENT_DRIVEN		1
>   
>   Why extra space added above?
> 
>     +#define NCSI_REQ_FLAG_NETLINK_DRIVEN	2
>      	struct ncsi_dev_priv *ndp;    /* Associated NCSI device          */
>      	struct sk_buff       *cmd;    /* Associated NCSI command packet  */
>      	struct sk_buff       *rsp;    /* Associated NCSI response packet */
>      	struct timer_list    timer;   /* Timer on waiting for response   */
>      	bool                 enabled; /* Time has been enabled or not    */
>     +
>     +	u32                  snd_seq;     /* netlink sending sequence number */
>     +	u32                  snd_portid;  /* netlink portid of sender        */
>     +	struct nlmsghdr      nlhdr;       /* netlink message header          */
>      };
>      
>   Extra line inside structure may not be required.
> 
>      enum {
>     @@ -310,6 +317,7 @@ struct ncsi_cmd_arg {
>      		unsigned int   dwords[4];
>      	};
>      	unsigned char        *data;       /* NCSI OEM data                 */
>     +	struct genl_info     *info;       /* Netlink information           */
>      };
>      
>     diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
>     index 45f33d6..62e191f 100644
>     --- a/net/ncsi/ncsi-netlink.c
>     +++ b/net/ncsi/ncsi-netlink.c
>     @@ -20,6 +20,7 @@
>      #include <uapi/linux/ncsi.h>
>      
>      #include "internal.h"
>     +#include "ncsi-pkt.h"
>      #include "ncsi-netlink.h"
>      
>      static struct genl_family ncsi_genl_family;
>     @@ -29,6 +30,7 @@ static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
>      	[NCSI_ATTR_PACKAGE_LIST] =	{ .type = NLA_NESTED },
>      	[NCSI_ATTR_PACKAGE_ID] =	{ .type = NLA_U32 },
>      	[NCSI_ATTR_CHANNEL_ID] =	{ .type = NLA_U32 },
>     +	[NCSI_ATTR_DATA] =		{ .type = NLA_BINARY, .len = 2048 },
>      };
>      
>      static struct ncsi_dev_priv *ndp_from_ifindex(struct net *net, u32 ifindex)
>     @@ -366,6 +368,198 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
>      	return 0;
>      }
>      
>     +static int ncsi_send_cmd_nl(struct sk_buff *msg, struct genl_info *info)
>     +{
>     +	struct ncsi_dev_priv *ndp;
>     +
>     +	struct ncsi_cmd_arg nca;
>     +	struct ncsi_pkt_hdr *hdr;
>     +
>     +	u32 package_id, channel_id;
>     +	unsigned char *data;
>     +	int len, ret;
>     +
>     +	if (!info || !info->attrs) {
>     +		ret = -EINVAL;
>     +		goto out;
>     +	}
>     +
>     +	if (!info->attrs[NCSI_ATTR_IFINDEX]) {
>     +		ret = -EINVAL;
>     +		goto out;
>     +	}
>     +
>     +	if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) {
>     +		ret = -EINVAL;
>     +		goto out;
>     +	}
>     +
>     +	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
>     +		ret = -EINVAL;
>     +		goto out;
>     +	}
>     +
>     +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
>     +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
>     +	if (!ndp) {
>     +		ret = -ENODEV;
>     +		goto out;
>     +	}
>     +
>     +	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
>     +	channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
>     +
>     +	if (package_id >= NCSI_MAX_PACKAGE || channel_id >= NCSI_MAX_CHANNEL) {
>     +		ret = -ERANGE;
>     +		goto out_netlink;
>     +	}
>     +
>     +	len = nla_len(info->attrs[NCSI_ATTR_DATA]);
>     +	if (len < sizeof(struct ncsi_pkt_hdr)) {
>     +		netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
>     +			    package_id);
>     +		ret = -EINVAL;
>     +		goto out_netlink;
>     +	} else {
>     +		data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
>     +	}
>     +
>     +	hdr = (struct ncsi_pkt_hdr *)data;
>     +
>     +	nca.ndp = ndp;
>     +	nca.package = (unsigned char)package_id;
>     +	nca.channel = (unsigned char)channel_id;
>     +	nca.type = hdr->type;
>     +	nca.req_flags = NCSI_REQ_FLAG_NETLINK_DRIVEN;
>     +	nca.info = info;
>     +	nca.payload = ntohs(hdr->length);
>     +	nca.data = data + sizeof(*hdr);
>     +
>     +	ret = ncsi_xmit_cmd(&nca);
>     +out_netlink:
>     +	if (ret != 0) {
>     +		netdev_err(ndp->ndev.dev,
>     +			   "NCSI: Error %d sending OEM command\n",
>     +			   ret);
>     +		ncsi_send_netlink_err(ndp->ndev.dev,
>     +				      info->snd_seq,
>     +				      info->snd_portid,
>     +				      info->nlhdr,
>     +				      ret);
>     +	}
> 
>   OEM should be removed from above message.
> 
>     +out:
>     +	return ret;
>     +}
>     +



^ permalink raw reply

* Re: [PATCH bpf-next v2 3/7] bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
From: Mauricio Vasquez @ 2018-10-10 17:47 UTC (permalink / raw)
  To: Song Liu; +Cc: Alexei Starovoitov, Daniel Borkmann, Networking
In-Reply-To: <CAPhsuW5Hs3yEpgy61FHkkXQzkgD320RWOeA5n9PVpqRGsOLM6w@mail.gmail.com>



On 10/10/2018 11:48 AM, Song Liu wrote:
> On Wed, Oct 10, 2018 at 7:06 AM Mauricio Vasquez B
> <mauricio.vasquez@polito.it> wrote:
>> The following patch implements a bpf queue/stack maps that
>> provides the peek/pop/push functions.  There is not a direct
>> relationship between those functions and the current maps
>> syscalls, hence a new MAP_LOOKUP_AND_DELETE_ELEM syscall is added,
>> this is mapped to the pop operation in the queue/stack maps
>> and it is still to implement in other kind of maps.
>>
>> Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
>> ---
>>   include/linux/bpf.h      |    1 +
>>   include/uapi/linux/bpf.h |    1 +
>>   kernel/bpf/syscall.c     |   82 ++++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 84 insertions(+)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 9b558713447f..5793f0c7fbb5 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -39,6 +39,7 @@ struct bpf_map_ops {
>>          void *(*map_lookup_elem)(struct bpf_map *map, void *key);
>>          int (*map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags);
>>          int (*map_delete_elem)(struct bpf_map *map, void *key);
>> +       void *(*map_lookup_and_delete_elem)(struct bpf_map *map, void *key);
>>
>>          /* funcs called by prog_array and perf_event_array map */
>>          void *(*map_fd_get_ptr)(struct bpf_map *map, struct file *map_file,
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index f9187b41dff6..3bb94aa2d408 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -103,6 +103,7 @@ enum bpf_cmd {
>>          BPF_BTF_LOAD,
>>          BPF_BTF_GET_FD_BY_ID,
>>          BPF_TASK_FD_QUERY,
>> +       BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>>   };
>>
>>   enum bpf_map_type {
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index f36c080ad356..6907d661dea5 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -980,6 +980,85 @@ static int map_get_next_key(union bpf_attr *attr)
>>          return err;
>>   }
>>
>> +#define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
>> +
>> +static int map_lookup_and_delete_elem(union bpf_attr *attr)
>> +{
>> +       void __user *ukey = u64_to_user_ptr(attr->key);
>> +       void __user *uvalue = u64_to_user_ptr(attr->value);
>> +       int ufd = attr->map_fd;
>> +       struct bpf_map *map;
>> +       void *key, *value, *ptr;
>> +       u32 value_size;
>> +       struct fd f;
>> +       int err;
>> +
>> +       if (CHECK_ATTR(BPF_MAP_LOOKUP_AND_DELETE_ELEM))
>> +               return -EINVAL;
>> +
>> +       f = fdget(ufd);
>> +       map = __bpf_map_get(f);
>> +       if (IS_ERR(map))
>> +               return PTR_ERR(map);
>> +
>> +       if (!(f.file->f_mode & FMODE_CAN_WRITE)) {
>> +               err = -EPERM;
>> +               goto err_put;
>> +       }
>> +
>> +       key = __bpf_copy_key(ukey, map->key_size);
>> +       if (IS_ERR(key)) {
>> +               err = PTR_ERR(key);
>> +               goto err_put;
>> +       }
>> +
>> +       value_size = map->value_size;
>> +
>> +       err = -ENOMEM;
>> +       value = kmalloc(value_size, GFP_USER | __GFP_NOWARN);
>> +       if (!value)
>> +               goto free_key;
>> +
>> +       err = -EFAULT;
>> +       if (copy_from_user(value, uvalue, value_size) != 0)
>> +               goto free_value;
>> +
>> +       /* must increment bpf_prog_active to avoid kprobe+bpf triggering from
>> +        * inside bpf map update or delete otherwise deadlocks are possible
>> +        */
>> +       preempt_disable();
>> +       __this_cpu_inc(bpf_prog_active);
>> +       if (map->ops->map_lookup_and_delete_elem) {
>> +               rcu_read_lock();
>> +               ptr = map->ops->map_lookup_and_delete_elem(map, key);
>> +               if (ptr)
>> +                       memcpy(value, ptr, value_size);
> I think we are exposed to race condition with push and pop in parallel.
> map_lookup_and_delete_elem() only updates the head/tail, so it gives
> no protection for the buffer pointed by ptr.

queue/stack maps does not use this 'ptr', the pop operation directly 
copies the value into the buffer allocated in map_lookup_and_delete_elem().
The copy from the queue/stack buffer into 'value' and the head/tail 
update are protected by a spinlock in the queue/stack maps implementation.

On the other hand, future implementation of map_lookup_and_delete 
operation in other kind of maps should guarantee that the return ptr is 
rcu protected.

Does it make sense to you?

Thanks,
Mauricio.
> Thanks,
> Song
>
>> +               rcu_read_unlock();
>> +               err = ptr ? 0 : -ENOENT;
>> +       } else {
>> +               err = -ENOTSUPP;
>> +       }
>> +
>> +       __this_cpu_dec(bpf_prog_active);
>> +       preempt_enable();
>> +
>> +       if (err)
>> +               goto free_value;
>> +
>> +       if (copy_to_user(uvalue, value, value_size) != 0)
>> +               goto free_value;
>> +
>> +       err = 0;
>> +
>> +free_value:
>> +       kfree(value);
>> +free_key:
>> +       kfree(key);
>> +err_put:
>> +       fdput(f);
>> +       return err;
>> +}
>> +
>>   static const struct bpf_prog_ops * const bpf_prog_types[] = {
>>   #define BPF_PROG_TYPE(_id, _name) \
>>          [_id] = & _name ## _prog_ops,
>> @@ -2453,6 +2532,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>>          case BPF_TASK_FD_QUERY:
>>                  err = bpf_task_fd_query(&attr, uattr);
>>                  break;
>> +       case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
>> +               err = map_lookup_and_delete_elem(&attr);
>> +               break;
>>          default:
>>                  err = -EINVAL;
>>                  break;
>>

^ permalink raw reply

* Re: [PATCH net-next] net: enable RPS on vlan devices
From: John Fastabend @ 2018-10-10 17:37 UTC (permalink / raw)
  To: Eric Dumazet, Shannon Nelson, davem, netdev; +Cc: silviu.smarandache
In-Reply-To: <ff43dd14-37e8-115a-15b2-27fa4bbbfd28@gmail.com>

On 10/10/2018 10:14 AM, Eric Dumazet wrote:
> 
> 
> On 10/10/2018 09:18 AM, Shannon Nelson wrote:
>> On 10/9/2018 7:17 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 10/09/2018 07:11 PM, Shannon Nelson wrote:
>>>>
>>>> Hence the reason we sent this as an RFC a couple of weeks ago.  We got no response, so followed up with this patch in order to get some input. Do you have any suggestions for how we might accomplish this in a less ugly way?
>>>
>>> I dunno, maybe a modern way for all these very specific needs would be to use an eBPF
>>> hook to implement whatever combination of RPS/RFS/what_have_you
>>>
>>> Then, we no longer have to review what various strategies are used by users.
>>
>> We're trying to make use of an existing useful feature that was designed for exactly this kind of problem.  It is already there and no new user training is needed.  We're actually fixing what could arguably be called a bug since the /sys/class/net/<dev>/queues/rx-0/rps_cpus entry exists for vlan devices but currently doesn't do anything.  We're also addressing a security concern related to the recent L1TF excitement.
>>
>> For this case, we want to target the network stack processing to happen on a certain subset of CPUs.  With admittedly only a cursory look through eBPF, I don't see an obvious way to target the packet processing to an alternative CPU, unless we add yet another field to the skb that eBPF/XDP could fill and then query that field in the same time as we currently check get_rps_cpu().  But adding to the skb is usually frowned upon unless absolutely necessary, and this seems like a duplication of what we already have with RPS, so why add a competing feature?
>>
>> Back to my earlier question: are there any suggestions for how we might accomplish this in a less ugly way?
> 
> 
> What if you want to have efficient multi queue processing ?
> The Vlan device could have multiple RX queues, but you forced queue_mapping=0
> 
> Honestly, RPS & RFS show their age and complexity (look at net/core/net-sysfs.c ...)
> 
> We should not expand it, we should put in place a new infrastructure, fully expandable.
> With socket lookups, we even can avoid having a hashtable for flow information, removing
> one cache miss, and removing flow collisions.
> 
> eBPF seems perfect to me.
> 

Latest tree has a sk_lookup() helper supported in 'tc' layer now
to lookup the socket. And XDP has support for a "cpumap" object
that allows redirect to remote CPUs. Neither was specifically
designed for this but I suspect with some extra work these might
be what is needed.

I would start by looking at bpf_sk_lookup() in filter.c and the
cpumap type in ./kernel/bpf/cpumap.c, also in general sk_lookup
from XDP layer will likely be needed shortly anyways.

> It is time that we stop adding core infra that most users do not need/use.
> (RPS and RFS are default off)
> 

^ permalink raw reply

* Re: [PATCH] r8152: limit MAC pass-through to one device
From: David Miller @ 2018-10-10 17:30 UTC (permalink / raw)
  To: oneukum; +Cc: Mario.Limonciello, jkohoutek, netdev
In-Reply-To: <1539191825.691.1.camel@suse.com>

From: Oliver Neukum <oneukum@suse.com>
Date: Wed, 10 Oct 2018 19:17:05 +0200

> On Mi, 2018-10-10 at 17:18 +0000, Mario.Limonciello@dell.com wrote:
>> > 
>> > MAC address having to be unique, a MAC coming from the host
>> > must be used at most once at a time. Hence the users must
>> > be recorded and additional users must fall back to conventional
>> > methods.
>> 
>> I checked with the internal team and actually applies pass through MAC on both
>> Windows Realtek driver and in UEFI network stack this applies to ALL supported
>> Realtek devices (R8153-AD).
> 
> I may have formulated this badly. What happens if you attach two
> devices of this type to the same host?

If the devices are on different physical network segments it will
work.

This was common back in the day even.  Old Sparc machines had a host
defined MAC address, and even with multi-port cards they would all use
that same MAC address unless the firmware nodes had individual device
MAC addresses defined.

I think it's a valid configuration.

But I can see that elements of userspace such as udev will likely not
be happy because their device naming schemes are built on the
assumption that all MAC addresses are unique.

^ permalink raw reply

* Re: [PATCH] r8152: limit MAC pass-through to one device
From: Oliver Neukum @ 2018-10-10 17:17 UTC (permalink / raw)
  To: Mario.Limonciello, davem, Jan Kohoutek, netdev
In-Reply-To: <ddf3edbee9754d3684adfec6bb0f220a@ausx13mpc120.AMER.DELL.COM>

On Mi, 2018-10-10 at 17:18 +0000, Mario.Limonciello@dell.com wrote:
> > 
> > MAC address having to be unique, a MAC coming from the host
> > must be used at most once at a time. Hence the users must
> > be recorded and additional users must fall back to conventional
> > methods.
> 
> I checked with the internal team and actually applies pass through MAC on both
> Windows Realtek driver and in UEFI network stack this applies to ALL supported
> Realtek devices (R8153-AD).

I may have formulated this badly. What happens if you attach two
devices of this type to the same host?

	Regards
		Oliver

^ permalink raw reply

* Re: [PATCH net] net: make skb_partial_csum_set() more robust against overflows
From: David Miller @ 2018-10-10 17:22 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, herbert
In-Reply-To: <20181010135935.21533-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Wed, 10 Oct 2018 06:59:35 -0700

> syzbot managed to crash in skb_checksum_help() [1] :
> 
>         BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
> 
> Root cause is the following check in skb_partial_csum_set()
> 
> 	if (unlikely(start > skb_headlen(skb)) ||
> 	    unlikely((int)start + off > skb_headlen(skb) - 2))
> 		return false;
> 
> If skb_headlen(skb) is 1, then (skb_headlen(skb) - 2) becomes 0xffffffff
> and the check fails to detect that ((int)start + off) is off the limit,
> since the compare is unsigned.
> 
> When we fix that, then the first condition (start > skb_headlen(skb))
> becomes obsolete.
> 
> Then we should also check that (skb_headroom(skb) + start) wont
> overflow 16bit field.
> 
> [1]
> kernel BUG at net/core/dev.c:2880!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 7330 Comm: syz-executor4 Not tainted 4.19.0-rc6+ #253
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:skb_checksum_help+0x9e3/0xbb0 net/core/dev.c:2880
> Code: 85 00 ff ff ff 48 c1 e8 03 42 80 3c 28 00 0f 84 09 fb ff ff 48 8b bd 00 ff ff ff e8 97 a8 b9 fb e9 f8 fa ff ff e8 2d 09 76 fb <0f> 0b 48 8b bd 28 ff ff ff e8 1f a8 b9 fb e9 b1 f6 ff ff 48 89 cf
> RSP: 0018:ffff8801d83a6f60 EFLAGS: 00010293
> RAX: ffff8801b9834380 RBX: ffff8801b9f8d8c0 RCX: ffffffff8608c6d7
> RDX: 0000000000000000 RSI: ffffffff8608cc63 RDI: 0000000000000006
> RBP: ffff8801d83a7068 R08: ffff8801b9834380 R09: 0000000000000000
> R10: ffff8801d83a76d8 R11: 0000000000000000 R12: 0000000000000001
> R13: 0000000000010001 R14: 000000000000ffff R15: 00000000000000a8
> FS:  00007f1a66db5700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f7d77f091b0 CR3: 00000001ba252000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  skb_csum_hwoffload_help+0x8f/0xe0 net/core/dev.c:3269
>  validate_xmit_skb+0xa2a/0xf30 net/core/dev.c:3312
>  __dev_queue_xmit+0xc2f/0x3950 net/core/dev.c:3797
>  dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
>  packet_snd net/packet/af_packet.c:2928 [inline]
>  packet_sendmsg+0x422d/0x64c0 net/packet/af_packet.c:2953
> 
> Fixes: 5ff8dda3035d ("net: Ensure partial checksum offset is inside the skb head")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply

* Re: [PATCH net-next v4] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Vijay Khemka @ 2018-10-10 17:19 UTC (permalink / raw)
  To: Justin.Lee1@Dell.com, sam@mendozajonas.com, joel@jms.id.au
  Cc: linux-aspeed@lists.ozlabs.org, netdev@vger.kernel.org,
	openbmc@lists.ozlabs.org, Amithash Prasad, christian@cmd.nu
In-Reply-To: <7c7f95baba3a467e89aabf726a5230f7@AUSX13MPS302.AMER.DELL.COM>

Hi Justin,
Please see minor comments below.

Regards
-Vijay

On 10/10/18, 9:01 AM, "Justin.Lee1@Dell.com" <Justin.Lee1@Dell.com> wrote:

    The new command (NCSI_CMD_SEND_CMD) is added to allow user space application
    to send NC-SI command to the network card.
    Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response.
    
    The work flow is as below. 
    
    Request:
    User space application
    	-> Netlink interface (msg)
    	-> new Netlink handler - ncsi_send_cmd_nl()
    	-> ncsi_xmit_cmd()
    
    Response:
    Response received - ncsi_rcv_rsp()
    	-> internal response handler - ncsi_rsp_handler_xxx()
    	-> ncsi_rsp_handler_netlink()
    	-> ncsi_send_netlink_rsp ()
    	-> Netlink interface (msg)
    	-> user space application
    
    Command timeout - ncsi_request_timeout()
    	-> ncsi_send_netlink_timeout ()
    	-> Netlink interface (msg with zero data length)
    	-> user space application
    
    Error:
    Error detected
    	-> ncsi_send_netlink_err ()
    	-> Netlink interface (err msg)
    	-> user space application
    
    
    Signed-off-by: Justin Lee <justin.lee1@dell.com> 
    
    
    ---
    V4: Update comments and remove some debug message.
    V3: Based on https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_patch_979688_&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=v9MU0Ki9pWnTXCWwjHPVgpnCR80vXkkcrIaqU7USl5g&m=38KSrF_ThuoRmouLx-xKKkkpk9EfhNFEEqXbduQqQpE&s=GRXR1sIEzNo7xDKyUWS9t1Ita6vxEX_llzMx2azueVc&e= to remove the duplicated code.
    V2: Remove non-related debug message and clean up the code.
    
    
    --- a/net/ncsi/internal.h
    +++ b/net/ncsi/internal.h
    @@ -175,6 +175,8 @@ struct ncsi_package;
     #define NCSI_RESERVED_CHANNEL	0x1f
     #define NCSI_CHANNEL_INDEX(c)	((c) & ((1 << NCSI_PACKAGE_SHIFT) - 1))
     #define NCSI_TO_CHANNEL(p, c)	(((p) << NCSI_PACKAGE_SHIFT) | (c))
    +#define NCSI_MAX_PACKAGE	8
    +#define NCSI_MAX_CHANNEL	32
     
     struct ncsi_channel {
     	unsigned char               id;
    @@ -219,12 +221,17 @@ struct ncsi_request {
     	unsigned char        id;      /* Request ID - 0 to 255           */
     	bool                 used;    /* Request that has been assigned  */
     	unsigned int         flags;   /* NCSI request property           */
    -#define NCSI_REQ_FLAG_EVENT_DRIVEN	1
    +#define NCSI_REQ_FLAG_EVENT_DRIVEN		1
  
  Why extra space added above?

    +#define NCSI_REQ_FLAG_NETLINK_DRIVEN	2
     	struct ncsi_dev_priv *ndp;    /* Associated NCSI device          */
     	struct sk_buff       *cmd;    /* Associated NCSI command packet  */
     	struct sk_buff       *rsp;    /* Associated NCSI response packet */
     	struct timer_list    timer;   /* Timer on waiting for response   */
     	bool                 enabled; /* Time has been enabled or not    */
    +
    +	u32                  snd_seq;     /* netlink sending sequence number */
    +	u32                  snd_portid;  /* netlink portid of sender        */
    +	struct nlmsghdr      nlhdr;       /* netlink message header          */
     };
     
  Extra line inside structure may not be required.

     enum {
    @@ -310,6 +317,7 @@ struct ncsi_cmd_arg {
     		unsigned int   dwords[4];
     	};
     	unsigned char        *data;       /* NCSI OEM data                 */
    +	struct genl_info     *info;       /* Netlink information           */
     };
     
    diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
    index 45f33d6..62e191f 100644
    --- a/net/ncsi/ncsi-netlink.c
    +++ b/net/ncsi/ncsi-netlink.c
    @@ -20,6 +20,7 @@
     #include <uapi/linux/ncsi.h>
     
     #include "internal.h"
    +#include "ncsi-pkt.h"
     #include "ncsi-netlink.h"
     
     static struct genl_family ncsi_genl_family;
    @@ -29,6 +30,7 @@ static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
     	[NCSI_ATTR_PACKAGE_LIST] =	{ .type = NLA_NESTED },
     	[NCSI_ATTR_PACKAGE_ID] =	{ .type = NLA_U32 },
     	[NCSI_ATTR_CHANNEL_ID] =	{ .type = NLA_U32 },
    +	[NCSI_ATTR_DATA] =		{ .type = NLA_BINARY, .len = 2048 },
     };
     
     static struct ncsi_dev_priv *ndp_from_ifindex(struct net *net, u32 ifindex)
    @@ -366,6 +368,198 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
     	return 0;
     }
     
    +static int ncsi_send_cmd_nl(struct sk_buff *msg, struct genl_info *info)
    +{
    +	struct ncsi_dev_priv *ndp;
    +
    +	struct ncsi_cmd_arg nca;
    +	struct ncsi_pkt_hdr *hdr;
    +
    +	u32 package_id, channel_id;
    +	unsigned char *data;
    +	int len, ret;
    +
    +	if (!info || !info->attrs) {
    +		ret = -EINVAL;
    +		goto out;
    +	}
    +
    +	if (!info->attrs[NCSI_ATTR_IFINDEX]) {
    +		ret = -EINVAL;
    +		goto out;
    +	}
    +
    +	if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) {
    +		ret = -EINVAL;
    +		goto out;
    +	}
    +
    +	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
    +		ret = -EINVAL;
    +		goto out;
    +	}
    +
    +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
    +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
    +	if (!ndp) {
    +		ret = -ENODEV;
    +		goto out;
    +	}
    +
    +	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
    +	channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
    +
    +	if (package_id >= NCSI_MAX_PACKAGE || channel_id >= NCSI_MAX_CHANNEL) {
    +		ret = -ERANGE;
    +		goto out_netlink;
    +	}
    +
    +	len = nla_len(info->attrs[NCSI_ATTR_DATA]);
    +	if (len < sizeof(struct ncsi_pkt_hdr)) {
    +		netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
    +			    package_id);
    +		ret = -EINVAL;
    +		goto out_netlink;
    +	} else {
    +		data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
    +	}
    +
    +	hdr = (struct ncsi_pkt_hdr *)data;
    +
    +	nca.ndp = ndp;
    +	nca.package = (unsigned char)package_id;
    +	nca.channel = (unsigned char)channel_id;
    +	nca.type = hdr->type;
    +	nca.req_flags = NCSI_REQ_FLAG_NETLINK_DRIVEN;
    +	nca.info = info;
    +	nca.payload = ntohs(hdr->length);
    +	nca.data = data + sizeof(*hdr);
    +
    +	ret = ncsi_xmit_cmd(&nca);
    +out_netlink:
    +	if (ret != 0) {
    +		netdev_err(ndp->ndev.dev,
    +			   "NCSI: Error %d sending OEM command\n",
    +			   ret);
    +		ncsi_send_netlink_err(ndp->ndev.dev,
    +				      info->snd_seq,
    +				      info->snd_portid,
    +				      info->nlhdr,
    +				      ret);
    +	}

  OEM should be removed from above message.

    +out:
    +	return ret;
    +}
    +
     
    
    


^ permalink raw reply

* RE: [PATCH] r8152: limit MAC pass-through to one device
From: Mario.Limonciello @ 2018-10-10 17:18 UTC (permalink / raw)
  To: oneukum, netdev, davem, jkohoutek
In-Reply-To: <20181010142933.31051-1-oneukum@suse.com>

> 
> MAC address having to be unique, a MAC coming from the host
> must be used at most once at a time. Hence the users must
> be recorded and additional users must fall back to conventional
> methods.

I checked with the internal team and actually applies pass through MAC on both
Windows Realtek driver and in UEFI network stack this applies to ALL supported
Realtek devices (R8153-AD).

> 
> Signed-off-by: Oliver Neukum <oneukum@suse.com>
> Fixes: 34ee32c9a5696 ("r8152: Add support for setting pass through MAC address on
> RTL8153-AD")
> ---
>  drivers/net/usb/r8152.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index f1b5201cc320..7345a2258ee4 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -766,6 +766,9 @@ enum tx_csum_stat {
>  	TX_CSUM_NONE
>  };
> 
> +/* pass through MACs are per host, hence concurrent use is forbidden */
> +static struct r8152 *pass_through_user = NULL;
> +
>  /* Maximum number of multicast addresses to filter (vs. Rx-all-multicast).
>   * The RTL chips use a 64 element hash table based on the Ethernet CRC.
>   */
> @@ -1221,7 +1224,14 @@ static int set_ethernet_addr(struct r8152 *tp)
>  		 * or system doesn't provide valid _SB.AMAC this will be
>  		 * be expected to non-zero
>  		 */
> -		ret = vendor_mac_passthru_addr_read(tp, &sa);
> +		if (!pass_through_user) {
> +			ret = vendor_mac_passthru_addr_read(tp, &sa);
> +			if (ret >= 0)
> +				/* we must record the user against concurrent use
> */
> +				pass_through_user = tp;
> +		} else {
> +			ret = -EBUSY;
> +		}
>  		if (ret < 0)
>  			ret = pla_ocp_read(tp, PLA_BACKUP, 8, sa.sa_data);
>  	}
> @@ -5304,6 +5314,8 @@ static void rtl8152_disconnect(struct usb_interface *intf)
>  		cancel_delayed_work_sync(&tp->hw_phy_work);
>  		tp->rtl_ops.unload(tp);
>  		free_netdev(tp->netdev);
> +		if (pass_through_user == tp)
> +			pass_through_user = NULL;
>  	}
>  }
> 
> --
> 2.16.4

^ permalink raw reply

* Re: [PATCH net-next] net: enable RPS on vlan devices
From: Eric Dumazet @ 2018-10-10 17:14 UTC (permalink / raw)
  To: Shannon Nelson, Eric Dumazet, davem, netdev; +Cc: silviu.smarandache
In-Reply-To: <6d7227d4-a7e0-0979-42bc-c7af61b35192@oracle.com>



On 10/10/2018 09:18 AM, Shannon Nelson wrote:
> On 10/9/2018 7:17 PM, Eric Dumazet wrote:
>>
>>
>> On 10/09/2018 07:11 PM, Shannon Nelson wrote:
>>>
>>> Hence the reason we sent this as an RFC a couple of weeks ago.  We got no response, so followed up with this patch in order to get some input. Do you have any suggestions for how we might accomplish this in a less ugly way?
>>
>> I dunno, maybe a modern way for all these very specific needs would be to use an eBPF
>> hook to implement whatever combination of RPS/RFS/what_have_you
>>
>> Then, we no longer have to review what various strategies are used by users.
> 
> We're trying to make use of an existing useful feature that was designed for exactly this kind of problem.  It is already there and no new user training is needed.  We're actually fixing what could arguably be called a bug since the /sys/class/net/<dev>/queues/rx-0/rps_cpus entry exists for vlan devices but currently doesn't do anything.  We're also addressing a security concern related to the recent L1TF excitement.
> 
> For this case, we want to target the network stack processing to happen on a certain subset of CPUs.  With admittedly only a cursory look through eBPF, I don't see an obvious way to target the packet processing to an alternative CPU, unless we add yet another field to the skb that eBPF/XDP could fill and then query that field in the same time as we currently check get_rps_cpu().  But adding to the skb is usually frowned upon unless absolutely necessary, and this seems like a duplication of what we already have with RPS, so why add a competing feature?
> 
> Back to my earlier question: are there any suggestions for how we might accomplish this in a less ugly way?


What if you want to have efficient multi queue processing ?
The Vlan device could have multiple RX queues, but you forced queue_mapping=0

Honestly, RPS & RFS show their age and complexity (look at net/core/net-sysfs.c ...)

We should not expand it, we should put in place a new infrastructure, fully expandable.
With socket lookups, we even can avoid having a hashtable for flow information, removing
one cache miss, and removing flow collisions.

eBPF seems perfect to me.

It is time that we stop adding core infra that most users do not need/use.
(RPS and RFS are default off)

^ permalink raw reply

* Re: [PATCH] bpf: bpftool, add support for attaching programs to maps
From: Jakub Kicinski @ 2018-10-10 17:11 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev
In-Reply-To: <20181010164426.9321.42364.stgit@john-Precision-Tower-5810>

On Wed, 10 Oct 2018 09:44:26 -0700, John Fastabend wrote:
> Sock map/hash introduce support for attaching programs to maps. To
> date I have been doing this with custom tooling but this is less than
> ideal as we shift to using bpftool as the single CLI for our BPF uses.
> This patch adds new sub commands 'attach' and 'detach' to the 'prog'
> command to attach programs to maps and then detach them.
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  tools/bpf/bpftool/main.h |    1 +
>  tools/bpf/bpftool/prog.c |   92 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 92 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
> index 40492cd..9ceb2b6 100644
> --- a/tools/bpf/bpftool/main.h
> +++ b/tools/bpf/bpftool/main.h
> @@ -137,6 +137,7 @@ int cmd_select(const struct cmd *cmds, int argc, char **argv,
>  int do_cgroup(int argc, char **arg);
>  int do_perf(int argc, char **arg);
>  int do_net(int argc, char **arg);
> +int do_attach_cmd(int argc, char **arg);

Looks like a leftover?

>  int prog_parse_fd(int *argc, char ***argv);
>  int map_parse_fd(int *argc, char ***argv);
> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> index b1cd3bc..280881d 100644
> --- a/tools/bpf/bpftool/prog.c
> +++ b/tools/bpf/bpftool/prog.c
> @@ -77,6 +77,26 @@
>  	[BPF_PROG_TYPE_FLOW_DISSECTOR]	= "flow_dissector",
>  };
>  
> +static const char * const attach_type_strings[] = {
> +	[BPF_SK_SKB_STREAM_PARSER] = "stream_parser",
> +	[BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict",
> +	[BPF_SK_MSG_VERDICT] = "msg_verdict",
> +	[__MAX_BPF_ATTACH_TYPE] = NULL,
> +};
> +
> +enum bpf_attach_type parse_attach_type(const char *str)
> +{
> +	enum bpf_attach_type type;
> +
> +	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
> +		if (attach_type_strings[type] &&
> +		    is_prefix(str, attach_type_strings[type]))
> +			return type;
> +	}
> +
> +	return __MAX_BPF_ATTACH_TYPE;
> +}
> +
>  static void print_boot_time(__u64 nsecs, char *buf, unsigned int size)
>  {
>  	struct timespec real_time_ts, boot_time_ts;
> @@ -697,6 +717,71 @@ int map_replace_compar(const void *p1, const void *p2)
>  	return a->idx - b->idx;
>  }
>  
> +static int do_attach(int argc, char **argv)
> +{
> +	enum bpf_attach_type attach_type;
> +	int err, mapfd, progfd;
> +
> +	if (!REQ_ARGS(4)) {

Hm, 4 or 5?  id $prog $type id $map ?

> +		p_err("too few parameters for map attach");
> +		return -EINVAL;
> +	}
> +
> +	progfd = prog_parse_fd(&argc, &argv);
> +	if (progfd < 0)
> +		return progfd;
> +
> +	attach_type = parse_attach_type(*argv);
> +	if (attach_type == __MAX_BPF_ATTACH_TYPE) {
> +		p_err("invalid attach type");
> +		return -EINVAL;
> +	}
> +
> +	NEXT_ARG();

nit: maybe NEXT_ARG() should be grouped with the code that consumes the
     parameter, i.e. new line after not before?

> +	mapfd = map_parse_fd(&argc, &argv);
> +	if (mapfd < 0)
> +		return mapfd;
> +
> +	err = bpf_prog_attach(progfd, mapfd, attach_type, 0);
> +	if (err) {
> +		p_err("failed prog attach to map");
> +		return -EINVAL;
> +	}

Could you plunk a

if (json_output)
	jsonw_null(json_wtr);

here to always produce valid JSON even for commands with no output
today?

Same comments for detach.

> +	return 0;
> +}
> +
> +static int do_detach(int argc, char **argv)
> +{

> +}
>  static int do_load(int argc, char **argv)
>  {
>  	enum bpf_attach_type expected_attach_type;
> @@ -942,6 +1027,7 @@ static int do_help(int argc, char **argv)
>  		"       %s %s pin   PROG FILE\n"
>  		"       %s %s load  OBJ  FILE [type TYPE] [dev NAME] \\\n"
>  		"                         [map { idx IDX | name NAME } MAP]\n"
> +		"       %s %s attach PROG ATTACH_TYPE MAP\n"
>  		"       %s %s help\n"
>  		"\n"
>  		"       " HELP_SPEC_MAP "\n"
> @@ -953,10 +1039,12 @@ static int do_help(int argc, char **argv)
>  		"                 cgroup/bind4 | cgroup/bind6 | cgroup/post_bind4 |\n"
>  		"                 cgroup/post_bind6 | cgroup/connect4 | cgroup/connect6 |\n"
>  		"                 cgroup/sendmsg4 | cgroup/sendmsg6 }\n"
> +		"       ATTACH_TYPE := { msg_verdict | skb_verdict | skb_parse }\n"
>  		"       " HELP_SPEC_OPTIONS "\n"
>  		"",
>  		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
> -		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
> +		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
> +		bin_name, argv[-2]);
>  
>  	return 0;
>  }
> @@ -968,6 +1056,8 @@ static int do_help(int argc, char **argv)
>  	{ "dump",	do_dump },
>  	{ "pin",	do_pin },
>  	{ "load",	do_load },
> +	{ "attach",	do_attach },
> +	{ "detach",	do_detach },
>  	{ 0 }
>  };

Would you mind updating the man page and the bash completions?

^ permalink raw reply

* Re: sparc64 mystery with Cheetah+ D-cache parity error (n_tty_set_termios, bpf_check, cheetah_copy_page_insn)
From: David Miller @ 2018-10-10 17:08 UTC (permalink / raw)
  To: mroos; +Cc: sparclinux, netdev, gregkh, jslaby
In-Reply-To: <a4391645-0c4f-bd99-7371-b0e2cb505542@linux.ee>

From: Meelis Roos <mroos@linux.ee>
Date: Wed, 10 Oct 2018 16:24:59 +0300

> I have seen multiple strange messages like this, on multiple sparc64
> machines:
> 
> [ 55.523882] CPU[1]: Cheetah+ D-cache parity error at
> TPC[0000000000707e8c]
> [   55.626033] TPC<n_tty_set_termios+0x2c/0x3c0>
> 
> This specfic one ise from n_tty_set_termios and it is currently
> repeatable on a Sun V210.
> I have seen these on V245 and V445 too, with different addresses. On
> V445, the same address caused
> errors on multiple CPUs so it does not seem like a hardware problem,
> rather something software releated,
> that's why I am reporting it here.
> 
> On V445 it is gone with my current custom kernels but was there with
> 4.16.0-1-sparc64-smp Debian kernel package,
> probabaly because I do not have bpfilter compiled in:

I just started getting my older machines up again and I get these
kinds of errors sometimes too, usually in the TSB copy routine.

I'll definitely be looking into this and thanks for the report and
data.

^ permalink raw reply

* Re: [PATCH v9 00/15] octeontx2-af: Add RVU Admin Function driver
From: David Miller @ 2018-10-10 17:07 UTC (permalink / raw)
  To: sunil.kovvuri; +Cc: netdev, arnd, linux-soc, sgoutham
In-Reply-To: <1539175475-5351-1-git-send-email-sunil.kovvuri@gmail.com>

From: sunil.kovvuri@gmail.com
Date: Wed, 10 Oct 2018 18:14:20 +0530

> Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW
> resources from the network, crypto and other functional blocks into
> PCI-compatible physical and virtual functions. Each functional block
> again has multiple local functions (LFs) for provisioning to PCI devices.
> RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual
> functions (VFs). PF0 is called the administrative / admin function (AF)
> and has privileges to provision RVU functional block's LFs to each of the
> PF/VF.
> 
> RVU managed networking functional blocks
>  - Network pool allocator (NPA)
>  - Network interface controller (NIX)
>  - Network parser CAM (NPC)
>  - Schedule/Synchronize/Order unit (SSO)
> 
> RVU managed non-networking functional blocks
>  - Crypto accelerator (CPT)
>  - Scheduled timers unit (TIM)
>  - Schedule/Synchronize/Order unit (SSO)
>    Used for both networking and non networking usecases
>  - Compression (upcoming in future variants of the silicons)
> 
> Resource provisioning examples
>  - A PF/VF with NIX-LF & NPA-LF resources works as a pure network device
>  - A PF/VF with CPT-LF resource works as a pure cyrpto offload device.
> 
> This admin function driver neither receives any data nor processes it i.e
> no I/O, a configuration only driver.
> 
> PF/VFs communicates with AF via a shared memory region (mailbox). Upon
> receiving requests from PF/VF, AF does resource provisioning and other
> HW configuration. AF is always attached to host, but PF/VFs may be used
> by host kernel itself, or attached to VMs or to userspace applications
> like DPDK etc. So AF has to handle provisioning/configuration requests
> sent by any device from any domain.
> 
> This patch series adds logic for the following
>  - RVU AF driver with functional blocks provisioning support.
>  - Mailbox infrastructure for communication between AF and PFs.
>  - CGX (MAC controller) driver which communicates with firmware for
>    managing  physical ethernet interfaces. AF collects info from this
>    driver and forwards the same to the PF/VFs uaing these interfaces.
> 
> This is the first set of patches out of 80+ patches.

Series applied.

Please address Arnd's feedback about your usleep based timeout loops
as a follow-on.

Thank you.

^ permalink raw reply

* Re: [PATCH net-next 1/1] qed: Add support for virtual link.
From: David Miller @ 2018-10-10 16:58 UTC (permalink / raw)
  To: sudarsana.kalluru; +Cc: netdev, Tomer.Tayar, Michal.Kalderon
In-Reply-To: <20181010120012.9848-1-sudarsana.kalluru@cavium.com>

From: Sudarsana Reddy Kalluru <sudarsana.kalluru@cavium.com>
Date: Wed, 10 Oct 2018 05:00:12 -0700

> Currently driver registers to physical link notifications (of the device)
> from Management firmware (MFW). Driver doesn't get notified if there's a
> change in the virtual link e.g., link-flap on the peer PF interface.
> Virtual link indication from MFW reflects the per PF link status instead
> of the physical link.
> 
> The patch adds driver support for,
>   - Advertising the virtual link support to MFW.
>   - Handling the virtual link notification from MFW.
> 
> Please consider applying it to 'net-next'.
> 
> Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH bpf-next v2 3/7] bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
From: Song Liu @ 2018-10-10 16:48 UTC (permalink / raw)
  To: mauricio.vasquez; +Cc: Alexei Starovoitov, Daniel Borkmann, Networking
In-Reply-To: <153918037098.8915.14965935582566410782.stgit@kernel>

On Wed, Oct 10, 2018 at 7:06 AM Mauricio Vasquez B
<mauricio.vasquez@polito.it> wrote:
>
> The following patch implements a bpf queue/stack maps that
> provides the peek/pop/push functions.  There is not a direct
> relationship between those functions and the current maps
> syscalls, hence a new MAP_LOOKUP_AND_DELETE_ELEM syscall is added,
> this is mapped to the pop operation in the queue/stack maps
> and it is still to implement in other kind of maps.
>
> Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
> ---
>  include/linux/bpf.h      |    1 +
>  include/uapi/linux/bpf.h |    1 +
>  kernel/bpf/syscall.c     |   82 ++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 84 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 9b558713447f..5793f0c7fbb5 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -39,6 +39,7 @@ struct bpf_map_ops {
>         void *(*map_lookup_elem)(struct bpf_map *map, void *key);
>         int (*map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags);
>         int (*map_delete_elem)(struct bpf_map *map, void *key);
> +       void *(*map_lookup_and_delete_elem)(struct bpf_map *map, void *key);
>
>         /* funcs called by prog_array and perf_event_array map */
>         void *(*map_fd_get_ptr)(struct bpf_map *map, struct file *map_file,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index f9187b41dff6..3bb94aa2d408 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -103,6 +103,7 @@ enum bpf_cmd {
>         BPF_BTF_LOAD,
>         BPF_BTF_GET_FD_BY_ID,
>         BPF_TASK_FD_QUERY,
> +       BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>  };
>
>  enum bpf_map_type {
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index f36c080ad356..6907d661dea5 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -980,6 +980,85 @@ static int map_get_next_key(union bpf_attr *attr)
>         return err;
>  }
>
> +#define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
> +
> +static int map_lookup_and_delete_elem(union bpf_attr *attr)
> +{
> +       void __user *ukey = u64_to_user_ptr(attr->key);
> +       void __user *uvalue = u64_to_user_ptr(attr->value);
> +       int ufd = attr->map_fd;
> +       struct bpf_map *map;
> +       void *key, *value, *ptr;
> +       u32 value_size;
> +       struct fd f;
> +       int err;
> +
> +       if (CHECK_ATTR(BPF_MAP_LOOKUP_AND_DELETE_ELEM))
> +               return -EINVAL;
> +
> +       f = fdget(ufd);
> +       map = __bpf_map_get(f);
> +       if (IS_ERR(map))
> +               return PTR_ERR(map);
> +
> +       if (!(f.file->f_mode & FMODE_CAN_WRITE)) {
> +               err = -EPERM;
> +               goto err_put;
> +       }
> +
> +       key = __bpf_copy_key(ukey, map->key_size);
> +       if (IS_ERR(key)) {
> +               err = PTR_ERR(key);
> +               goto err_put;
> +       }
> +
> +       value_size = map->value_size;
> +
> +       err = -ENOMEM;
> +       value = kmalloc(value_size, GFP_USER | __GFP_NOWARN);
> +       if (!value)
> +               goto free_key;
> +
> +       err = -EFAULT;
> +       if (copy_from_user(value, uvalue, value_size) != 0)
> +               goto free_value;
> +
> +       /* must increment bpf_prog_active to avoid kprobe+bpf triggering from
> +        * inside bpf map update or delete otherwise deadlocks are possible
> +        */
> +       preempt_disable();
> +       __this_cpu_inc(bpf_prog_active);
> +       if (map->ops->map_lookup_and_delete_elem) {
> +               rcu_read_lock();
> +               ptr = map->ops->map_lookup_and_delete_elem(map, key);
> +               if (ptr)
> +                       memcpy(value, ptr, value_size);
I think we are exposed to race condition with push and pop in parallel.
map_lookup_and_delete_elem() only updates the head/tail, so it gives
no protection for the buffer pointed by ptr.

Thanks,
Song

> +               rcu_read_unlock();
> +               err = ptr ? 0 : -ENOENT;
> +       } else {
> +               err = -ENOTSUPP;
> +       }
> +
> +       __this_cpu_dec(bpf_prog_active);
> +       preempt_enable();
> +
> +       if (err)
> +               goto free_value;
> +
> +       if (copy_to_user(uvalue, value, value_size) != 0)
> +               goto free_value;
> +
> +       err = 0;
> +
> +free_value:
> +       kfree(value);
> +free_key:
> +       kfree(key);
> +err_put:
> +       fdput(f);
> +       return err;
> +}
> +
>  static const struct bpf_prog_ops * const bpf_prog_types[] = {
>  #define BPF_PROG_TYPE(_id, _name) \
>         [_id] = & _name ## _prog_ops,
> @@ -2453,6 +2532,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>         case BPF_TASK_FD_QUERY:
>                 err = bpf_task_fd_query(&attr, uattr);
>                 break;
> +       case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
> +               err = map_lookup_and_delete_elem(&attr);
> +               break;
>         default:
>                 err = -EINVAL;
>                 break;
>

^ permalink raw reply

* [PATCH] bpf: bpftool, add support for attaching programs to maps
From: John Fastabend @ 2018-10-10 16:44 UTC (permalink / raw)
  To: jakub.kicinski, ast, daniel; +Cc: netdev

Sock map/hash introduce support for attaching programs to maps. To
date I have been doing this with custom tooling but this is less than
ideal as we shift to using bpftool as the single CLI for our BPF uses.
This patch adds new sub commands 'attach' and 'detach' to the 'prog'
command to attach programs to maps and then detach them.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/bpf/bpftool/main.h |    1 +
 tools/bpf/bpftool/prog.c |   92 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 40492cd..9ceb2b6 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -137,6 +137,7 @@ int cmd_select(const struct cmd *cmds, int argc, char **argv,
 int do_cgroup(int argc, char **arg);
 int do_perf(int argc, char **arg);
 int do_net(int argc, char **arg);
+int do_attach_cmd(int argc, char **arg);
 
 int prog_parse_fd(int *argc, char ***argv);
 int map_parse_fd(int *argc, char ***argv);
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index b1cd3bc..280881d 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -77,6 +77,26 @@
 	[BPF_PROG_TYPE_FLOW_DISSECTOR]	= "flow_dissector",
 };
 
+static const char * const attach_type_strings[] = {
+	[BPF_SK_SKB_STREAM_PARSER] = "stream_parser",
+	[BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict",
+	[BPF_SK_MSG_VERDICT] = "msg_verdict",
+	[__MAX_BPF_ATTACH_TYPE] = NULL,
+};
+
+enum bpf_attach_type parse_attach_type(const char *str)
+{
+	enum bpf_attach_type type;
+
+	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
+		if (attach_type_strings[type] &&
+		    is_prefix(str, attach_type_strings[type]))
+			return type;
+	}
+
+	return __MAX_BPF_ATTACH_TYPE;
+}
+
 static void print_boot_time(__u64 nsecs, char *buf, unsigned int size)
 {
 	struct timespec real_time_ts, boot_time_ts;
@@ -697,6 +717,71 @@ int map_replace_compar(const void *p1, const void *p2)
 	return a->idx - b->idx;
 }
 
+static int do_attach(int argc, char **argv)
+{
+	enum bpf_attach_type attach_type;
+	int err, mapfd, progfd;
+
+	if (!REQ_ARGS(4)) {
+		p_err("too few parameters for map attach");
+		return -EINVAL;
+	}
+
+	progfd = prog_parse_fd(&argc, &argv);
+	if (progfd < 0)
+		return progfd;
+
+	attach_type = parse_attach_type(*argv);
+	if (attach_type == __MAX_BPF_ATTACH_TYPE) {
+		p_err("invalid attach type");
+		return -EINVAL;
+	}
+
+	NEXT_ARG();
+	mapfd = map_parse_fd(&argc, &argv);
+	if (mapfd < 0)
+		return mapfd;
+
+	err = bpf_prog_attach(progfd, mapfd, attach_type, 0);
+	if (err) {
+		p_err("failed prog attach to map");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int do_detach(int argc, char **argv)
+{
+	enum bpf_attach_type attach_type;
+	int err, mapfd, progfd;
+
+	if (!REQ_ARGS(4)) {
+		p_err("too few parameters for map detach");
+		return -EINVAL;
+	}
+
+	progfd = prog_parse_fd(&argc, &argv);
+	if (progfd < 0)
+		return progfd;
+
+	attach_type = parse_attach_type(*argv);
+	if (attach_type == __MAX_BPF_ATTACH_TYPE) {
+		p_err("invalid attach type");
+		return -EINVAL;
+	}
+
+	NEXT_ARG();
+	mapfd = map_parse_fd(&argc, &argv);
+	if (mapfd < 0)
+		return mapfd;
+
+	err = bpf_prog_detach2(progfd, mapfd, attach_type);
+	if (err) {
+		p_err("failed prog detach from map");
+		return -EINVAL;
+	}
+	return 0;
+}
 static int do_load(int argc, char **argv)
 {
 	enum bpf_attach_type expected_attach_type;
@@ -942,6 +1027,7 @@ static int do_help(int argc, char **argv)
 		"       %s %s pin   PROG FILE\n"
 		"       %s %s load  OBJ  FILE [type TYPE] [dev NAME] \\\n"
 		"                         [map { idx IDX | name NAME } MAP]\n"
+		"       %s %s attach PROG ATTACH_TYPE MAP\n"
 		"       %s %s help\n"
 		"\n"
 		"       " HELP_SPEC_MAP "\n"
@@ -953,10 +1039,12 @@ static int do_help(int argc, char **argv)
 		"                 cgroup/bind4 | cgroup/bind6 | cgroup/post_bind4 |\n"
 		"                 cgroup/post_bind6 | cgroup/connect4 | cgroup/connect6 |\n"
 		"                 cgroup/sendmsg4 | cgroup/sendmsg6 }\n"
+		"       ATTACH_TYPE := { msg_verdict | skb_verdict | skb_parse }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
-		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2]);
 
 	return 0;
 }
@@ -968,6 +1056,8 @@ static int do_help(int argc, char **argv)
 	{ "dump",	do_dump },
 	{ "pin",	do_pin },
 	{ "load",	do_load },
+	{ "attach",	do_attach },
+	{ "detach",	do_detach },
 	{ 0 }
 };
 

^ permalink raw reply related

* Re: Possible bug in traffic control?
From: Cong Wang @ 2018-10-10 16:39 UTC (permalink / raw)
  To: jcoombs; +Cc: Linux Kernel Network Developers
In-Reply-To: <CACcUnf-HWq0U5Vw4sXkx6wn+mApxm5y2-WeN6=HX7Am0+Bb=3Q@mail.gmail.com>

On Wed, Oct 10, 2018 at 8:54 AM Josh Coombs <jcoombs@staff.gwi.net> wrote:
>
> 2.3 billion 1 byte packets failed to re-create the bug.  To try and
> simplify the setup I removed macsec from the equation, using a single
> host in the middle as the bridge.  Interestingly, rather than 1.3Gbits
> a second in both directions, it ran around 8Mbits a second.  Switching
> the filter from u32 to matchall didn't change the performance.  Going
> back to the four machine test bed, again removing macsec and just
> bridging through radically decreased the throughput to around 8Mbits.
> Flip on macsec for the bridge and 1.3Gbits?

This is a great narrow down! We can rule out macsec for guilty.

Can you share a minimum reproducer for this problem? If so I can take
a look.

^ permalink raw reply

* Re: Re: BUG: corrupted list in p9_read_work
From: Dmitry Vyukov @ 2018-10-10 16:36 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: syzbot, David Miller, Eric Van Hensbergen, LKML, Latchesar Ionkov,
	netdev, Ron Minnich, syzkaller-bugs, v9fs-developer
In-Reply-To: <20181010161003.GA5371@nautica>

On Wed, Oct 10, 2018 at 6:10 PM, Dominique Martinet
<asmadeus@codewreck.org> wrote:
> Dominique Martinet wrote on Wed, Oct 10, 2018:
>> It works though, is it just picky because I didn't end it in .git? let's
>> try again, sorry for the noise...
>>
>> #syz test: git://github.com/martinetd/linux.git e4ca13f7d075e551dc158df6af18fb412a1dba0a
>
> And I guess the commit hash needs to be in the default clone branch to
> work ?
> ('git fetch <repo> <hash>' happily fetches the commit in a new clone for
> me... But that feels like a github specific behaviour maybe)

yeeeep, this is bug:
https://github.com/google/syzkaller/issues/728

Turns out git fetch of a named remote and just a tree work
differently. The latter only fetches the main branch.

'git fetch <repo> <hash>' is it a thing? Is it something that requires
special server configuration? I remember something similar that wasn't
able to fetch a random commit hash all the time...

The plan was to make a named remote and then fetch it, this should
fetch everything.


> Oh, well; made a branch for it, last try for me.
>
> #syz test: git://github.com/martinetd/linux.git for-syzbot
>
> --
> Dominique

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox