Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH iproute2-next 3/9] rdma: Add filtering infrastructure
From: David Ahern @ 2018-01-03  2:47 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe
  Cc: RDMA mailing list, Leon Romanovsky, netdev, Stephen Hemminger
In-Reply-To: <20180102093725.6172-4-leon@kernel.org>

On 1/2/18 2:37 AM, Leon Romanovsky wrote:
> +/*
> + * Check if string entry is filtered:
> + *  * key doesn't exist -> user didn't request -> not filtered
> + */
> +bool rd_check_is_string_filtered(struct rd *rd, const char *key, char *val)
> +{
> +	bool key_is_filtered = false;
> +	struct filter_entry *fe;
> +	char *p = NULL;
> +	char *str;
> +
> +	list_for_each_entry(fe, &rd->filter_list, list) {
> +		if (!strcmpx(fe->key, key)) {
> +			/* We found the key */
> +			p = strdup(fe->value);

if (p == NULL) ...

> +
> +			/*
> +			 * Need to check if value in range
> +			 * It can come in the following formats
> +			 * and their permutations:
> +			 * str
> +			 * str1,str2
> +			 */
> +			str = strtok(p, ",");
> +			while (str) {
> +				if (!strcmpx(str, val)) {
> +					key_is_filtered = true;
> +					goto out;
> +				}
> +				str = strtok(NULL, ",");
> +			}
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	free(p);
> +	return key_is_filtered;
> +}
> +

^ permalink raw reply

* Re: [PATCH] 3c59x: fix missing dma_mapping_error check
From: David Miller @ 2018-01-03  2:48 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, nhorman, klassert
In-Reply-To: <20171229164010.1991-1-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Fri, 29 Dec 2017 11:40:10 -0500

> @@ -2067,6 +2072,9 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  		int len = (skb->len + 3) & ~3;
>  		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
>  						PCI_DMA_TODEVICE);
> +		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma))
> +			return NETDEV_TX_OK;
> +

This leaks the SKB, right?

And for the RX cases, it allows the RX ring to deplete to empty which
tends to hang most chips.  You need to make the DMA failure detection
early and recycle the RX buffer back to the chip instead of passing
it up to the stack.

^ permalink raw reply

* Re: [patch iproute2 v4 3/3] man: Add -bs option to tc manpage
From: Chris Mi @ 2018-01-03  2:48 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: netdev, gerlitz.or, stephen, dsahern
In-Reply-To: <20180102200718.GB725@localhost.localdomain>

2018/1/3 4:07, Marcelo Ricardo Leitner:
> On Tue, Jan 02, 2018 at 11:28:04PM +0900, Chris Mi wrote:
>> Signed-off-by: Chris Mi <chrism@mellanox.com>
>> ---
>>   man/man8/tc.8 | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/man/man8/tc.8 b/man/man8/tc.8
>> index ff071b33..de137e16 100644
>> --- a/man/man8/tc.8
>> +++ b/man/man8/tc.8
>> @@ -601,6 +601,11 @@ must exist already.
>>   read commands from provided file or standard input and invoke them.
>>   First failure will cause termination of tc.
>>   
>> +.TP
>> +.BR "\-bs", " \-bs size", " \-batchsize", " \-batchsize size"
>> +How many commands are accumulated before sending to kernel.
>> +By default, it is 1. It only takes effect in batch mode.
>> +
> You should also describe the limitations it has. Like, it only works
> for action and filter and that it shouldn't be mixed with other
> commands.
Done.
> And maybe even do such check in the code: refuse to do other commands
> if batch_size > 1.
I didn't add it because I'm afraid the benefit may be gone if I add the 
check.
But I add a warning in the man page.
>
>>   .TP
>>   .BR "\-force"
>>   don't terminate tc on errors in batch mode.
>> -- 
>> 2.14.3
>>

^ permalink raw reply

* Re: [PATCH net] ethtool: do not print warning for applications using legacy API
From: David Miller @ 2018-01-03  2:50 UTC (permalink / raw)
  To: stephen; +Cc: decot, netdev, linux-kernel
In-Reply-To: <20171229180252.6981-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri, 29 Dec 2017 10:02:52 -0800

> From: Stephen Hemminger <stephen@networkplumber.org>
> 
> In kernel log ths message appears on every boot:
>  "warning: `NetworkChangeNo' uses legacy ethtool link settings API,
>   link modes are only partially reported"
> 
> When ethtool link settings API changed, it started complaining about
> usages of old API. Ironically, the original patch was from google but
> the application using the legacy API is chrome.

Chrome on my machine doesn't do this, FWIW...

> Linux ABI is fixed as much as possible. The kernel must not break it
> and should not complain about applications using legacy API's.
> This patch just removes the warning since using legacy API's
> in Linux is perfectly acceptable.
> 
> Fixes: 3f1ac7a700d0 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH iproute2-next 6/9] rdma: Update kernel header file
From: David Ahern @ 2018-01-03  2:50 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe
  Cc: RDMA mailing list, Leon Romanovsky, netdev, Stephen Hemminger
In-Reply-To: <20180102093725.6172-7-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

On 1/2/18 2:37 AM, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> Synchronize iporute2 package with latest kernel
> RDMA netlink header file.
> 
> Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  include/uapi/rdma/rdma_netlink.h | 58 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 56 insertions(+), 2 deletions(-)

FYI: uapi headers are updated separately. Once the patches hit net-next,
I will update the headers and drop this patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net] cxgb4: Fix FW flash errors
From: David Miller @ 2018-01-03  2:51 UTC (permalink / raw)
  To: ganeshgr; +Cc: netdev, nirranjan, indranil, venkatesh, arjun, leedom
In-Reply-To: <1514572569-28462-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Sat, 30 Dec 2017 00:06:09 +0530

> From: Arjun Vynipadath <arjun@chelsio.com>
> 
> Initialize adapter->params.sf_fw_start to fix firmware flash
> issues. Use existing macros defined for FW flash addresses.
> 
> Fixes: 96ac18f14a5a ("cxgb4: Add support for new flash parts")
> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

This commit log message doesn't match the patch.

You say "Initialize adapter->params.sf_fw_start", but this patch
is removing that struct member altogether.

^ permalink raw reply

* Re: [net 1/1] tipc: fix problems with multipoint-to-point flow control
From: David Miller @ 2018-01-03  2:52 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, mohan.krishna.ghanta.krishnamurthy, tung.q.nguyen,
	hoang.h.le, canh.d.luu, ying.xue, tipc-discussion
In-Reply-To: <1514573282-4355-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Fri, 29 Dec 2017 19:48:02 +0100

> In commit 04d7b574b245 ("tipc: add multipoint-to-point flow control") we
> introduced a protocol for preventing buffer overflow when many group
> members try to simultaneously send messages to the same receiving member.
> 
> Stress test of this mechanism has revealed a couple of related bugs:
> 
> - When the receiving member receives an advertisement REMIT message from
>   one of the senders, it will sometimes prematurely activate a pending
>   member and send it the remitted advertisement, although the upper
>   limit for active senders has been reached. This leads to accumulation
>   of illegal advertisements, and eventually to messages being dropped
>   because of receive buffer overflow.
> 
> - When the receiving member leaves REMITTED state while a received
>   message is being read, we miss to look at the pending queue, to
>   activate the oldest pending peer. This leads to some pending senders
>   being starved out, and never getting the opportunity to profit from
>   the remitted advertisement.
> 
> We fix the former in the function tipc_group_proto_rcv() by returning
> directly from the function once it becomes clear that the remitting
> peer cannot leave REMITTED state at that point.
> 
> We fix the latter in the function tipc_group_update_rcv_win() by looking
> up and activate the longest pending peer when it becomes clear that the
> remitting peer now can leave REMITTED state.
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied, thanks Jon.

^ permalink raw reply

* Re: [PATCH net-next 2/2] tun: allow to attach ebpf socket filter
From: Jason Wang @ 2018-01-03  2:53 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, LKML, Michael S. Tsirkin, Willem de Bruijn
In-Reply-To: <CAF=yD-KydM326DErG5XaafCN=R=p2V48MgUDH0LRpPrXCUeOdA@mail.gmail.com>



On 2018年01月02日 17:19, Willem de Bruijn wrote:
>>> More importantly, should this program just return a boolean pass or
>>> drop. Taking a length and trimming may introduce bugs later on if the
>>> stack parses the packet unconditionally, expecting a minimum size
>>> to be present.
>>>
>>> This was the reason for introducing sk_filter_trim_cap and using that
>>> in other sk_filter sites.
>>>
>>> A quick scan shows that tun_put_user expects a full vlan tag to exist
>>> if skb_vlan_tag_present(skb), for instance. If trimmed to below this
>>> length the final call to skb_copy_datagram_iter may have negative
>>> length.
>>>
>>> This is an issue with the existing sk_filter call as much as with the
>>> new run_ebpf_filter call.
>> Good point, so consider it was used by sk_filter too, we need to fix it
>> anyway. Actually, I've considered the boolean return value but finally I
>> decide to obey the style of sk filter. Maybe the trimming has real user. e.g
>> high speed header recoding/analysis? Consider it's not hard to fix, how
>> about just keep that?
> I don't see an obvious use case, but sure. We'll just need to look
> at what the minimum trim length needs to be.

It looks to me that the minimum length is:

skb_vlan_tag_present(skb) ? offsetof(struct vlan_ethhdr, h_vlan_proto) : 0

And consider the vlan tag insertion done in tun_put_user(), we need trim 
4 more bytes if vlan tag is present.

Thanks

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: Fix dsa_legacy_register() return value
From: David Miller @ 2018-01-03  2:53 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, privat, andrew, vivien.didelot, linux-kernel
In-Reply-To: <20171229190545.21109-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Fri, 29 Dec 2017 11:05:45 -0800

> We need to make the dsa_legacy_register() stub return 0 in order for
> dsa_init_module() to successfully register and continue registering the
> ETH_P_XDSA packet handler.
> 
> Fixes: 2a93c1a3651f ("net: dsa: Allow compiling out legacy support")
> Reported-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied, thanks Florian.

^ permalink raw reply

* [patch iproute2 v5 1/3] lib/libnetlink: Add a function rtnl_talk_msg
From: Chris Mi @ 2018-01-03  2:55 UTC (permalink / raw)
  To: netdev; +Cc: gerlitz.or, stephen, dsahern, marcelo.leitner
In-Reply-To: <20180103025517.3767-1-chrism@mellanox.com>

rtnl_talk can only send a single message to kernel. Add a new function
rtnl_talk_msg that can send multiple messages to kernel.

Signed-off-by: Chris Mi <chrism@mellanox.com>
---
 include/libnetlink.h |  3 +++
 lib/libnetlink.c     | 59 ++++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/include/libnetlink.h b/include/libnetlink.h
index a4d83b9e..01d98b16 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -96,6 +96,9 @@ int rtnl_dump_filter_nc(struct rtnl_handle *rth,
 int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	      struct nlmsghdr **answer)
 	__attribute__((warn_unused_result));
+int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
+		  struct nlmsghdr **answer)
+	__attribute__((warn_unused_result));
 int rtnl_talk_extack(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	      struct nlmsghdr **answer, nl_ext_ack_fn_t errfn)
 	__attribute__((warn_unused_result));
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index 00e6ce0c..cc02a139 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -581,32 +581,34 @@ static void rtnl_talk_error(struct nlmsghdr *h, struct nlmsgerr *err,
 		strerror(-err->error));
 }
 
-static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
-		       struct nlmsghdr **answer,
-		       bool show_rtnl_err, nl_ext_ack_fn_t errfn)
+static int __rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
+			   struct nlmsghdr **answer,
+			   bool show_rtnl_err, nl_ext_ack_fn_t errfn)
 {
-	int status;
-	unsigned int seq;
-	struct nlmsghdr *h;
+	int iovlen = m->msg_iovlen;
+	unsigned int seq = 0;
+	int i, status;
+	char *buf;
+
 	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
-	struct iovec iov = {
-		.iov_base = n,
-		.iov_len = n->nlmsg_len
-	};
+	struct iovec iov, *v;
+	struct nlmsghdr *h;
 	struct msghdr msg = {
 		.msg_name = &nladdr,
 		.msg_namelen = sizeof(nladdr),
 		.msg_iov = &iov,
 		.msg_iovlen = 1,
 	};
-	char *buf;
 
-	n->nlmsg_seq = seq = ++rtnl->seq;
-
-	if (answer == NULL)
-		n->nlmsg_flags |= NLM_F_ACK;
+	for (i = 0; i < iovlen; i++) {
+		v = &m->msg_iov[i];
+		h = v->iov_base;
+		h->nlmsg_seq = seq = ++rtnl->seq;
+		if (answer == NULL)
+			h->nlmsg_flags |= NLM_F_ACK;
+	}
 
-	status = sendmsg(rtnl->fd, &msg, 0);
+	status = sendmsg(rtnl->fd, m, 0);
 	if (status < 0) {
 		perror("Cannot talk to rtnetlink");
 		return -1;
@@ -698,12 +700,37 @@ static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	}
 }
 
+static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
+		       struct nlmsghdr **answer,
+		       bool show_rtnl_err, nl_ext_ack_fn_t errfn)
+{
+	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
+	struct iovec iov = {
+		.iov_base = n,
+		.iov_len = n->nlmsg_len
+	};
+	struct msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = &iov,
+		.msg_iovlen = 1,
+	};
+
+	return __rtnl_talk_msg(rtnl, &msg, answer, show_rtnl_err, errfn);
+}
+
 int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 	      struct nlmsghdr **answer)
 {
 	return __rtnl_talk(rtnl, n, answer, true, NULL);
 }
 
+int rtnl_talk_msg(struct rtnl_handle *rtnl, struct msghdr *m,
+	      struct nlmsghdr **answer)
+{
+	return __rtnl_talk_msg(rtnl, m, answer, true, NULL);
+}
+
 int rtnl_talk_extack(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 		     struct nlmsghdr **answer,
 		     nl_ext_ack_fn_t errfn)
-- 
2.14.3

^ permalink raw reply related

* [patch iproute2 v5 0/3] tc: Add -bs option to batch mode
From: Chris Mi @ 2018-01-03  2:55 UTC (permalink / raw)
  To: netdev; +Cc: gerlitz.or, stephen, dsahern, marcelo.leitner

Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this patchset, we can accumulate
several commands before sending to kernel. The batch size is specified
using option -bs or -batchsize.

To accumulate the commands in tc, client should allocate an array of
struct iovec. If batchsize is bigger than 1, only after the client
has accumulated enough commands, can the client call rtnl_talk_msg
to send the message that includes the iov array. One exception is
that there is no more command in the batch file.

But please note that kernel still processes the requests one by one.
To process the requests in parallel in kernel is another effort.
The time we're saving in this patchset is the user mode and kernel mode
context switch. So this patchset works on top of the current kernel.

Using the following script in kernel, we can generate 1,000,000 rules.
        tools/testing/selftests/tc-testing/tdc_batch.py

Without this patchset, 'tc -b $file' exection time is:

real    0m15.125s
user    0m6.982s
sys     0m8.080s

With this patchset, 'tc -b $file -bs 10' exection time is:

real    0m12.772s
user    0m5.984s
sys     0m6.723s

The insertion rate is improved more than 10%.

In this patchset, we still ack for every rule. If we don't ack at all,

'tc -b $file' exection time is:

real    0m14.484s
user    0m6.919s
sys     0m7.498s

'tc -b $file -bs 10' exection time is:

real    0m11.664s
user    0m6.017s
sys     0m5.578s

We can see that the performance win is to send multiple messages instead
of no acking. I think that's because in tc, we don't spend too much time
processing the ack message.

v3
==
1. Instead of hacking function rtnl_talk directly, add a new function
   rtnl_talk_msg.
2. remove most of global variables to use parameter passing
3. divide the previous patch into 4 patches.

v4
==
1. Remove function setcmdlinetotal. Now in function batch, we read one
   more line to determine if we are reaching the end of file.
2. Remove function __rtnl_check_ack. Now __rtnl_talk calls __rtnl_talk_msg
   directly.
3. if (batch_size < 1)
        batch_size = 1;

v5
==
1. Fix a bug that can't deal with batch file with blank line.
2. Describe the limitation in man page.

Chris Mi (3):
  lib/libnetlink: Add a function rtnl_talk_msg
  tc: Add -bs option to batch mode
  man: Add -bs option to tc manpage

 include/libnetlink.h |   3 ++
 lib/libnetlink.c     |  59 ++++++++++++++++++-------
 man/man8/tc.8        |   9 ++++
 tc/m_action.c        |  90 +++++++++++++++++++++++++++++---------
 tc/tc.c              |  70 +++++++++++++++++++++++------
 tc/tc_common.h       |   8 +++-
 tc/tc_filter.c       | 121 +++++++++++++++++++++++++++++++++++++--------------
 7 files changed, 276 insertions(+), 84 deletions(-)

-- 
2.14.3

^ permalink raw reply

* [patch iproute2 v5 2/3] tc: Add -bs option to batch mode
From: Chris Mi @ 2018-01-03  2:55 UTC (permalink / raw)
  To: netdev; +Cc: gerlitz.or, stephen, dsahern, marcelo.leitner
In-Reply-To: <20180103025517.3767-1-chrism@mellanox.com>

Signed-off-by: Chris Mi <chrism@mellanox.com>
---
 tc/m_action.c  |  90 ++++++++++++++++++++++++++++++++----------
 tc/tc.c        |  70 ++++++++++++++++++++++++++-------
 tc/tc_common.h |   8 +++-
 tc/tc_filter.c | 121 +++++++++++++++++++++++++++++++++++++++++----------------
 4 files changed, 221 insertions(+), 68 deletions(-)

diff --git a/tc/m_action.c b/tc/m_action.c
index fc422364..2e79034d 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -23,6 +23,7 @@
 #include <arpa/inet.h>
 #include <string.h>
 #include <dlfcn.h>
+#include <errno.h>
 
 #include "utils.h"
 #include "tc_common.h"
@@ -546,40 +547,87 @@ bad_val:
 	return ret;
 }
 
+typedef struct {
+	struct nlmsghdr		n;
+	struct tcamsg		t;
+	char			buf[MAX_MSG];
+} tc_action_req;
+
+static tc_action_req *action_reqs;
+static struct iovec msg_iov[MSG_IOV_MAX];
+
+void free_action_reqs(void)
+{
+	free(action_reqs);
+}
+
+static tc_action_req *get_action_req(int batch_size, int index)
+{
+	tc_action_req *req;
+
+	if (action_reqs == NULL) {
+		action_reqs = malloc(batch_size * sizeof (tc_action_req));
+		if (action_reqs == NULL)
+			return NULL;
+	}
+	req = &action_reqs[index];
+	memset(req, 0, sizeof (*req));
+
+	return req;
+}
+
 static int tc_action_modify(int cmd, unsigned int flags,
-			    int *argc_p, char ***argv_p)
+			    int *argc_p, char ***argv_p,
+			    int batch_size, int index, bool send)
 {
 	int argc = *argc_p;
 	char **argv = *argv_p;
 	int ret = 0;
-	struct {
-		struct nlmsghdr         n;
-		struct tcamsg           t;
-		char                    buf[MAX_MSG];
-	} req = {
-		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcamsg)),
-		.n.nlmsg_flags = NLM_F_REQUEST | flags,
-		.n.nlmsg_type = cmd,
-		.t.tca_family = AF_UNSPEC,
+	tc_action_req *req;
+	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
+	struct iovec *iov = &msg_iov[index];
+
+	req = get_action_req(batch_size, index);
+	if (req == NULL) {
+		fprintf(stderr, "get_action_req error: not enough buffer\n");
+		return -ENOMEM;
+	}
+
+	req->n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcamsg));
+	req->n.nlmsg_flags = NLM_F_REQUEST | flags;
+	req->n.nlmsg_type = cmd;
+	req->t.tca_family = AF_UNSPEC;
+	struct rtattr *tail = NLMSG_TAIL(&req->n);
+
+	struct msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = msg_iov,
+		.msg_iovlen = index + 1,
 	};
-	struct rtattr *tail = NLMSG_TAIL(&req.n);
 
 	argc -= 1;
 	argv += 1;
-	if (parse_action(&argc, &argv, TCA_ACT_TAB, &req.n)) {
+	if (parse_action(&argc, &argv, TCA_ACT_TAB, &req->n)) {
 		fprintf(stderr, "Illegal \"action\"\n");
 		return -1;
 	}
-	tail->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail;
+	tail->rta_len = (void *) NLMSG_TAIL(&req->n) - (void *) tail;
+
+	*argc_p = argc;
+	*argv_p = argv;
+
+	iov->iov_base = &req->n;
+	iov->iov_len = req->n.nlmsg_len;
+
+	if (!send)
+		return 0;
 
-	if (rtnl_talk(&rth, &req.n, NULL) < 0) {
+	if (rtnl_talk_msg(&rth, &msg, NULL) < 0) {
 		fprintf(stderr, "We have an error talking to the kernel\n");
 		ret = -1;
 	}
 
-	*argc_p = argc;
-	*argv_p = argv;
-
 	return ret;
 }
 
@@ -679,7 +727,7 @@ bad_val:
 	return ret;
 }
 
-int do_action(int argc, char **argv)
+int do_action(int argc, char **argv, int batch_size, int index, bool send)
 {
 
 	int ret = 0;
@@ -689,12 +737,14 @@ int do_action(int argc, char **argv)
 		if (matches(*argv, "add") == 0) {
 			ret =  tc_action_modify(RTM_NEWACTION,
 						NLM_F_EXCL | NLM_F_CREATE,
-						&argc, &argv);
+						&argc, &argv, batch_size,
+						index, send);
 		} else if (matches(*argv, "change") == 0 ||
 			  matches(*argv, "replace") == 0) {
 			ret = tc_action_modify(RTM_NEWACTION,
 					       NLM_F_CREATE | NLM_F_REPLACE,
-					       &argc, &argv);
+					       &argc, &argv, batch_size,
+					       index, send);
 		} else if (matches(*argv, "delete") == 0) {
 			argc -= 1;
 			argv += 1;
diff --git a/tc/tc.c b/tc/tc.c
index ad9f07e9..90ce4ce2 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -189,20 +189,20 @@ static void usage(void)
 	fprintf(stderr, "Usage: tc [ OPTIONS ] OBJECT { COMMAND | help }\n"
 			"       tc [-force] -batch filename\n"
 			"where  OBJECT := { qdisc | class | filter | action | monitor | exec }\n"
-	                "       OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | -n[etns] name |\n"
+	                "       OPTIONS := { -s[tatistics] | -d[etails] | -r[aw] | -p[retty] | -b[atch] [filename] | -bs | -batchsize [size] | -n[etns] name |\n"
 			"                    -nm | -nam[es] | { -cf | -conf } path } | -j[son]\n");
 }
 
-static int do_cmd(int argc, char **argv)
+static int do_cmd(int argc, char **argv, int batch_size, int index, bool send)
 {
 	if (matches(*argv, "qdisc") == 0)
 		return do_qdisc(argc-1, argv+1);
 	if (matches(*argv, "class") == 0)
 		return do_class(argc-1, argv+1);
 	if (matches(*argv, "filter") == 0)
-		return do_filter(argc-1, argv+1);
+		return do_filter(argc-1, argv+1, batch_size, index, send);
 	if (matches(*argv, "actions") == 0)
-		return do_action(argc-1, argv+1);
+		return do_action(argc-1, argv+1, batch_size, index, send);
 	if (matches(*argv, "monitor") == 0)
 		return do_tcmonitor(argc-1, argv+1);
 	if (matches(*argv, "exec") == 0)
@@ -217,11 +217,15 @@ static int do_cmd(int argc, char **argv)
 	return -1;
 }
 
-static int batch(const char *name)
+static int batch(const char *name, int batch_size)
 {
+	bool lastline = false;
+	int msg_iov_index = 0;
+	char *line2 = NULL;
 	char *line = NULL;
 	size_t len = 0;
 	int ret = 0;
+	bool send;
 
 	batch_mode = 1;
 	if (name && strcmp(name, "-") != 0) {
@@ -240,23 +244,50 @@ static int batch(const char *name)
 	}
 
 	cmdlineno = 0;
-	while (getcmdline(&line, &len, stdin) != -1) {
+	if (getcmdline(&line, &len, stdin) == -1)
+		goto Exit;
+	do {
 		char *largv[100];
 		int largc;
 
+		if (getcmdline(&line2, &len, stdin) == -1)
+			lastline = true;
+
 		largc = makeargs(line, largv, 100);
+
+		line = line2;
+		line2 = NULL;
+		len = 0;
+
 		if (largc == 0)
 			continue;	/* blank line */
 
-		if (do_cmd(largc, largv)) {
-			fprintf(stderr, "Command failed %s:%d\n", name, cmdlineno);
+		/*
+		 * In batch mode, if we haven't accumulated enough commands
+		 * and this is not the last command, don't send the message
+		 * immediately.
+		 */
+		if (batch_size > 1 && msg_iov_index + 1 != batch_size
+		    && !lastline)
+			send = false;
+		else
+			send = true;
+
+		ret = do_cmd(largc, largv, batch_size, msg_iov_index++, send);
+		if (ret < 0) {
+			fprintf(stderr, "Command failed %s:%d\n", name,
+				cmdlineno);
 			ret = 1;
 			if (!force)
 				break;
 		}
-	}
-	if (line)
-		free(line);
+		msg_iov_index %= batch_size;
+	} while (!lastline);
+
+	free_filter_reqs();
+	free_action_reqs();
+Exit:
+	free(line);
 
 	rtnl_close(&rth);
 	return ret;
@@ -267,6 +298,7 @@ int main(int argc, char **argv)
 {
 	int ret;
 	char *batch_file = NULL;
+	int batch_size = 1;
 
 	while (argc > 1) {
 		if (argv[1][0] != '-')
@@ -297,6 +329,16 @@ int main(int argc, char **argv)
 			if (argc <= 1)
 				usage();
 			batch_file = argv[1];
+		} else if (matches(argv[1], "-batchsize") == 0 ||
+				matches(argv[1], "-bs") == 0) {
+			argc--;	argv++;
+			if (argc <= 1)
+				usage();
+			batch_size = atoi(argv[1]);
+			if (batch_size > MSG_IOV_MAX)
+				batch_size = MSG_IOV_MAX;
+			else if (batch_size < 0)
+				batch_size = 1;
 		} else if (matches(argv[1], "-netns") == 0) {
 			NEXT_ARG();
 			if (netns_switch(argv[1]))
@@ -323,7 +365,7 @@ int main(int argc, char **argv)
 	}
 
 	if (batch_file)
-		return batch(batch_file);
+		return batch(batch_file, batch_size);
 
 	if (argc <= 1) {
 		usage();
@@ -341,7 +383,9 @@ int main(int argc, char **argv)
 		goto Exit;
 	}
 
-	ret = do_cmd(argc-1, argv+1);
+	ret = do_cmd(argc-1, argv+1, 1, 0, true);
+	free_filter_reqs();
+	free_action_reqs();
 Exit:
 	rtnl_close(&rth);
 
diff --git a/tc/tc_common.h b/tc/tc_common.h
index 264fbdac..8a82439f 100644
--- a/tc/tc_common.h
+++ b/tc/tc_common.h
@@ -1,13 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #define TCA_BUF_MAX	(64*1024)
+#define MSG_IOV_MAX	256
 
 extern struct rtnl_handle rth;
 
 extern int do_qdisc(int argc, char **argv);
 extern int do_class(int argc, char **argv);
-extern int do_filter(int argc, char **argv);
-extern int do_action(int argc, char **argv);
+extern int do_filter(int argc, char **argv, int batch_size, int index, bool send);
+extern int do_action(int argc, char **argv, int batch_size, int index, bool send);
 extern int do_tcmonitor(int argc, char **argv);
 extern int do_exec(int argc, char **argv);
 
@@ -24,5 +25,8 @@ struct tc_sizespec;
 extern int parse_size_table(int *p_argc, char ***p_argv, struct tc_sizespec *s);
 extern int check_size_table_opts(struct tc_sizespec *s);
 
+extern void free_filter_reqs(void);
+extern void free_action_reqs(void);
+
 extern int show_graph;
 extern bool use_names;
diff --git a/tc/tc_filter.c b/tc/tc_filter.c
index 545cc3a1..6fecbb45 100644
--- a/tc/tc_filter.c
+++ b/tc/tc_filter.c
@@ -19,6 +19,7 @@
 #include <arpa/inet.h>
 #include <string.h>
 #include <linux/if_ether.h>
+#include <errno.h>
 
 #include "rt_names.h"
 #include "utils.h"
@@ -42,18 +43,44 @@ static void usage(void)
 		"OPTIONS := ... try tc filter add <desired FILTER_KIND> help\n");
 }
 
-static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
+typedef struct {
+	struct nlmsghdr		n;
+	struct tcmsg		t;
+	char			buf[MAX_MSG];
+} tc_filter_req;
+
+static tc_filter_req *filter_reqs;
+static struct iovec msg_iov[MSG_IOV_MAX];
+
+void free_filter_reqs(void)
 {
-	struct {
-		struct nlmsghdr	n;
-		struct tcmsg		t;
-		char			buf[MAX_MSG];
-	} req = {
-		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
-		.n.nlmsg_flags = NLM_F_REQUEST | flags,
-		.n.nlmsg_type = cmd,
-		.t.tcm_family = AF_UNSPEC,
-	};
+	free(filter_reqs);
+}
+
+static tc_filter_req *get_filter_req(int batch_size, int index)
+{
+	tc_filter_req *req;
+
+	if (filter_reqs == NULL) {
+		filter_reqs = malloc(batch_size * sizeof (tc_filter_req));
+		if (filter_reqs == NULL)
+			return NULL;
+	}
+	req = &filter_reqs[index];
+	memset(req, 0, sizeof (*req));
+
+	return req;
+}
+
+static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv,
+			    int batch_size, int index, bool send)
+{
+	tc_filter_req *req;
+	int ret;
+
+	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
+	struct iovec *iov = &msg_iov[index];
+
 	struct filter_util *q = NULL;
 	__u32 prio = 0;
 	__u32 protocol = 0;
@@ -65,6 +92,24 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 	char  k[FILTER_NAMESZ] = {};
 	struct tc_estimator est = {};
 
+	req = get_filter_req(batch_size, index);
+	if (req == NULL) {
+		fprintf(stderr, "get_filter_req error: not enough buffer\n");
+		return -ENOMEM;
+	}
+
+	req->n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
+	req->n.nlmsg_flags = NLM_F_REQUEST | flags;
+	req->n.nlmsg_type = cmd;
+	req->t.tcm_family = AF_UNSPEC;
+
+	struct msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = msg_iov,
+		.msg_iovlen = index + 1,
+	};
+
 	if (cmd == RTM_NEWTFILTER && flags & NLM_F_CREATE)
 		protocol = htons(ETH_P_ALL);
 
@@ -75,37 +120,37 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 				duparg("dev", *argv);
 			strncpy(d, *argv, sizeof(d)-1);
 		} else if (strcmp(*argv, "root") == 0) {
-			if (req.t.tcm_parent) {
+			if (req->t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"root\" is duplicate parent ID\n");
 				return -1;
 			}
-			req.t.tcm_parent = TC_H_ROOT;
+			req->t.tcm_parent = TC_H_ROOT;
 		} else if (strcmp(*argv, "ingress") == 0) {
-			if (req.t.tcm_parent) {
+			if (req->t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"ingress\" is duplicate parent ID\n");
 				return -1;
 			}
-			req.t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
+			req->t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
 						     TC_H_MIN_INGRESS);
 		} else if (strcmp(*argv, "egress") == 0) {
-			if (req.t.tcm_parent) {
+			if (req->t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"egress\" is duplicate parent ID\n");
 				return -1;
 			}
-			req.t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
+			req->t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
 						     TC_H_MIN_EGRESS);
 		} else if (strcmp(*argv, "parent") == 0) {
 			__u32 handle;
 
 			NEXT_ARG();
-			if (req.t.tcm_parent)
+			if (req->t.tcm_parent)
 				duparg("parent", *argv);
 			if (get_tc_classid(&handle, *argv))
 				invarg("Invalid parent ID", *argv);
-			req.t.tcm_parent = handle;
+			req->t.tcm_parent = handle;
 		} else if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
 			if (fhandle)
@@ -152,26 +197,26 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 		argc--; argv++;
 	}
 
-	req.t.tcm_info = TC_H_MAKE(prio<<16, protocol);
+	req->t.tcm_info = TC_H_MAKE(prio<<16, protocol);
 
 	if (chain_index_set)
-		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
+		addattr32(&req->n, sizeof(*req), TCA_CHAIN, chain_index);
 
 	if (k[0])
-		addattr_l(&req.n, sizeof(req), TCA_KIND, k, strlen(k)+1);
+		addattr_l(&req->n, sizeof(*req), TCA_KIND, k, strlen(k)+1);
 
 	if (d[0])  {
 		ll_init_map(&rth);
 
-		req.t.tcm_ifindex = ll_name_to_index(d);
-		if (req.t.tcm_ifindex == 0) {
+		req->t.tcm_ifindex = ll_name_to_index(d);
+		if (req->t.tcm_ifindex == 0) {
 			fprintf(stderr, "Cannot find device \"%s\"\n", d);
 			return 1;
 		}
 	}
 
 	if (q) {
-		if (q->parse_fopt(q, fhandle, argc, argv, &req.n))
+		if (q->parse_fopt(q, fhandle, argc, argv, &req->n))
 			return 1;
 	} else {
 		if (fhandle) {
@@ -190,10 +235,17 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 	}
 
 	if (est.ewma_log)
-		addattr_l(&req.n, sizeof(req), TCA_RATE, &est, sizeof(est));
+		addattr_l(&req->n, sizeof(*req), TCA_RATE, &est, sizeof(est));
 
-	if (rtnl_talk(&rth, &req.n, NULL) < 0) {
-		fprintf(stderr, "We have an error talking to the kernel\n");
+	iov->iov_base = &req->n;
+	iov->iov_len = req->n.nlmsg_len;
+
+	if (!send)
+		return 0;
+
+	ret = rtnl_talk_msg(&rth, &msg, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "We have an error talking to the kernel, %d\n", ret);
 		return 2;
 	}
 
@@ -636,20 +688,23 @@ static int tc_filter_list(int argc, char **argv)
 	return 0;
 }
 
-int do_filter(int argc, char **argv)
+int do_filter(int argc, char **argv, int batch_size, int index, bool send)
 {
 	if (argc < 1)
 		return tc_filter_list(0, NULL);
 	if (matches(*argv, "add") == 0)
 		return tc_filter_modify(RTM_NEWTFILTER, NLM_F_EXCL|NLM_F_CREATE,
-					argc-1, argv+1);
+					argc-1, argv+1,
+					batch_size, index, send);
 	if (matches(*argv, "change") == 0)
-		return tc_filter_modify(RTM_NEWTFILTER, 0, argc-1, argv+1);
+		return tc_filter_modify(RTM_NEWTFILTER, 0, argc-1, argv+1,
+					batch_size, index, send);
 	if (matches(*argv, "replace") == 0)
 		return tc_filter_modify(RTM_NEWTFILTER, NLM_F_CREATE, argc-1,
-					argv+1);
+					argv+1, batch_size, index, send);
 	if (matches(*argv, "delete") == 0)
-		return tc_filter_modify(RTM_DELTFILTER, 0,  argc-1, argv+1);
+		return tc_filter_modify(RTM_DELTFILTER, 0, argc-1, argv+1,
+					batch_size, index, send);
 	if (matches(*argv, "get") == 0)
 		return tc_filter_get(RTM_GETTFILTER, 0,  argc-1, argv+1);
 	if (matches(*argv, "list") == 0 || matches(*argv, "show") == 0
-- 
2.14.3

^ permalink raw reply related

* [patch iproute2 v5 3/3] man: Add -bs option to tc manpage
From: Chris Mi @ 2018-01-03  2:55 UTC (permalink / raw)
  To: netdev; +Cc: gerlitz.or, stephen, dsahern, marcelo.leitner
In-Reply-To: <20180103025517.3767-1-chrism@mellanox.com>

Signed-off-by: Chris Mi <chrism@mellanox.com>
---
 man/man8/tc.8 | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/man/man8/tc.8 b/man/man8/tc.8
index ff071b33..7338ed3b 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -601,6 +601,15 @@ must exist already.
 read commands from provided file or standard input and invoke them.
 First failure will cause termination of tc.
 
+.TP
+.BR "\-bs", " \-bs size", " \-batchsize", " \-batchsize size"
+How many commands are accumulated before sending to kernel.
+By default, it is 1. It only takes effect in batch mode.
+Currently, it only supports filter add or actions add.
+If there are mixed commands in the batch file, the result is unpredictable.
+And there is a limitation that the last line in the batch file should not be blank.
+Or you will lose at most batchsize - 1 rules.
+
 .TP
 .BR "\-force"
 don't terminate tc on errors in batch mode.
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH net-next 0/2] net: stmmac: Couple of debug prints improvements
From: David Miller @ 2018-01-03  2:55 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, peppe.cavallaro, alexandre.torgue, linux-kernel
In-Reply-To: <20171230035633.29514-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Fri, 29 Dec 2017 19:56:31 -0800

> While working on a particular problem, I had to turn on debug prints and found
> them to be useful, but could deserve some improvements in order to help debug
> situations.

Series applied, thanks Florian.

^ permalink raw reply

* Re: [PATCH] qed: Use zeroing memory allocator than allocator/memset
From: David Miller @ 2018-01-03  2:56 UTC (permalink / raw)
  To: himanshujha199640
  Cc: Ariel.Elior, everest-linux-l2, netdev, linux-kernel, mcgrof
In-Reply-To: <1514648224-6820-1-git-send-email-himanshujha199640@gmail.com>

From: Himanshu Jha <himanshujha199640@gmail.com>
Date: Sat, 30 Dec 2017 21:07:04 +0530

> Use dma_zalloc_coherent and vzalloc for allocating zeroed
> memory and remove unnecessary memset function.
> 
> Done using Coccinelle.
> Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
> 0-day tested with no failures.
> 
> Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] ethernet/broadcom: Use zeroing memory allocator than allocator/memset
From: David Miller @ 2018-01-03  2:56 UTC (permalink / raw)
  To: himanshujha199640
  Cc: michael.chan, f.fainelli, bcm-kernel-feedback-list, netdev,
	linux-arm-kernel, linux-kernel, mcgrof
In-Reply-To: <1514648697-7148-1-git-send-email-himanshujha199640@gmail.com>

From: Himanshu Jha <himanshujha199640@gmail.com>
Date: Sat, 30 Dec 2017 21:14:57 +0530

> Use dma_zalloc_coherent for allocating zeroed
> memory and remove unnecessary memset function.
> 
> Done using Coccinelle.
> Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
> 0-day tested with no failures.
> 
> Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>

Applied.

^ permalink raw reply

* Re: [patch iproute2 v5 0/3] tc: Add -bs option to batch mode
From: David Ahern @ 2018-01-03  2:57 UTC (permalink / raw)
  To: Chris Mi, netdev; +Cc: gerlitz.or, stephen, marcelo.leitner
In-Reply-To: <20180103025517.3767-1-chrism@mellanox.com>

I have a day job outside of iproute2 patches; give a day or 2 for review
by many people.

^ permalink raw reply

* [PATCH] nl80211: Check for the required netlink attribute presence
From: Hao Chen @ 2018-01-03  3:00 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David S. Miller, linux-wireless, netdev, linux-kernel, Hao Chen

nl80211_nan_add_func() does not check if the required attribute
NL80211_NAN_FUNC_FOLLOW_UP_DEST is present when processing
NL80211_CMD_ADD_NAN_FUNCTION request. This request can be issued
by users with CAP_NET_ADMIN privilege and may result in NULL dereference
and a system crash. Add a check for the required attribute presence.

Signed-off-by: Hao Chen <flank3rsky@gmail.com>
---
 net/wireless/nl80211.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 213d0c4..2b3dbcd 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -11361,7 +11361,8 @@ static int nl80211_nan_add_func(struct sk_buff *skb,
 		break;
 	case NL80211_NAN_FUNC_FOLLOW_UP:
 		if (!tb[NL80211_NAN_FUNC_FOLLOW_UP_ID] ||
-		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_REQ_ID]) {
+		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_REQ_ID] ||
+		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_DEST]) {
 			err = -EINVAL;
 			goto out;
 		}
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH] liquidio: Use zeroing memory allocator than allocator/memset
From: David Miller @ 2018-01-03  3:01 UTC (permalink / raw)
  To: himanshujha199640
  Cc: derek.chickles, satananda.burla, felix.manlunas, raghu.vatsavayi,
	netdev, linux-kernel
In-Reply-To: <1514723249-5736-1-git-send-email-himanshujha199640@gmail.com>

From: Himanshu Jha <himanshujha199640@gmail.com>
Date: Sun, 31 Dec 2017 17:57:29 +0530

> Use vzalloc for allocating zeroed memory and remove unnecessary
> memset function.
> 
> Done using Coccinelle.
> Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
> 0-day tested with no failures.
> 
> Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH iproute2-next 8/9] rdma: Add QP resource tracking information
From: David Ahern @ 2018-01-03  3:03 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe
  Cc: RDMA mailing list, Leon Romanovsky, netdev, Stephen Hemminger
In-Reply-To: <20180102093725.6172-9-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

On 1/2/18 2:37 AM, Leon Romanovsky wrote:
> +	mnl_attr_for_each_nested(nla_entry, nla_table) {
> +		struct nlattr *nla_line[RDMA_NLDEV_ATTR_MAX] = {};
> +		uint32_t lqpn, rqpn = 0, rq_psn = 0, sq_psn;
> +		uint8_t type, state, path_mig_state = 0;
> +		uint32_t port = 0, pid = 0;
> +		bool ignore_value = false;
> +		char port_name[32];
> +		const char *comm;
> +		int err;
> +
> +		err = mnl_attr_parse_nested(nla_entry, rd_attr_cb, nla_line);
> +		if (err != MNL_CB_OK)
> +			return -EINVAL;
> +
> +		if (!nla_line[RDMA_NLDEV_ATTR_RES_LQPN] ||
> +		    !nla_line[RDMA_NLDEV_ATTR_RES_SQ_PSN] ||
> +		    !nla_line[RDMA_NLDEV_ATTR_RES_TYPE] ||
> +		    !nla_line[RDMA_NLDEV_ATTR_RES_STATE] ||
> +		    !nla_line[RDMA_NLDEV_ATTR_RES_PID_COMM]) {
> +			return -EINVAL;
> +		}
> +
> +		if (nla_line[RDMA_NLDEV_ATTR_PORT_INDEX])
> +			port = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_PORT_INDEX]);
> +
> +		if (port != rd->port_idx)
> +			continue;
> +
> +		if (nla_line[RDMA_NLDEV_ATTR_PORT_INDEX])
> +			snprintf(port_name, 32, "%s/%u", name, port);
> +		else
> +			snprintf(port_name, 32, "%s/-", name);
> +
> +		lqpn = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_RES_LQPN]);
> +		if (rd_check_is_filtered(rd, "lqpn", lqpn, false))
> +			continue;
> +
> +		if (nla_line[RDMA_NLDEV_ATTR_RES_RQPN])
> +			rqpn = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_RES_RQPN]);
> +		else
> +			ignore_value = true;
> +
> +		if (rd_check_is_filtered(rd, "rqpn", rqpn, ignore_value))
> +			continue;
> +
> +		if (nla_line[RDMA_NLDEV_ATTR_RES_RQ_PSN])
> +			rq_psn = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_RES_RQ_PSN]);
> +
> +		sq_psn = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_RES_SQ_PSN]);
> +		if (nla_line[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE])
> +			path_mig_state = mnl_attr_get_u8(nla_line[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE]);
> +		type = mnl_attr_get_u8(nla_line[RDMA_NLDEV_ATTR_RES_TYPE]);
> +		state = mnl_attr_get_u8(nla_line[RDMA_NLDEV_ATTR_RES_STATE]);
> +
> +		if (nla_line[RDMA_NLDEV_ATTR_RES_PID])
> +			pid = mnl_attr_get_u32(nla_line[RDMA_NLDEV_ATTR_RES_PID]);
> +
> +		if (rd_check_is_filtered(rd, "pid", pid, false))
> +			continue;
> +
> +		comm = mnl_attr_get_str(nla_line[RDMA_NLDEV_ATTR_RES_PID_COMM]);
> +
> +		if (rd->json_output) {
> +			jsonw_start_array(rd->jw);
> +			jsonw_uint_field(rd->jw, "ifindex", idx);
> +			if (nla_line[RDMA_NLDEV_ATTR_PORT_INDEX])
> +				jsonw_uint_field(rd->jw, "port", port);
> +			jsonw_string_field(rd->jw, "ifname", port_name);
> +			jsonw_uint_field(rd->jw, "lqpn", lqpn);
> +			if (nla_line[RDMA_NLDEV_ATTR_RES_RQPN])
> +				jsonw_uint_field(rd->jw, "rqpn", rqpn);
> +			if (nla_line[RDMA_NLDEV_ATTR_RES_RQ_PSN])
> +				jsonw_uint_field(rd->jw, "rq-psn", rq_psn);
> +			if (nla_line[RDMA_NLDEV_ATTR_RES_SQ_PSN])
> +				jsonw_uint_field(rd->jw, "sq-psn", sq_psn);
> +			if (nla_line[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE])
> +				jsonw_string_field(rd->jw, "path-mig-state",
> +						   path_mig_to_str(path_mig_state));
> +
> +			jsonw_string_field(rd->jw, "type", qp_types_to_str(type));
> +			jsonw_string_field(rd->jw, "state", qp_states_to_str(state));
> +			if (nla_line[RDMA_NLDEV_ATTR_RES_PID])
> +				jsonw_uint_field(rd->jw, "pid", pid);
> +			jsonw_string_field(rd->jw, "comm", comm);
> +			jsonw_end_array(rd->jw);

Some newlines separating common fields goes a long way to make that
readable.

> +		} else {
> +			pr_out("%-10s", port_name);
> +			for (cidx = 0; cidx < ARRAY_SIZE(c); cidx++)
> +				if (show_column(rd, &c[cidx])) {
> +					if (!strcmpx(c[cidx].filter_name, "lqpn"))
> +						pr_out("%-11u", lqpn);
> +					if (!strcmpx(c[cidx].filter_name, "rqpn")) {
> +						if (nla_line[RDMA_NLDEV_ATTR_RES_RQPN])
> +							pr_out("%-11u", rqpn);
> +						else
> +							pr_out("%-11s", "---");
> +					}
> +					if (!strcmpx(c[cidx].filter_name, "type"))
> +						pr_out("%-6s", qp_types_to_str(type));
> +					if (!strcmpx(c[cidx].filter_name, "state"))
> +						pr_out("%-7s", qp_states_to_str(state));
> +					if (!strcmpx(c[cidx].filter_name, "rq-psn")) {
> +						if (nla_line[RDMA_NLDEV_ATTR_RES_RQ_PSN])
> +							pr_out("%-11d", rq_psn);
> +						else
> +							pr_out("%-11s", "---");
> +					}
> +					if (!strcmpx(c[cidx].filter_name, "sq-psn"))
> +						pr_out("%-11d", sq_psn);
> +					if (!strcmpx(c[cidx].filter_name, "path-mig")) {
> +						if (nla_line[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE])
> +							pr_out("%-16s", path_mig_to_str(path_mig_state));
> +						else
> +							pr_out("%-16s", "---");
> +					}
> +					if (!strcmpx(c[cidx].filter_name, "pid"))
> +						pr_out("%-11d", pid);
> +					if (!strcmpx(c[cidx].filter_name, "comm")) {
> +						if (nla_line[RDMA_NLDEV_ATTR_RES_PID]) {
> +							pr_out("%-16s ", comm);
> +						} else {
> +							char tmp[18];
> +
> +							snprintf(tmp, sizeof(tmp), "[%s]", comm);
> +							pr_out("%-16s", tmp);
> +						}
> +					}

same with this block. Can each be put into helpers to decrease the
indentation and then some newlines to put some whitespace on all the
strcmp's?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Business Opportunity
From: Mr Yin Lianchen @ 2018-01-02 18:28 UTC (permalink / raw)


Hello,

How are you and your family?
Thanks for accepting my connection.
I am connecting you due to a Business Opportunity.
Should you like to know more about it.
Do get back to me so i give you further details.

I hope to hear from you soon

Regards,

MR. YIN LIANCHEN
CHIEF INVESTMENT OFFICER
CHINA EVERBRIGHT LIMITED.
210 CENTURY CENTER BUILDING, 25th FLOOR, 21
CENTURY AVENUE, PUDONG NEW AREA,
SHANGHAI, CHINA

^ permalink raw reply

* [PATCH net-next 1/3] nfp: flower: obtain repr link state only from firmware
From: Jakub Kicinski @ 2018-01-03  3:18 UTC (permalink / raw)
  To: netdev; +Cc: oss-drivers, Dirk van der Merwe
In-Reply-To: <20180103031901.4165-1-jakub.kicinski@netronome.com>

From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>

Instead of starting up reprs assuming that there is link, only respond
to the link state reported by firmware.

Furthermore, ensure link is down after repr netdevs are created.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/main.c  | 2 --
 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 63160e9754d4..252d19236ad8 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -110,7 +110,6 @@ nfp_flower_repr_netdev_open(struct nfp_app *app, struct nfp_repr *repr)
 	if (err)
 		return err;
 
-	netif_carrier_on(repr->netdev);
 	netif_tx_wake_all_queues(repr->netdev);
 
 	return 0;
@@ -119,7 +118,6 @@ nfp_flower_repr_netdev_open(struct nfp_app *app, struct nfp_repr *repr)
 static int
 nfp_flower_repr_netdev_stop(struct nfp_app *app, struct nfp_repr *repr)
 {
-	netif_carrier_off(repr->netdev);
 	netif_tx_disable(repr->netdev);
 
 	return nfp_flower_cmsg_portmod(repr, false);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
index 78b36c67c232..3c6cb381385d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
@@ -336,6 +336,8 @@ struct net_device *nfp_repr_alloc(struct nfp_app *app)
 	if (!netdev)
 		return NULL;
 
+	netif_carrier_off(netdev);
+
 	repr = netdev_priv(netdev);
 	repr->netdev = netdev;
 	repr->app = app;
-- 
2.15.1

^ permalink raw reply related

* [PATCH net-next 0/3] nfp: flower: repr link state
From: Jakub Kicinski @ 2018-01-03  3:18 UTC (permalink / raw)
  To: netdev; +Cc: oss-drivers, Jakub Kicinski

Dirk says:

This series provides two updates towards the link state of reprs in
the flower nfp app.

Patch #1 improves the way link state is reported for reprs. Instead of
starting with an assumed 'UP' state, always assume the link state is
'DOWN' and then modify this only on events received from firmware.

Patch #2 adds a new nfp_app hook, repr_preclean. This callback is
executed before reprs are removed from the app context and is executed
per repr.

Patch #3 implements the new REIFY control message, used to indicate
when reprs are created and destroyed. Firmware uses these messages
to prevent communication about any particular port when the driver
doesn't know about the repr yet or anymore.

Dirk van der Merwe (3):
  nfp: flower: obtain repr link state only from firmware
  nfp: add repr_preclean callback
  nfp: flower: implement the PORT_REIFY message

 drivers/net/ethernet/netronome/nfp/flower/cmsg.c  |  46 ++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h  |  11 +++
 drivers/net/ethernet/netronome/nfp/flower/main.c  | 105 ++++++++++++++++++++--
 drivers/net/ethernet/netronome/nfp/flower/main.h  |   5 ++
 drivers/net/ethernet/netronome/nfp/nfp_app.h      |  10 +++
 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c |  20 ++++-
 6 files changed, 189 insertions(+), 8 deletions(-)

-- 
2.15.1

^ permalink raw reply

* [PATCH net-next 2/3] nfp: add repr_preclean callback
From: Jakub Kicinski @ 2018-01-03  3:19 UTC (permalink / raw)
  To: netdev; +Cc: oss-drivers, Dirk van der Merwe
In-Reply-To: <20180103031901.4165-1-jakub.kicinski@netronome.com>

From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>

Just before a repr is cleaned up, we give the app a chance to perform
some preclean configuration while the reprs pointer is still configured
for the app.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_app.h      | 10 ++++++++++
 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 18 +++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.h b/drivers/net/ethernet/netronome/nfp/nfp_app.h
index 0e5e0305ad1c..3af1943a8521 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_app.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_app.h
@@ -77,6 +77,8 @@ extern const struct nfp_app_type app_flower;
  * @vnic_init:	vNIC netdev was registered
  * @vnic_clean:	vNIC netdev about to be unregistered
  * @repr_init:	representor about to be registered
+ * @repr_preclean:	representor about to unregistered, executed before app
+ *			reference to the it is removed
  * @repr_clean:	representor about to be unregistered
  * @repr_open:	representor netdev open callback
  * @repr_stop:	representor netdev stop callback
@@ -112,6 +114,7 @@ struct nfp_app_type {
 	void (*vnic_clean)(struct nfp_app *app, struct nfp_net *nn);
 
 	int (*repr_init)(struct nfp_app *app, struct net_device *netdev);
+	void (*repr_preclean)(struct nfp_app *app, struct net_device *netdev);
 	void (*repr_clean)(struct nfp_app *app, struct net_device *netdev);
 
 	int (*repr_open)(struct nfp_app *app, struct nfp_repr *repr);
@@ -225,6 +228,13 @@ nfp_app_repr_init(struct nfp_app *app, struct net_device *netdev)
 	return app->type->repr_init(app, netdev);
 }
 
+static inline void
+nfp_app_repr_preclean(struct nfp_app *app, struct net_device *netdev)
+{
+	if (app->type->repr_preclean)
+		app->type->repr_preclean(app, netdev);
+}
+
 static inline void
 nfp_app_repr_clean(struct nfp_app *app, struct net_device *netdev)
 {
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
index 3c6cb381385d..f50aa119570a 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
@@ -377,11 +377,22 @@ nfp_reprs_clean_and_free_by_type(struct nfp_app *app,
 				 enum nfp_repr_type type)
 {
 	struct nfp_reprs *reprs;
+	int i;
 
-	reprs = nfp_app_reprs_set(app, type, NULL);
+	reprs = rcu_dereference_protected(app->reprs[type],
+					  lockdep_is_held(&app->pf->lock));
 	if (!reprs)
 		return;
 
+	/* Preclean must happen before we remove the reprs reference from the
+	 * app below.
+	 */
+	for (i = 0; i < reprs->num_reprs; i++)
+		if (reprs->reprs[i])
+			nfp_app_repr_preclean(app, reprs->reprs[i]);
+
+	reprs = nfp_app_reprs_set(app, type, NULL);
+
 	synchronize_rcu();
 	nfp_reprs_clean_and_free(reprs);
 }
@@ -420,8 +431,10 @@ int nfp_reprs_resync_phys_ports(struct nfp_app *app)
 			continue;
 
 		repr = netdev_priv(old_reprs->reprs[i]);
-		if (repr->port->type == NFP_PORT_INVALID)
+		if (repr->port->type == NFP_PORT_INVALID) {
+			nfp_app_repr_preclean(app, old_reprs->reprs[i]);
 			continue;
+		}
 
 		reprs->reprs[i] = old_reprs->reprs[i];
 	}
@@ -438,7 +451,6 @@ int nfp_reprs_resync_phys_ports(struct nfp_app *app)
 		if (repr->port->type != NFP_PORT_INVALID)
 			continue;
 
-		nfp_app_repr_stop(app, repr);
 		nfp_repr_clean(repr);
 	}
 
-- 
2.15.1

^ permalink raw reply related

* [PATCH net-next 3/3] nfp: flower: implement the PORT_REIFY message
From: Jakub Kicinski @ 2018-01-03  3:19 UTC (permalink / raw)
  To: netdev; +Cc: oss-drivers, Dirk van der Merwe
In-Reply-To: <20180103031901.4165-1-jakub.kicinski@netronome.com>

From: Dirk van der Merwe <dirk.vandermerwe@netronome.com>

The PORT_REIFY message indicates whether reprs have been created or
when they are about to be destroyed. This is necessary so firmware
can know which state the driver is in, e.g. the firmware will not send
any control messages related to ports when the reprs are destroyed.

This prevents nuisance warning messages printed whenever the firmware
sends updates for non-existent reprs.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/cmsg.c |  46 ++++++++++
 drivers/net/ethernet/netronome/nfp/flower/cmsg.h |  11 +++
 drivers/net/ethernet/netronome/nfp/flower/main.c | 103 ++++++++++++++++++++++-
 drivers/net/ethernet/netronome/nfp/flower/main.h |   5 ++
 4 files changed, 162 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
index e98bb9cdb6a3..615314d9e7c6 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
@@ -125,6 +125,27 @@ int nfp_flower_cmsg_portmod(struct nfp_repr *repr, bool carrier_ok)
 	return 0;
 }
 
+int nfp_flower_cmsg_portreify(struct nfp_repr *repr, bool exists)
+{
+	struct nfp_flower_cmsg_portreify *msg;
+	struct sk_buff *skb;
+
+	skb = nfp_flower_cmsg_alloc(repr->app, sizeof(*msg),
+				    NFP_FLOWER_CMSG_TYPE_PORT_REIFY,
+				    GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	msg = nfp_flower_cmsg_get_data(skb);
+	msg->portnum = cpu_to_be32(repr->dst->u.port_info.port_id);
+	msg->reserved = 0;
+	msg->info = cpu_to_be16(exists);
+
+	nfp_ctrl_tx(repr->app->ctrl, skb);
+
+	return 0;
+}
+
 static void
 nfp_flower_cmsg_portmod_rx(struct nfp_app *app, struct sk_buff *skb)
 {
@@ -160,6 +181,28 @@ nfp_flower_cmsg_portmod_rx(struct nfp_app *app, struct sk_buff *skb)
 	rtnl_unlock();
 }
 
+static void
+nfp_flower_cmsg_portreify_rx(struct nfp_app *app, struct sk_buff *skb)
+{
+	struct nfp_flower_priv *priv = app->priv;
+	struct nfp_flower_cmsg_portreify *msg;
+	bool exists;
+
+	msg = nfp_flower_cmsg_get_data(skb);
+
+	rcu_read_lock();
+	exists = !!nfp_app_repr_get(app, be32_to_cpu(msg->portnum));
+	rcu_read_unlock();
+	if (!exists) {
+		nfp_flower_cmsg_warn(app, "ctrl msg for unknown port 0x%08x\n",
+				     be32_to_cpu(msg->portnum));
+		return;
+	}
+
+	atomic_inc(&priv->reify_replies);
+	wake_up_interruptible(&priv->reify_wait_queue);
+}
+
 static void
 nfp_flower_cmsg_process_one_rx(struct nfp_app *app, struct sk_buff *skb)
 {
@@ -176,6 +219,9 @@ nfp_flower_cmsg_process_one_rx(struct nfp_app *app, struct sk_buff *skb)
 
 	type = cmsg_hdr->type;
 	switch (type) {
+	case NFP_FLOWER_CMSG_TYPE_PORT_REIFY:
+		nfp_flower_cmsg_portreify_rx(app, skb);
+		break;
 	case NFP_FLOWER_CMSG_TYPE_PORT_MOD:
 		nfp_flower_cmsg_portmod_rx(app, skb);
 		break;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
index 992d2eec1019..adfe474c2cf0 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h
@@ -350,6 +350,7 @@ struct nfp_flower_cmsg_hdr {
 enum nfp_flower_cmsg_type_port {
 	NFP_FLOWER_CMSG_TYPE_FLOW_ADD =		0,
 	NFP_FLOWER_CMSG_TYPE_FLOW_DEL =		2,
+	NFP_FLOWER_CMSG_TYPE_PORT_REIFY =	6,
 	NFP_FLOWER_CMSG_TYPE_MAC_REPR =		7,
 	NFP_FLOWER_CMSG_TYPE_PORT_MOD =		8,
 	NFP_FLOWER_CMSG_TYPE_NO_NEIGH =		10,
@@ -386,6 +387,15 @@ struct nfp_flower_cmsg_portmod {
 
 #define NFP_FLOWER_CMSG_PORTMOD_INFO_LINK	BIT(0)
 
+/* NFP_FLOWER_CMSG_TYPE_PORT_REIFY */
+struct nfp_flower_cmsg_portreify {
+	__be32 portnum;
+	u16 reserved;
+	__be16 info;
+};
+
+#define NFP_FLOWER_CMSG_PORTREIFY_INFO_EXIST	BIT(0)
+
 enum nfp_flower_cmsg_port_type {
 	NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC =	0x0,
 	NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT =	0x1,
@@ -444,6 +454,7 @@ nfp_flower_cmsg_mac_repr_add(struct sk_buff *skb, unsigned int idx,
 			     unsigned int nbi, unsigned int nbi_port,
 			     unsigned int phys_port);
 int nfp_flower_cmsg_portmod(struct nfp_repr *repr, bool carrier_ok);
+int nfp_flower_cmsg_portreify(struct nfp_repr *repr, bool exists);
 void nfp_flower_cmsg_process_rx(struct work_struct *work);
 void nfp_flower_cmsg_rx(struct nfp_app *app, struct sk_buff *skb);
 struct sk_buff *
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 252d19236ad8..67c406815365 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -32,6 +32,7 @@
  */
 
 #include <linux/etherdevice.h>
+#include <linux/lockdep.h>
 #include <linux/pci.h>
 #include <linux/skbuff.h>
 #include <linux/vmalloc.h>
@@ -101,6 +102,52 @@ nfp_flower_repr_get(struct nfp_app *app, u32 port_id)
 	return reprs->reprs[port];
 }
 
+static int
+nfp_flower_reprs_reify(struct nfp_app *app, enum nfp_repr_type type,
+		       bool exists)
+{
+	struct nfp_reprs *reprs;
+	int i, err, count = 0;
+
+	reprs = rcu_dereference_protected(app->reprs[type],
+					  lockdep_is_held(&app->pf->lock));
+	if (!reprs)
+		return 0;
+
+	for (i = 0; i < reprs->num_reprs; i++)
+		if (reprs->reprs[i]) {
+			struct nfp_repr *repr = netdev_priv(reprs->reprs[i]);
+
+			err = nfp_flower_cmsg_portreify(repr, exists);
+			if (err)
+				return err;
+			count++;
+		}
+
+	return count;
+}
+
+static int
+nfp_flower_wait_repr_reify(struct nfp_app *app, atomic_t *replies, int tot_repl)
+{
+	struct nfp_flower_priv *priv = app->priv;
+	int err;
+
+	if (!tot_repl)
+		return 0;
+
+	lockdep_assert_held(&app->pf->lock);
+	err = wait_event_interruptible_timeout(priv->reify_wait_queue,
+					       atomic_read(replies) >= tot_repl,
+					       msecs_to_jiffies(10));
+	if (err <= 0) {
+		nfp_warn(app->cpp, "Not all reprs responded to reify\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
 static int
 nfp_flower_repr_netdev_open(struct nfp_app *app, struct nfp_repr *repr)
 {
@@ -138,6 +185,24 @@ nfp_flower_repr_netdev_clean(struct nfp_app *app, struct net_device *netdev)
 				     netdev_priv(netdev));
 }
 
+static void
+nfp_flower_repr_netdev_preclean(struct nfp_app *app, struct net_device *netdev)
+{
+	struct nfp_repr *repr = netdev_priv(netdev);
+	struct nfp_flower_priv *priv = app->priv;
+	atomic_t *replies = &priv->reify_replies;
+	int err;
+
+	atomic_set(replies, 0);
+	err = nfp_flower_cmsg_portreify(repr, false);
+	if (err) {
+		nfp_warn(app->cpp, "Failed to notify firmware about repr destruction\n");
+		return;
+	}
+
+	nfp_flower_wait_repr_reify(app, replies, 1);
+}
+
 static void nfp_flower_sriov_disable(struct nfp_app *app)
 {
 	struct nfp_flower_priv *priv = app->priv;
@@ -155,10 +220,11 @@ nfp_flower_spawn_vnic_reprs(struct nfp_app *app,
 {
 	u8 nfp_pcie = nfp_cppcore_pcie_unit(app->pf->cpp);
 	struct nfp_flower_priv *priv = app->priv;
+	atomic_t *replies = &priv->reify_replies;
 	enum nfp_port_type port_type;
 	struct nfp_reprs *reprs;
+	int i, err, reify_cnt;
 	const u8 queue = 0;
-	int i, err;
 
 	port_type = repr_type == NFP_REPR_TYPE_PF ? NFP_PORT_PF_PORT :
 						    NFP_PORT_VF_PORT;
@@ -209,7 +275,21 @@ nfp_flower_spawn_vnic_reprs(struct nfp_app *app,
 
 	nfp_app_reprs_set(app, repr_type, reprs);
 
+	atomic_set(replies, 0);
+	reify_cnt = nfp_flower_reprs_reify(app, repr_type, true);
+	if (reify_cnt < 0) {
+		err = reify_cnt;
+		nfp_warn(app->cpp, "Failed to notify firmware about repr creation\n");
+		goto err_reprs_remove;
+	}
+
+	err = nfp_flower_wait_repr_reify(app, replies, reify_cnt);
+	if (err)
+		goto err_reprs_remove;
+
 	return 0;
+err_reprs_remove:
+	reprs = nfp_app_reprs_set(app, repr_type, NULL);
 err_reprs_clean:
 	nfp_reprs_clean_and_free(reprs);
 	return err;
@@ -231,10 +311,11 @@ static int
 nfp_flower_spawn_phy_reprs(struct nfp_app *app, struct nfp_flower_priv *priv)
 {
 	struct nfp_eth_table *eth_tbl = app->pf->eth_tbl;
+	atomic_t *replies = &priv->reify_replies;
 	struct sk_buff *ctrl_skb;
 	struct nfp_reprs *reprs;
+	int err, reify_cnt;
 	unsigned int i;
-	int err;
 
 	ctrl_skb = nfp_flower_cmsg_mac_repr_start(app, eth_tbl->count);
 	if (!ctrl_skb)
@@ -291,16 +372,30 @@ nfp_flower_spawn_phy_reprs(struct nfp_app *app, struct nfp_flower_priv *priv)
 
 	nfp_app_reprs_set(app, NFP_REPR_TYPE_PHYS_PORT, reprs);
 
-	/* The MAC_REPR control message should be sent after the MAC
+	/* The REIFY/MAC_REPR control messages should be sent after the MAC
 	 * representors are registered using nfp_app_reprs_set().  This is
 	 * because the firmware may respond with control messages for the
 	 * MAC representors, f.e. to provide the driver with information
 	 * about their state, and without registration the driver will drop
 	 * any such messages.
 	 */
+	atomic_set(replies, 0);
+	reify_cnt = nfp_flower_reprs_reify(app, NFP_REPR_TYPE_PHYS_PORT, true);
+	if (reify_cnt < 0) {
+		err = reify_cnt;
+		nfp_warn(app->cpp, "Failed to notify firmware about repr creation\n");
+		goto err_reprs_remove;
+	}
+
+	err = nfp_flower_wait_repr_reify(app, replies, reify_cnt);
+	if (err)
+		goto err_reprs_remove;
+
 	nfp_ctrl_tx(app->ctrl, ctrl_skb);
 
 	return 0;
+err_reprs_remove:
+	reprs = nfp_app_reprs_set(app, NFP_REPR_TYPE_PHYS_PORT, NULL);
 err_reprs_clean:
 	nfp_reprs_clean_and_free(reprs);
 err_free_ctrl_skb:
@@ -417,6 +512,7 @@ static int nfp_flower_init(struct nfp_app *app)
 	app_priv->app = app;
 	skb_queue_head_init(&app_priv->cmsg_skbs);
 	INIT_WORK(&app_priv->cmsg_work, nfp_flower_cmsg_process_rx);
+	init_waitqueue_head(&app_priv->reify_wait_queue);
 
 	err = nfp_flower_metadata_init(app);
 	if (err)
@@ -474,6 +570,7 @@ const struct nfp_app_type app_flower = {
 	.vnic_clean	= nfp_flower_vnic_clean,
 
 	.repr_init	= nfp_flower_repr_netdev_init,
+	.repr_preclean	= nfp_flower_repr_netdev_preclean,
 	.repr_clean	= nfp_flower_repr_netdev_clean,
 
 	.repr_open	= nfp_flower_repr_netdev_open,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index 6e3937a0b708..332ff0fdc038 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -102,6 +102,9 @@ struct nfp_fl_stats_id {
  * @nfp_mac_off_count:	Number of MACs in address list
  * @nfp_tun_mac_nb:	Notifier to monitor link state
  * @nfp_tun_neigh_nb:	Notifier to monitor neighbour state
+ * @reify_replies:	atomically stores the number of replies received
+ *			from firmware for repr reify
+ * @reify_wait_queue:	wait queue for repr reify response counting
  */
 struct nfp_flower_priv {
 	struct nfp_app *app;
@@ -127,6 +130,8 @@ struct nfp_flower_priv {
 	int nfp_mac_off_count;
 	struct notifier_block nfp_tun_mac_nb;
 	struct notifier_block nfp_tun_neigh_nb;
+	atomic_t reify_replies;
+	wait_queue_head_t reify_wait_queue;
 };
 
 struct nfp_fl_key_ls {
-- 
2.15.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox