Netdev List

Netdev List
 help / color / mirror / Atom feed

* [RFC iproute2 0/4] Revert tc batchsize feature
From: Stephen Hemminger @ 2019-08-01  0:45 UTC (permalink / raw)
  To: jiri, chrism; +Cc: netdev, Stephen Hemminger

The batchsize feature of tc might save a few cycles but it
is a maintaince nightmare, it has uninitialized variables and
poor error handling. 

This patch set reverts back to the original state.
Please don't resubmit original code. Go back to the drawing
board and do something generic.  For example, the routing
daemons have figured out that by using multiple threads and
turning off the netlink ACK they can update millions of routes
quickly.

Stephen Hemminger (4):
  Revert "tc: Remove pointless assignments in batch()"
  Revert "tc: flush after each command in batch mode"
  Revert "tc: fix batch force option"
  Revert "tc: Add batchsize feature for filter and actions"

 tc/m_action.c  |  65 ++++++----------
 tc/tc.c        | 201 ++++---------------------------------------------
 tc/tc_common.h |   7 +-
 tc/tc_filter.c | 129 ++++++++++++-------------------
 4 files changed, 87 insertions(+), 315 deletions(-)

^ permalink raw reply

* [RFC iproute2 1/4] Revert "tc: Remove pointless assignments in batch()"
From: Stephen Hemminger @ 2019-08-01  0:45 UTC (permalink / raw)
  To: jiri, chrism; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20190801004506.9049-1-stephen@networkplumber.org>

This reverts commit 6358bbc381c6e38465838370bcbbdeb77ec3565a.
---
 tc/tc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tc/tc.c b/tc/tc.c
index 64e342dd85bf..1f23971ae4b9 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -326,11 +326,11 @@ static int batch(const char *name)
 	struct batch_buf *head = NULL, *tail = NULL, *buf_pool = NULL;
 	char *largv[100], *largv_next[100];
 	char *line, *line_next = NULL;
+	bool bs_enabled_next = false;
 	bool bs_enabled = false;
 	bool lastline = false;
 	int largc, largc_next;
 	bool bs_enabled_saved;
-	bool bs_enabled_next;
 	int batchsize = 0;
 	size_t len = 0;
 	int ret = 0;
@@ -359,6 +359,7 @@ static int batch(const char *name)
 		goto Exit;
 	largc = makeargs(line, largv, 100);
 	bs_enabled = batchsize_enabled(largc, largv);
+	bs_enabled_saved = bs_enabled;
 	do {
 		if (getcmdline(&line_next, &len, stdin) == -1)
 			lastline = true;
@@ -394,6 +395,7 @@ static int batch(const char *name)
 		len = 0;
 		bs_enabled_saved = bs_enabled;
 		bs_enabled = bs_enabled_next;
+		bs_enabled_next = false;
 
 		if (largc == 0) {
 			largc = largc_next;
-- 
2.20.1


^ permalink raw reply related

* [RFC iproute2 3/4] Revert "tc: fix batch force option"
From: Stephen Hemminger @ 2019-08-01  0:45 UTC (permalink / raw)
  To: jiri, chrism; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20190801004506.9049-1-stephen@networkplumber.org>

This reverts commit b133392468d1f404077a8f3554d1f63d48bb45e8.
---
 tc/tc.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/tc/tc.c b/tc/tc.c
index c115155b2234..b7b6bd288897 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -334,7 +334,6 @@ static int batch(const char *name)
 	int batchsize = 0;
 	size_t len = 0;
 	int ret = 0;
-	int err;
 	bool send;
 
 	batch_mode = 1;
@@ -403,9 +402,9 @@ static int batch(const char *name)
 			continue;	/* blank line */
 		}
 
-		err = do_cmd(largc, largv, tail == NULL ? NULL : tail->buf,
+		ret = do_cmd(largc, largv, tail == NULL ? NULL : tail->buf,
 			     tail == NULL ? 0 : sizeof(tail->buf));
-		if (err != 0) {
+		if (ret != 0) {
 			fprintf(stderr, "Command failed %s:%d\n", name,
 				cmdlineno - 1);
 			ret = 1;
@@ -427,17 +426,15 @@ static int batch(const char *name)
 				iov->iov_len = n->nlmsg_len;
 			}
 
-			err = rtnl_talk_iov(&rth, iovs, batchsize, NULL);
-			put_batch_bufs(&buf_pool, &head, &tail);
-			free(iovs);
-			if (err < 0) {
+			ret = rtnl_talk_iov(&rth, iovs, batchsize, NULL);
+			if (ret < 0) {
 				fprintf(stderr, "Command failed %s:%d\n", name,
-					cmdlineno - (batchsize + err) - 1);
-				ret = 1;
-				if (!force)
-					break;
+					cmdlineno - (batchsize + ret) - 1);
+				return 2;
 			}
+			put_batch_bufs(&buf_pool, &head, &tail);
 			batchsize = 0;
+			free(iovs);
 		}
 	} while (!lastline);
 
-- 
2.20.1


^ permalink raw reply related

* [RFC iproute2 2/4] Revert "tc: flush after each command in batch mode"
From: Stephen Hemminger @ 2019-08-01  0:45 UTC (permalink / raw)
  To: jiri, chrism; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20190801004506.9049-1-stephen@networkplumber.org>

This reverts commit d66fdfda71e4a30c1ca0ddb7b1a048bef30fe79e.
---
 tc/tc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tc/tc.c b/tc/tc.c
index 1f23971ae4b9..c115155b2234 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -405,7 +405,6 @@ static int batch(const char *name)
 
 		err = do_cmd(largc, largv, tail == NULL ? NULL : tail->buf,
 			     tail == NULL ? 0 : sizeof(tail->buf));
-		fflush(stdout);
 		if (err != 0) {
 			fprintf(stderr, "Command failed %s:%d\n", name,
 				cmdlineno - 1);
-- 
2.20.1


^ permalink raw reply related

* [RFC iproute2 4/4] Revert "tc: Add batchsize feature for filter and actions"
From: Stephen Hemminger @ 2019-08-01  0:45 UTC (permalink / raw)
  To: jiri, chrism; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20190801004506.9049-1-stephen@networkplumber.org>

This reverts commit 485d0c6001c4aa134b99c86913d6a7089b7b2ab0.
---
 tc/m_action.c  |  65 ++++++----------
 tc/tc.c        | 199 ++++---------------------------------------------
 tc/tc_common.h |   7 +-
 tc/tc_filter.c | 129 ++++++++++++--------------------
 4 files changed, 87 insertions(+), 313 deletions(-)

diff --git a/tc/m_action.c b/tc/m_action.c
index ab6bc0ad28ff..bdc62720879c 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -556,61 +556,40 @@ bad_val:
 	return ret;
 }
 
-struct tc_action_req {
-	struct nlmsghdr		n;
-	struct tcamsg		t;
-	char			buf[MAX_MSG];
-};
-
 static int tc_action_modify(int cmd, unsigned int flags,
-			    int *argc_p, char ***argv_p,
-			    void *buf, size_t buflen)
+			    int *argc_p, char ***argv_p)
 {
-	struct tc_action_req *req, action_req;
-	char **argv = *argv_p;
-	struct rtattr *tail;
 	int argc = *argc_p;
-	struct iovec iov;
+	char **argv = *argv_p;
 	int ret = 0;
-
-	if (buf) {
-		req = buf;
-		if (buflen < sizeof (struct tc_action_req)) {
-			fprintf(stderr, "buffer is too small: %zu\n", buflen);
-			return -1;
-		}
-	} else {
-		memset(&action_req, 0, sizeof (struct tc_action_req));
-		req = &action_req;
-	}
-
-	req->n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcamsg));
-	req->n.nlmsg_flags = NLM_F_REQUEST | flags;
-	req->n.nlmsg_type = cmd;
-	req->t.tca_family = AF_UNSPEC;
-	tail = NLMSG_TAIL(&req->n);
+	struct {
+		struct nlmsghdr         n;
+		struct tcamsg           t;
+		char                    buf[MAX_MSG];
+	} req = {
+		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcamsg)),
+		.n.nlmsg_flags = NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.t.tca_family = AF_UNSPEC,
+	};
+	struct rtattr *tail = NLMSG_TAIL(&req.n);
 
 	argc -= 1;
 	argv += 1;
-	if (parse_action(&argc, &argv, TCA_ACT_TAB, &req->n)) {
+	if (parse_action(&argc, &argv, TCA_ACT_TAB, &req.n)) {
 		fprintf(stderr, "Illegal \"action\"\n");
 		return -1;
 	}
-	tail->rta_len = (void *) NLMSG_TAIL(&req->n) - (void *) tail;
+	tail->rta_len = (void *) NLMSG_TAIL(&req.n) - (void *) tail;
 
-	*argc_p = argc;
-	*argv_p = argv;
-
-	if (buf)
-		return 0;
-
-	iov.iov_base = &req->n;
-	iov.iov_len = req->n.nlmsg_len;
-	if (rtnl_talk_iov(&rth, &iov, 1, NULL) < 0) {
+	if (rtnl_talk(&rth, &req.n, NULL) < 0) {
 		fprintf(stderr, "We have an error talking to the kernel\n");
 		ret = -1;
 	}
 
+	*argc_p = argc;
+	*argv_p = argv;
+
 	return ret;
 }
 
@@ -711,7 +690,7 @@ bad_val:
 	return ret;
 }
 
-int do_action(int argc, char **argv, void *buf, size_t buflen)
+int do_action(int argc, char **argv)
 {
 
 	int ret = 0;
@@ -721,12 +700,12 @@ int do_action(int argc, char **argv, void *buf, size_t buflen)
 		if (matches(*argv, "add") == 0) {
 			ret =  tc_action_modify(RTM_NEWACTION,
 						NLM_F_EXCL | NLM_F_CREATE,
-						&argc, &argv, buf, buflen);
+						&argc, &argv);
 		} else if (matches(*argv, "change") == 0 ||
 			  matches(*argv, "replace") == 0) {
 			ret = tc_action_modify(RTM_NEWACTION,
 					       NLM_F_CREATE | NLM_F_REPLACE,
-					       &argc, &argv, buf, buflen);
+					       &argc, &argv);
 		} else if (matches(*argv, "delete") == 0) {
 			argc -= 1;
 			argv += 1;
diff --git a/tc/tc.c b/tc/tc.c
index b7b6bd288897..a0a18f380b46 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -205,18 +205,18 @@ static void usage(void)
 		"		     -nm | -nam[es] | { -cf | -conf } path }\n");
 }
 
-static int do_cmd(int argc, char **argv, void *buf, size_t buflen)
+static int do_cmd(int argc, char **argv)
 {
 	if (matches(*argv, "qdisc") == 0)
 		return do_qdisc(argc-1, argv+1);
 	if (matches(*argv, "class") == 0)
 		return do_class(argc-1, argv+1);
 	if (matches(*argv, "filter") == 0)
-		return do_filter(argc-1, argv+1, buf, buflen);
+		return do_filter(argc-1, argv+1);
 	if (matches(*argv, "chain") == 0)
-		return do_chain(argc-1, argv+1, buf, buflen);
+		return do_chain(argc-1, argv+1);
 	if (matches(*argv, "actions") == 0)
-		return do_action(argc-1, argv+1, buf, buflen);
+		return do_action(argc-1, argv+1);
 	if (matches(*argv, "monitor") == 0)
 		return do_tcmonitor(argc-1, argv+1);
 	if (matches(*argv, "exec") == 0)
@@ -231,110 +231,11 @@ static int do_cmd(int argc, char **argv, void *buf, size_t buflen)
 	return -1;
 }
 
-#define TC_MAX_SUBC	10
-static bool batchsize_enabled(int argc, char *argv[])
-{
-	struct {
-		char *c;
-		char *subc[TC_MAX_SUBC];
-	} table[] = {
-		{ "filter", { "add", "delete", "change", "replace", NULL} },
-		{ "actions", { "add", "change", "replace", NULL} },
-		{ NULL },
-	}, *iter;
-	char *s;
-	int i;
-
-	if (argc < 2)
-		return false;
-
-	for (iter = table; iter->c; iter++) {
-		if (matches(argv[0], iter->c))
-			continue;
-		for (i = 0; i < TC_MAX_SUBC; i++) {
-			s = iter->subc[i];
-			if (s && matches(argv[1], s) == 0)
-				return true;
-		}
-	}
-
-	return false;
-}
-
-struct batch_buf {
-	struct batch_buf	*next;
-	char			buf[16420];	/* sizeof (struct nlmsghdr) +
-						   max(sizeof (struct tcmsg) +
-						   sizeof (struct tcamsg)) +
-						   MAX_MSG */
-};
-
-static struct batch_buf *get_batch_buf(struct batch_buf **pool,
-				       struct batch_buf **head,
-				       struct batch_buf **tail)
-{
-	struct batch_buf *buf;
-
-	if (*pool == NULL)
-		buf = calloc(1, sizeof(struct batch_buf));
-	else {
-		buf = *pool;
-		*pool = (*pool)->next;
-		memset(buf, 0, sizeof(struct batch_buf));
-	}
-
-	if (*head == NULL)
-		*head = *tail = buf;
-	else {
-		(*tail)->next = buf;
-		(*tail) = buf;
-	}
-
-	return buf;
-}
-
-static void put_batch_bufs(struct batch_buf **pool,
-			   struct batch_buf **head,
-			   struct batch_buf **tail)
-{
-	if (*head == NULL || *tail == NULL)
-		return;
-
-	if (*pool == NULL)
-		*pool = *head;
-	else {
-		(*tail)->next = *pool;
-		*pool = *head;
-	}
-	*head = NULL;
-	*tail = NULL;
-}
-
-static void free_batch_bufs(struct batch_buf **pool)
-{
-	struct batch_buf *buf;
-
-	for (buf = *pool; buf != NULL; buf = *pool) {
-		*pool = buf->next;
-		free(buf);
-	}
-	*pool = NULL;
-}
-
 static int batch(const char *name)
 {
-	struct batch_buf *head = NULL, *tail = NULL, *buf_pool = NULL;
-	char *largv[100], *largv_next[100];
-	char *line, *line_next = NULL;
-	bool bs_enabled_next = false;
-	bool bs_enabled = false;
-	bool lastline = false;
-	int largc, largc_next;
-	bool bs_enabled_saved;
-	int batchsize = 0;
+	char *line = NULL;
 	size_t len = 0;
 	int ret = 0;
-	bool send;
 
 	batch_mode = 1;
 	if (name && strcmp(name, "-") != 0) {
@@ -354,95 +255,25 @@ static int batch(const char *name)
 	}
 
 	cmdlineno = 0;
-	if (getcmdline(&line, &len, stdin) == -1)
-		goto Exit;
-	largc = makeargs(line, largv, 100);
-	bs_enabled = batchsize_enabled(largc, largv);
-	bs_enabled_saved = bs_enabled;
-	do {
-		if (getcmdline(&line_next, &len, stdin) == -1)
-			lastline = true;
-
-		largc_next = makeargs(line_next, largv_next, 100);
-		bs_enabled_next = batchsize_enabled(largc_next, largv_next);
-		if (bs_enabled) {
-			struct batch_buf *buf;
-
-			buf = get_batch_buf(&buf_pool, &head, &tail);
-			if (!buf) {
-				fprintf(stderr,
-					"failed to allocate batch_buf\n");
-				return -1;
-			}
-			++batchsize;
-		}
+	while (getcmdline(&line, &len, stdin) != -1) {
+		char *largv[100];
+		int largc;
 
-		/*
-		 * In batch mode, if we haven't accumulated enough commands
-		 * and this is not the last command and this command & next
-		 * command both support the batchsize feature, don't send the
-		 * message immediately.
-		 */
-		if (!lastline && bs_enabled && bs_enabled_next
-		    && batchsize != MSG_IOV_MAX)
-			send = false;
-		else
-			send = true;
-
-		line = line_next;
-		line_next = NULL;
-		len = 0;
-		bs_enabled_saved = bs_enabled;
-		bs_enabled = bs_enabled_next;
-		bs_enabled_next = false;
-
-		if (largc == 0) {
-			largc = largc_next;
-			memcpy(largv, largv_next, largc * sizeof(char *));
+		largc = makeargs(line, largv, 100);
+		if (largc == 0)
 			continue;	/* blank line */
-		}
 
-		ret = do_cmd(largc, largv, tail == NULL ? NULL : tail->buf,
-			     tail == NULL ? 0 : sizeof(tail->buf));
-		if (ret != 0) {
-			fprintf(stderr, "Command failed %s:%d\n", name,
-				cmdlineno - 1);
+		if (do_cmd(largc, largv)) {
+			fprintf(stderr, "Command failed %s:%d\n",
+				name, cmdlineno);
 			ret = 1;
 			if (!force)
 				break;
 		}
-		largc = largc_next;
-		memcpy(largv, largv_next, largc * sizeof(char *));
-
-		if (send && bs_enabled_saved) {
-			struct iovec *iov, *iovs;
-			struct batch_buf *buf;
-			struct nlmsghdr *n;
-
-			iov = iovs = malloc(batchsize * sizeof(struct iovec));
-			for (buf = head; buf != NULL; buf = buf->next, ++iov) {
-				n = (struct nlmsghdr *)&buf->buf;
-				iov->iov_base = n;
-				iov->iov_len = n->nlmsg_len;
-			}
-
-			ret = rtnl_talk_iov(&rth, iovs, batchsize, NULL);
-			if (ret < 0) {
-				fprintf(stderr, "Command failed %s:%d\n", name,
-					cmdlineno - (batchsize + ret) - 1);
-				return 2;
-			}
-			put_batch_bufs(&buf_pool, &head, &tail);
-			batchsize = 0;
-			free(iovs);
-		}
-	} while (!lastline);
+	}
 
-	free_batch_bufs(&buf_pool);
-Exit:
 	free(line);
 	rtnl_close(&rth);
-
 	return ret;
 }
 
@@ -536,7 +367,7 @@ int main(int argc, char **argv)
 		goto Exit;
 	}
 
-	ret = do_cmd(argc-1, argv+1, NULL, 0);
+	ret = do_cmd(argc-1, argv+1);
 Exit:
 	rtnl_close(&rth);
 
diff --git a/tc/tc_common.h b/tc/tc_common.h
index d8a6dfdeabd4..802fb7f01fe4 100644
--- a/tc/tc_common.h
+++ b/tc/tc_common.h
@@ -1,15 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #define TCA_BUF_MAX	(64*1024)
-#define MSG_IOV_MAX	128
 
 extern struct rtnl_handle rth;
 
 int do_qdisc(int argc, char **argv);
 int do_class(int argc, char **argv);
-int do_filter(int argc, char **argv, void *buf, size_t buflen);
-int do_chain(int argc, char **argv, void *buf, size_t buflen);
-int do_action(int argc, char **argv, void *buf, size_t buflen);
+int do_filter(int argc, char **argv);
+int do_chain(int argc, char **argv);
+int do_action(int argc, char **argv);
 int do_tcmonitor(int argc, char **argv);
 int do_exec(int argc, char **argv);
 
diff --git a/tc/tc_filter.c b/tc/tc_filter.c
index cd78c2441efa..53759a7a8876 100644
--- a/tc/tc_filter.c
+++ b/tc/tc_filter.c
@@ -58,42 +58,30 @@ struct tc_filter_req {
 	char			buf[MAX_MSG];
 };
 
-static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv,
-			    void *buf, size_t buflen)
+static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv)
 {
-	struct tc_filter_req *req, filter_req;
+	struct {
+		struct nlmsghdr	n;
+		struct tcmsg		t;
+		char			buf[MAX_MSG];
+	} req = {
+		.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg)),
+		.n.nlmsg_flags = NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.t.tcm_family = AF_UNSPEC,
+	};
 	struct filter_util *q = NULL;
-	struct tc_estimator est = {};
-	char k[FILTER_NAMESZ] = {};
-	int chain_index_set = 0;
-	char d[IFNAMSIZ] = {};
-	int protocol_set = 0;
-	__u32 block_index = 0;
-	char *fhandle = NULL;
+	__u32 prio = 0;
 	__u32 protocol = 0;
+	int protocol_set = 0;
 	__u32 chain_index;
-	struct iovec iov;
-	__u32 prio = 0;
-	int ret;
-
-	if (buf) {
-		req = buf;
-		if (buflen < sizeof (struct tc_filter_req)) {
-			fprintf(stderr, "buffer is too small: %zu\n", buflen);
-			return -1;
-		}
-	} else {
-		memset(&filter_req, 0, sizeof (struct tc_filter_req));
-		req = &filter_req;
-	}
-
-	req->n.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
-	req->n.nlmsg_flags = NLM_F_REQUEST | flags;
-	req->n.nlmsg_type = cmd;
-	req->t.tcm_family = AF_UNSPEC;
+	int chain_index_set = 0;
+	char *fhandle = NULL;
+	char  d[IFNAMSIZ] = {};
+	char  k[FILTER_NAMESZ] = {};
+	struct tc_estimator est = {};
 
-	if ((cmd == RTM_NEWTFILTER || cmd == RTM_NEWCHAIN) &&
-	    flags & NLM_F_CREATE)
+	if (cmd == RTM_NEWTFILTER && flags & NLM_F_CREATE)
 		protocol = htons(ETH_P_ALL);
 
 	while (argc > 0) {
@@ -101,53 +89,39 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv,
 			NEXT_ARG();
 			if (d[0])
 				duparg("dev", *argv);
-			if (block_index) {
-				fprintf(stderr, "Error: \"dev\" and \"block\" are mutually exlusive\n");
-				return -1;
-			}
 			strncpy(d, *argv, sizeof(d)-1);
-		} else if (matches(*argv, "block") == 0) {
-			NEXT_ARG();
-			if (block_index)
-				duparg("block", *argv);
-			if (d[0]) {
-				fprintf(stderr, "Error: \"dev\" and \"block\" are mutually exlusive\n");
-				return -1;
-			}
-			if (get_u32(&block_index, *argv, 0) || !block_index)
-				invarg("invalid block index value", *argv);
 		} else if (strcmp(*argv, "root") == 0) {
-			if (req->t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"root\" is duplicate parent ID\n");
 				return -1;
 			}
-			req->t.tcm_parent = TC_H_ROOT;
+			req.t.tcm_parent = TC_H_ROOT;
 		} else if (strcmp(*argv, "ingress") == 0) {
-			if (req->t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"ingress\" is duplicate parent ID\n");
 				return -1;
 			}
-			req->t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
+			req.t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
 						     TC_H_MIN_INGRESS);
 		} else if (strcmp(*argv, "egress") == 0) {
-			if (req->t.tcm_parent) {
+			if (req.t.tcm_parent) {
 				fprintf(stderr,
 					"Error: \"egress\" is duplicate parent ID\n");
 				return -1;
 			}
-			req->t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
+			req.t.tcm_parent = TC_H_MAKE(TC_H_CLSACT,
 						     TC_H_MIN_EGRESS);
 		} else if (strcmp(*argv, "parent") == 0) {
 			__u32 handle;
 
 			NEXT_ARG();
-			if (req->t.tcm_parent)
+			if (req.t.tcm_parent)
 				duparg("parent", *argv);
 			if (get_tc_classid(&handle, *argv))
 				invarg("Invalid parent ID", *argv);
-			req->t.tcm_parent = handle;
+			req.t.tcm_parent = handle;
 		} else if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
 			if (fhandle)
@@ -194,27 +168,26 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv,
 		argc--; argv++;
 	}
 
-	req->t.tcm_info = TC_H_MAKE(prio<<16, protocol);
+	req.t.tcm_info = TC_H_MAKE(prio<<16, protocol);
 
 	if (chain_index_set)
-		addattr32(&req->n, sizeof(*req), TCA_CHAIN, chain_index);
+		addattr32(&req.n, sizeof(req), TCA_CHAIN, chain_index);
 
 	if (k[0])
-		addattr_l(&req->n, sizeof(*req), TCA_KIND, k, strlen(k)+1);
+		addattr_l(&req.n, sizeof(req), TCA_KIND, k, strlen(k)+1);
 
 	if (d[0])  {
 		ll_init_map(&rth);
 
-		req->t.tcm_ifindex = ll_name_to_index(d);
-		if (!req->t.tcm_ifindex)
-			return -nodev(d);
-	} else if (block_index) {
-		req->t.tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK;
-		req->t.tcm_block_index = block_index;
+		req.t.tcm_ifindex = ll_name_to_index(d);
+		if (req.t.tcm_ifindex == 0) {
+			fprintf(stderr, "Cannot find device \"%s\"\n", d);
+			return 1;
+		}
 	}
 
 	if (q) {
-		if (q->parse_fopt(q, fhandle, argc, argv, &req->n))
+		if (q->parse_fopt(q, fhandle, argc, argv, &req.n))
 			return 1;
 	} else {
 		if (fhandle) {
@@ -233,16 +206,10 @@ static int tc_filter_modify(int cmd, unsigned int flags, int argc, char **argv,
 	}
 
 	if (est.ewma_log)
-		addattr_l(&req->n, sizeof(*req), TCA_RATE, &est, sizeof(est));
+		addattr_l(&req.n, sizeof(req), TCA_RATE, &est, sizeof(est));
 
-	if (buf)
-		return 0;
-
-	iov.iov_base = &req->n;
-	iov.iov_len = req->n.nlmsg_len;
-	ret = rtnl_talk_iov(&rth, &iov, 1, NULL);
-	if (ret < 0) {
-		fprintf(stderr, "We have an error talking to the kernel, %d\n", ret);
+	if (rtnl_talk(&rth, &req.n, NULL) < 0) {
+		fprintf(stderr, "We have an error talking to the kernel\n");
 		return 2;
 	}
 
@@ -751,22 +718,20 @@ static int tc_filter_list(int cmd, int argc, char **argv)
 	return 0;
 }
 
-int do_filter(int argc, char **argv, void *buf, size_t buflen)
+int do_filter(int argc, char **argv)
 {
 	if (argc < 1)
 		return tc_filter_list(RTM_GETTFILTER, 0, NULL);
 	if (matches(*argv, "add") == 0)
 		return tc_filter_modify(RTM_NEWTFILTER, NLM_F_EXCL|NLM_F_CREATE,
-					argc-1, argv+1, buf, buflen);
+					argc-1, argv+1);
 	if (matches(*argv, "change") == 0)
-		return tc_filter_modify(RTM_NEWTFILTER, 0, argc-1, argv+1,
-					buf, buflen);
+		return tc_filter_modify(RTM_NEWTFILTER, 0, argc-1, argv+1);
 	if (matches(*argv, "replace") == 0)
 		return tc_filter_modify(RTM_NEWTFILTER, NLM_F_CREATE, argc-1,
-					argv+1, buf, buflen);
+					argv+1);
 	if (matches(*argv, "delete") == 0)
-		return tc_filter_modify(RTM_DELTFILTER, 0, argc-1, argv+1,
-					buf, buflen);
+		return tc_filter_modify(RTM_DELTFILTER, 0,  argc-1, argv+1);
 	if (matches(*argv, "get") == 0)
 		return tc_filter_get(RTM_GETTFILTER, 0,  argc-1, argv+1);
 	if (matches(*argv, "list") == 0 || matches(*argv, "show") == 0
@@ -781,16 +746,16 @@ int do_filter(int argc, char **argv, void *buf, size_t buflen)
 	return -1;
 }
 
-int do_chain(int argc, char **argv, void *buf, size_t buflen)
+int do_chain(int argc, char **argv)
 {
 	if (argc < 1)
 		return tc_filter_list(RTM_GETCHAIN, 0, NULL);
 	if (matches(*argv, "add") == 0) {
 		return tc_filter_modify(RTM_NEWCHAIN, NLM_F_EXCL | NLM_F_CREATE,
-					argc - 1, argv + 1, buf, buflen);
+					argc - 1, argv + 1);
 	} else if (matches(*argv, "delete") == 0) {
 		return tc_filter_modify(RTM_DELCHAIN, 0,
-					argc - 1, argv + 1, buf, buflen);
+					argc - 1, argv + 1);
 	} else if (matches(*argv, "get") == 0) {
 		return tc_filter_get(RTM_GETCHAIN, 0,
 				     argc - 1, argv + 1);
-- 
2.20.1


^ permalink raw reply related

* RE: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
From: Sunil Muthuswamy @ 2019-08-01  0:50 UTC (permalink / raw)
  To: Dexuan Cui, David Miller, netdev@vger.kernel.org
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01696DDD3A3F601370701DD2BFDF0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Tuesday, July 30, 2019 6:26 PM
> To: Sunil Muthuswamy <sunilmut@microsoft.com>; David Miller <davem@davemloft.net>; netdev@vger.kernel.org
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; linux-hyperv@vger.kernel.org; linux-
> kernel@vger.kernel.org; olaf@aepfle.de; apw@canonical.com; jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> marcelo.cerri@canonical.com
> Subject: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
> 
> 
> There is a race condition for an established connection that is being closed
> by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the
> 'remove_sock' is false):
> 
> 1 for the initial value;
> 1 for the sk being in the bound list;
> 1 for the sk being in the connected list;
> 1 for the delayed close_work.
> 
> After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may*
> decrease the refcnt to 3.
> 
> Concurrently, hvs_close_connection() runs in another thread:
>   calls vsock_remove_sock() to decrease the refcnt by 2;
>   call sock_put() to decrease the refcnt to 0, and free the sk;
>   next, the "release_sock(sk)" may hang due to use-after-free.
> 
> In the above, after hvs_release() finishes, if hvs_close_connection() runs
> faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
> because at the beginning of hvs_close_connection(), the refcnt is still 4.
> 
> The issue can be resolved if an extra reference is taken when the
> connection is established.
> 
> Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Cc: stable@vger.kernel.org
> 
> ---
> 
> Changes in v2:
>   Changed the location of the sock_hold() call.
>   Updated the changelog accordingly.
> 
>   Thanks Sunil for the suggestion!
> 
> 
> With the proper kernel debugging options enabled, first a warning can
> appear:
> 
> kworker/1:0/4467 is freeing memory ..., with a lock still held there!
> stack backtrace:
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> Call Trace:
>  dump_stack+0x67/0x90
>  debug_check_no_locks_freed.cold.52+0x78/0x7d
>  slab_free_freelist_hook+0x85/0x140
>  kmem_cache_free+0xa5/0x380
>  __sk_destruct+0x150/0x260
>  hvs_close_connection+0x24/0x30 [hv_sock]
>  vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
>  process_one_work+0x241/0x600
>  worker_thread+0x3c/0x390
>  kthread+0x11b/0x140
>  ret_from_fork+0x24/0x30
> 
> and then the following release_sock(sk) can hang:
> 
> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467]
> ...
> irq event stamp: 62890
> CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G        W         5.2.0+ #39
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0
> ...
> Call Trace:
>  do_raw_spin_lock+0xab/0xb0
>  release_sock+0x19/0xb0
>  vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
>  process_one_work+0x241/0x600
>  worker_thread+0x3c/0x390
>  kthread+0x11b/0x140
>  ret_from_fork+0x24/0x30
> 
>  net/vmw_vsock/hyperv_transport.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
> index f2084e3f7aa4..9d864ebeb7b3 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -312,6 +312,11 @@ static void hvs_close_connection(struct vmbus_channel *chan)
>  	lock_sock(sk);
>  	hvs_do_close_lock_held(vsock_sk(sk), true);
>  	release_sock(sk);
> +
> +	/* Release the refcnt for the channel that's opened in
> +	 * hvs_open_connection().
> +	 */
> +	sock_put(sk);
>  }
> 
>  static void hvs_open_connection(struct vmbus_channel *chan)
> @@ -407,6 +412,9 @@ static void hvs_open_connection(struct vmbus_channel *chan)
>  	}
> 
>  	set_per_channel_state(chan, conn_from_host ? new : sk);
> +
> +	/* This reference will be dropped by hvs_close_connection(). */
> +	sock_hold(conn_from_host ? new : sk);
>  	vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
> 
>  	/* Set the pending send size to max packet size to always get
> --
> 2.19.1

Reviewed-by: Sunil Muthuswamy <sunilmut@microsoft.com>

^ permalink raw reply

* Re: [RFC iproute2 0/4] Revert tc batchsize feature
From: Dave Taht @ 2019-08-01  1:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jiří Pírko, chrism,
	Linux Kernel Network Developers
In-Reply-To: <20190801004506.9049-1-stephen@networkplumber.org>

Does this fix my longstanding issue with piping commands into it? :P

( https://github.com/tohojo/flent/issues/146 )

On Wed, Jul 31, 2019 at 5:46 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> The batchsize feature of tc might save a few cycles but it
> is a maintaince nightmare, it has uninitialized variables and
> poor error handling.
>
> This patch set reverts back to the original state.
> Please don't resubmit original code. Go back to the drawing
> board and do something generic.  For example, the routing
> daemons have figured out that by using multiple threads and
> turning off the netlink ACK they can update millions of routes
> quickly.
>
> Stephen Hemminger (4):
>   Revert "tc: Remove pointless assignments in batch()"
>   Revert "tc: flush after each command in batch mode"
>   Revert "tc: fix batch force option"
>   Revert "tc: Add batchsize feature for filter and actions"
>
>  tc/m_action.c  |  65 ++++++----------
>  tc/tc.c        | 201 ++++---------------------------------------------
>  tc/tc_common.h |   7 +-
>  tc/tc_filter.c | 129 ++++++++++++-------------------
>  4 files changed, 87 insertions(+), 315 deletions(-)
>


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply

* Re: [PATCH -next] iwlwifi: dbg: work around clang bug by marking debug strings static
From: Michael Ellerman @ 2019-08-01  1:21 UTC (permalink / raw)
  To: Nick Desaulniers, kvalo, Luca Coelho
  Cc: Nick Desaulniers, Arnd Bergmann, Nathan Chancellor, Johannes Berg,
	Emmanuel Grumbach, Intel Linux Wireless, David S. Miller,
	Shahar S Matityahu, Sara Sharon, linux-wireless, netdev,
	linux-kernel, clang-built-linux
In-Reply-To: <20190712001708.170259-1-ndesaulniers@google.com>

Nick Desaulniers <ndesaulniers@google.com> writes:
> Commit r353569 in prerelease Clang-9 is producing a linkage failure:
>
> ld: drivers/net/wireless/intel/iwlwifi/fw/dbg.o:
> in function `_iwl_fw_dbg_apply_point':
> dbg.c:(.text+0x827a): undefined reference to `__compiletime_assert_2387'

This breakage is also seen in older GCC versions (v4.6.3 at least):

  drivers/net/wireless/intel/iwlwifi/fw/dbg.c: In function 'iwl_fw_dbg_info_apply.isra.10':
  drivers/net/wireless/intel/iwlwifi/fw/dbg.c:2445:3: error: call to '__compiletime_assert_2446' declared with attribute error: BUILD_BUG_ON failed: err_str[sizeof(err_str) - 2] != '\n'
  drivers/net/wireless/intel/iwlwifi/fw/dbg.c:2451:3: error: call to '__compiletime_assert_2452' declared with attribute error: BUILD_BUG_ON failed: err_str[sizeof(err_str) - 2] != '\n'
  drivers/net/wireless/intel/iwlwifi/fw/dbg.c: In function '_iwl_fw_dbg_apply_point':
  drivers/net/wireless/intel/iwlwifi/fw/dbg.c:2789:5: error: call to '__compiletime_assert_2790' declared with attribute error: BUILD_BUG_ON failed: invalid_ap_str[sizeof(invalid_ap_str) - 2] != '\n'
  drivers/net/wireless/intel/iwlwifi/fw/dbg.c:2800:5: error: call to '__compiletime_assert_2801' declared with attribute error: BUILD_BUG_ON failed: invalid_ap_str[sizeof(invalid_ap_str) - 2] != '\n'
 
  http://kisskb.ellerman.id.au/kisskb/buildresult/13902155/


Luca, you said this was already fixed in your internal tree, and the fix
would appear soon in next, but I don't see anything in linux-next?

cheers

^ permalink raw reply

* [PATCH] enetc: Select PHYLIB while CONFIG_FSL_ENETC_VF is set
From: YueHaibing @ 2019-08-01  1:24 UTC (permalink / raw)
  To: claudiu.manoil, davem; +Cc: linux-kernel, netdev, YueHaibing

Like FSL_ENETC, when CONFIG_FSL_ENETC_VF is set,
we should select PHYLIB, otherwise building still fails:

drivers/net/ethernet/freescale/enetc/enetc.o: In function `enetc_open':
enetc.c:(.text+0x2744): undefined reference to `phy_start'
enetc.c:(.text+0x282c): undefined reference to `phy_disconnect'
drivers/net/ethernet/freescale/enetc/enetc.o: In function `enetc_close':
enetc.c:(.text+0x28f8): undefined reference to `phy_stop'
enetc.c:(.text+0x2904): undefined reference to `phy_disconnect'
drivers/net/ethernet/freescale/enetc/enetc_ethtool.o:(.rodata+0x3f8): undefined reference to `phy_ethtool_get_link_ksettings'
drivers/net/ethernet/freescale/enetc/enetc_ethtool.o:(.rodata+0x400): undefined reference to `phy_ethtool_set_link_ksettings'

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/freescale/enetc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/freescale/enetc/Kconfig b/drivers/net/ethernet/freescale/enetc/Kconfig
index 46fdf36b..04a59db 100644
--- a/drivers/net/ethernet/freescale/enetc/Kconfig
+++ b/drivers/net/ethernet/freescale/enetc/Kconfig
@@ -13,6 +13,7 @@ config FSL_ENETC
 config FSL_ENETC_VF
 	tristate "ENETC VF driver"
 	depends on PCI && PCI_MSI && (ARCH_LAYERSCAPE || COMPILE_TEST)
+	select PHYLIB
 	help
 	  This driver supports NXP ENETC gigabit ethernet controller PCIe
 	  virtual function (VF) devices enabled by the ENETC PF driver.
-- 
2.7.4



^ permalink raw reply related

* Re: [PATCH 2/2] cnic: Use refcount_t for refcount
From: Chuhong Yuan @ 2019-08-01  2:22 UTC (permalink / raw)
  To: Michael Chan; +Cc: David S . Miller, Netdev, open list
In-Reply-To: <CACKFLinuFebDgJN=BgK5e-bNaFqNpk61teva0=2xMH6R_iT39g@mail.gmail.com>

Michael Chan <michael.chan@broadcom.com> 于2019年8月1日周四 上午1:58写道：
>
> On Wed, Jul 31, 2019 at 5:22 AM Chuhong Yuan <hslester96@gmail.com> wrote:
>
> >  static void cnic_ctx_wr(struct cnic_dev *dev, u32 cid_addr, u32 off, u32 val)
> > @@ -494,7 +494,7 @@ int cnic_register_driver(int ulp_type, struct cnic_ulp_ops *ulp_ops)
> >         }
> >         read_unlock(&cnic_dev_lock);
> >
> > -       atomic_set(&ulp_ops->ref_count, 0);
> > +       refcount_set(&ulp_ops->ref_count, 0);
> >         rcu_assign_pointer(cnic_ulp_tbl[ulp_type], ulp_ops);
> >         mutex_unlock(&cnic_lock);
> >
>
> Willem's comment applies here too.  The driver needs to be modified to
> count from 1 to use refcount_t.
>
> Thanks.

I have revised this problem but find the other two refcounts -
cnic_dev::ref_count
and cnic_sock::ref_count have no set.
I am not sure where to initialize them to 1.

Besides, should ulp_ops->ref_count be set to 0 when unregistered?

^ permalink raw reply

* [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev

From: wenxu <wenxu@ucloud.cn>

This series patch make nftables offload support the vlan and
tunnel device offload through indr-block architecture.

The first four patches mv tc indr block to flow offload and
rename to flow-indr-block.
Because the new flow-indr-block can't get the tcf_block
directly. The fifthe patch provide a callback list to get 
flow_block of each subsystem immediately when the device
register and contain a block.
The last patch make nf_tables_offload support flow-indr-block.

wenxu (6):
  cls_api: modify the tc_indr_block_ing_cmd parameters.
  cls_api: replace block with flow_block in tc_indr_block_dev
  cls_api: add flow_indr_block_call function
  flow_offload: move tc indirect block to flow offload
  flow_offload: support get flow_block immediately
  netfilter: nf_tables_offload: support indr block call

 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  11 +-
 include/net/flow_offload.h                         |  48 ++++
 include/net/netfilter/nf_tables_offload.h          |   2 +
 include/net/pkt_cls.h                              |  35 ---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 251 ++++++++++++++++++++
 net/netfilter/nf_tables_api.c                      |   7 +
 net/netfilter/nf_tables_offload.c                  | 156 +++++++++++--
 net/sched/cls_api.c                                | 255 ++++-----------------
 10 files changed, 497 insertions(+), 281 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters.
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

This patch make tc_indr_block_ing_cmd can't access struct
tc_indr_block_dev and tc_indr_block_cb.

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch

 net/sched/cls_api.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3565d9a..2e3b58d 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -677,26 +677,28 @@ static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
-static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
-				  struct tc_indr_block_cb *indr_block_cb,
+static void tc_indr_block_ing_cmd(struct net_device *dev,
+				  struct tcf_block *block,
+				  tc_indr_block_bind_cb_t *cb,
+				  void *cb_priv,
 				  enum flow_block_command command)
 {
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
-		.net		= dev_net(indr_dev->dev),
-		.block_shared	= tcf_block_non_null_shared(indr_dev->block),
+		.net		= dev_net(dev),
+		.block_shared	= tcf_block_non_null_shared(block),
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	if (!indr_dev->block)
+	if (!block)
 		return;
 
-	bo.block = &indr_dev->block->flow_block;
+	bo.block = &block->flow_block;
 
-	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-			  &bo);
-	tcf_block_setup(indr_dev->block, &bo);
+	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
+
+	tcf_block_setup(block, &bo);
 }
 
 int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
@@ -715,7 +717,8 @@ int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 	if (err)
 		goto err_dev_put;
 
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_BIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, cb_priv,
+			      FLOW_BLOCK_BIND);
 	return 0;
 
 err_dev_put:
@@ -752,7 +755,8 @@ void __tc_indr_block_cb_unregister(struct net_device *dev,
 		return;
 
 	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_UNBIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, indr_block_cb->cb_priv,
+			      FLOW_BLOCK_UNBIND);
 	tc_indr_block_cb_del(indr_block_cb);
 	tc_indr_block_dev_put(indr_dev);
 }
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next v5 4/6] flow_offload: move tc indirect block to flow offload
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: make flow_indr_block_cb/dev in c file
 
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  11 +-
 include/net/flow_offload.h                         |  31 +++
 include/net/pkt_cls.h                              |  35 ---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 218 +++++++++++++++++++
 net/sched/cls_api.c                                | 241 +--------------------
 7 files changed, 265 insertions(+), 284 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7f747cb..074573b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -785,9 +785,9 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 {
 	int err;
 
-	err = __tc_indr_block_cb_register(netdev, rpriv,
-					  mlx5e_rep_indr_setup_tc_cb,
-					  rpriv);
+	err = __flow_indr_block_cb_register(netdev, rpriv,
+					    mlx5e_rep_indr_setup_tc_cb,
+					    rpriv);
 	if (err) {
 		struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
 
@@ -800,8 +800,8 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
 					    struct net_device *netdev)
 {
-	__tc_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
-				      rpriv);
+	__flow_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
+					rpriv);
 }
 
 static int mlx5e_nic_rep_netdevice_event(struct notifier_block *nb,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index e209f15..7b490db 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1479,16 +1479,17 @@ int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
 		return NOTIFY_OK;
 
 	if (event == NETDEV_REGISTER) {
-		err = __tc_indr_block_cb_register(netdev, app,
-						  nfp_flower_indr_setup_tc_cb,
-						  app);
+		err = __flow_indr_block_cb_register(netdev, app,
+						    nfp_flower_indr_setup_tc_cb,
+						    app);
 		if (err)
 			nfp_flower_cmsg_warn(app,
 					     "Indirect block reg failed - %s\n",
 					     netdev->name);
 	} else if (event == NETDEV_UNREGISTER) {
-		__tc_indr_block_cb_unregister(netdev,
-					      nfp_flower_indr_setup_tc_cb, app);
+		__flow_indr_block_cb_unregister(netdev,
+						nfp_flower_indr_setup_tc_cb,
+						app);
 	}
 
 	return NOTIFY_OK;
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 00b9aab..c8d60a6 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -4,6 +4,7 @@
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <net/flow_dissector.h>
+#include <linux/rhashtable.h>
 
 struct flow_match {
 	struct flow_dissector	*dissector;
@@ -366,4 +367,34 @@ static inline void flow_block_init(struct flow_block *flow_block)
 	INIT_LIST_HEAD(&flow_block->cb_list);
 }
 
+typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
+				      enum tc_setup_type type, void *type_data);
+
+typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
+				       struct flow_block *flow_block,
+				       flow_indr_block_bind_cb_t *cb,
+				       void *cb_priv,
+				       enum flow_block_command command);
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_ident);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_ident);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_ident);
+
+void flow_indr_block_call(struct flow_block *flow_block,
+			  struct net_device *dev,
+			  flow_indr_block_ing_cmd_t *ing_cmd_cb,
+			  struct flow_block_offload *bo,
+			  enum flow_block_command command);
+
 #endif /* _NET_FLOW_OFFLOAD_H */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809..0790a4e 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -70,15 +70,6 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 	return block->q;
 }
 
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident);
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident);
-
 int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		 struct tcf_result *res, bool compat_mode);
 
@@ -137,32 +128,6 @@ void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
 {
 }
 
-static inline
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
-static inline
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
 static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			       struct tcf_result *res, bool compat_mode)
 {
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6b6b012..d9f359a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -23,9 +23,6 @@
 struct module;
 struct bpf_flow_keys;
 
-typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
-				    enum tc_setup_type type, void *type_data);
-
 struct qdisc_rate_table {
 	struct tc_ratespec rate;
 	u32		data[256];
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index d63b970..a1fdfa4 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -2,6 +2,7 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <net/flow_offload.h>
+#include <linux/rtnetlink.h>
 
 struct flow_rule *flow_rule_alloc(unsigned int num_actions)
 {
@@ -280,3 +281,220 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 	}
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
+
+static struct rhashtable indr_setup_block_ht;
+
+struct flow_indr_block_cb {
+	struct list_head list;
+	void *cb_priv;
+	flow_indr_block_bind_cb_t *cb;
+	void *cb_ident;
+};
+
+struct flow_indr_block_dev {
+	struct rhash_head ht_node;
+	struct net_device *dev;
+	unsigned int refcnt;
+	struct list_head cb_list;
+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
+	struct flow_block *flow_block;
+};
+
+static const struct rhashtable_params flow_indr_setup_block_ht_params = {
+	.key_offset	= offsetof(struct flow_indr_block_dev, dev),
+	.head_offset	= offsetof(struct flow_indr_block_dev, ht_node),
+	.key_len	= sizeof(struct net_device *),
+};
+
+static struct flow_indr_block_dev *
+flow_indr_block_dev_lookup(struct net_device *dev)
+{
+	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
+				      flow_indr_setup_block_ht_params);
+}
+
+static struct flow_indr_block_dev *
+flow_indr_block_dev_get(struct net_device *dev)
+{
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (indr_dev)
+		goto inc_ref;
+
+	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
+	if (!indr_dev)
+		return NULL;
+
+	INIT_LIST_HEAD(&indr_dev->cb_list);
+	indr_dev->dev = dev;
+	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+				   flow_indr_setup_block_ht_params)) {
+		kfree(indr_dev);
+		return NULL;
+	}
+
+inc_ref:
+	indr_dev->refcnt++;
+	return indr_dev;
+}
+
+static void flow_indr_block_dev_put(struct flow_indr_block_dev *indr_dev)
+{
+	if (--indr_dev->refcnt)
+		return;
+
+	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+			       flow_indr_setup_block_ht_params);
+	kfree(indr_dev);
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_lookup(struct flow_indr_block_dev *indr_dev,
+			  flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		if (indr_block_cb->cb == cb &&
+		    indr_block_cb->cb_ident == cb_ident)
+			return indr_block_cb;
+	return NULL;
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_add(struct flow_indr_block_dev *indr_dev, void *cb_priv,
+		       flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (indr_block_cb)
+		return ERR_PTR(-EEXIST);
+
+	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
+	if (!indr_block_cb)
+		return ERR_PTR(-ENOMEM);
+
+	indr_block_cb->cb_priv = cb_priv;
+	indr_block_cb->cb = cb;
+	indr_block_cb->cb_ident = cb_ident;
+	list_add(&indr_block_cb->list, &indr_dev->cb_list);
+
+	return indr_block_cb;
+}
+
+static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
+{
+	list_del(&indr_block_cb->list);
+	kfree(indr_block_cb);
+}
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+	int err;
+
+	indr_dev = flow_indr_block_dev_get(dev);
+	if (!indr_dev)
+		return -ENOMEM;
+
+	indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
+	err = PTR_ERR_OR_ZERO(indr_block_cb);
+	if (err)
+		goto err_dev_put;
+
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+				     indr_block_cb->cb, indr_block_cb->cb_priv,
+				     FLOW_BLOCK_BIND);
+
+	return 0;
+
+err_dev_put:
+	flow_indr_block_dev_put(indr_dev);
+	return err;
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb,
+				void *cb_ident)
+{
+	int err;
+
+	rtnl_lock();
+	err = __flow_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
+	rtnl_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_register);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (!indr_block_cb)
+		return;
+
+	/* Send unbind message if required to free any block cbs. */
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+				     indr_block_cb->cb, indr_block_cb->cb_priv,
+				     FLOW_BLOCK_UNBIND);
+
+	flow_indr_block_cb_del(indr_block_cb);
+	flow_indr_block_dev_put(indr_dev);
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_unregister);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_ident)
+{
+	rtnl_lock();
+	__flow_indr_block_cb_unregister(dev, cb, cb_ident);
+	rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_unregister);
+
+void flow_indr_block_call(struct flow_block *flow_block,
+			  struct net_device *dev,
+			  flow_indr_block_ing_cmd_t *ing_cmd_cb,
+			  struct flow_block_offload *bo,
+			  enum flow_block_command command)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
+	indr_dev->ing_cmd_cb = command == FLOW_BLOCK_BIND ? ing_cmd_cb : NULL;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+				  bo);
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_call);
+
+static int __init init_flow_indr_rhashtable(void)
+{
+	return rhashtable_init(&indr_setup_block_ht,
+			       &flow_indr_setup_block_ht_params);
+}
+subsys_initcall(init_flow_indr_rhashtable);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 617b098..bd5e591 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -37,6 +37,7 @@
 #include <net/tc_act/tc_skbedit.h>
 #include <net/tc_act/tc_ct.h>
 #include <net/tc_act/tc_mpls.h>
+#include <net/flow_offload.h>
 
 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
 
@@ -545,149 +546,12 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 	}
 }
 
-static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
-{
-	const struct Qdisc_class_ops *cops;
-	struct Qdisc *qdisc;
-
-	if (!dev_ingress_queue(dev))
-		return NULL;
-
-	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
-	if (!qdisc)
-		return NULL;
-
-	cops = qdisc->ops->cl_ops;
-	if (!cops)
-		return NULL;
-
-	if (!cops->tcf_block)
-		return NULL;
-
-	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
-}
-
-static struct rhashtable indr_setup_block_ht;
-
-struct tc_indr_block_dev {
-	struct rhash_head ht_node;
-	struct net_device *dev;
-	unsigned int refcnt;
-	struct list_head cb_list;
-	struct flow_block *flow_block;
-};
-
-struct tc_indr_block_cb {
-	struct list_head list;
-	void *cb_priv;
-	tc_indr_block_bind_cb_t *cb;
-	void *cb_ident;
-};
-
-static const struct rhashtable_params tc_indr_setup_block_ht_params = {
-	.key_offset	= offsetof(struct tc_indr_block_dev, dev),
-	.head_offset	= offsetof(struct tc_indr_block_dev, ht_node),
-	.key_len	= sizeof(struct net_device *),
-};
-
-static struct tc_indr_block_dev *
-tc_indr_block_dev_lookup(struct net_device *dev)
-{
-	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
-				      tc_indr_setup_block_ht_params);
-}
-
-static void tc_indr_get_default_block(struct tc_indr_block_dev *indr_dev)
-{
-	struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
-
-	if (block)
-		indr_dev->flow_block = &block->flow_block;
-}
-
-static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
-{
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (indr_dev)
-		goto inc_ref;
-
-	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
-	if (!indr_dev)
-		return NULL;
-
-	INIT_LIST_HEAD(&indr_dev->cb_list);
-	indr_dev->dev = dev;
-	tc_indr_get_default_block(indr_dev);
-	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-				   tc_indr_setup_block_ht_params)) {
-		kfree(indr_dev);
-		return NULL;
-	}
-
-inc_ref:
-	indr_dev->refcnt++;
-	return indr_dev;
-}
-
-static void tc_indr_block_dev_put(struct tc_indr_block_dev *indr_dev)
-{
-	if (--indr_dev->refcnt)
-		return;
-
-	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-			       tc_indr_setup_block_ht_params);
-	kfree(indr_dev);
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_lookup(struct tc_indr_block_dev *indr_dev,
-			tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		if (indr_block_cb->cb == cb &&
-		    indr_block_cb->cb_ident == cb_ident)
-			return indr_block_cb;
-	return NULL;
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_add(struct tc_indr_block_dev *indr_dev, void *cb_priv,
-		     tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (indr_block_cb)
-		return ERR_PTR(-EEXIST);
-
-	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
-	if (!indr_block_cb)
-		return ERR_PTR(-ENOMEM);
-
-	indr_block_cb->cb_priv = cb_priv;
-	indr_block_cb->cb = cb;
-	indr_block_cb->cb_ident = cb_ident;
-	list_add(&indr_block_cb->list, &indr_dev->cb_list);
-
-	return indr_block_cb;
-}
-
-static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
-{
-	list_del(&indr_block_cb->list);
-	kfree(indr_block_cb);
-}
-
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
 static void tc_indr_block_ing_cmd(struct net_device *dev,
 				  struct flow_block *flow_block,
-				  tc_indr_block_bind_cb_t *cb,
+				  flow_indr_block_bind_cb_t *cb,
 				  void *cb_priv,
 				  enum flow_block_command command)
 {
@@ -712,96 +576,6 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	tcf_block_setup(block, &bo);
 }
 
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-	int err;
-
-	indr_dev = tc_indr_block_dev_get(dev);
-	if (!indr_dev)
-		return -ENOMEM;
-
-	indr_block_cb = tc_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
-	err = PTR_ERR_OR_ZERO(indr_block_cb);
-	if (err)
-		goto err_dev_put;
-
-	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, cb, cb_priv,
-			      FLOW_BLOCK_BIND);
-	return 0;
-
-err_dev_put:
-	tc_indr_block_dev_put(indr_dev);
-	return err;
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_register);
-
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	int err;
-
-	rtnl_lock();
-	err = __tc_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
-	rtnl_unlock();
-
-	return err;
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_register);
-
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, indr_block_cb->cb,
-						indr_block_cb->cb_ident);
-	if (!indr_block_cb)
-		return;
-
-	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, indr_block_cb->cb,
-			      indr_block_cb->cb_priv, FLOW_BLOCK_UNBIND);
-	tc_indr_block_cb_del(indr_block_cb);
-	tc_indr_block_dev_put(indr_dev);
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_unregister);
-
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	rtnl_lock();
-	__tc_indr_block_cb_unregister(dev, cb, cb_ident);
-	rtnl_unlock();
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
-
-static void flow_indr_block_call(struct flow_block *flow_block,
-				 struct net_device *dev,
-				 struct flow_block_offload *bo,
-				 enum flow_block_command command)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-				  bo);
-}
-
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
@@ -817,7 +591,9 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	flow_indr_block_call(&block->flow_block, dev, &bo, command);
+	flow_indr_block_call(&block->flow_block, dev, tc_indr_block_ing_cmd,
+			     &bo, command);
+
 	tcf_block_setup(block, &bo);
 }
 
@@ -3382,11 +3158,6 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
-	err = rhashtable_init(&indr_setup_block_ht,
-			      &tc_indr_setup_block_ht_params);
-	if (err)
-		goto err_rhash_setup_block_ht;
-
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
@@ -3400,8 +3171,6 @@ static int __init tc_filter_init(void)
 
 	return 0;
 
-err_rhash_setup_block_ht:
-	unregister_pernet_subsys(&tcf_net_ops);
 err_register_pernet_subsys:
 	destroy_workqueue(tc_filter_wq);
 	return err;
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

The new flow-indr-block can't get the tcf_block
directly. It provide a callback list to find the flow_block immediately
when the device register and contain a ingress block.

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: add get_block_cb_list for both nft and tc

 include/net/flow_offload.h | 17 +++++++++++++++++
 net/core/flow_offload.c    | 33 +++++++++++++++++++++++++++++++++
 net/sched/cls_api.c        | 44 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index c8d60a6..db04e3f 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -376,6 +376,23 @@ typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
 				       void *cb_priv,
 				       enum flow_block_command command);
 
+struct flow_indr_block_info {
+	struct flow_block *flow_block;
+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
+};
+
+typedef bool flow_indr_get_default_block_t(struct net_device *dev,
+					    struct flow_indr_block_info *info);
+
+struct flow_indr_get_block_entry {
+	flow_indr_get_default_block_t *get_block_cb;
+	struct list_head	list;
+};
+
+void flow_indr_add_default_block_cb(struct flow_indr_get_block_entry *entry);
+
+void flow_indr_del_default_block_cb(struct flow_indr_get_block_entry *entry);
+
 int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 				  flow_indr_block_bind_cb_t *cb,
 				  void *cb_ident);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index a1fdfa4..8ff7a75b 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -282,6 +282,8 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
 
+static LIST_HEAD(get_default_block_cb_list);
+
 static struct rhashtable indr_setup_block_ht;
 
 struct flow_indr_block_cb {
@@ -313,6 +315,24 @@ struct flow_indr_block_dev {
 				      flow_indr_setup_block_ht_params);
 }
 
+static void flow_get_default_block(struct flow_indr_block_dev *indr_dev)
+{
+	struct flow_indr_get_block_entry *entry_cb;
+	struct flow_indr_block_info info;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(entry_cb, &get_default_block_cb_list, list) {
+		if (entry_cb->get_block_cb(indr_dev->dev, &info)) {
+			indr_dev->flow_block = info.flow_block;
+			indr_dev->ing_cmd_cb = info.ing_cmd_cb;
+			break;
+		}
+	}
+
+	rcu_read_unlock();
+}
+
 static struct flow_indr_block_dev *
 flow_indr_block_dev_get(struct net_device *dev)
 {
@@ -328,6 +348,7 @@ struct flow_indr_block_dev {
 
 	INIT_LIST_HEAD(&indr_dev->cb_list);
 	indr_dev->dev = dev;
+	flow_get_default_block(indr_dev);
 	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
 				   flow_indr_setup_block_ht_params)) {
 		kfree(indr_dev);
@@ -492,6 +513,18 @@ void flow_indr_block_call(struct flow_block *flow_block,
 }
 EXPORT_SYMBOL_GPL(flow_indr_block_call);
 
+void flow_indr_add_default_block_cb(struct flow_indr_get_block_entry *entry)
+{
+	list_add_tail_rcu(&entry->list, &get_default_block_cb_list);
+}
+EXPORT_SYMBOL_GPL(flow_indr_add_default_block_cb);
+
+void flow_indr_del_default_block_cb(struct flow_indr_get_block_entry *entry)
+{
+	list_del_rcu(&entry->list);
+}
+EXPORT_SYMBOL_GPL(flow_indr_del_default_block_cb);
+
 static int __init init_flow_indr_rhashtable(void)
 {
 	return rhashtable_init(&indr_setup_block_ht,
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index bd5e591..8bf918c 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -576,6 +576,43 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	tcf_block_setup(block, &bo);
 }
 
+static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
+{
+	const struct Qdisc_class_ops *cops;
+	struct Qdisc *qdisc;
+
+	if (!dev_ingress_queue(dev))
+		return NULL;
+
+	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
+	if (!qdisc)
+		return NULL;
+
+	cops = qdisc->ops->cl_ops;
+	if (!cops)
+		return NULL;
+
+	if (!cops->tcf_block)
+		return NULL;
+
+	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+}
+
+static bool tc_indr_get_default_block(struct net_device *dev,
+				      struct flow_indr_block_info *info)
+{
+	struct tcf_block *block = tc_dev_ingress_block(dev);
+
+	if (block) {
+		info->flow_block = &block->flow_block;
+		info->ing_cmd_cb = tc_indr_block_ing_cmd;
+
+		return true;
+	}
+
+	return false;
+}
+
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
@@ -3146,6 +3183,11 @@ static void __net_exit tcf_net_exit(struct net *net)
 	.size = sizeof(struct tcf_net),
 };
 
+static struct flow_indr_get_block_entry get_block_entry = {
+	.get_block_cb = tc_indr_get_default_block,
+	.list = LIST_HEAD_INIT(get_block_entry.list),
+};
+
 static int __init tc_filter_init(void)
 {
 	int err;
@@ -3158,6 +3200,8 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
+	flow_indr_add_default_block_cb(&get_block_entry);
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next 3/6] cls_api: add flow_indr_block_call function
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

This patch make indr_block_call don't access struct tc_indr_block_cb
and tc_indr_block_dev directly

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch

 net/sched/cls_api.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index f9643fa..617b098 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -783,13 +783,30 @@ void tc_indr_block_cb_unregister(struct net_device *dev,
 }
 EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
 
+static void flow_indr_block_call(struct flow_block *flow_block,
+				 struct net_device *dev,
+				 struct flow_block_offload *bo,
+				 enum flow_block_command command)
+{
+	struct tc_indr_block_cb *indr_block_cb;
+	struct tc_indr_block_dev *indr_dev;
+
+	indr_dev = tc_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+				  bo);
+}
+
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
 			       struct netlink_ext_ack *extack)
 {
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= ei->binder_type,
@@ -800,17 +817,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_dev->flow_block = command == FLOW_BLOCK_BIND ?
-					  &block->flow_block : NULL;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-				  &bo);
-
+	flow_indr_block_call(&block->flow_block, dev, &bo, command);
 	tcf_block_setup(block, &bo);
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

nftable support indr-block call. It makes nftable an offload vlan
and tunnel device.

nft add table netdev firewall
nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: add nft_get_default_block

 include/net/netfilter/nf_tables_offload.h |   2 +
 net/netfilter/nf_tables_api.c             |   7 ++
 net/netfilter/nf_tables_offload.c         | 156 +++++++++++++++++++++++++-----
 3 files changed, 141 insertions(+), 24 deletions(-)

diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 3196663..ac69087 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -63,6 +63,8 @@ struct nft_flow_rule {
 struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
 void nft_flow_rule_destroy(struct nft_flow_rule *flow);
 int nft_flow_rule_offload_commit(struct net *net);
+bool nft_indr_get_default_block(struct net_device *dev,
+				struct flow_indr_block_info *info);
 
 #define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg)		\
 	(__reg)->base_offset	=					\
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cf..6a1d0b2 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7593,6 +7593,11 @@ static void __net_exit nf_tables_exit_net(struct net *net)
 	.exit	= nf_tables_exit_net,
 };
 
+static struct flow_indr_get_block_entry get_block_entry = {
+	.get_block_cb = nft_indr_get_default_block,
+	.list = LIST_HEAD_INIT(get_block_entry.list),
+};
+
 static int __init nf_tables_module_init(void)
 {
 	int err;
@@ -7624,6 +7629,7 @@ static int __init nf_tables_module_init(void)
 		goto err5;
 
 	nft_chain_route_init();
+	flow_indr_add_default_block_cb(&get_block_entry);
 	return err;
 err5:
 	rhltable_destroy(&nft_objname_ht);
@@ -7640,6 +7646,7 @@ static int __init nf_tables_module_init(void)
 
 static void __exit nf_tables_module_exit(void)
 {
+	flow_indr_del_default_block_cb(&get_block_entry);
 	nfnetlink_subsys_unregister(&nf_tables_subsys);
 	unregister_netdevice_notifier(&nf_tables_flowtable_notifier);
 	nft_chain_filter_fini();
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5..59c9629 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -171,24 +171,114 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
 	return 0;
 }
 
+static int nft_block_setup(struct nft_base_chain *basechain,
+			   struct flow_block_offload *bo,
+			   enum flow_block_command cmd)
+{
+	int err;
+
+	switch (cmd) {
+	case FLOW_BLOCK_BIND:
+		err = nft_flow_offload_bind(bo, basechain);
+		break;
+	case FLOW_BLOCK_UNBIND:
+		err = nft_flow_offload_unbind(bo, basechain);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static int nft_block_offload_cmd(struct nft_base_chain *chain,
+				 struct net_device *dev,
+				 enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	int err;
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	if (err < 0)
+		return err;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
+static void nft_indr_block_ing_cmd(struct net_device *dev,
+				   struct flow_block *flow_block,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_priv,
+				   enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	struct nft_base_chain *chain;
+
+	if (flow_block)
+		return;
+
+	chain = container_of(flow_block, struct nft_base_chain, flow_block);
+
+	bo.net = dev_net(dev);
+	bo.block = flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
+
+	nft_block_setup(chain, &bo, cmd);
+}
+
+static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
+				      struct net_device *dev,
+				      enum flow_block_command cmd)
+{
+	struct flow_block_offload bo = {};
+	struct netlink_ext_ack extack = {};
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	flow_indr_block_call(&chain->flow_block, dev, nft_indr_block_ing_cmd,
+			     &bo, cmd);
+
+	if (list_empty(&bo.cb_list))
+		return -EOPNOTSUPP;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
 #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
 
 static int nft_flow_offload_chain(struct nft_trans *trans,
 				  enum flow_block_command cmd)
 {
 	struct nft_chain *chain = trans->ctx.chain;
-	struct netlink_ext_ack extack = {};
-	struct flow_block_offload bo = {};
 	struct nft_base_chain *basechain;
 	struct net_device *dev;
-	int err;
 
 	if (!nft_is_base_chain(chain))
 		return -EOPNOTSUPP;
 
 	basechain = nft_base_chain(chain);
 	dev = basechain->ops.dev;
-	if (!dev || !dev->netdev_ops->ndo_setup_tc)
+	if (!dev)
 		return -EOPNOTSUPP;
 
 	/* Only default policy to accept is supported for now. */
@@ -197,26 +287,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
 	    nft_trans_chain_policy(trans) != NF_ACCEPT)
 		return -EOPNOTSUPP;
 
-	bo.command = cmd;
-	bo.block = &basechain->flow_block;
-	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
-	bo.extack = &extack;
-	INIT_LIST_HEAD(&bo.cb_list);
-
-	err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
-	if (err < 0)
-		return err;
-
-	switch (cmd) {
-	case FLOW_BLOCK_BIND:
-		err = nft_flow_offload_bind(&bo, basechain);
-		break;
-	case FLOW_BLOCK_UNBIND:
-		err = nft_flow_offload_unbind(&bo, basechain);
-		break;
-	}
-
-	return err;
+	if (dev->netdev_ops->ndo_setup_tc)
+		return nft_block_offload_cmd(basechain, dev, cmd);
+	else
+		return nft_indr_block_offload_cmd(basechain, dev, cmd);
 }
 
 int nft_flow_rule_offload_commit(struct net *net)
@@ -266,3 +340,37 @@ int nft_flow_rule_offload_commit(struct net *net)
 
 	return err;
 }
+
+bool nft_indr_get_default_block(struct net_device *dev,
+				struct flow_indr_block_info *info)
+{
+	struct net *net = dev_net(dev);
+	const struct nft_table *table;
+	const struct nft_chain *chain;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(table, &net->nft.tables, list) {
+		if (table->family != NFPROTO_NETDEV)
+			continue;
+
+		list_for_each_entry_rcu(chain, &table->chains, list) {
+			if (nft_is_base_chain(chain)) {
+				struct nft_base_chain *basechain;
+
+				basechain = nft_base_chain(chain);
+				if (!strncmp(basechain->dev_name, dev->name,
+					     IFNAMSIZ)) {
+					info->flow_block = &basechain->flow_block;
+					info->ing_cmd_cb = nft_indr_block_ing_cmd;
+					rcu_read_unlock();
+					return true;
+				}
+			}
+		}
+	}
+
+	rcu_read_unlock();
+
+	return false;
+}
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next v5 2/6] cls_api: replace block with flow_block in tc_indr_block_dev
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
In-Reply-To: <1564628627-10021-1-git-send-email-wenxu@ucloud.cn>

From: wenxu <wenxu@ucloud.cn>

This patch make tc_indr_block_dev can separate from tc subsystem

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch

 net/sched/cls_api.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2e3b58d..f9643fa 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -574,7 +574,7 @@ struct tc_indr_block_dev {
 	struct net_device *dev;
 	unsigned int refcnt;
 	struct list_head cb_list;
-	struct tcf_block *block;
+	struct flow_block *flow_block;
 };
 
 struct tc_indr_block_cb {
@@ -597,6 +597,14 @@ struct tc_indr_block_cb {
 				      tc_indr_setup_block_ht_params);
 }
 
+static void tc_indr_get_default_block(struct tc_indr_block_dev *indr_dev)
+{
+	struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
+
+	if (block)
+		indr_dev->flow_block = &block->flow_block;
+}
+
 static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
 {
 	struct tc_indr_block_dev *indr_dev;
@@ -611,7 +619,7 @@ static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
 
 	INIT_LIST_HEAD(&indr_dev->cb_list);
 	indr_dev->dev = dev;
-	indr_dev->block = tc_dev_ingress_block(dev);
+	tc_indr_get_default_block(indr_dev);
 	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
 				   tc_indr_setup_block_ht_params)) {
 		kfree(indr_dev);
@@ -678,11 +686,14 @@ static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
 static void tc_indr_block_ing_cmd(struct net_device *dev,
-				  struct tcf_block *block,
+				  struct flow_block *flow_block,
 				  tc_indr_block_bind_cb_t *cb,
 				  void *cb_priv,
 				  enum flow_block_command command)
 {
+	struct tcf_block *block = flow_block ? container_of(flow_block,
+							    struct tcf_block,
+							    flow_block) : NULL;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
@@ -694,7 +705,7 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	if (!block)
 		return;
 
-	bo.block = &block->flow_block;
+	bo.block = flow_block;
 
 	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
 
@@ -717,7 +728,7 @@ int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 	if (err)
 		goto err_dev_put;
 
-	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, cb_priv,
+	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, cb, cb_priv,
 			      FLOW_BLOCK_BIND);
 	return 0;
 
@@ -750,13 +761,14 @@ void __tc_indr_block_cb_unregister(struct net_device *dev,
 	if (!indr_dev)
 		return;
 
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, indr_block_cb->cb,
+						indr_block_cb->cb_ident);
 	if (!indr_block_cb)
 		return;
 
 	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, indr_block_cb->cb_priv,
-			      FLOW_BLOCK_UNBIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, indr_block_cb->cb,
+			      indr_block_cb->cb_priv, FLOW_BLOCK_UNBIND);
 	tc_indr_block_cb_del(indr_block_cb);
 	tc_indr_block_dev_put(indr_dev);
 }
@@ -792,7 +804,8 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	if (!indr_dev)
 		return;
 
-	indr_dev->block = command == FLOW_BLOCK_BIND ? block : NULL;
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ?
+					  &block->flow_block : NULL;
 
 	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
 		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net] r8152: fix typo in register name
From: Kevin Lo @ 2019-08-01  3:29 UTC (permalink / raw)
  To: Hayes Wang; +Cc: netdev

It is likely that PAL_BDC_CR should be PLA_BDC_CR.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
---
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 39e0768d734d..0cc03a9ff545 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -50,7 +50,7 @@
 #define PLA_TEREDO_WAKE_BASE	0xc0c4
 #define PLA_MAR			0xcd00
 #define PLA_BACKUP		0xd000
-#define PAL_BDC_CR		0xd1a0
+#define PLA_BDC_CR		0xd1a0
 #define PLA_TEREDO_TIMER	0xd2cc
 #define PLA_REALWOW_TIMER	0xd2e8
 #define PLA_SUSPEND_FLAG	0xd38a
@@ -274,7 +274,7 @@
 #define TEREDO_RS_EVENT_MASK	0x00fe
 #define OOB_TEREDO_EN		0x0001
 
-/* PAL_BDC_CR */
+/* PLA_BDC_CR */
 #define ALDPS_PROXY_MODE	0x0001
 
 /* PLA_EFUSE_CMD */
@@ -3191,9 +3191,9 @@ static void r8152b_enter_oob(struct r8152 *tp)
 
 	rtl_rx_vlan_en(tp, true);
 
-	ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PAL_BDC_CR);
+	ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_BDC_CR);
 	ocp_data |= ALDPS_PROXY_MODE;
-	ocp_write_word(tp, MCU_TYPE_PLA, PAL_BDC_CR, ocp_data);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_BDC_CR, ocp_data);
 
 	ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL);
 	ocp_data |= NOW_IS_OOB | DIS_MCU_CLROOB;
@@ -3577,9 +3577,9 @@ static void r8153_enter_oob(struct r8152 *tp)
 
 	rtl_rx_vlan_en(tp, true);
 
-	ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PAL_BDC_CR);
+	ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_BDC_CR);
 	ocp_data |= ALDPS_PROXY_MODE;
-	ocp_write_word(tp, MCU_TYPE_PLA, PAL_BDC_CR, ocp_data);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_BDC_CR, ocp_data);
 
 	ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL);
 	ocp_data |= NOW_IS_OOB | DIS_MCU_CLROOB;

^ permalink raw reply related

* Re: [PATCH bpf v2] libbpf : make libbpf_num_possible_cpus function thread safe
From: Alexei Starovoitov @ 2019-08-01  3:40 UTC (permalink / raw)
  To: Takshak Chahande
  Cc: Network Development, Alexei Starovoitov, Daniel Borkmann,
	Andrey Ignatov, Kernel Team, hechaol, Jakub Kicinski
In-Reply-To: <20190731221055.1478201-1-ctakshak@fb.com>

On Wed, Jul 31, 2019 at 3:11 PM Takshak Chahande <ctakshak@fb.com> wrote:
>
> Having static variable `cpus` in libbpf_num_possible_cpus function
> without guarding it with mutex makes this function thread-unsafe.
>
> If multiple threads accessing this function, in the current form; it
> leads to incrementing the static variable value `cpus` in the multiple
> of total available CPUs.
>
> Used local stack variable to calculate the number of possible CPUs and
> then updated the static variable using WRITE_ONCE().
>
> Changes since v1:
>  * added stack variable to calculate cpus
>  * serialized static variable update using WRITE_ONCE()
>  * fixed Fixes tag
>
> Fixes: 6446b3155521 ("bpf: add a new API libbpf_num_possible_cpus()")
> Signed-off-by: Takshak Chahande <ctakshak@fb.com>

Applied. Thanks.
Thanks for keeping 'changes since v1 part' as part of git history.

^ permalink raw reply

* Re: [PATCH net-next v2 1/6] net: dsa: mv88e6xxx: add support for MV88E6220
From: Andrew Lunn @ 2019-08-01  3:43 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rasmus Villemoes
In-Reply-To: <20190731082351.3157-2-h.feurstein@gmail.com>

On Wed, Jul 31, 2019 at 10:23:46AM +0200, Hubert Feurstein wrote:
> The MV88E6220 is almost the same as MV88E6250 except that the ports 2-4 are
> not routed to pins. So the usable ports are 0, 1, 5 and 6.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH bpf 1/2] bpf: fix x64 JIT code generation for jmp to 1st insn
From: Alexei Starovoitov @ 2019-08-01  3:43 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	netdev@vger.kernel.org, bpf@vger.kernel.org, Kernel Team
In-Reply-To: <B6B907F5-E6CA-4C0B-92F3-0411CA4F4D95@fb.com>

On Wed, Jul 31, 2019 at 12:36 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 30, 2019, at 6:38 PM, Alexei Starovoitov <ast@kernel.org> wrote:
> >
> > Introduction of bounded loops exposed old bug in x64 JIT.
> > JIT maintains the array of offsets to the end of all instructions to
> > compute jmp offsets.
> > addrs[0] - offset of the end of the 1st insn (that includes prologue).
> > addrs[1] - offset of the end of the 2nd insn.
> > JIT didn't keep the offset of the beginning of the 1st insn,
> > since classic BPF didn't have backward jumps and valid extended BPF
> > couldn't have a branch to 1st insn, because it didn't allow loops.
> > With bounded loops it's possible to construct a valid program that
> > jumps backwards to the 1st insn.
> > Fix JIT by computing:
> > addrs[0] - offset of the end of prologue == start of the 1st insn.
> > addrs[1] - offset of the end of 1st insn.
> >
> > Reported-by: syzbot+35101610ff3e83119b1b@syzkaller.appspotmail.com
> > Fixes: 2589726d12a1 ("bpf: introduce bounded loops")
> > Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> Do we need similar fix for x86_32?

Right. x86_32 would need similar fix.

Applied to bpf tree.

^ permalink raw reply

* Re: [PATCH net-next v2 3/6] net: dsa: mv88e6xxx: introduce invalid_port_mask in mv88e6xxx_info
From: Andrew Lunn @ 2019-08-01  3:45 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rasmus Villemoes
In-Reply-To: <20190731082351.3157-4-h.feurstein@gmail.com>

On Wed, Jul 31, 2019 at 10:23:48AM +0200, Hubert Feurstein wrote:
> With this it is possible to mark certain chip ports as invalid. This is
> required for example for the MV88E6220 (which is in general a MV88E6250
> with 7 ports) but the ports 2-4 are not routed to pins.
> 
> If a user configures an invalid port, an error is returned.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: order ptp structs numerically ascending
From: Andrew Lunn @ 2019-08-01  3:46 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Vivien Didelot, Florian Fainelli,
	David S. Miller, Rasmus Villemoes
In-Reply-To: <20190731082351.3157-6-h.feurstein@gmail.com>

On Wed, Jul 31, 2019 at 10:23:50AM +0200, Hubert Feurstein wrote:
> As it is done for all the other structs within this driver.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 6/6] net: dsa: mv88e6xxx: add PTP support for MV88E6250 family
From: Andrew Lunn @ 2019-08-01  3:49 UTC (permalink / raw)
  To: Hubert Feurstein
  Cc: netdev, linux-kernel, Richard Cochran, Vivien Didelot,
	Florian Fainelli, David S. Miller, Rasmus Villemoes
In-Reply-To: <20190731082351.3157-7-h.feurstein@gmail.com>

On Wed, Jul 31, 2019 at 10:23:51AM +0200, Hubert Feurstein wrote:
> This adds PTP support for the MV88E6250 family.
> 
> Signed-off-by: Hubert Feurstein <h.feurstein@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* RE: [PATCH net] r8152: fix typo in register name
From: Hayes Wang @ 2019-08-01  3:50 UTC (permalink / raw)
  To: Kevin Lo; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190801032938.GA22256@ns.kevlo.org>

> From: Kevin Lo [mailto:kevlo@kevlo.org]
> Sent: Thursday, August 01, 2019 11:30 AM
> To: Hayes Wang
> Cc: netdev@vger.kernel.org
> Subject: [PATCH net] r8152: fix typo in register name
> 
> It is likely that PAL_BDC_CR should be PLA_BDC_CR.
> 
> Signed-off-by: Kevin Lo <kevlo@kevlo.org>

Acked-by: Hayes Wang <hayeswang@realtek.com>

Best Regards,
Hayes



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox