* [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload
@ 2019-08-01  3:03 wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters wenxu
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
This series patch make nftables offload support the vlan and
tunnel device offload through indr-block architecture.
The first four patches mv tc indr block to flow offload and
rename to flow-indr-block.
Because the new flow-indr-block can't get the tcf_block
directly. The fifthe patch provide a callback list to get 
flow_block of each subsystem immediately when the device
register and contain a block.
The last patch make nf_tables_offload support flow-indr-block.
wenxu (6):
  cls_api: modify the tc_indr_block_ing_cmd parameters.
  cls_api: replace block with flow_block in tc_indr_block_dev
  cls_api: add flow_indr_block_call function
  flow_offload: move tc indirect block to flow offload
  flow_offload: support get flow_block immediately
  netfilter: nf_tables_offload: support indr block call
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  11 +-
 include/net/flow_offload.h                         |  48 ++++
 include/net/netfilter/nf_tables_offload.h          |   2 +
 include/net/pkt_cls.h                              |  35 ---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 251 ++++++++++++++++++++
 net/netfilter/nf_tables_api.c                      |   7 +
 net/netfilter/nf_tables_offload.c                  | 156 +++++++++++--
 net/sched/cls_api.c                                | 255 ++++-----------------
 10 files changed, 497 insertions(+), 281 deletions(-)
-- 
1.8.3.1
^ permalink raw reply	[flat|nested] 20+ messages in thread
* [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters.
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 2/6] cls_api: replace block with flow_block in tc_indr_block_dev wenxu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
This patch make tc_indr_block_ing_cmd can't access struct
tc_indr_block_dev and tc_indr_block_cb.
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch
 net/sched/cls_api.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3565d9a..2e3b58d 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -677,26 +677,28 @@ static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
-static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
-				  struct tc_indr_block_cb *indr_block_cb,
+static void tc_indr_block_ing_cmd(struct net_device *dev,
+				  struct tcf_block *block,
+				  tc_indr_block_bind_cb_t *cb,
+				  void *cb_priv,
 				  enum flow_block_command command)
 {
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
-		.net		= dev_net(indr_dev->dev),
-		.block_shared	= tcf_block_non_null_shared(indr_dev->block),
+		.net		= dev_net(dev),
+		.block_shared	= tcf_block_non_null_shared(block),
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	if (!indr_dev->block)
+	if (!block)
 		return;
 
-	bo.block = &indr_dev->block->flow_block;
+	bo.block = &block->flow_block;
 
-	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-			  &bo);
-	tcf_block_setup(indr_dev->block, &bo);
+	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
+
+	tcf_block_setup(block, &bo);
 }
 
 int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
@@ -715,7 +717,8 @@ int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 	if (err)
 		goto err_dev_put;
 
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_BIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, cb_priv,
+			      FLOW_BLOCK_BIND);
 	return 0;
 
 err_dev_put:
@@ -752,7 +755,8 @@ void __tc_indr_block_cb_unregister(struct net_device *dev,
 		return;
 
 	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_UNBIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, indr_block_cb->cb_priv,
+			      FLOW_BLOCK_UNBIND);
 	tc_indr_block_cb_del(indr_block_cb);
 	tc_indr_block_dev_put(indr_dev);
 }
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* [PATCH net-next v5 2/6] cls_api: replace block with flow_block in tc_indr_block_dev
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-01  3:03 ` [PATCH net-next 3/6] cls_api: add flow_indr_block_call function wenxu
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
This patch make tc_indr_block_dev can separate from tc subsystem
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch
 net/sched/cls_api.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2e3b58d..f9643fa 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -574,7 +574,7 @@ struct tc_indr_block_dev {
 	struct net_device *dev;
 	unsigned int refcnt;
 	struct list_head cb_list;
-	struct tcf_block *block;
+	struct flow_block *flow_block;
 };
 
 struct tc_indr_block_cb {
@@ -597,6 +597,14 @@ struct tc_indr_block_cb {
 				      tc_indr_setup_block_ht_params);
 }
 
+static void tc_indr_get_default_block(struct tc_indr_block_dev *indr_dev)
+{
+	struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
+
+	if (block)
+		indr_dev->flow_block = &block->flow_block;
+}
+
 static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
 {
 	struct tc_indr_block_dev *indr_dev;
@@ -611,7 +619,7 @@ static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
 
 	INIT_LIST_HEAD(&indr_dev->cb_list);
 	indr_dev->dev = dev;
-	indr_dev->block = tc_dev_ingress_block(dev);
+	tc_indr_get_default_block(indr_dev);
 	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
 				   tc_indr_setup_block_ht_params)) {
 		kfree(indr_dev);
@@ -678,11 +686,14 @@ static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
 static void tc_indr_block_ing_cmd(struct net_device *dev,
-				  struct tcf_block *block,
+				  struct flow_block *flow_block,
 				  tc_indr_block_bind_cb_t *cb,
 				  void *cb_priv,
 				  enum flow_block_command command)
 {
+	struct tcf_block *block = flow_block ? container_of(flow_block,
+							    struct tcf_block,
+							    flow_block) : NULL;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
@@ -694,7 +705,7 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	if (!block)
 		return;
 
-	bo.block = &block->flow_block;
+	bo.block = flow_block;
 
 	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
 
@@ -717,7 +728,7 @@ int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 	if (err)
 		goto err_dev_put;
 
-	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, cb_priv,
+	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, cb, cb_priv,
 			      FLOW_BLOCK_BIND);
 	return 0;
 
@@ -750,13 +761,14 @@ void __tc_indr_block_cb_unregister(struct net_device *dev,
 	if (!indr_dev)
 		return;
 
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, indr_block_cb->cb,
+						indr_block_cb->cb_ident);
 	if (!indr_block_cb)
 		return;
 
 	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(dev, indr_dev->block, cb, indr_block_cb->cb_priv,
-			      FLOW_BLOCK_UNBIND);
+	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, indr_block_cb->cb,
+			      indr_block_cb->cb_priv, FLOW_BLOCK_UNBIND);
 	tc_indr_block_cb_del(indr_block_cb);
 	tc_indr_block_dev_put(indr_dev);
 }
@@ -792,7 +804,8 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	if (!indr_dev)
 		return;
 
-	indr_dev->block = command == FLOW_BLOCK_BIND ? block : NULL;
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ?
+					  &block->flow_block : NULL;
 
 	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
 		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* [PATCH net-next 3/6] cls_api: add flow_indr_block_call function
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 2/6] cls_api: replace block with flow_block in tc_indr_block_dev wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-05  6:02   ` Jiri Pirko
  2019-08-01  3:03 ` [PATCH net-next v5 4/6] flow_offload: move tc indirect block to flow offload wenxu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
This patch make indr_block_call don't access struct tc_indr_block_cb
and tc_indr_block_dev directly
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: new patch
 net/sched/cls_api.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index f9643fa..617b098 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -783,13 +783,30 @@ void tc_indr_block_cb_unregister(struct net_device *dev,
 }
 EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
 
+static void flow_indr_block_call(struct flow_block *flow_block,
+				 struct net_device *dev,
+				 struct flow_block_offload *bo,
+				 enum flow_block_command command)
+{
+	struct tc_indr_block_cb *indr_block_cb;
+	struct tc_indr_block_dev *indr_dev;
+
+	indr_dev = tc_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+				  bo);
+}
+
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
 			       struct netlink_ext_ack *extack)
 {
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= ei->binder_type,
@@ -800,17 +817,7 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_dev->flow_block = command == FLOW_BLOCK_BIND ?
-					  &block->flow_block : NULL;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-				  &bo);
-
+	flow_indr_block_call(&block->flow_block, dev, &bo, command);
 	tcf_block_setup(block, &bo);
 }
 
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* [PATCH net-next v5 4/6] flow_offload: move tc indirect block to flow offload
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
                   ` (2 preceding siblings ...)
  2019-08-01  3:03 ` [PATCH net-next 3/6] cls_api: add flow_indr_block_call function wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately wenxu
  2019-08-01  3:03 ` [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call wenxu
  5 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: make flow_indr_block_cb/dev in c file
 
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  11 +-
 include/net/flow_offload.h                         |  31 +++
 include/net/pkt_cls.h                              |  35 ---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 218 +++++++++++++++++++
 net/sched/cls_api.c                                | 241 +--------------------
 7 files changed, 265 insertions(+), 284 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7f747cb..074573b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -785,9 +785,9 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 {
 	int err;
 
-	err = __tc_indr_block_cb_register(netdev, rpriv,
-					  mlx5e_rep_indr_setup_tc_cb,
-					  rpriv);
+	err = __flow_indr_block_cb_register(netdev, rpriv,
+					    mlx5e_rep_indr_setup_tc_cb,
+					    rpriv);
 	if (err) {
 		struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
 
@@ -800,8 +800,8 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
 					    struct net_device *netdev)
 {
-	__tc_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
-				      rpriv);
+	__flow_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
+					rpriv);
 }
 
 static int mlx5e_nic_rep_netdevice_event(struct notifier_block *nb,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index e209f15..7b490db 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1479,16 +1479,17 @@ int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
 		return NOTIFY_OK;
 
 	if (event == NETDEV_REGISTER) {
-		err = __tc_indr_block_cb_register(netdev, app,
-						  nfp_flower_indr_setup_tc_cb,
-						  app);
+		err = __flow_indr_block_cb_register(netdev, app,
+						    nfp_flower_indr_setup_tc_cb,
+						    app);
 		if (err)
 			nfp_flower_cmsg_warn(app,
 					     "Indirect block reg failed - %s\n",
 					     netdev->name);
 	} else if (event == NETDEV_UNREGISTER) {
-		__tc_indr_block_cb_unregister(netdev,
-					      nfp_flower_indr_setup_tc_cb, app);
+		__flow_indr_block_cb_unregister(netdev,
+						nfp_flower_indr_setup_tc_cb,
+						app);
 	}
 
 	return NOTIFY_OK;
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 00b9aab..c8d60a6 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -4,6 +4,7 @@
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <net/flow_dissector.h>
+#include <linux/rhashtable.h>
 
 struct flow_match {
 	struct flow_dissector	*dissector;
@@ -366,4 +367,34 @@ static inline void flow_block_init(struct flow_block *flow_block)
 	INIT_LIST_HEAD(&flow_block->cb_list);
 }
 
+typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
+				      enum tc_setup_type type, void *type_data);
+
+typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
+				       struct flow_block *flow_block,
+				       flow_indr_block_bind_cb_t *cb,
+				       void *cb_priv,
+				       enum flow_block_command command);
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_ident);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_ident);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_ident);
+
+void flow_indr_block_call(struct flow_block *flow_block,
+			  struct net_device *dev,
+			  flow_indr_block_ing_cmd_t *ing_cmd_cb,
+			  struct flow_block_offload *bo,
+			  enum flow_block_command command);
+
 #endif /* _NET_FLOW_OFFLOAD_H */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809..0790a4e 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -70,15 +70,6 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 	return block->q;
 }
 
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident);
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident);
-
 int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		 struct tcf_result *res, bool compat_mode);
 
@@ -137,32 +128,6 @@ void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
 {
 }
 
-static inline
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
-static inline
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
 static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			       struct tcf_result *res, bool compat_mode)
 {
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6b6b012..d9f359a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -23,9 +23,6 @@
 struct module;
 struct bpf_flow_keys;
 
-typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
-				    enum tc_setup_type type, void *type_data);
-
 struct qdisc_rate_table {
 	struct tc_ratespec rate;
 	u32		data[256];
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index d63b970..a1fdfa4 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -2,6 +2,7 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <net/flow_offload.h>
+#include <linux/rtnetlink.h>
 
 struct flow_rule *flow_rule_alloc(unsigned int num_actions)
 {
@@ -280,3 +281,220 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 	}
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
+
+static struct rhashtable indr_setup_block_ht;
+
+struct flow_indr_block_cb {
+	struct list_head list;
+	void *cb_priv;
+	flow_indr_block_bind_cb_t *cb;
+	void *cb_ident;
+};
+
+struct flow_indr_block_dev {
+	struct rhash_head ht_node;
+	struct net_device *dev;
+	unsigned int refcnt;
+	struct list_head cb_list;
+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
+	struct flow_block *flow_block;
+};
+
+static const struct rhashtable_params flow_indr_setup_block_ht_params = {
+	.key_offset	= offsetof(struct flow_indr_block_dev, dev),
+	.head_offset	= offsetof(struct flow_indr_block_dev, ht_node),
+	.key_len	= sizeof(struct net_device *),
+};
+
+static struct flow_indr_block_dev *
+flow_indr_block_dev_lookup(struct net_device *dev)
+{
+	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
+				      flow_indr_setup_block_ht_params);
+}
+
+static struct flow_indr_block_dev *
+flow_indr_block_dev_get(struct net_device *dev)
+{
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (indr_dev)
+		goto inc_ref;
+
+	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
+	if (!indr_dev)
+		return NULL;
+
+	INIT_LIST_HEAD(&indr_dev->cb_list);
+	indr_dev->dev = dev;
+	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+				   flow_indr_setup_block_ht_params)) {
+		kfree(indr_dev);
+		return NULL;
+	}
+
+inc_ref:
+	indr_dev->refcnt++;
+	return indr_dev;
+}
+
+static void flow_indr_block_dev_put(struct flow_indr_block_dev *indr_dev)
+{
+	if (--indr_dev->refcnt)
+		return;
+
+	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+			       flow_indr_setup_block_ht_params);
+	kfree(indr_dev);
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_lookup(struct flow_indr_block_dev *indr_dev,
+			  flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		if (indr_block_cb->cb == cb &&
+		    indr_block_cb->cb_ident == cb_ident)
+			return indr_block_cb;
+	return NULL;
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_add(struct flow_indr_block_dev *indr_dev, void *cb_priv,
+		       flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (indr_block_cb)
+		return ERR_PTR(-EEXIST);
+
+	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
+	if (!indr_block_cb)
+		return ERR_PTR(-ENOMEM);
+
+	indr_block_cb->cb_priv = cb_priv;
+	indr_block_cb->cb = cb;
+	indr_block_cb->cb_ident = cb_ident;
+	list_add(&indr_block_cb->list, &indr_dev->cb_list);
+
+	return indr_block_cb;
+}
+
+static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
+{
+	list_del(&indr_block_cb->list);
+	kfree(indr_block_cb);
+}
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+	int err;
+
+	indr_dev = flow_indr_block_dev_get(dev);
+	if (!indr_dev)
+		return -ENOMEM;
+
+	indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
+	err = PTR_ERR_OR_ZERO(indr_block_cb);
+	if (err)
+		goto err_dev_put;
+
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+				     indr_block_cb->cb, indr_block_cb->cb_priv,
+				     FLOW_BLOCK_BIND);
+
+	return 0;
+
+err_dev_put:
+	flow_indr_block_dev_put(indr_dev);
+	return err;
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb,
+				void *cb_ident)
+{
+	int err;
+
+	rtnl_lock();
+	err = __flow_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
+	rtnl_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_register);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (!indr_block_cb)
+		return;
+
+	/* Send unbind message if required to free any block cbs. */
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+				     indr_block_cb->cb, indr_block_cb->cb_priv,
+				     FLOW_BLOCK_UNBIND);
+
+	flow_indr_block_cb_del(indr_block_cb);
+	flow_indr_block_dev_put(indr_dev);
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_unregister);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_ident)
+{
+	rtnl_lock();
+	__flow_indr_block_cb_unregister(dev, cb, cb_ident);
+	rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_unregister);
+
+void flow_indr_block_call(struct flow_block *flow_block,
+			  struct net_device *dev,
+			  flow_indr_block_ing_cmd_t *ing_cmd_cb,
+			  struct flow_block_offload *bo,
+			  enum flow_block_command command)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
+	indr_dev->ing_cmd_cb = command == FLOW_BLOCK_BIND ? ing_cmd_cb : NULL;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+				  bo);
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_call);
+
+static int __init init_flow_indr_rhashtable(void)
+{
+	return rhashtable_init(&indr_setup_block_ht,
+			       &flow_indr_setup_block_ht_params);
+}
+subsys_initcall(init_flow_indr_rhashtable);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 617b098..bd5e591 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -37,6 +37,7 @@
 #include <net/tc_act/tc_skbedit.h>
 #include <net/tc_act/tc_ct.h>
 #include <net/tc_act/tc_mpls.h>
+#include <net/flow_offload.h>
 
 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
 
@@ -545,149 +546,12 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 	}
 }
 
-static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
-{
-	const struct Qdisc_class_ops *cops;
-	struct Qdisc *qdisc;
-
-	if (!dev_ingress_queue(dev))
-		return NULL;
-
-	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
-	if (!qdisc)
-		return NULL;
-
-	cops = qdisc->ops->cl_ops;
-	if (!cops)
-		return NULL;
-
-	if (!cops->tcf_block)
-		return NULL;
-
-	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
-}
-
-static struct rhashtable indr_setup_block_ht;
-
-struct tc_indr_block_dev {
-	struct rhash_head ht_node;
-	struct net_device *dev;
-	unsigned int refcnt;
-	struct list_head cb_list;
-	struct flow_block *flow_block;
-};
-
-struct tc_indr_block_cb {
-	struct list_head list;
-	void *cb_priv;
-	tc_indr_block_bind_cb_t *cb;
-	void *cb_ident;
-};
-
-static const struct rhashtable_params tc_indr_setup_block_ht_params = {
-	.key_offset	= offsetof(struct tc_indr_block_dev, dev),
-	.head_offset	= offsetof(struct tc_indr_block_dev, ht_node),
-	.key_len	= sizeof(struct net_device *),
-};
-
-static struct tc_indr_block_dev *
-tc_indr_block_dev_lookup(struct net_device *dev)
-{
-	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
-				      tc_indr_setup_block_ht_params);
-}
-
-static void tc_indr_get_default_block(struct tc_indr_block_dev *indr_dev)
-{
-	struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
-
-	if (block)
-		indr_dev->flow_block = &block->flow_block;
-}
-
-static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
-{
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (indr_dev)
-		goto inc_ref;
-
-	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
-	if (!indr_dev)
-		return NULL;
-
-	INIT_LIST_HEAD(&indr_dev->cb_list);
-	indr_dev->dev = dev;
-	tc_indr_get_default_block(indr_dev);
-	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-				   tc_indr_setup_block_ht_params)) {
-		kfree(indr_dev);
-		return NULL;
-	}
-
-inc_ref:
-	indr_dev->refcnt++;
-	return indr_dev;
-}
-
-static void tc_indr_block_dev_put(struct tc_indr_block_dev *indr_dev)
-{
-	if (--indr_dev->refcnt)
-		return;
-
-	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-			       tc_indr_setup_block_ht_params);
-	kfree(indr_dev);
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_lookup(struct tc_indr_block_dev *indr_dev,
-			tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		if (indr_block_cb->cb == cb &&
-		    indr_block_cb->cb_ident == cb_ident)
-			return indr_block_cb;
-	return NULL;
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_add(struct tc_indr_block_dev *indr_dev, void *cb_priv,
-		     tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (indr_block_cb)
-		return ERR_PTR(-EEXIST);
-
-	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
-	if (!indr_block_cb)
-		return ERR_PTR(-ENOMEM);
-
-	indr_block_cb->cb_priv = cb_priv;
-	indr_block_cb->cb = cb;
-	indr_block_cb->cb_ident = cb_ident;
-	list_add(&indr_block_cb->list, &indr_dev->cb_list);
-
-	return indr_block_cb;
-}
-
-static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
-{
-	list_del(&indr_block_cb->list);
-	kfree(indr_block_cb);
-}
-
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
 static void tc_indr_block_ing_cmd(struct net_device *dev,
 				  struct flow_block *flow_block,
-				  tc_indr_block_bind_cb_t *cb,
+				  flow_indr_block_bind_cb_t *cb,
 				  void *cb_priv,
 				  enum flow_block_command command)
 {
@@ -712,96 +576,6 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	tcf_block_setup(block, &bo);
 }
 
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-	int err;
-
-	indr_dev = tc_indr_block_dev_get(dev);
-	if (!indr_dev)
-		return -ENOMEM;
-
-	indr_block_cb = tc_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
-	err = PTR_ERR_OR_ZERO(indr_block_cb);
-	if (err)
-		goto err_dev_put;
-
-	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, cb, cb_priv,
-			      FLOW_BLOCK_BIND);
-	return 0;
-
-err_dev_put:
-	tc_indr_block_dev_put(indr_dev);
-	return err;
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_register);
-
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	int err;
-
-	rtnl_lock();
-	err = __tc_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
-	rtnl_unlock();
-
-	return err;
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_register);
-
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, indr_block_cb->cb,
-						indr_block_cb->cb_ident);
-	if (!indr_block_cb)
-		return;
-
-	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(dev, indr_dev->flow_block, indr_block_cb->cb,
-			      indr_block_cb->cb_priv, FLOW_BLOCK_UNBIND);
-	tc_indr_block_cb_del(indr_block_cb);
-	tc_indr_block_dev_put(indr_dev);
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_unregister);
-
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	rtnl_lock();
-	__tc_indr_block_cb_unregister(dev, cb, cb_ident);
-	rtnl_unlock();
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
-
-static void flow_indr_block_call(struct flow_block *flow_block,
-				 struct net_device *dev,
-				 struct flow_block_offload *bo,
-				 enum flow_block_command command)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
-		return;
-
-	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? flow_block : NULL;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-				  bo);
-}
-
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
@@ -817,7 +591,9 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	flow_indr_block_call(&block->flow_block, dev, &bo, command);
+	flow_indr_block_call(&block->flow_block, dev, tc_indr_block_ing_cmd,
+			     &bo, command);
+
 	tcf_block_setup(block, &bo);
 }
 
@@ -3382,11 +3158,6 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
-	err = rhashtable_init(&indr_setup_block_ht,
-			      &tc_indr_setup_block_ht_params);
-	if (err)
-		goto err_rhash_setup_block_ht;
-
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
@@ -3400,8 +3171,6 @@ static int __init tc_filter_init(void)
 
 	return 0;
 
-err_rhash_setup_block_ht:
-	unregister_pernet_subsys(&tcf_net_ops);
 err_register_pernet_subsys:
 	destroy_workqueue(tc_filter_wq);
 	return err;
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
                   ` (3 preceding siblings ...)
  2019-08-01  3:03 ` [PATCH net-next v5 4/6] flow_offload: move tc indirect block to flow offload wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-01 23:11   ` Jakub Kicinski
  2019-08-01  3:03 ` [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call wenxu
  5 siblings, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
The new flow-indr-block can't get the tcf_block
directly. It provide a callback list to find the flow_block immediately
when the device register and contain a ingress block.
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: add get_block_cb_list for both nft and tc
 include/net/flow_offload.h | 17 +++++++++++++++++
 net/core/flow_offload.c    | 33 +++++++++++++++++++++++++++++++++
 net/sched/cls_api.c        | 44 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+)
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index c8d60a6..db04e3f 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -376,6 +376,23 @@ typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
 				       void *cb_priv,
 				       enum flow_block_command command);
 
+struct flow_indr_block_info {
+	struct flow_block *flow_block;
+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
+};
+
+typedef bool flow_indr_get_default_block_t(struct net_device *dev,
+					    struct flow_indr_block_info *info);
+
+struct flow_indr_get_block_entry {
+	flow_indr_get_default_block_t *get_block_cb;
+	struct list_head	list;
+};
+
+void flow_indr_add_default_block_cb(struct flow_indr_get_block_entry *entry);
+
+void flow_indr_del_default_block_cb(struct flow_indr_get_block_entry *entry);
+
 int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
 				  flow_indr_block_bind_cb_t *cb,
 				  void *cb_ident);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index a1fdfa4..8ff7a75b 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -282,6 +282,8 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
 
+static LIST_HEAD(get_default_block_cb_list);
+
 static struct rhashtable indr_setup_block_ht;
 
 struct flow_indr_block_cb {
@@ -313,6 +315,24 @@ struct flow_indr_block_dev {
 				      flow_indr_setup_block_ht_params);
 }
 
+static void flow_get_default_block(struct flow_indr_block_dev *indr_dev)
+{
+	struct flow_indr_get_block_entry *entry_cb;
+	struct flow_indr_block_info info;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(entry_cb, &get_default_block_cb_list, list) {
+		if (entry_cb->get_block_cb(indr_dev->dev, &info)) {
+			indr_dev->flow_block = info.flow_block;
+			indr_dev->ing_cmd_cb = info.ing_cmd_cb;
+			break;
+		}
+	}
+
+	rcu_read_unlock();
+}
+
 static struct flow_indr_block_dev *
 flow_indr_block_dev_get(struct net_device *dev)
 {
@@ -328,6 +348,7 @@ struct flow_indr_block_dev {
 
 	INIT_LIST_HEAD(&indr_dev->cb_list);
 	indr_dev->dev = dev;
+	flow_get_default_block(indr_dev);
 	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
 				   flow_indr_setup_block_ht_params)) {
 		kfree(indr_dev);
@@ -492,6 +513,18 @@ void flow_indr_block_call(struct flow_block *flow_block,
 }
 EXPORT_SYMBOL_GPL(flow_indr_block_call);
 
+void flow_indr_add_default_block_cb(struct flow_indr_get_block_entry *entry)
+{
+	list_add_tail_rcu(&entry->list, &get_default_block_cb_list);
+}
+EXPORT_SYMBOL_GPL(flow_indr_add_default_block_cb);
+
+void flow_indr_del_default_block_cb(struct flow_indr_get_block_entry *entry)
+{
+	list_del_rcu(&entry->list);
+}
+EXPORT_SYMBOL_GPL(flow_indr_del_default_block_cb);
+
 static int __init init_flow_indr_rhashtable(void)
 {
 	return rhashtable_init(&indr_setup_block_ht,
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index bd5e591..8bf918c 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -576,6 +576,43 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	tcf_block_setup(block, &bo);
 }
 
+static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
+{
+	const struct Qdisc_class_ops *cops;
+	struct Qdisc *qdisc;
+
+	if (!dev_ingress_queue(dev))
+		return NULL;
+
+	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
+	if (!qdisc)
+		return NULL;
+
+	cops = qdisc->ops->cl_ops;
+	if (!cops)
+		return NULL;
+
+	if (!cops->tcf_block)
+		return NULL;
+
+	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+}
+
+static bool tc_indr_get_default_block(struct net_device *dev,
+				      struct flow_indr_block_info *info)
+{
+	struct tcf_block *block = tc_dev_ingress_block(dev);
+
+	if (block) {
+		info->flow_block = &block->flow_block;
+		info->ing_cmd_cb = tc_indr_block_ing_cmd;
+
+		return true;
+	}
+
+	return false;
+}
+
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
@@ -3146,6 +3183,11 @@ static void __net_exit tcf_net_exit(struct net *net)
 	.size = sizeof(struct tcf_net),
 };
 
+static struct flow_indr_get_block_entry get_block_entry = {
+	.get_block_cb = tc_indr_get_default_block,
+	.list = LIST_HEAD_INIT(get_block_entry.list),
+};
+
 static int __init tc_filter_init(void)
 {
 	int err;
@@ -3158,6 +3200,8 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
+	flow_indr_add_default_block_cb(&get_block_entry);
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call
  2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
                   ` (4 preceding siblings ...)
  2019-08-01  3:03 ` [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately wenxu
@ 2019-08-01  3:03 ` wenxu
  2019-08-01  3:58   ` Yunsheng Lin
  5 siblings, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-01  3:03 UTC (permalink / raw)
  To: jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
nftable support indr-block call. It makes nftable an offload vlan
and tunnel device.
nft add table netdev firewall
nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v5: add nft_get_default_block
 include/net/netfilter/nf_tables_offload.h |   2 +
 net/netfilter/nf_tables_api.c             |   7 ++
 net/netfilter/nf_tables_offload.c         | 156 +++++++++++++++++++++++++-----
 3 files changed, 141 insertions(+), 24 deletions(-)
diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 3196663..ac69087 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -63,6 +63,8 @@ struct nft_flow_rule {
 struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
 void nft_flow_rule_destroy(struct nft_flow_rule *flow);
 int nft_flow_rule_offload_commit(struct net *net);
+bool nft_indr_get_default_block(struct net_device *dev,
+				struct flow_indr_block_info *info);
 
 #define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg)		\
 	(__reg)->base_offset	=					\
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cf..6a1d0b2 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7593,6 +7593,11 @@ static void __net_exit nf_tables_exit_net(struct net *net)
 	.exit	= nf_tables_exit_net,
 };
 
+static struct flow_indr_get_block_entry get_block_entry = {
+	.get_block_cb = nft_indr_get_default_block,
+	.list = LIST_HEAD_INIT(get_block_entry.list),
+};
+
 static int __init nf_tables_module_init(void)
 {
 	int err;
@@ -7624,6 +7629,7 @@ static int __init nf_tables_module_init(void)
 		goto err5;
 
 	nft_chain_route_init();
+	flow_indr_add_default_block_cb(&get_block_entry);
 	return err;
 err5:
 	rhltable_destroy(&nft_objname_ht);
@@ -7640,6 +7646,7 @@ static int __init nf_tables_module_init(void)
 
 static void __exit nf_tables_module_exit(void)
 {
+	flow_indr_del_default_block_cb(&get_block_entry);
 	nfnetlink_subsys_unregister(&nf_tables_subsys);
 	unregister_netdevice_notifier(&nf_tables_flowtable_notifier);
 	nft_chain_filter_fini();
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5..59c9629 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -171,24 +171,114 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
 	return 0;
 }
 
+static int nft_block_setup(struct nft_base_chain *basechain,
+			   struct flow_block_offload *bo,
+			   enum flow_block_command cmd)
+{
+	int err;
+
+	switch (cmd) {
+	case FLOW_BLOCK_BIND:
+		err = nft_flow_offload_bind(bo, basechain);
+		break;
+	case FLOW_BLOCK_UNBIND:
+		err = nft_flow_offload_unbind(bo, basechain);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static int nft_block_offload_cmd(struct nft_base_chain *chain,
+				 struct net_device *dev,
+				 enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	int err;
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	if (err < 0)
+		return err;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
+static void nft_indr_block_ing_cmd(struct net_device *dev,
+				   struct flow_block *flow_block,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_priv,
+				   enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	struct nft_base_chain *chain;
+
+	if (flow_block)
+		return;
+
+	chain = container_of(flow_block, struct nft_base_chain, flow_block);
+
+	bo.net = dev_net(dev);
+	bo.block = flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
+
+	nft_block_setup(chain, &bo, cmd);
+}
+
+static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
+				      struct net_device *dev,
+				      enum flow_block_command cmd)
+{
+	struct flow_block_offload bo = {};
+	struct netlink_ext_ack extack = {};
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	flow_indr_block_call(&chain->flow_block, dev, nft_indr_block_ing_cmd,
+			     &bo, cmd);
+
+	if (list_empty(&bo.cb_list))
+		return -EOPNOTSUPP;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
 #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
 
 static int nft_flow_offload_chain(struct nft_trans *trans,
 				  enum flow_block_command cmd)
 {
 	struct nft_chain *chain = trans->ctx.chain;
-	struct netlink_ext_ack extack = {};
-	struct flow_block_offload bo = {};
 	struct nft_base_chain *basechain;
 	struct net_device *dev;
-	int err;
 
 	if (!nft_is_base_chain(chain))
 		return -EOPNOTSUPP;
 
 	basechain = nft_base_chain(chain);
 	dev = basechain->ops.dev;
-	if (!dev || !dev->netdev_ops->ndo_setup_tc)
+	if (!dev)
 		return -EOPNOTSUPP;
 
 	/* Only default policy to accept is supported for now. */
@@ -197,26 +287,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
 	    nft_trans_chain_policy(trans) != NF_ACCEPT)
 		return -EOPNOTSUPP;
 
-	bo.command = cmd;
-	bo.block = &basechain->flow_block;
-	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
-	bo.extack = &extack;
-	INIT_LIST_HEAD(&bo.cb_list);
-
-	err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
-	if (err < 0)
-		return err;
-
-	switch (cmd) {
-	case FLOW_BLOCK_BIND:
-		err = nft_flow_offload_bind(&bo, basechain);
-		break;
-	case FLOW_BLOCK_UNBIND:
-		err = nft_flow_offload_unbind(&bo, basechain);
-		break;
-	}
-
-	return err;
+	if (dev->netdev_ops->ndo_setup_tc)
+		return nft_block_offload_cmd(basechain, dev, cmd);
+	else
+		return nft_indr_block_offload_cmd(basechain, dev, cmd);
 }
 
 int nft_flow_rule_offload_commit(struct net *net)
@@ -266,3 +340,37 @@ int nft_flow_rule_offload_commit(struct net *net)
 
 	return err;
 }
+
+bool nft_indr_get_default_block(struct net_device *dev,
+				struct flow_indr_block_info *info)
+{
+	struct net *net = dev_net(dev);
+	const struct nft_table *table;
+	const struct nft_chain *chain;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(table, &net->nft.tables, list) {
+		if (table->family != NFPROTO_NETDEV)
+			continue;
+
+		list_for_each_entry_rcu(chain, &table->chains, list) {
+			if (nft_is_base_chain(chain)) {
+				struct nft_base_chain *basechain;
+
+				basechain = nft_base_chain(chain);
+				if (!strncmp(basechain->dev_name, dev->name,
+					     IFNAMSIZ)) {
+					info->flow_block = &basechain->flow_block;
+					info->ing_cmd_cb = nft_indr_block_ing_cmd;
+					rcu_read_unlock();
+					return true;
+				}
+			}
+		}
+	}
+
+	rcu_read_unlock();
+
+	return false;
+}
-- 
1.8.3.1
^ permalink raw reply related	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call
  2019-08-01  3:03 ` [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call wenxu
@ 2019-08-01  3:58   ` Yunsheng Lin
  2019-08-01  4:47     ` wenxu
  0 siblings, 1 reply; 20+ messages in thread
From: Yunsheng Lin @ 2019-08-01  3:58 UTC (permalink / raw)
  To: wenxu, jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
On 2019/8/1 11:03, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> nftable support indr-block call. It makes nftable an offload vlan
> and tunnel device.
> 
> nft add table netdev firewall
> nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
> nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
> nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
> nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0
> 
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
> v5: add nft_get_default_block
> 
>  include/net/netfilter/nf_tables_offload.h |   2 +
>  net/netfilter/nf_tables_api.c             |   7 ++
>  net/netfilter/nf_tables_offload.c         | 156 +++++++++++++++++++++++++-----
>  3 files changed, 141 insertions(+), 24 deletions(-)
> 
> diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
> index 3196663..ac69087 100644
> --- a/include/net/netfilter/nf_tables_offload.h
> +++ b/include/net/netfilter/nf_tables_offload.h
> @@ -63,6 +63,8 @@ struct nft_flow_rule {
>  struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
>  void nft_flow_rule_destroy(struct nft_flow_rule *flow);
>  int nft_flow_rule_offload_commit(struct net *net);
> +bool nft_indr_get_default_block(struct net_device *dev,
> +				struct flow_indr_block_info *info);
>  
>  #define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg)		\
>  	(__reg)->base_offset	=					\
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 605a7cf..6a1d0b2 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -7593,6 +7593,11 @@ static void __net_exit nf_tables_exit_net(struct net *net)
>  	.exit	= nf_tables_exit_net,
>  };
>  
> +static struct flow_indr_get_block_entry get_block_entry = {
> +	.get_block_cb = nft_indr_get_default_block,
> +	.list = LIST_HEAD_INIT(get_block_entry.list),
> +};
> +
>  static int __init nf_tables_module_init(void)
>  {
>  	int err;
> @@ -7624,6 +7629,7 @@ static int __init nf_tables_module_init(void)
>  		goto err5;
>  
>  	nft_chain_route_init();
> +	flow_indr_add_default_block_cb(&get_block_entry);
>  	return err;
>  err5:
>  	rhltable_destroy(&nft_objname_ht);
> @@ -7640,6 +7646,7 @@ static int __init nf_tables_module_init(void)
>  
>  static void __exit nf_tables_module_exit(void)
>  {
> +	flow_indr_del_default_block_cb(&get_block_entry);
>  	nfnetlink_subsys_unregister(&nf_tables_subsys);
>  	unregister_netdevice_notifier(&nf_tables_flowtable_notifier);
>  	nft_chain_filter_fini();
> diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
> index 64f5fd5..59c9629 100644
> --- a/net/netfilter/nf_tables_offload.c
> +++ b/net/netfilter/nf_tables_offload.c
> @@ -171,24 +171,114 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
>  	return 0;
>  }
>  
> +static int nft_block_setup(struct nft_base_chain *basechain,
> +			   struct flow_block_offload *bo,
> +			   enum flow_block_command cmd)
> +{
> +	int err;
> +
> +	switch (cmd) {
> +	case FLOW_BLOCK_BIND:
> +		err = nft_flow_offload_bind(bo, basechain);
> +		break;
> +	case FLOW_BLOCK_UNBIND:
> +		err = nft_flow_offload_unbind(bo, basechain);
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		err = -EOPNOTSUPP;
> +	}
> +
> +	return err;
> +}
> +
> +static int nft_block_offload_cmd(struct nft_base_chain *chain,
> +				 struct net_device *dev,
> +				 enum flow_block_command cmd)
> +{
> +	struct netlink_ext_ack extack = {};
> +	struct flow_block_offload bo = {};
> +	int err;
> +
> +	bo.net = dev_net(dev);
> +	bo.block = &chain->flow_block;
> +	bo.command = cmd;
> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
> +	bo.extack = &extack;
> +	INIT_LIST_HEAD(&bo.cb_list);
> +
> +	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> +	if (err < 0)
> +		return err;
> +
> +	return nft_block_setup(chain, &bo, cmd);
> +}
> +
> +static void nft_indr_block_ing_cmd(struct net_device *dev,
> +				   struct flow_block *flow_block,
> +				   flow_indr_block_bind_cb_t *cb,
> +				   void *cb_priv,
> +				   enum flow_block_command cmd)
> +{
> +	struct netlink_ext_ack extack = {};
> +	struct flow_block_offload bo = {};
> +	struct nft_base_chain *chain;
> +
> +	if (flow_block)
> +		return;
Maybe "if (!flow_block)" ?
> +
> +	chain = container_of(flow_block, struct nft_base_chain, flow_block);
> +
> +	bo.net = dev_net(dev);
> +	bo.block = flow_block;
> +	bo.command = cmd;
> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
> +	bo.extack = &extack;
> +	INIT_LIST_HEAD(&bo.cb_list);
> +
> +	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
> +
> +	nft_block_setup(chain, &bo, cmd);
> +}
> +
> +static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
> +				      struct net_device *dev,
> +				      enum flow_block_command cmd)
> +{
> +	struct flow_block_offload bo = {};
> +	struct netlink_ext_ack extack = {};
> +
> +	bo.net = dev_net(dev);
> +	bo.block = &chain->flow_block;
> +	bo.command = cmd;
> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
> +	bo.extack = &extack;
> +	INIT_LIST_HEAD(&bo.cb_list);
> +
> +	flow_indr_block_call(&chain->flow_block, dev, nft_indr_block_ing_cmd,
> +			     &bo, cmd);
> +
> +	if (list_empty(&bo.cb_list))
> +		return -EOPNOTSUPP;
> +
> +	return nft_block_setup(chain, &bo, cmd);
> +}
> +
>  #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
>  
>  static int nft_flow_offload_chain(struct nft_trans *trans,
>  				  enum flow_block_command cmd)
>  {
>  	struct nft_chain *chain = trans->ctx.chain;
> -	struct netlink_ext_ack extack = {};
> -	struct flow_block_offload bo = {};
>  	struct nft_base_chain *basechain;
>  	struct net_device *dev;
> -	int err;
>  
>  	if (!nft_is_base_chain(chain))
>  		return -EOPNOTSUPP;
>  
>  	basechain = nft_base_chain(chain);
>  	dev = basechain->ops.dev;
> -	if (!dev || !dev->netdev_ops->ndo_setup_tc)
> +	if (!dev)
>  		return -EOPNOTSUPP;
>  
>  	/* Only default policy to accept is supported for now. */
> @@ -197,26 +287,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
>  	    nft_trans_chain_policy(trans) != NF_ACCEPT)
>  		return -EOPNOTSUPP;
>  
> -	bo.command = cmd;
> -	bo.block = &basechain->flow_block;
> -	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
> -	bo.extack = &extack;
> -	INIT_LIST_HEAD(&bo.cb_list);
> -
> -	err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
> -	if (err < 0)
> -		return err;
> -
> -	switch (cmd) {
> -	case FLOW_BLOCK_BIND:
> -		err = nft_flow_offload_bind(&bo, basechain);
> -		break;
> -	case FLOW_BLOCK_UNBIND:
> -		err = nft_flow_offload_unbind(&bo, basechain);
> -		break;
> -	}
> -
> -	return err;
> +	if (dev->netdev_ops->ndo_setup_tc)
> +		return nft_block_offload_cmd(basechain, dev, cmd);
> +	else
> +		return nft_indr_block_offload_cmd(basechain, dev, cmd);
>  }
>  
>  int nft_flow_rule_offload_commit(struct net *net)
> @@ -266,3 +340,37 @@ int nft_flow_rule_offload_commit(struct net *net)
>  
>  	return err;
>  }
> +
> +bool nft_indr_get_default_block(struct net_device *dev,
> +				struct flow_indr_block_info *info)
> +{
> +	struct net *net = dev_net(dev);
> +	const struct nft_table *table;
> +	const struct nft_chain *chain;
> +
> +	rcu_read_lock();
> +
> +	list_for_each_entry_rcu(table, &net->nft.tables, list) {
> +		if (table->family != NFPROTO_NETDEV)
> +			continue;
> +
> +		list_for_each_entry_rcu(chain, &table->chains, list) {
> +			if (nft_is_base_chain(chain)) {
> +				struct nft_base_chain *basechain;
> +
> +				basechain = nft_base_chain(chain);
> +				if (!strncmp(basechain->dev_name, dev->name,
> +					     IFNAMSIZ)) {
> +					info->flow_block = &basechain->flow_block;
> +					info->ing_cmd_cb = nft_indr_block_ing_cmd;
> +					rcu_read_unlock();
> +					return true;
> +				}
> +			}
> +		}
> +	}
> +
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> 
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call
  2019-08-01  3:58   ` Yunsheng Lin
@ 2019-08-01  4:47     ` wenxu
  0 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-01  4:47 UTC (permalink / raw)
  To: Yunsheng Lin, jiri, pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev
On 8/1/2019 11:58 AM, Yunsheng Lin wrote:
> On 2019/8/1 11:03, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> nftable support indr-block call. It makes nftable an offload vlan
>> and tunnel device.
>>
>> nft add table netdev firewall
>> nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
>> nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
>> nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
>> nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0
>>
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>> ---
>> v5: add nft_get_default_block
>>
>>  include/net/netfilter/nf_tables_offload.h |   2 +
>>  net/netfilter/nf_tables_api.c             |   7 ++
>>  net/netfilter/nf_tables_offload.c         | 156 +++++++++++++++++++++++++-----
>>  3 files changed, 141 insertions(+), 24 deletions(-)
>>
>> diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
>> index 3196663..ac69087 100644
>> --- a/include/net/netfilter/nf_tables_offload.h
>> +++ b/include/net/netfilter/nf_tables_offload.h
>> @@ -63,6 +63,8 @@ struct nft_flow_rule {
>>  struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
>>  void nft_flow_rule_destroy(struct nft_flow_rule *flow);
>>  int nft_flow_rule_offload_commit(struct net *net);
>> +bool nft_indr_get_default_block(struct net_device *dev,
>> +				struct flow_indr_block_info *info);
>>  
>>  #define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg)		\
>>  	(__reg)->base_offset	=					\
>> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
>> index 605a7cf..6a1d0b2 100644
>> --- a/net/netfilter/nf_tables_api.c
>> +++ b/net/netfilter/nf_tables_api.c
>> @@ -7593,6 +7593,11 @@ static void __net_exit nf_tables_exit_net(struct net *net)
>>  	.exit	= nf_tables_exit_net,
>>  };
>>  
>> +static struct flow_indr_get_block_entry get_block_entry = {
>> +	.get_block_cb = nft_indr_get_default_block,
>> +	.list = LIST_HEAD_INIT(get_block_entry.list),
>> +};
>> +
>>  static int __init nf_tables_module_init(void)
>>  {
>>  	int err;
>> @@ -7624,6 +7629,7 @@ static int __init nf_tables_module_init(void)
>>  		goto err5;
>>  
>>  	nft_chain_route_init();
>> +	flow_indr_add_default_block_cb(&get_block_entry);
>>  	return err;
>>  err5:
>>  	rhltable_destroy(&nft_objname_ht);
>> @@ -7640,6 +7646,7 @@ static int __init nf_tables_module_init(void)
>>  
>>  static void __exit nf_tables_module_exit(void)
>>  {
>> +	flow_indr_del_default_block_cb(&get_block_entry);
>>  	nfnetlink_subsys_unregister(&nf_tables_subsys);
>>  	unregister_netdevice_notifier(&nf_tables_flowtable_notifier);
>>  	nft_chain_filter_fini();
>> diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
>> index 64f5fd5..59c9629 100644
>> --- a/net/netfilter/nf_tables_offload.c
>> +++ b/net/netfilter/nf_tables_offload.c
>> @@ -171,24 +171,114 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
>>  	return 0;
>>  }
>>  
>> +static int nft_block_setup(struct nft_base_chain *basechain,
>> +			   struct flow_block_offload *bo,
>> +			   enum flow_block_command cmd)
>> +{
>> +	int err;
>> +
>> +	switch (cmd) {
>> +	case FLOW_BLOCK_BIND:
>> +		err = nft_flow_offload_bind(bo, basechain);
>> +		break;
>> +	case FLOW_BLOCK_UNBIND:
>> +		err = nft_flow_offload_unbind(bo, basechain);
>> +		break;
>> +	default:
>> +		WARN_ON_ONCE(1);
>> +		err = -EOPNOTSUPP;
>> +	}
>> +
>> +	return err;
>> +}
>> +
>> +static int nft_block_offload_cmd(struct nft_base_chain *chain,
>> +				 struct net_device *dev,
>> +				 enum flow_block_command cmd)
>> +{
>> +	struct netlink_ext_ack extack = {};
>> +	struct flow_block_offload bo = {};
>> +	int err;
>> +
>> +	bo.net = dev_net(dev);
>> +	bo.block = &chain->flow_block;
>> +	bo.command = cmd;
>> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
>> +	bo.extack = &extack;
>> +	INIT_LIST_HEAD(&bo.cb_list);
>> +
>> +	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
>> +	if (err < 0)
>> +		return err;
>> +
>> +	return nft_block_setup(chain, &bo, cmd);
>> +}
>> +
>> +static void nft_indr_block_ing_cmd(struct net_device *dev,
>> +				   struct flow_block *flow_block,
>> +				   flow_indr_block_bind_cb_t *cb,
>> +				   void *cb_priv,
>> +				   enum flow_block_command cmd)
>> +{
>> +	struct netlink_ext_ack extack = {};
>> +	struct flow_block_offload bo = {};
>> +	struct nft_base_chain *chain;
>> +
>> +	if (flow_block)
>> +		return;
> Maybe "if (!flow_block)" ?
yes it's a mistake. Thx!
>
>> +
>> +	chain = container_of(flow_block, struct nft_base_chain, flow_block);
>> +
>> +	bo.net = dev_net(dev);
>> +	bo.block = flow_block;
>> +	bo.command = cmd;
>> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
>> +	bo.extack = &extack;
>> +	INIT_LIST_HEAD(&bo.cb_list);
>> +
>> +	cb(dev, cb_priv, TC_SETUP_BLOCK, &bo);
>> +
>> +	nft_block_setup(chain, &bo, cmd);
>> +}
>> +
>> +static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
>> +				      struct net_device *dev,
>> +				      enum flow_block_command cmd)
>> +{
>> +	struct flow_block_offload bo = {};
>> +	struct netlink_ext_ack extack = {};
>> +
>> +	bo.net = dev_net(dev);
>> +	bo.block = &chain->flow_block;
>> +	bo.command = cmd;
>> +	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
>> +	bo.extack = &extack;
>> +	INIT_LIST_HEAD(&bo.cb_list);
>> +
>> +	flow_indr_block_call(&chain->flow_block, dev, nft_indr_block_ing_cmd,
>> +			     &bo, cmd);
>> +
>> +	if (list_empty(&bo.cb_list))
>> +		return -EOPNOTSUPP;
>> +
>> +	return nft_block_setup(chain, &bo, cmd);
>> +}
>> +
>>  #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
>>  
>>  static int nft_flow_offload_chain(struct nft_trans *trans,
>>  				  enum flow_block_command cmd)
>>  {
>>  	struct nft_chain *chain = trans->ctx.chain;
>> -	struct netlink_ext_ack extack = {};
>> -	struct flow_block_offload bo = {};
>>  	struct nft_base_chain *basechain;
>>  	struct net_device *dev;
>> -	int err;
>>  
>>  	if (!nft_is_base_chain(chain))
>>  		return -EOPNOTSUPP;
>>  
>>  	basechain = nft_base_chain(chain);
>>  	dev = basechain->ops.dev;
>> -	if (!dev || !dev->netdev_ops->ndo_setup_tc)
>> +	if (!dev)
>>  		return -EOPNOTSUPP;
>>  
>>  	/* Only default policy to accept is supported for now. */
>> @@ -197,26 +287,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
>>  	    nft_trans_chain_policy(trans) != NF_ACCEPT)
>>  		return -EOPNOTSUPP;
>>  
>> -	bo.command = cmd;
>> -	bo.block = &basechain->flow_block;
>> -	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
>> -	bo.extack = &extack;
>> -	INIT_LIST_HEAD(&bo.cb_list);
>> -
>> -	err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
>> -	if (err < 0)
>> -		return err;
>> -
>> -	switch (cmd) {
>> -	case FLOW_BLOCK_BIND:
>> -		err = nft_flow_offload_bind(&bo, basechain);
>> -		break;
>> -	case FLOW_BLOCK_UNBIND:
>> -		err = nft_flow_offload_unbind(&bo, basechain);
>> -		break;
>> -	}
>> -
>> -	return err;
>> +	if (dev->netdev_ops->ndo_setup_tc)
>> +		return nft_block_offload_cmd(basechain, dev, cmd);
>> +	else
>> +		return nft_indr_block_offload_cmd(basechain, dev, cmd);
>>  }
>>  
>>  int nft_flow_rule_offload_commit(struct net *net)
>> @@ -266,3 +340,37 @@ int nft_flow_rule_offload_commit(struct net *net)
>>  
>>  	return err;
>>  }
>> +
>> +bool nft_indr_get_default_block(struct net_device *dev,
>> +				struct flow_indr_block_info *info)
>> +{
>> +	struct net *net = dev_net(dev);
>> +	const struct nft_table *table;
>> +	const struct nft_chain *chain;
>> +
>> +	rcu_read_lock();
>> +
>> +	list_for_each_entry_rcu(table, &net->nft.tables, list) {
>> +		if (table->family != NFPROTO_NETDEV)
>> +			continue;
>> +
>> +		list_for_each_entry_rcu(chain, &table->chains, list) {
>> +			if (nft_is_base_chain(chain)) {
>> +				struct nft_base_chain *basechain;
>> +
>> +				basechain = nft_base_chain(chain);
>> +				if (!strncmp(basechain->dev_name, dev->name,
>> +					     IFNAMSIZ)) {
>> +					info->flow_block = &basechain->flow_block;
>> +					info->ing_cmd_cb = nft_indr_block_ing_cmd;
>> +					rcu_read_unlock();
>> +					return true;
>> +				}
>> +			}
>> +		}
>> +	}
>> +
>> +	rcu_read_unlock();
>> +
>> +	return false;
>> +}
>>
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-01  3:03 ` [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately wenxu
@ 2019-08-01 23:11   ` Jakub Kicinski
  2019-08-02  2:47     ` wenxu
  2019-08-02 10:45     ` wenxu
  0 siblings, 2 replies; 20+ messages in thread
From: Jakub Kicinski @ 2019-08-01 23:11 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> The new flow-indr-block can't get the tcf_block
> directly. It provide a callback list to find the flow_block immediately
> when the device register and contain a ingress block.
> 
> Signed-off-by: wenxu <wenxu@ucloud.cn>
First of all thanks for splitting the series up into more patches, 
it is easier to follow the logic now!
> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>  
>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>  	indr_dev->dev = dev;
> +	flow_get_default_block(indr_dev);
>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>  				   flow_indr_setup_block_ht_params)) {
>  		kfree(indr_dev);
I wonder if it's still practical to keep the block information in the
indr_dev structure at all. The way this used to work was:
[hash table of devices]     --------------
                 |         |    netdev    |
                 |         |    refcnt    |
  indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
                 |         |   callbacks  | .
                 |          --------------   \__ [cb, cb_priv, cb_ident]
                 |                               [cb, cb_priv, cb_ident]
                 |          --------------
                 |         |    netdev    |
                 |         |    refcnt    |
  indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
                 |         |   callbacks  |.
-----------------           --------------   \__ [cb, cb_priv, cb_ident]
                                                 [cb, cb_priv, cb_ident]
In the example above we have two tunnels tun0 and tun1, each one has a
indr_dev structure allocated, and for each one of them two drivers
registered for callbacks (hence the callbacks list has two entries).
We used to cache the TC block in the indr_dev structure, but now that
there are multiple subsytems using the indr_dev we either have to have
a list of cached blocks (with entries for each subsystem) or just always
iterate over the subsystems :(
After all the same device may have both a TC block and a NFT block.
I think always iterating would be easier:
The indr_dev struct would no longer have the block pointer, instead
when new driver registers for the callback instead of:
	if (indr_dev->ing_cmd_cb)
		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
				     indr_block_cb->cb, indr_block_cb->cb_priv,
				     FLOW_BLOCK_BIND);
We'd have something like the loop in flow_get_default_block():
	for each (subsystem)
		subsystem->handle_new_indir_cb(indr_dev, cb);
And then per-subsystem logic would actually call the cb. Or:
	for each (subsystem)
		block = get_default_block(indir_dev)
		indr_dev->ing_cmd_cb(...)
I hope this makes sense.
Also please double check nft unload logic has an RCU synchronization
point, I'm not 100% confident rcu_barrier() implies synchronize_rcu().
Perhaps someone more knowledgeable can chime in :)
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-01 23:11   ` Jakub Kicinski
@ 2019-08-02  2:47     ` wenxu
  2019-08-02  2:56       ` Jakub Kicinski
  2019-08-02 10:45     ` wenxu
  1 sibling, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-02  2:47 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On 8/2/2019 7:11 AM, Jakub Kicinski wrote:
> On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> The new flow-indr-block can't get the tcf_block
>> directly. It provide a callback list to find the flow_block immediately
>> when the device register and contain a ingress block.
>>
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> First of all thanks for splitting the series up into more patches, 
> it is easier to follow the logic now!
>
>> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>>  
>>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>>  	indr_dev->dev = dev;
>> +	flow_get_default_block(indr_dev);
>>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>>  				   flow_indr_setup_block_ht_params)) {
>>  		kfree(indr_dev);
> I wonder if it's still practical to keep the block information in the
> indr_dev structure at all. The way this used to work was:
>
>
> [hash table of devices]     --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  | .
>                  |          --------------   \__ [cb, cb_priv, cb_ident]
>                  |                               [cb, cb_priv, cb_ident]
>                  |          --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  |.
> -----------------           --------------   \__ [cb, cb_priv, cb_ident]
>                                                  [cb, cb_priv, cb_ident]
>
>
> In the example above we have two tunnels tun0 and tun1, each one has a
> indr_dev structure allocated, and for each one of them two drivers
> registered for callbacks (hence the callbacks list has two entries).
>
> We used to cache the TC block in the indr_dev structure, but now that
> there are multiple subsytems using the indr_dev we either have to have
> a list of cached blocks (with entries for each subsystem) or just always
> iterate over the subsystems :(
>
> After all the same device may have both a TC block and a NFT block.
Only one subsystem can be used for the same device for both indr-dev and hw-dev
the flow_block_cb_is_busy avoid the situation you mentioned.
>
> I think always iterating would be easier:
>
> The indr_dev struct would no longer have the block pointer, instead
> when new driver registers for the callback instead of:
>
> 	if (indr_dev->ing_cmd_cb)
> 		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
> 				     indr_block_cb->cb, indr_block_cb->cb_priv,
> 				     FLOW_BLOCK_BIND);
>
> We'd have something like the loop in flow_get_default_block():
>
> 	for each (subsystem)
> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>
> And then per-subsystem logic would actually call the cb. Or:
>
> 	for each (subsystem)
> 		block = get_default_block(indir_dev)
> 		indr_dev->ing_cmd_cb(...)
>
> I hope this makes sense.
>
>
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-02  2:47     ` wenxu
@ 2019-08-02  2:56       ` Jakub Kicinski
  0 siblings, 0 replies; 20+ messages in thread
From: Jakub Kicinski @ 2019-08-02  2:56 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On Fri, 2 Aug 2019 10:47:26 +0800, wenxu wrote:
> > After all the same device may have both a TC block and a NFT block.  
> 
> Only one subsystem can be used for the same device for both indr-dev and hw-dev
> the flow_block_cb_is_busy avoid the situation you mentioned.
AFAIU that's a temporary limitation until drivers learn to manage
multiple tables.
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-01 23:11   ` Jakub Kicinski
  2019-08-02  2:47     ` wenxu
@ 2019-08-02 10:45     ` wenxu
  2019-08-02 13:09       ` wenxu
  1 sibling, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-02 10:45 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On 8/2/2019 7:11 AM, Jakub Kicinski wrote:
> On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> The new flow-indr-block can't get the tcf_block
>> directly. It provide a callback list to find the flow_block immediately
>> when the device register and contain a ingress block.
>>
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> First of all thanks for splitting the series up into more patches, 
> it is easier to follow the logic now!
>
>> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>>  
>>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>>  	indr_dev->dev = dev;
>> +	flow_get_default_block(indr_dev);
>>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>>  				   flow_indr_setup_block_ht_params)) {
>>  		kfree(indr_dev);
> I wonder if it's still practical to keep the block information in the
> indr_dev structure at all. The way this used to work was:
>
>
> [hash table of devices]     --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  | .
>                  |          --------------   \__ [cb, cb_priv, cb_ident]
>                  |                               [cb, cb_priv, cb_ident]
>                  |          --------------
>                  |         |    netdev    |
>                  |         |    refcnt    |
>   indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
>                  |         |   callbacks  |.
> -----------------           --------------   \__ [cb, cb_priv, cb_ident]
>                                                  [cb, cb_priv, cb_ident]
>
>
> In the example above we have two tunnels tun0 and tun1, each one has a
> indr_dev structure allocated, and for each one of them two drivers
> registered for callbacks (hence the callbacks list has two entries).
>
> We used to cache the TC block in the indr_dev structure, but now that
> there are multiple subsytems using the indr_dev we either have to have
> a list of cached blocks (with entries for each subsystem) or just always
> iterate over the subsystems :(
>
> After all the same device may have both a TC block and a NFT block.
>
> I think always iterating would be easier:
>
> The indr_dev struct would no longer have the block pointer, instead
> when new driver registers for the callback instead of:
>
> 	if (indr_dev->ing_cmd_cb)
> 		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
> 				     indr_block_cb->cb, indr_block_cb->cb_priv,
> 				     FLOW_BLOCK_BIND);
>
> We'd have something like the loop in flow_get_default_block():
>
> 	for each (subsystem)
> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>
> And then per-subsystem logic would actually call the cb. Or:
>
> 	for each (subsystem)
> 		block = get_default_block(indir_dev)
> 		indr_dev->ing_cmd_cb(...)
            nft dev chian is also based on register_netdevice_notifier, So for unregister case,
the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister. is right?
So maybe we can cache the block as a list of all the subsystem in  indr_dev ?
> I hope this makes sense.
>
>
> Also please double check nft unload logic has an RCU synchronization
> point, I'm not 100% confident rcu_barrier() implies synchronize_rcu().
> Perhaps someone more knowledgeable can chime in :)
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-02 10:45     ` wenxu
@ 2019-08-02 13:09       ` wenxu
  2019-08-02 18:02         ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-02 13:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
在 2019/8/2 18:45, wenxu 写道:
> On 8/2/2019 7:11 AM, Jakub Kicinski wrote:
>> On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
>>> From: wenxu <wenxu@ucloud.cn>
>>>
>>> The new flow-indr-block can't get the tcf_block
>>> directly. It provide a callback list to find the flow_block immediately
>>> when the device register and contain a ingress block.
>>>
>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>> First of all thanks for splitting the series up into more patches, 
>> it is easier to follow the logic now!
>>
>>> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>>>  
>>>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>>>  	indr_dev->dev = dev;
>>> +	flow_get_default_block(indr_dev);
>>>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>>>  				   flow_indr_setup_block_ht_params)) {
>>>  		kfree(indr_dev);
>> I wonder if it's still practical to keep the block information in the
>> indr_dev structure at all. The way this used to work was:
>>
>>
>> [hash table of devices]     --------------
>>                  |         |    netdev    |
>>                  |         |    refcnt    |
>>   indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
>>                  |         |   callbacks  | .
>>                  |          --------------   \__ [cb, cb_priv, cb_ident]
>>                  |                               [cb, cb_priv, cb_ident]
>>                  |          --------------
>>                  |         |    netdev    |
>>                  |         |    refcnt    |
>>   indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
>>                  |         |   callbacks  |.
>> -----------------           --------------   \__ [cb, cb_priv, cb_ident]
>>                                                  [cb, cb_priv, cb_ident]
>>
>>
>> In the example above we have two tunnels tun0 and tun1, each one has a
>> indr_dev structure allocated, and for each one of them two drivers
>> registered for callbacks (hence the callbacks list has two entries).
>>
>> We used to cache the TC block in the indr_dev structure, but now that
>> there are multiple subsytems using the indr_dev we either have to have
>> a list of cached blocks (with entries for each subsystem) or just always
>> iterate over the subsystems :(
>>
>> After all the same device may have both a TC block and a NFT block.
>>
>> I think always iterating would be easier:
>>
>> The indr_dev struct would no longer have the block pointer, instead
>> when new driver registers for the callback instead of:
>>
>> 	if (indr_dev->ing_cmd_cb)
>> 		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
>> 				     indr_block_cb->cb, indr_block_cb->cb_priv,
>> 				     FLOW_BLOCK_BIND);
>>
>> We'd have something like the loop in flow_get_default_block():
>>
>> 	for each (subsystem)
>> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>>
>> And then per-subsystem logic would actually call the cb. Or:
>>
>> 	for each (subsystem)
>> 		block = get_default_block(indir_dev)
>> 		indr_dev->ing_cmd_cb(...)
>             nft dev chian is also based on register_netdevice_notifier, So for unregister case,
>
> the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister. is right?
>
> So maybe we can cache the block as a list of all the subsystem in  indr_dev ?
when the device is unregister the nft netdev chain related to this device will also be delete through netdevice_notifier
. So for unregister case,the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister.
cache for the block is not work because the chain already be delete and free. Maybe it improve the prio of
rep_netdev_event can help this?
>
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-02 13:09       ` wenxu
@ 2019-08-02 18:02         ` Jakub Kicinski
  2019-08-02 23:19           ` wenxu
  0 siblings, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2019-08-02 18:02 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On Fri, 2 Aug 2019 21:09:03 +0800, wenxu wrote:
> >> We'd have something like the loop in flow_get_default_block():
> >>
> >> 	for each (subsystem)
> >> 		subsystem->handle_new_indir_cb(indr_dev, cb);
> >>
> >> And then per-subsystem logic would actually call the cb. Or:
> >>
> >> 	for each (subsystem)
> >> 		block = get_default_block(indir_dev)
> >> 		indr_dev->ing_cmd_cb(...)  
> >             nft dev chian is also based on register_netdevice_notifier, So for unregister case,
> >
> > the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister. is right?
> >
> > So maybe we can cache the block as a list of all the subsystem in  indr_dev ?  
> 
> 
> when the device is unregister the nft netdev chain related to this device will also be delete through netdevice_notifier
> 
> . So for unregister case,the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister.
Hm, but I don't think that should be an issue. The ordering should be
like one of the following two:
device unregister:
  - driver notifier callback
    - unregister flow cb
      - UNBIND cb
        - free driver's block state
    - free driver's device state
  - nft block destroy
    # doesn't see driver's CB any more
Or:
device unregister:
  - nft block destroy
    - UNBIND cb
      - free driver's block state
  - driver notifier callback
    - free driver's state
No?
> cache for the block is not work because the chain already be delete and free. Maybe it improve the prio of
> 
> rep_netdev_event can help this?
In theory the cache should work in a similar way as drivers, because
once the indr_dev is created and the initial block is found, the cached
value is just recorded in BIND/UNBIND calls. So if BIND/UNBIND works for
drivers it will also put the right info in the cache.
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-02 18:02         ` Jakub Kicinski
@ 2019-08-02 23:19           ` wenxu
  2019-08-03  0:21             ` Jakub Kicinski
  0 siblings, 1 reply; 20+ messages in thread
From: wenxu @ 2019-08-02 23:19 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
在 2019/8/3 2:02, Jakub Kicinski 写道:
> On Fri, 2 Aug 2019 21:09:03 +0800, wenxu wrote:
>>>> We'd have something like the loop in flow_get_default_block():
>>>>
>>>> 	for each (subsystem)
>>>> 		subsystem->handle_new_indir_cb(indr_dev, cb);
>>>>
>>>> And then per-subsystem logic would actually call the cb. Or:
>>>>
>>>> 	for each (subsystem)
>>>> 		block = get_default_block(indir_dev)
>>>> 		indr_dev->ing_cmd_cb(...)  
>>>             nft dev chian is also based on register_netdevice_notifier, So for unregister case,
>>>
>>> the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister. is right?
>>>
>>> So maybe we can cache the block as a list of all the subsystem in  indr_dev ?  
>>
>> when the device is unregister the nft netdev chain related to this device will also be delete through netdevice_notifier
>>
>> . So for unregister case,the basechian(block) of nft maybe delete before the __tc_indr_block_cb_unregister.
> Hm, but I don't think that should be an issue. The ordering should be
> like one of the following two:
>
> device unregister:
>   - driver notifier callback
>     - unregister flow cb
>       - UNBIND cb
>         - free driver's block state
>     - free driver's device state
>   - nft block destroy
>     # doesn't see driver's CB any more
>
> Or:
>
> device unregister:
>   - nft block destroy
>     - UNBIND cb
>       - free driver's block state
>   - driver notifier callback
>     - free driver's state
>
> No?
For the second case maybe can't unbind cb? because the nft block is destroied. There is no way
to find the block(chain) in nft.
>
>> cache for the block is not work because the chain already be delete and free. Maybe it improve the prio of
>>
>> rep_netdev_event can help this?
> In theory the cache should work in a similar way as drivers, because
> once the indr_dev is created and the initial block is found, the cached
> value is just recorded in BIND/UNBIND calls. So if BIND/UNBIND works for
> drivers it will also put the right info in the cache.
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-02 23:19           ` wenxu
@ 2019-08-03  0:21             ` Jakub Kicinski
  2019-08-03 13:42               ` wenxu
  0 siblings, 1 reply; 20+ messages in thread
From: Jakub Kicinski @ 2019-08-03  0:21 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
On Sat, 3 Aug 2019 07:19:31 +0800, wenxu wrote:
> > Or:
> >
> > device unregister:
> >   - nft block destroy
> >     - UNBIND cb
> >       - free driver's block state
> >   - driver notifier callback
> >     - free driver's state
> >
> > No?  
> 
> For the second case maybe can't unbind cb? because the nft block is
> destroied. There is no way to find the block(chain) in nft.
But before the block is destroyed doesn't nft send an UNBIND event to
the drivers, always?
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
  2019-08-03  0:21             ` Jakub Kicinski
@ 2019-08-03 13:42               ` wenxu
  0 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-03 13:42 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
在 2019/8/3 8:21, Jakub Kicinski 写道:
> On Sat, 3 Aug 2019 07:19:31 +0800, wenxu wrote:
>>> Or:
>>>
>>> device unregister:
>>>   - nft block destroy
>>>     - UNBIND cb
>>>       - free driver's block state
>>>   - driver notifier callback
>>>     - free driver's state
>>>
>>> No?  
>> For the second case maybe can't unbind cb? because the nft block is
>> destroied. There is no way to find the block(chain) in nft.
> But before the block is destroyed doesn't nft send an UNBIND event to
> the drivers, always?
you are correct, it will be send an UBIND event when the block is destroyed
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/6] cls_api: add flow_indr_block_call function
  2019-08-01  3:03 ` [PATCH net-next 3/6] cls_api: add flow_indr_block_call function wenxu
@ 2019-08-05  6:02   ` Jiri Pirko
  2019-08-05  6:26     ` wenxu
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2019-08-05  6:02 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev
Re subject. You don't have "v5" in this patch. I don't understand how
that happened. Do you use --subject-prefix in git-format-patch?
^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH net-next 3/6] cls_api: add flow_indr_block_call function
  2019-08-05  6:02   ` Jiri Pirko
@ 2019-08-05  6:26     ` wenxu
  0 siblings, 0 replies; 20+ messages in thread
From: wenxu @ 2019-08-05  6:26 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev
v5 contain this patch but with non-version tag,
I used --subject-prefix in git-format-patch. I am sorry to  make a mistake when modify the
commit log. So should I repost the v6?
On 8/5/2019 2:02 PM, Jiri Pirko wrote:
> Re subject. You don't have "v5" in this patch. I don't understand how
> that happened. Do you use --subject-prefix in git-format-patch?
>
^ permalink raw reply	[flat|nested] 20+ messages in thread
end of thread, other threads:[~2019-08-05  6:27 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-01  3:03 [PATCH net-next v5 0/6] flow_offload: add indr-block in nf_table_offload wenxu
2019-08-01  3:03 ` [PATCH net-next v5 1/6] cls_api: modify the tc_indr_block_ing_cmd parameters wenxu
2019-08-01  3:03 ` [PATCH net-next v5 2/6] cls_api: replace block with flow_block in tc_indr_block_dev wenxu
2019-08-01  3:03 ` [PATCH net-next 3/6] cls_api: add flow_indr_block_call function wenxu
2019-08-05  6:02   ` Jiri Pirko
2019-08-05  6:26     ` wenxu
2019-08-01  3:03 ` [PATCH net-next v5 4/6] flow_offload: move tc indirect block to flow offload wenxu
2019-08-01  3:03 ` [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately wenxu
2019-08-01 23:11   ` Jakub Kicinski
2019-08-02  2:47     ` wenxu
2019-08-02  2:56       ` Jakub Kicinski
2019-08-02 10:45     ` wenxu
2019-08-02 13:09       ` wenxu
2019-08-02 18:02         ` Jakub Kicinski
2019-08-02 23:19           ` wenxu
2019-08-03  0:21             ` Jakub Kicinski
2019-08-03 13:42               ` wenxu
2019-08-01  3:03 ` [PATCH net-next v5 6/6] netfilter: nf_tables_offload: support indr block call wenxu
2019-08-01  3:58   ` Yunsheng Lin
2019-08-01  4:47     ` wenxu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).