Netdev List
 help / color / mirror / Atom feed
* [patch net-next v3 00/10] net: sched: allow qdiscs to share filter block instances
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel

From: Jiri Pirko <jiri@mellanox.com>

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification. If we don't pass any
block number, a new one will be generated by kernel:

$ tc qdisc add dev ens7 ingress block 22
                                ^^^^^^^^
$ tc qdisc add dev ens8 ingress block 22
                                ^^^^^^^^

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22

To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Now we can add filter to any of qdiscs sharing the same block:

$ tc filter add dev ens7 ingress protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop


We will see the same output if we list filters for ens7 and ens8, including stats:

$ tc -s filter show dev ens7 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 39 sec used 2 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

$ tc -s filter show dev ens8 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 40 sec used 3 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

---
v2->v3:
- removed original patch 1, removing tp->q cls_bpf dependency. Fixed by
  Jakub in the meantime.
- patch 1:
 - rebased on top of the current net-next
- patch 5:
 - new patch
- patch 6:
 - removed "p_" prefix from block index function args
- patch 9:
 - add tc offload feature handling

Jiri Pirko (10):
  net: sched: introduce support for multiple filter chain pointers
    registration
  net: sched: avoid usage of tp->q in tcf_classify
  net: sched: introduce block mechanism to handle netif_keep_dst calls
  net: sched: remove classid and q fields from tcf_proto
  net: sched: keep track of offloaded filters and check tc offload
    feature
  net: sched: allow ingress and clsact qdiscs to share filter blocks
  mlxsw: spectrum_acl: Reshuffle code around
    mlxsw_sp_acl_ruleset_create/destroy
  mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  mlxsw: spectrum_acl: Implement TC block sharing
  mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind
    ops

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 +++++++++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  44 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 302 +++++++++++++----
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    |  44 ++-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 +--
 include/net/pkt_cls.h                              |   4 +
 include/net/sch_generic.h                          |  26 +-
 include/uapi/linux/pkt_sched.h                     |  11 +
 net/sched/cls_api.c                                | 359 ++++++++++++++++++---
 net/sched/cls_bpf.c                                |   8 +-
 net/sched/cls_flow.c                               |   2 +-
 net/sched/cls_flower.c                             |   3 +-
 net/sched/cls_matchall.c                           |   3 +-
 net/sched/cls_route.c                              |   2 +-
 net/sched/cls_u32.c                                |  13 +-
 net/sched/sch_ingress.c                            |  89 ++++-
 16 files changed, 913 insertions(+), 220 deletions(-)

-- 
2.9.5

^ permalink raw reply

* [patch net-next v3 01/10] net: sched: introduce support for multiple filter chain pointers registration
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- rebased on top of the current net-next
---
 include/net/pkt_cls.h     |   3 +
 include/net/sch_generic.h |   5 +-
 net/sched/cls_api.c       | 225 ++++++++++++++++++++++++++++++++++++++++------
 3 files changed, 206 insertions(+), 27 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 0105445..c04154f 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -29,6 +29,8 @@ struct tcf_block_ext_info {
 	enum tcf_block_binder_type binder_type;
 	tcf_chain_head_change_t *chain_head_change;
 	void *chain_head_change_priv;
+	bool shareable;
+	u32 block_index;
 };
 
 struct tcf_block_cb;
@@ -48,6 +50,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 
 static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 {
+	WARN_ON(block->refcnt != 1);
 	return block->q;
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 8f8c0af..9ad1a51 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -269,8 +269,7 @@ typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv);
 
 struct tcf_chain {
 	struct tcf_proto __rcu *filter_chain;
-	tcf_chain_head_change_t *chain_head_change;
-	void *chain_head_change_priv;
+	struct list_head filter_chain_list;
 	struct list_head list;
 	struct tcf_block *block;
 	u32 index; /* chain index */
@@ -279,6 +278,8 @@ struct tcf_chain {
 
 struct tcf_block {
 	struct list_head chain_list;
+	u32 index; /* block index for shared blocks */
+	unsigned int refcnt;
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 5b9b8a6..dce4c3d 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -25,6 +25,7 @@
 #include <linux/kmod.h>
 #include <linux/err.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -180,6 +181,12 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
+struct tcf_filter_chain_list_item {
+	struct list_head list;
+	tcf_chain_head_change_t *chain_head_change;
+	void *chain_head_change_priv;
+};
+
 static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 					  u32 chain_index)
 {
@@ -188,6 +195,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 	if (!chain)
 		return NULL;
+	INIT_LIST_HEAD(&chain->filter_chain_list);
 	list_add_tail(&chain->list, &block->chain_list);
 	chain->block = block;
 	chain->index = chain_index;
@@ -195,12 +203,19 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	return chain;
 }
 
+static void tcf_chain_head_change_item(struct tcf_filter_chain_list_item *item,
+				       struct tcf_proto *tp_head)
+{
+	if (item->chain_head_change)
+		item->chain_head_change(tp_head, item->chain_head_change_priv);
+}
 static void tcf_chain_head_change(struct tcf_chain *chain,
 				  struct tcf_proto *tp_head)
 {
-	if (chain->chain_head_change)
-		chain->chain_head_change(tp_head,
-					 chain->chain_head_change_priv);
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list)
+		tcf_chain_head_change_item(item, tp_head);
 }
 
 static void tcf_chain_flush(struct tcf_chain *chain)
@@ -281,15 +296,84 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
 }
 
-int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
-		      struct tcf_block_ext_info *ei)
+static int
+tcf_chain_head_change_cb_add(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->chain_head_change = ei->chain_head_change;
+	item->chain_head_change_priv = ei->chain_head_change_priv;
+	if (chain->filter_chain)
+		tcf_chain_head_change_item(item, chain->filter_chain);
+	list_add(&item->list, &chain->filter_chain_list);
+	return 0;
+}
+
+static void
+tcf_chain_head_change_cb_del(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list) {
+		if ((!ei->chain_head_change && !ei->chain_head_change_priv) ||
+		    (item->chain_head_change == ei->chain_head_change &&
+		     item->chain_head_change_priv == ei->chain_head_change_priv)) {
+			tcf_chain_head_change_item(item, NULL);
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
+struct tcf_net {
+	struct idr idr;
+};
+
+static unsigned int tcf_net_id;
+
+static int tcf_block_insert(struct tcf_block *block, struct net *net,
+			    u32 block_index)
 {
-	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+	int idr_start;
+	int idr_end;
+	int index;
+
+	if (block_index >= INT_MAX)
+		return -EINVAL;
+	idr_start = block_index ? block_index : 1;
+	idr_end = block_index ? block_index + 1 : INT_MAX;
+
+	index = idr_alloc(&tn->idr, block, idr_start, idr_end, GFP_KERNEL);
+	if (index < 0)
+		return index;
+	block->index = index;
+	return 0;
+}
+
+static void tcf_block_remove(struct tcf_block *block, struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_remove(&tn->idr, block->index);
+}
+
+static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q)
+{
+	struct tcf_block *block;
 	struct tcf_chain *chain;
 	int err;
 
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
 	if (!block)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
 
@@ -299,17 +383,72 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
-	WARN_ON(!ei->chain_head_change);
-	chain->chain_head_change = ei->chain_head_change;
-	chain->chain_head_change_priv = ei->chain_head_change_priv;
 	block->net = qdisc_net(q);
+	block->refcnt = 1;
+	block->net = net;
 	block->q = q;
+	return block;
+
+err_chain_create:
+	kfree(block);
+	return ERR_PTR(err);
+}
+
+static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	return idr_find(&tn->idr, block_index);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+}
+
+int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
+		      struct tcf_block_ext_info *ei)
+{
+	struct net *net = qdisc_net(q);
+	struct tcf_block *block = NULL;
+	bool created = false;
+	int err;
+
+	if (ei->shareable) {
+		block = tcf_block_lookup(net, ei->block_index);
+		if (block)
+			block->refcnt++;
+	}
+
+	if (!block) {
+		block = tcf_block_create(net, q);
+		if (IS_ERR(block))
+			return PTR_ERR(block);
+		created = true;
+		if (ei->shareable) {
+			err = tcf_block_insert(block, net, ei->block_index);
+			if (err)
+				goto err_block_insert;
+		}
+	}
+
+	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block), ei);
+	if (err)
+		goto err_chain_head_change_cb_add;
+
 	tcf_block_offload_bind(block, q, ei);
 	*p_block = block;
 	return 0;
 
-err_chain_create:
-	kfree(block);
+err_chain_head_change_cb_add:
+	if (created) {
+		if (ei->shareable)
+			tcf_block_remove(block, net);
+err_block_insert:
+		kfree(block);
+	} else {
+		block->refcnt--;
+	}
 	return err;
 }
 EXPORT_SYMBOL(tcf_block_get_ext);
@@ -342,24 +481,32 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 {
 	struct tcf_chain *chain, *tmp;
 
-	/* Hold a refcnt for all chains, so that they don't disappear
-	 * while we are iterating.
-	 */
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_hold(chain);
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_flush(chain);
+	if (--block->refcnt == 0) {
+		if (ei->shareable)
+			tcf_block_remove(block, block->net);
+
+		/* Hold a refcnt for all chains, except 0, so that they
+		 * don't disappear while we are iterating.
+		 */
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_hold(chain);
+
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_flush(chain);
+	}
 
 	tcf_block_offload_unbind(block, q, ei);
 
-	/* At this point, all the chains should have refcnt >= 1. */
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_put(chain);
+	if (block->refcnt == 0) {
+		/* At this point, all the chains should have refcnt >= 1. */
+		list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+			tcf_chain_put(chain);
 
-	/* Finally, put chain 0 and allow block to be freed. */
-	chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
-	tcf_chain_put(chain);
+		/* Finally, put chain 0 and allow block to be freed. */
+		tcf_chain_put(tcf_block_chain_zero(block));
+	}
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
@@ -1246,12 +1393,40 @@ int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+static __net_init int tcf_net_init(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_init(&tn->idr);
+	return 0;
+}
+
+static void __net_exit tcf_net_exit(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_destroy(&tn->idr);
+}
+
+static struct pernet_operations tcf_net_ops = {
+	.init = tcf_net_init,
+	.exit = tcf_net_exit,
+	.id   = &tcf_net_id,
+	.size = sizeof(struct tcf_net),
+};
+
 static int __init tc_filter_init(void)
 {
+	int err;
+
 	tc_filter_wq = alloc_ordered_workqueue("tc_filter_workqueue", 0);
 	if (!tc_filter_wq)
 		return -ENOMEM;
 
+	err = register_pernet_subsys(&tcf_net_ops);
+	if (err)
+		return err;
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 02/10] net: sched: avoid usage of tp->q in tcf_classify
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Use block index in the messages instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index dce4c3d..de7791c 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -665,8 +665,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 #ifdef CONFIG_NET_CLS_ACT
 reset:
 	if (unlikely(limit++ >= max_reclassify_loop)) {
-		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
-				       tp->q->ops->id, tp->prio & 0xffff,
+		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
+				       tp->chain->block->index,
+				       tp->prio & 0xffff,
 				       ntohs(tp->protocol));
 		return TC_ACT_SHOT;
 	}
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 03/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
- fix binder_type to check "egress" as well as pointed out by Daniel
---
 include/net/pkt_cls.h     |  1 +
 include/net/sch_generic.h |  2 ++
 net/sched/cls_api.c       | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 net/sched/cls_bpf.c       |  4 +--
 net/sched/cls_flow.c      |  2 +-
 net/sched/cls_route.c     |  2 +-
 6 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index c04154f..80261e7 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -40,6 +40,7 @@ bool tcf_queue_work(struct work_struct *work);
 struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
 				bool create);
 void tcf_chain_put(struct tcf_chain *chain);
+void tcf_block_netif_keep_dst(struct tcf_block *block);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain, struct Qdisc *q);
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9ad1a51..70fa3ec 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -283,6 +283,8 @@ struct tcf_block {
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
+	struct list_head owner_list;
+	bool keep_dst;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index de7791c..4f51a5d 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -376,6 +376,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q)
 		return ERR_PTR(-ENOMEM);
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
+	INIT_LIST_HEAD(&block->owner_list);
 
 	/* Create chain 0 by default, it has to be always present. */
 	chain = tcf_chain_create(block, 0);
@@ -406,6 +407,65 @@ static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
 	return list_first_entry(&block->chain_list, struct tcf_chain, list);
 }
 
+struct tcf_block_owner_item {
+	struct list_head list;
+	struct Qdisc *q;
+	enum tcf_block_binder_type binder_type;
+};
+
+static void
+tcf_block_owner_netif_keep_dst(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	if (block->keep_dst &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
+		netif_keep_dst(qdisc_dev(q));
+}
+
+void tcf_block_netif_keep_dst(struct tcf_block *block)
+{
+	struct tcf_block_owner_item *item;
+
+	block->keep_dst = true;
+	list_for_each_entry(item, &block->owner_list, list)
+		tcf_block_owner_netif_keep_dst(block, item->q,
+					       item->binder_type);
+}
+EXPORT_SYMBOL(tcf_block_netif_keep_dst);
+
+static int tcf_block_owner_add(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->q = q;
+	item->binder_type = binder_type;
+	list_add(&item->list, &block->owner_list);
+	return 0;
+}
+
+static void tcf_block_owner_del(struct tcf_block *block,
+				struct Qdisc *q,
+				enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	list_for_each_entry(item, &block->owner_list, list) {
+		if (item->q == q && item->binder_type == binder_type) {
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		      struct tcf_block_ext_info *ei)
 {
@@ -432,6 +492,12 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		}
 	}
 
+	err = tcf_block_owner_add(block, q, ei->binder_type);
+	if (err)
+		goto err_block_owner_add;
+
+	tcf_block_owner_netif_keep_dst(block, q, ei->binder_type);
+
 	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block), ei);
 	if (err)
 		goto err_chain_head_change_cb_add;
@@ -441,6 +507,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	return 0;
 
 err_chain_head_change_cb_add:
+	tcf_block_owner_del(block, q, ei->binder_type);
+err_block_owner_add:
 	if (created) {
 		if (ei->shareable)
 			tcf_block_remove(block, net);
@@ -482,6 +550,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 	struct tcf_chain *chain, *tmp;
 
 	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
+	tcf_block_owner_del(block, q, ei->binder_type);
 
 	if (--block->refcnt == 0) {
 		if (ei->shareable)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 6fe798c..69d7e9a 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -409,8 +409,8 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
 	prog->bpf_name = name;
 	prog->filter = fp;
 
-	if (fp->dst_needed && !(tp->q->flags & TCQ_F_INGRESS))
-		netif_keep_dst(qdisc_dev(tp->q));
+	if (fp->dst_needed)
+		tcf_block_netif_keep_dst(tp->chain->block);
 
 	return 0;
 }
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 25c2a88..28cd6fb 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -526,7 +526,7 @@ static int flow_change(struct net *net, struct sk_buff *in_skb,
 
 	timer_setup(&fnew->perturb_timer, flow_perturbation, TIMER_DEFERRABLE);
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 
 	if (tb[TCA_FLOW_KEYS]) {
 		fnew->keymask = keymask;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index ac9a5b8..a1f2b1b 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -527,7 +527,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
 		if (f->handle < f1->handle)
 			break;
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 	rcu_assign_pointer(f->next, f1);
 	rcu_assign_pointer(*fp, f);
 
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 04/10] net: sched: remove classid and q fields from tcf_proto
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Both are no longer used, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 2 --
 net/sched/cls_api.c       | 7 ++-----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 70fa3ec..a75bbfa 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -249,8 +249,6 @@ struct tcf_proto {
 
 	/* All the rest */
 	u32			prio;
-	u32			classid;
-	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
 	struct tcf_chain	*chain;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 4f51a5d..de9dbcb 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -123,8 +123,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
-					  u32 prio, u32 parent, struct Qdisc *q,
-					  struct tcf_chain *chain)
+					  u32 prio, struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -158,8 +157,6 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->classify = tp->ops->classify;
 	tp->protocol = protocol;
 	tp->prio = prio;
-	tp->classid = parent;
-	tp->q = q;
 	tp->chain = chain;
 
 	err = tp->ops->init(tp);
@@ -1062,7 +1059,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, prio, parent, q, chain);
+				      protocol, prio, chain);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 05/10] net: sched: keep track of offloaded filters and check tc offload feature
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

During block bind, we need to check tc offload feature. If it is
disabled yet still the block contains offloaded filters, forbid the
bind. Also forbid to register callback for a block that already
containes offloaded filters, as the play back is not supported now.
For keeping track of offloaded filters there is a new counter
introduced, alongside with couple of helpers called from cls_* code.
These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- new patch
---
 include/net/sch_generic.h | 17 +++++++++++++++
 net/sched/cls_api.c       | 53 +++++++++++++++++++++++++++++++++++++----------
 net/sched/cls_bpf.c       |  4 +++-
 net/sched/cls_flower.c    |  3 ++-
 net/sched/cls_matchall.c  |  3 ++-
 net/sched/cls_u32.c       | 13 ++++++------
 6 files changed, 73 insertions(+), 20 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index a75bbfa..9c08026 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -283,8 +283,25 @@ struct tcf_block {
 	struct list_head cb_list;
 	struct list_head owner_list;
 	bool keep_dst;
+	unsigned int offloadcnt;
 };
 
+static inline void tcf_block_offload_inc(struct tcf_block *block, u32 *flags)
+{
+	if (*flags & TCA_CLS_FLAGS_IN_HW)
+		return;
+	*flags |= TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt++;
+}
+
+static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
+{
+	if (!(*flags & TCA_CLS_FLAGS_IN_HW))
+		return;
+	*flags &= ~TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt--;
+}
+
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index de9dbcb..ac25142 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -266,31 +266,50 @@ void tcf_chain_put(struct tcf_chain *chain)
 }
 EXPORT_SYMBOL(tcf_chain_put);
 
-static void tcf_block_offload_cmd(struct tcf_block *block, struct Qdisc *q,
+static bool tcf_block_offload_in_use(struct tcf_block *block)
+{
+	return block->offloadcnt;
+}
+
+static void tcf_block_offload_cmd(struct tcf_block *block,
+				  struct net_device *dev,
 				  struct tcf_block_ext_info *ei,
 				  enum tc_block_command command)
 {
-	struct net_device *dev = q->dev_queue->dev;
 	struct tc_block_offload bo = {};
 
-	if (!dev->netdev_ops->ndo_setup_tc)
-		return;
 	bo.command = command;
 	bo.binder_type = ei->binder_type;
 	bo.block = block;
 	dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
 }
 
-static void tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
-				   struct tcf_block_ext_info *ei)
+static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
+				  struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_BIND);
+	struct net_device *dev = q->dev_queue->dev;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		return 0;
+
+	/* If tc offload feature is disabled and the block we try to bind
+	 * to already has some offloaded filters, forbid to bind.
+	 */
+	if (!tc_can_offload(dev) && tcf_block_offload_in_use(block))
+		return -EOPNOTSUPP;
+
+	tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_BIND);
+	return 0;
 }
 
 static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 				     struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
+	struct net_device *dev = q->dev_queue->dev;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		return;
+	tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_UNBIND);
 }
 
 static int
@@ -499,10 +518,15 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	if (err)
 		goto err_chain_head_change_cb_add;
 
-	tcf_block_offload_bind(block, q, ei);
+	err = tcf_block_offload_bind(block, q, ei);
+	if (err)
+		goto err_block_offload_bind;
+
 	*p_block = block;
 	return 0;
 
+err_block_offload_bind:
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 err_chain_head_change_cb_add:
 	tcf_block_owner_del(block, q, ei->binder_type);
 err_block_owner_add:
@@ -630,9 +654,16 @@ struct tcf_block_cb *__tcf_block_cb_register(struct tcf_block *block,
 {
 	struct tcf_block_cb *block_cb;
 
+	/* At this point, playback of previous block cb calls is not supported,
+	 * so forbid to register to block which already has some offloaded
+	 * filters present.
+	 */
+	if (tcf_block_offload_in_use(block))
+		return ERR_PTR(-EOPNOTSUPP);
+
 	block_cb = kzalloc(sizeof(*block_cb), GFP_KERNEL);
 	if (!block_cb)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	block_cb->cb = cb;
 	block_cb->cb_ident = cb_ident;
 	block_cb->cb_priv = cb_priv;
@@ -648,7 +679,7 @@ int tcf_block_cb_register(struct tcf_block *block,
 	struct tcf_block_cb *block_cb;
 
 	block_cb = __tcf_block_cb_register(block, cb, cb_ident, cb_priv);
-	return block_cb ? 0 : -ENOMEM;
+	return IS_ERR(block_cb) ? PTR_ERR(block_cb) : 0;
 }
 EXPORT_SYMBOL(tcf_block_cb_register);
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 69d7e9a..9cf61e7 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -170,8 +170,10 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 			cls_bpf_offload_cmd(tp, prog, TC_CLSBPF_DESTROY);
 			return err;
 		} else if (err > 0) {
-			prog->gen_flags |= TCA_CLS_FLAGS_IN_HW;
+			tcf_block_offload_inc(block, &prog->gen_flags);
 		}
+	} else {
+		tcf_block_offload_dec(block, &prog->gen_flags);
 	}
 
 	if (addorrep && skip_sw && !(prog->gen_flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 6132a73..f61df19 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -229,6 +229,7 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f)
 
 	tc_setup_cb_call(block, &f->exts, TC_SETUP_CLSFLOWER,
 			 &cls_flower, false);
+	tcf_block_offload_dec(block, &f->flags);
 }
 
 static int fl_hw_replace_filter(struct tcf_proto *tp,
@@ -256,7 +257,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 		fl_hw_destroy_filter(tp, f);
 		return err;
 	} else if (err > 0) {
-		f->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &f->flags);
 	}
 
 	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 66d4e00..d0e57c8 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -81,6 +81,7 @@ static void mall_destroy_hw_filter(struct tcf_proto *tp,
 	cls_mall.cookie = cookie;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSMATCHALL, &cls_mall, false);
+	tcf_block_offload_dec(block, &head->flags);
 }
 
 static int mall_replace_hw_filter(struct tcf_proto *tp,
@@ -103,7 +104,7 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
 		mall_destroy_hw_filter(tp, head, cookie);
 		return err;
 	} else if (err > 0) {
-		head->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &head->flags);
 	}
 
 	if (skip_sw && !(head->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index ac152b4..0a5a2cb 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -530,16 +530,17 @@ static int u32_replace_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	return 0;
 }
 
-static void u32_remove_hw_knode(struct tcf_proto *tp, u32 handle)
+static void u32_remove_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n)
 {
 	struct tcf_block *block = tp->chain->block;
 	struct tc_cls_u32_offload cls_u32 = {};
 
 	tc_cls_common_offload_init(&cls_u32.common, tp);
 	cls_u32.command = TC_CLSU32_DELETE_KNODE;
-	cls_u32.knode.handle = handle;
+	cls_u32.knode.handle = n->handle;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, false);
+	tcf_block_offload_dec(block, &n->flags);
 }
 
 static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
@@ -568,10 +569,10 @@ static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
 
 	err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, skip_sw);
 	if (err < 0) {
-		u32_remove_hw_knode(tp, n->handle);
+		u32_remove_hw_knode(tp, n);
 		return err;
 	} else if (err > 0) {
-		n->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &n->flags);
 	}
 
 	if (skip_sw && !(n->flags & TCA_CLS_FLAGS_IN_HW))
@@ -590,7 +591,7 @@ static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht)
 			RCU_INIT_POINTER(ht->ht[h],
 					 rtnl_dereference(n->next));
 			tcf_unbind_filter(tp, &n->res);
-			u32_remove_hw_knode(tp, n->handle);
+			u32_remove_hw_knode(tp, n);
 			idr_remove_ext(&ht->handle_idr, n->handle);
 			if (tcf_exts_get_net(&n->exts))
 				call_rcu(&n->rcu, u32_delete_key_freepf_rcu);
@@ -683,7 +684,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last)
 		goto out;
 
 	if (TC_U32_KEY(ht->handle)) {
-		u32_remove_hw_knode(tp, ht->handle);
+		u32_remove_hw_knode(tp, (struct tc_u_knode *)ht);
 		ret = u32_delete_key(tp, (struct tc_u_knode *)ht);
 		goto out;
 	}
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- removed "p_" prefix from block index function args
---
 include/uapi/linux/pkt_sched.h | 11 ++++++
 net/sched/sch_ingress.c        | 89 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index af3cc2f..0feee36 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -935,4 +935,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 5ecc38f..3b27b60 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -60,6 +60,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
 	struct mini_Qdisc_pair *miniqp = priv;
 
 	mini_qdisc_pair_swap(miniqp, tp_head);
+};
+
+static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int ingress_parse_opt(struct nlattr *opt, u32 *ingress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*ingress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, ingress_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	return 0;
 }
 
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
@@ -70,6 +93,11 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress);
 
+	err = ingress_parse_opt(opt, &q->block_info.block_index);
+	if (err)
+		return err;
+
+	q->block_info.shareable = true;
 	q->block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->block_info.chain_head_change = clsact_chain_head_change;
 	q->block_info.chain_head_change_priv = &q->miniqp;
@@ -94,11 +122,14 @@ static void ingress_destroy(struct Qdisc *sch)
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+	struct ingress_sched_data *q = qdisc_priv(sch);
 	struct nlattr *nest;
 
 	nest = nla_nest_start(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->block->index))
+		goto nla_put_failure;
 
 	return nla_nest_end(skb, nest);
 
@@ -166,6 +197,35 @@ static struct tcf_block *clsact_tcf_block(struct Qdisc *sch, unsigned long cl)
 	}
 }
 
+static const struct nla_policy clsact_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+	[TCA_CLSACT_EGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int clsact_parse_opt(struct nlattr *opt, u32 *ingress_block_index,
+			    u32 *egress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*ingress_block_index = 0;
+	*egress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, clsact_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	if (tb[TCA_CLSACT_EGRESS_BLOCK])
+		*egress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+	return 0;
+}
+
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct clsact_sched_data *q = qdisc_priv(sch);
@@ -174,6 +234,12 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp_ingress, sch, &dev->miniq_ingress);
 
+	err = clsact_parse_opt(opt, &q->ingress_block_info.block_index,
+			       &q->egress_block_info.block_index);
+	if (err)
+		return err;
+
+	q->ingress_block_info.shareable = true;
 	q->ingress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->ingress_block_info.chain_head_change = clsact_chain_head_change;
 	q->ingress_block_info.chain_head_change_priv = &q->miniqp_ingress;
@@ -184,6 +250,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 
 	mini_qdisc_pair_init(&q->miniqp_egress, sch, &dev->miniq_egress);
 
+	q->egress_block_info.shareable = true;
 	q->egress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS;
 	q->egress_block_info.chain_head_change = clsact_chain_head_change;
 	q->egress_block_info.chain_head_change_priv = &q->miniqp_egress;
@@ -215,6 +282,26 @@ static void clsact_destroy(struct Qdisc *sch)
 	net_dec_egress_queue();
 }
 
+static int clsact_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct clsact_sched_data *q = qdisc_priv(sch);
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->ingress_block->index))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_EGRESS_BLOCK, q->egress_block->index))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, nest);
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -1;
+}
+
 static const struct Qdisc_class_ops clsact_class_ops = {
 	.leaf		=	ingress_leaf,
 	.find		=	clsact_find,
@@ -230,7 +317,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct clsact_sched_data),
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
-	.dump		=	ingress_dump,
+	.dump		=	clsact_dump,
 	.owner		=	THIS_MODULE,
 };
 
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 102 ++++++++++-----------
 1 file changed, 46 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 93dcd31..ead4cb8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -118,8 +118,26 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
+static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+				     struct mlxsw_sp_acl_ruleset *ruleset,
+				     struct net_device *dev, bool ingress)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+}
+
+static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+					struct mlxsw_sp_acl_ruleset *ruleset)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+}
+
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
+			    bool ingress, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -132,6 +150,9 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
+	ruleset->ht_key.dev = dev;
+	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
 	err = rhashtable_init(&ruleset->rule_ht, &mlxsw_sp_acl_rule_ht_params);
@@ -142,68 +163,49 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		goto err_ops_ruleset_add;
 
-	return ruleset;
-
-err_ops_ruleset_add:
-	rhashtable_destroy(&ruleset->rule_ht);
-err_rhashtable_init:
-	kfree(ruleset);
-	return ERR_PTR(err);
-}
-
-static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
-					 struct mlxsw_sp_acl_ruleset *ruleset)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-
-	ops->ruleset_del(mlxsw_sp, ruleset->priv);
-	rhashtable_destroy(&ruleset->rule_ht);
-	kfree(ruleset);
-}
-
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress,
-				     u32 chain_index)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
-	int err;
-
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
-	ruleset->ht_key.chain_index = chain_index;
 	err = rhashtable_insert_fast(&acl->ruleset_ht, &ruleset->ht_node,
 				     mlxsw_sp_acl_ruleset_ht_params);
 	if (err)
-		return err;
-	if (!ruleset->ht_key.chain_index) {
+		goto err_ht_insert;
+
+	if (!chain_index) {
 		/* We only need ruleset with chain index 0, the implicit one,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
+						dev, ingress);
 		if (err)
-			goto err_ops_ruleset_bind;
+			goto err_ruleset_bind;
 	}
-	return 0;
 
-err_ops_ruleset_bind:
+	return ruleset;
+
+err_ruleset_bind:
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
-	return err;
+err_ht_insert:
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+err_ops_ruleset_add:
+	rhashtable_destroy(&ruleset->rule_ht);
+err_rhashtable_init:
+	kfree(ruleset);
+	return ERR_PTR(err);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
+					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+	rhashtable_destroy(&ruleset->rule_ht);
+	kfree(ruleset);
 }
 
 static void mlxsw_sp_acl_ruleset_ref_inc(struct mlxsw_sp_acl_ruleset *ruleset)
@@ -216,7 +218,6 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 {
 	if (--ruleset->ref_count)
 		return;
-	mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
 }
 
@@ -263,7 +264,6 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	const struct mlxsw_sp_acl_profile_ops *ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 	struct mlxsw_sp_acl_ruleset *ruleset;
-	int err;
 
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
@@ -275,18 +275,8 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	ruleset = mlxsw_sp_acl_ruleset_create(mlxsw_sp, ops);
-	if (IS_ERR(ruleset))
-		return ruleset;
-	err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset, dev,
-					ingress, chain_index);
-	if (err)
-		goto err_ruleset_bind;
-	return ruleset;
-
-err_ruleset_bind:
-	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
-	return ERR_PTR(err);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
+					   chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Instead, pass netdev and ingress flag to ruleset unbind op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  3 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  9 ++++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 33 +++++++++++-----------
 3 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index a0adcd8..523e64e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,7 +477,8 @@ struct mlxsw_sp_acl_profile_ops {
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
 			    struct net_device *dev, bool ingress);
-	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
+	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
+			       struct net_device *dev, bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index ead4cb8..7fb41a4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -128,11 +128,12 @@ static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 }
 
 static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+					struct mlxsw_sp_acl_ruleset *ruleset,
+					struct net_device *dev, bool ingress)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
 }
 
 static struct mlxsw_sp_acl_ruleset *
@@ -200,7 +201,9 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
+					    ruleset->ht_key.dev,
+					    ruleset->ht_key.ingress);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 7e8284b..50b2f9a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -154,10 +154,6 @@ struct mlxsw_sp_acl_tcam_group {
 	struct list_head region_list;
 	unsigned int region_count;
 	struct rhashtable chunk_ht;
-	struct {
-		u16 local_port;
-		bool ingress;
-	} bound;
 	struct mlxsw_sp_acl_tcam_group_ops *ops;
 	const struct mlxsw_sp_acl_tcam_pattern *patterns;
 	unsigned int patterns_count;
@@ -271,26 +267,28 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 		return -EINVAL;
 
 	mlxsw_sp_port = netdev_priv(dev);
-	group->bound.local_port = mlxsw_sp_port->local_port;
-	group->bound.ingress = ingress;
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_BIND, group->bound.local_port,
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
 			    group->id);
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
 
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
-			       struct mlxsw_sp_acl_tcam_group *group)
+			       struct mlxsw_sp_acl_tcam_group *group,
+			       struct net_device *dev, bool ingress)
 {
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_UNBIND, group->bound.local_port,
+	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
+		return;
+
+	mlxsw_sp_port = netdev_priv(dev);
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
 			    group->id);
 	mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
@@ -1066,11 +1064,12 @@ mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					void *ruleset_priv)
+					void *ruleset_priv,
+					struct net_device *dev, bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 09/10] mlxsw: spectrum_acl: Implement TC block sharing
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- add tc offload feature handling
v1->v2:
- new patch
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 +++++++++++++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  37 +++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 237 ++++++++++++++++++---
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 ++--
 4 files changed, 401 insertions(+), 96 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 3b9c8a0..d5e893a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1747,72 +1747,186 @@ static int mlxsw_sp_setup_tc_cls_matchall(struct mlxsw_sp_port *mlxsw_sp_port,
 }
 
 static int
-mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_port *mlxsw_sp_port,
-			     struct tc_cls_flower_offload *f,
-			     bool ingress)
+mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_acl_block *acl_block,
+			     struct tc_cls_flower_offload *f)
 {
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_acl_block_mlxsw_sp(acl_block);
+
 	switch (f->command) {
 	case TC_CLSFLOWER_REPLACE:
-		return mlxsw_sp_flower_replace(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_replace(mlxsw_sp, acl_block, f);
 	case TC_CLSFLOWER_DESTROY:
-		mlxsw_sp_flower_destroy(mlxsw_sp_port, ingress, f);
+		mlxsw_sp_flower_destroy(mlxsw_sp, acl_block, f);
 		return 0;
 	case TC_CLSFLOWER_STATS:
-		return mlxsw_sp_flower_stats(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_stats(mlxsw_sp, acl_block, f);
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
-				      void *cb_priv, bool ingress)
+static int mlxsw_sp_setup_tc_block_cb_matchall(enum tc_setup_type type,
+					       void *type_data,
+					       void *cb_priv, bool ingress)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = cb_priv;
 
-	if (!tc_can_offload(mlxsw_sp_port->dev))
-		return -EOPNOTSUPP;
-
 	switch (type) {
 	case TC_SETUP_CLSMATCHALL:
+		if (!tc_can_offload(mlxsw_sp_port->dev))
+			return -EOPNOTSUPP;
+
 		return mlxsw_sp_setup_tc_cls_matchall(mlxsw_sp_port, type_data,
 						      ingress);
 	case TC_SETUP_CLSFLOWER:
-		return mlxsw_sp_setup_tc_cls_flower(mlxsw_sp_port, type_data,
-						    ingress);
+		return 0;
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb_ig(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_ig(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, true);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, true);
 }
 
-static int mlxsw_sp_setup_tc_block_cb_eg(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_eg(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, false);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, false);
+}
+
+static int mlxsw_sp_setup_tc_block_cb_flower(enum tc_setup_type type,
+					     void *type_data, void *cb_priv)
+{
+	struct mlxsw_sp_acl_block *acl_block = cb_priv;
+
+	switch (type) {
+	case TC_SETUP_CLSMATCHALL:
+		return 0;
+	case TC_SETUP_CLSFLOWER:
+		if (mlxsw_sp_acl_block_disabled(acl_block))
+			return -EOPNOTSUPP;
+
+		return mlxsw_sp_setup_tc_cls_flower(acl_block, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb) {
+		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, block->net);
+		if (!acl_block)
+			return -ENOMEM;
+		block_cb = __tcf_block_cb_register(block,
+						   mlxsw_sp_setup_tc_block_cb_flower,
+						   mlxsw_sp, acl_block);
+		if (IS_ERR(block_cb)) {
+			err = PTR_ERR(block_cb);
+			goto err_cb_register;
+		}
+	} else {
+		acl_block = tcf_block_cb_priv(block_cb);
+	}
+	tcf_block_cb_incref(block_cb);
+	err = mlxsw_sp_acl_block_bind(mlxsw_sp, acl_block,
+				      mlxsw_sp_port, ingress);
+	if (err)
+		goto err_block_bind;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = acl_block;
+	else
+		mlxsw_sp_port->eg_acl_block = acl_block;
+
+	return 0;
+
+err_block_bind:
+	if (!tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+err_cb_register:
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
+	return err;
+}
+
+static void
+mlxsw_sp_setup_tc_block_flower_unbind(struct mlxsw_sp_port *mlxsw_sp_port,
+				      struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb)
+		return;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = NULL;
+	else
+		mlxsw_sp_port->eg_acl_block = NULL;
+
+	acl_block = tcf_block_cb_priv(block_cb);
+	err = mlxsw_sp_acl_block_unbind(mlxsw_sp, acl_block,
+					mlxsw_sp_port, ingress);
+	if (!err && !tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
 }
 
 static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct tc_block_offload *f)
 {
 	tc_setup_cb_t *cb;
+	bool ingress;
+	int err;
 
-	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_ig;
-	else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_eg;
-	else
+	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_ig;
+		ingress = true;
+	} else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_eg;
+		ingress = false;
+	} else {
 		return -EOPNOTSUPP;
+	}
 
 	switch (f->command) {
 	case TC_BLOCK_BIND:
-		return tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
-					     mlxsw_sp_port);
+		err = tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
+					    mlxsw_sp_port);
+		if (err)
+			return err;
+		err = mlxsw_sp_setup_tc_block_flower_bind(mlxsw_sp_port,
+							  f->block, ingress);
+		if (err) {
+			tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
+			return err;
+		}
+		return 0;
 	case TC_BLOCK_UNBIND:
+		mlxsw_sp_setup_tc_block_flower_unbind(mlxsw_sp_port,
+						      f->block, ingress);
 		tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
 		return 0;
 	default:
@@ -1840,10 +1954,18 @@ static int mlxsw_sp_feature_hw_tc(struct net_device *dev, bool enable)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 
-	if (!enable && (mlxsw_sp_port->acl_rule_count ||
-			!list_empty(&mlxsw_sp_port->mall_tc_list))) {
-		netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
-		return -EINVAL;
+	if (!enable) {
+		if (mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->ing_acl_block) ||
+		    mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->eg_acl_block) ||
+		    !list_empty(&mlxsw_sp_port->mall_tc_list)) {
+			netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
+			return -EINVAL;
+		}
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->eg_acl_block);
+	} else {
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->eg_acl_block);
 	}
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 523e64e..ab6ada7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -270,7 +270,8 @@ struct mlxsw_sp_port {
 	struct mlxsw_sp_port_sample *sample;
 	struct list_head vlans_list;
 	struct mlxsw_sp_qdisc root_qdisc;
-	unsigned acl_rule_count;
+	struct mlxsw_sp_acl_block *ing_acl_block;
+	struct mlxsw_sp_acl_block *eg_acl_block;
 };
 
 static inline bool
@@ -498,17 +499,34 @@ struct mlxsw_sp_acl_ops {
 				       enum mlxsw_sp_acl_profile profile);
 };
 
+struct mlxsw_sp_acl_block;
 struct mlxsw_sp_acl_ruleset;
 
 /* spectrum_acl.c */
 struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl);
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block);
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block);
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block);
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net);
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block);
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile);
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
 			      struct mlxsw_sp_acl_ruleset *ruleset);
@@ -575,11 +593,14 @@ void mlxsw_sp_acl_fini(struct mlxsw_sp *mlxsw_sp);
 extern const struct mlxsw_sp_acl_ops mlxsw_sp_acl_tcam_ops;
 
 /* spectrum_flower.c */
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f);
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f);
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f);
 
 /* spectrum_qdisc.c */
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 7fb41a4..f98bca9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -39,6 +39,7 @@
 #include <linux/string.h>
 #include <linux/rhashtable.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/tc_act/tc_vlan.h>
 
 #include "reg.h"
@@ -70,9 +71,23 @@ struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl)
 	return acl->afk;
 }
 
-struct mlxsw_sp_acl_ruleset_ht_key {
-	struct net_device *dev; /* dev this ruleset is bound to */
+struct mlxsw_sp_acl_block_binding {
+	struct list_head list;
+	struct net_device *dev;
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	bool ingress;
+};
+
+struct mlxsw_sp_acl_block {
+	struct list_head binding_list;
+	struct mlxsw_sp_acl_ruleset *ruleset_zero;
+	struct mlxsw_sp *mlxsw_sp;
+	unsigned int rule_count;
+	unsigned int disable_count;
+};
+
+struct mlxsw_sp_acl_ruleset_ht_key {
+	struct mlxsw_sp_acl_block *block;
 	u32 chain_index;
 	const struct mlxsw_sp_acl_profile_ops *ops;
 };
@@ -118,27 +133,185 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress)
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block)
+{
+	return block->mlxsw_sp;
+}
+
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block)
+{
+	return block ? block->rule_count : 0;
+}
+
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count++;
+}
+
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count--;
+}
+
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block)
 {
+	return block->disable_count;
+}
+
+static int
+mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_acl_block_binding *binding)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
+				 binding->mlxsw_sp_port->dev, binding->ingress);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset,
-					struct net_device *dev, bool ingress)
+static void
+mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_acl_block_binding *binding)
 {
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
+			    binding->mlxsw_sp_port->dev, binding->ingress);
+}
+
+static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
+{
+	return block->ruleset_zero;
+}
+
+static int
+mlxsw_sp_acl_ruleset_block_bind(struct mlxsw_sp *mlxsw_sp,
+				struct mlxsw_sp_acl_ruleset *ruleset,
+				struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	block->ruleset_zero = ruleset;
+	list_for_each_entry(binding, &block->binding_list, list) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto rollback;
+	}
+	return 0;
+
+rollback:
+	list_for_each_entry_continue_reverse(binding, &block->binding_list,
+					     list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+
+	return err;
+}
+
+static void
+mlxsw_sp_acl_ruleset_block_unbind(struct mlxsw_sp *mlxsw_sp,
+				  struct mlxsw_sp_acl_ruleset *ruleset,
+				  struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+}
+
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net)
+{
+	struct mlxsw_sp_acl_block *block;
+
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
+	if (!block)
+		return NULL;
+	INIT_LIST_HEAD(&block->binding_list);
+	block->mlxsw_sp = mlxsw_sp;
+	return block;
+}
+
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block)
+{
+	WARN_ON(!list_empty(&block->binding_list));
+	kfree(block);
+}
+
+static struct mlxsw_sp_acl_block_binding *
+mlxsw_sp_acl_block_lookup(struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_port *mlxsw_sp_port, bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		if (binding->mlxsw_sp_port == mlxsw_sp_port &&
+		    binding->ingress == ingress)
+			return binding;
+	return NULL;
+}
+
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	if (WARN_ON(mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress)))
+		return -EEXIST;
+
+	binding = kzalloc(sizeof(*binding), GFP_KERNEL);
+	if (!binding)
+		return -ENOMEM;
+	binding->mlxsw_sp_port = mlxsw_sp_port;
+	binding->ingress = ingress;
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block)) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto err_ruleset_bind;
+	}
+
+	list_add(&binding->list, &block->binding_list);
+	return 0;
+
+err_ruleset_bind:
+	kfree(binding);
+	return err;
+}
+
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	binding = mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress);
+	if (!binding)
+		return -ENOENT;
+
+	list_del(&binding->list);
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block))
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+
+	kfree(binding);
+	return 0;
 }
 
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -151,8 +324,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.block = block;
 	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
@@ -174,8 +346,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
-						dev, ingress);
+		err = mlxsw_sp_acl_ruleset_block_bind(mlxsw_sp, ruleset, block);
 		if (err)
 			goto err_ruleset_bind;
 	}
@@ -198,12 +369,12 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+	struct mlxsw_sp_acl_block *block = ruleset->ht_key.block;
+	u32 chain_index = ruleset->ht_key.chain_index;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
-	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
-					    ruleset->ht_key.dev,
-					    ruleset->ht_key.ingress);
+	if (!chain_index)
+		mlxsw_sp_acl_ruleset_block_unbind(mlxsw_sp, ruleset, block);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
@@ -225,15 +396,14 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 }
 
 static struct mlxsw_sp_acl_ruleset *
-__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
-			      bool ingress, u32 chain_index,
+__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl,
+			      struct mlxsw_sp_acl_block *block, u32 chain_index,
 			      const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl_ruleset_ht_key ht_key;
 
 	memset(&ht_key, 0, sizeof(ht_key));
-	ht_key.dev = dev;
-	ht_key.ingress = ingress;
+	ht_key.block = block;
 	ht_key.chain_index = chain_index;
 	ht_key.ops = ops;
 	return rhashtable_lookup_fast(&acl->ruleset_ht, &ht_key,
@@ -241,8 +411,8 @@ __mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -252,16 +422,15 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
 		return ERR_PTR(-EINVAL);
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (!ruleset)
 		return ERR_PTR(-ENOENT);
 	return ruleset;
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -272,14 +441,12 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ops)
 		return ERR_PTR(-EINVAL);
 
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (ruleset) {
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
-					   chain_index, ops);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
@@ -528,6 +695,7 @@ int mlxsw_sp_acl_rule_add(struct mlxsw_sp *mlxsw_sp,
 		goto err_rhashtable_insert;
 
 	list_add_tail(&rule->list, &mlxsw_sp->acl->rules);
+	ruleset->ht_key.block->rule_count++;
 	return 0;
 
 err_rhashtable_insert:
@@ -541,6 +709,7 @@ void mlxsw_sp_acl_rule_del(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl_ruleset *ruleset = rule->ruleset;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
+	ruleset->ht_key.block->rule_count--;
 	list_del(&rule->list);
 	rhashtable_remove_fast(&ruleset->rule_ht, &rule->ht_node,
 			       mlxsw_sp_acl_rule_ht_params);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 42e8a36..cf7b97d4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -35,6 +35,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/flow_dissector.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_gact.h>
@@ -45,7 +46,7 @@
 #include "core_acl_flex_keys.h"
 
 static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
-					 struct net_device *dev, bool ingress,
+					 struct mlxsw_sp_acl_block *block,
 					 struct mlxsw_sp_acl_rule_info *rulei,
 					 struct tcf_exts *exts)
 {
@@ -80,8 +81,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			struct mlxsw_sp_acl_ruleset *ruleset;
 			u16 group_id;
 
-			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, dev,
-							      ingress,
+			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, block,
 							      chain_index,
 							      MLXSW_SP_ACL_PROFILE_FLOWER);
 			if (IS_ERR(ruleset))
@@ -104,9 +104,6 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 				return err;
 
 			out_dev = tcf_mirred_dev(a);
-			if (out_dev == dev)
-				out_dev = NULL;
-
 			err = mlxsw_sp_acl_rulei_act_fwd(mlxsw_sp, rulei,
 							 out_dev);
 			if (err)
@@ -265,7 +262,7 @@ static int mlxsw_sp_flower_parse_ip(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
-				 struct net_device *dev, bool ingress,
+				 struct mlxsw_sp_acl_block *block,
 				 struct mlxsw_sp_acl_rule_info *rulei,
 				 struct tc_cls_flower_offload *f)
 {
@@ -383,21 +380,19 @@ static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		return err;
 
-	return mlxsw_sp_flower_parse_actions(mlxsw_sp, dev, ingress,
-					     rulei, f->exts);
+	return mlxsw_sp_flower_parse_actions(mlxsw_sp, block, rulei, f->exts);
 }
 
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-	struct net_device *dev = mlxsw_sp_port->dev;
 	struct mlxsw_sp_acl_rule_info *rulei;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, dev, ingress,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
@@ -410,7 +405,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	rulei = mlxsw_sp_acl_rule_rulei(rule);
-	err = mlxsw_sp_flower_parse(mlxsw_sp, dev, ingress, rulei, f);
+	err = mlxsw_sp_flower_parse(mlxsw_sp, block, rulei, f);
 	if (err)
 		goto err_flower_parse;
 
@@ -423,7 +418,6 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 		goto err_rule_add;
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count++;
 	return 0;
 
 err_rule_add:
@@ -435,15 +429,15 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	return err;
 }
 
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
 		return;
@@ -455,13 +449,12 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count--;
 }
 
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	u64 packets;
@@ -469,8 +462,8 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	u64 bytes;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (WARN_ON(IS_ERR(ruleset)))
 		return -EINVAL;
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v3 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops
From: Jiri Pirko @ 2017-12-13 15:10 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

No need to convert from mlxsw_sp_port to net_device and back again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  6 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  4 ++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 27 +++++++++-------------
 3 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index ab6ada7..525552d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,9 +477,11 @@ struct mlxsw_sp_acl_profile_ops {
 			   void *priv, void *ruleset_priv);
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			    struct net_device *dev, bool ingress);
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
 	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			       struct net_device *dev, bool ingress);
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index f98bca9..9439bfa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -169,7 +169,7 @@ mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
-				 binding->mlxsw_sp_port->dev, binding->ingress);
+				 binding->mlxsw_sp_port, binding->ingress);
 }
 
 static void
@@ -181,7 +181,7 @@ mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
-			    binding->mlxsw_sp_port->dev, binding->ingress);
+			    binding->mlxsw_sp_port, binding->ingress);
 }
 
 static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 50b2f9a..c6e180c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -258,15 +258,11 @@ static void mlxsw_sp_acl_tcam_group_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 			     struct mlxsw_sp_acl_tcam_group *group,
-			     struct net_device *dev, bool ingress)
+			     struct mlxsw_sp_port *mlxsw_sp_port,
+			     bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (!mlxsw_sp_port_dev_check(dev))
-		return -EINVAL;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
@@ -277,15 +273,11 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
 			       struct mlxsw_sp_acl_tcam_group *group,
-			       struct net_device *dev, bool ingress)
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
-		return;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
@@ -1054,22 +1046,25 @@ mlxsw_sp_acl_tcam_flower_ruleset_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 				      void *ruleset_priv,
-				      struct net_device *dev, bool ingress)
+				      struct mlxsw_sp_port *mlxsw_sp_port,
+				      bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
 	return mlxsw_sp_acl_tcam_group_bind(mlxsw_sp, &ruleset->group,
-					    dev, ingress);
+					    mlxsw_sp_port, ingress);
 }
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 					void *ruleset_priv,
-					struct net_device *dev, bool ingress)
+					struct mlxsw_sp_port *mlxsw_sp_port,
+					bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group,
+				       mlxsw_sp_port, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related

* [patch iproute2] tc: implement filter block sharing to ingress and clsact qdiscs
From: Jiri Pirko @ 2017-12-13 15:13 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel
In-Reply-To: <20171213151038.29665-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_sched.h | 11 +++++++++
 tc/q_clsact.c                  | 56 ++++++++++++++++++++++++++++++++++++++----
 tc/q_ingress.c                 | 32 +++++++++++++++++++++---
 3 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index af3cc2f..0feee36 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -935,4 +935,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/tc/q_clsact.c b/tc/q_clsact.c
index 341f653..06d67db 100644
--- a/tc/q_clsact.c
+++ b/tc/q_clsact.c
@@ -7,23 +7,69 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... clsact\n");
+	fprintf(stderr, "Usage: ... clsact [ingress_block BLOCK_INDEX] [egress_block BLOCK_INDEX]\n");
 }
 
 static int clsact_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			    struct nlmsghdr *n, const char *dev)
 {
-	if (argc > 0) {
-		fprintf(stderr, "What is \"%s\"?\n", *argv);
-		explain();
-		return -1;
+	struct rtattr *tail;
+	unsigned int ingress_block = 0;
+	unsigned int egress_block = 0;
+
+	while (argc > 0) {
+		if (strcmp(*argv, "ingress_block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&ingress_block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"ingress_block\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "egress_block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&egress_block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"egress_block\"\n");
+				return -1;
+			}
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (ingress_block)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, ingress_block);
+	if (egress_block)
+		addattr32(n, 1024, TCA_CLSACT_EGRESS_BLOCK, egress_block);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int clsact_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 {
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	unsigned int block;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "ingress_block",
+			   "ingress_block %u ", block);
+	}
+	if (tb[TCA_CLSACT_EGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_EGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "egress_block",
+			   "egress_block %u ", block);
+	}
 	return 0;
 }
 
diff --git a/tc/q_ingress.c b/tc/q_ingress.c
index 1e42229..6899c4d 100644
--- a/tc/q_ingress.c
+++ b/tc/q_ingress.c
@@ -17,30 +17,56 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... ingress\n");
+	fprintf(stderr, "Usage: ... ingress [block BLOCK_INDEX]\n");
 }
 
 static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			     struct nlmsghdr *n, const char *dev)
 {
+	struct rtattr *tail;
+	unsigned int block;
+
 	while (argc > 0) {
 		if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
-			argc--; argv++;
+		} else if (strcmp(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"block\"\n");
+				return -1;
+			}
 		} else {
 			fprintf(stderr, "What is \"%s\"?\n", *argv);
 			explain();
 			return -1;
 		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (block)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, block);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int ingress_print_opt(struct qdisc_util *qu, FILE *f,
 			     struct rtattr *opt)
 {
-	fprintf(f, "---------------- ");
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	unsigned int block;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "block", "block %u ", block);
+	}
 	return 0;
 }
 
-- 
2.9.5

^ permalink raw reply related

* [PATCH v4 net-next 1/4] libbpf: add ability to guess program type based on section name
From: Roman Gushchin @ 2017-12-13 15:18 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel-team, ast, daniel, jakub.kicinski, kafai,
	guro, Quentin Monnet, David Ahern
In-Reply-To: <20171213151854.21960-1-guro@fb.com>

The bpf_prog_load() function will guess program type if it's not
specified explicitly. This functionality will be used to implement
loading of different programs without asking a user to specify
the program type. In first order it will be used by bpftool.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/lib/bpf/libbpf.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 5aa45f89da93..205b7822fa0a 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1721,6 +1721,45 @@ BPF_PROG_TYPE_FNS(tracepoint, BPF_PROG_TYPE_TRACEPOINT);
 BPF_PROG_TYPE_FNS(xdp, BPF_PROG_TYPE_XDP);
 BPF_PROG_TYPE_FNS(perf_event, BPF_PROG_TYPE_PERF_EVENT);
 
+#define BPF_PROG_SEC(string, type) { string, sizeof(string), type }
+static const struct {
+	const char *sec;
+	size_t len;
+	enum bpf_prog_type prog_type;
+} section_names[] = {
+	BPF_PROG_SEC("socket",		BPF_PROG_TYPE_SOCKET_FILTER),
+	BPF_PROG_SEC("kprobe/",		BPF_PROG_TYPE_KPROBE),
+	BPF_PROG_SEC("kretprobe/",	BPF_PROG_TYPE_KPROBE),
+	BPF_PROG_SEC("tracepoint/",	BPF_PROG_TYPE_TRACEPOINT),
+	BPF_PROG_SEC("xdp",		BPF_PROG_TYPE_XDP),
+	BPF_PROG_SEC("perf_event",	BPF_PROG_TYPE_PERF_EVENT),
+	BPF_PROG_SEC("cgroup/skb",	BPF_PROG_TYPE_CGROUP_SKB),
+	BPF_PROG_SEC("cgroup/sock",	BPF_PROG_TYPE_CGROUP_SOCK),
+	BPF_PROG_SEC("cgroup/dev",	BPF_PROG_TYPE_CGROUP_DEVICE),
+	BPF_PROG_SEC("sockops",		BPF_PROG_TYPE_SOCK_OPS),
+	BPF_PROG_SEC("sk_skb",		BPF_PROG_TYPE_SK_SKB),
+};
+#undef BPF_PROG_SEC
+
+static enum bpf_prog_type bpf_program__guess_type(struct bpf_program *prog)
+{
+	int i;
+
+	if (!prog->section_name)
+		goto err;
+
+	for (i = 0; i < ARRAY_SIZE(section_names); i++)
+		if (strncmp(prog->section_name, section_names[i].sec,
+			    section_names[i].len) == 0)
+			return section_names[i].prog_type;
+
+err:
+	pr_warning("failed to guess program type based on section name %s\n",
+		   prog->section_name);
+
+	return BPF_PROG_TYPE_UNSPEC;
+}
+
 int bpf_map__fd(struct bpf_map *map)
 {
 	return map ? map->fd : -EINVAL;
@@ -1832,6 +1871,18 @@ int bpf_prog_load(const char *file, enum bpf_prog_type type,
 		return -ENOENT;
 	}
 
+	/*
+	 * If type is not specified, try to guess it based on
+	 * section name.
+	 */
+	if (type == BPF_PROG_TYPE_UNSPEC) {
+		type = bpf_program__guess_type(prog);
+		if (type == BPF_PROG_TYPE_UNSPEC) {
+			bpf_object__close(obj);
+			return -EINVAL;
+		}
+	}
+
 	bpf_program__set_type(prog, type);
 	err = bpf_object__load(obj);
 	if (err) {
-- 
2.14.3

^ permalink raw reply related

* [PATCH v4 net-next 0/4] bpftool: cgroup bpf operations
From: Roman Gushchin @ 2017-12-13 15:18 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel-team, ast, daniel, jakub.kicinski, kafai,
	guro, Quentin Monnet, David Ahern

This patchset adds basic cgroup bpf operations to bpftool.

Right now there is no convenient way to perform these operations.
The /samples/bpf/load_sock_ops.c implements attach/detacg operations,
but only for BPF_CGROUP_SOCK_OPS programs. Bps (part of bcc) implements
bpf introspection, but lacks any cgroup-related specific.

I find having a tool to perform these basic operations in the kernel tree
very useful, as it can be used in the corresponding bpf documentation
without creating additional dependencies. And bpftool seems to be
a right tool to extend with such functionality.

v4:
  - ATTACH_FLAGS and ATTACH_TYPE are listed and described in docs and usage
  - ATTACH_FLAG names converted to "multi" and "override"
  - do_attach() recognizes ATTACH_FLAG abbreviations, e.g "mul"
  - Local variables sorted ("reverse Christmas tree")
  - unknown attach flags value will be never truncated

v3:
  - SRC replaced with OBJ in prog load docs
  - Output unknown attach type in hex
  - License header in SPDX format
  - Minor style fixes (e.g. variable reordering)

v2:
  - Added prog load operations
  - All cgroup operations are looking like bpftool cgroup <command>
  - All cgroup-related stuff is moved to a separate file
  - Added support for attach flags
  - Added support for attaching/detaching programs by id, pinned name, etc
  - Changed cgroup detach arguments order
  - Added empty json output for succesful programs
  - Style fixed: includes order, strncmp and macroses, error handling
  - Added man pages

v1:
  https://lwn.net/Articles/740366/

Roman Gushchin (4):
  libbpf: add ability to guess program type based on section name
  libbpf: prefer global symbols as bpf program name source
  bpftool: implement prog load command
  bpftool: implement cgroup bpf operations

 tools/bpf/bpftool/Documentation/bpftool-cgroup.rst | 118 ++++++++
 tools/bpf/bpftool/Documentation/bpftool-map.rst    |   2 +-
 tools/bpf/bpftool/Documentation/bpftool-prog.rst   |  12 +-
 tools/bpf/bpftool/Documentation/bpftool.rst        |   8 +-
 tools/bpf/bpftool/cgroup.c                         | 307 +++++++++++++++++++++
 tools/bpf/bpftool/common.c                         |  71 ++---
 tools/bpf/bpftool/main.c                           |   3 +-
 tools/bpf/bpftool/main.h                           |   2 +
 tools/bpf/bpftool/prog.c                           |  29 +-
 tools/lib/bpf/libbpf.c                             |  53 ++++
 10 files changed, 566 insertions(+), 39 deletions(-)
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
 create mode 100644 tools/bpf/bpftool/cgroup.c

-- 
2.14.3

^ permalink raw reply

* [PATCH v4 net-next 4/4] bpftool: implement cgroup bpf operations
From: Roman Gushchin @ 2017-12-13 15:18 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel-team, ast, daniel, jakub.kicinski, kafai,
	guro, Quentin Monnet, David Ahern
In-Reply-To: <20171213151854.21960-1-guro@fb.com>

This patch adds basic cgroup bpf operations to bpftool:
cgroup list, attach and detach commands.

Usage is described in the corresponding man pages,
and examples are provided.

Syntax:
$ bpftool cgroup list CGROUP
$ bpftool cgroup attach CGROUP ATTACH_TYPE PROG [ATTACH_FLAGS]
$ bpftool cgroup detach CGROUP ATTACH_TYPE PROG

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
---
 tools/bpf/bpftool/Documentation/bpftool-cgroup.rst | 118 ++++++++
 tools/bpf/bpftool/Documentation/bpftool-map.rst    |   2 +-
 tools/bpf/bpftool/Documentation/bpftool-prog.rst   |   2 +-
 tools/bpf/bpftool/Documentation/bpftool.rst        |   6 +-
 tools/bpf/bpftool/cgroup.c                         | 307 +++++++++++++++++++++
 tools/bpf/bpftool/main.c                           |   3 +-
 tools/bpf/bpftool/main.h                           |   1 +
 7 files changed, 434 insertions(+), 5 deletions(-)
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
 create mode 100644 tools/bpf/bpftool/cgroup.c

diff --git a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
new file mode 100644
index 000000000000..45c71b1f682b
--- /dev/null
+++ b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
@@ -0,0 +1,118 @@
+================
+bpftool-cgroup
+================
+-------------------------------------------------------------------------------
+tool for inspection and simple manipulation of eBPF progs
+-------------------------------------------------------------------------------
+
+:Manual section: 8
+
+SYNOPSIS
+========
+
+	**bpftool** [*OPTIONS*] **cgroup** *COMMAND*
+
+	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
+
+	*COMMANDS* :=
+	{ **list** | **attach** | **detach** | **help** }
+
+MAP COMMANDS
+=============
+
+|	**bpftool** **cgroup list** *CGROUP*
+|	**bpftool** **cgroup attach** *CGROUP* *ATTACH_TYPE* *PROG* [*ATTACH_FLAGS*]
+|	**bpftool** **cgroup detach** *CGROUP* *ATTACH_TYPE* *PROG*
+|	**bpftool** **cgroup help**
+|
+|	*PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* }
+|	*ATTACH_TYPE* := { *ingress* | *egress* | *sock_create* | *sock_ops* | *device* }
+|	*ATTACH_FLAGS* := { *multi* | *override* }
+
+DESCRIPTION
+===========
+	**bpftool cgroup list** *CGROUP*
+		  List all programs attached to the cgroup *CGROUP*.
+
+		  Output will start with program ID followed by attach type,
+		  attach flags and program name.
+
+	**bpftool cgroup attach** *CGROUP* *ATTACH_TYPE* *PROG* [*ATTACH_FLAGS*]
+		  Attach program *PROG* to the cgroup *CGROUP* with attach type
+		  *ATTACH_TYPE* and optional *ATTACH_FLAGS*.
+
+		  *ATTACH_FLAGS* can be one of: **override** if a sub-cgroup installs
+		  some bpf program, the program in this cgroup yields to sub-cgroup
+		  program; **multi** if a sub-cgroup installs some bpf program,
+		  that cgroup program gets run in addition to the program in this
+		  cgroup.
+
+		  Only one program is allowed to be attached to a cgroup with
+		  no attach flags or the **override** flag. Attaching another
+		  program will release old program and attach the new one.
+
+		  Multiple programs are allowed to be attached to a cgroup with
+		  **multi**. They are executed in FIFO order (those that were
+		  attached first, run first).
+
+		  Non-default *ATTACH_FLAGS* are supported by kernel version 4.14
+		  and later.
+
+		  *ATTACH_TYPE* can be on of:
+		  **ingress** ingress path of the inet socket (since 4.10);
+		  **egress** egress path of the inet socket (since 4.10);
+		  **sock_create** opening of an inet socket (since 4.10);
+		  **sock_ops** various socket operations (since 4.12);
+		  **device** device access (since 4.15).
+
+	**bpftool cgroup detach** *CGROUP* *ATTACH_TYPE* *PROG*
+		  Detach *PROG* from the cgroup *CGROUP* and attach type
+		  *ATTACH_TYPE*.
+
+	**bpftool prog help**
+		  Print short help message.
+
+OPTIONS
+=======
+	-h, --help
+		  Print short generic help message (similar to **bpftool help**).
+
+	-v, --version
+		  Print version number (similar to **bpftool version**).
+
+	-j, --json
+		  Generate JSON output. For commands that cannot produce JSON, this
+		  option has no effect.
+
+	-p, --pretty
+		  Generate human-readable JSON output. Implies **-j**.
+
+	-f, --bpffs
+		  Show file names of pinned programs.
+
+EXAMPLES
+========
+|
+| **# mount -t bpf none /sys/fs/bpf/**
+| **# mkdir /sys/fs/cgroup/test.slice**
+| **# bpftool prog load ./device_cgroup.o /sys/fs/bpf/prog**
+| **# bpftool cgroup attach /sys/fs/cgroup/test.slice/ device id 1 allow_multi**
+
+**# bpftool cgroup list /sys/fs/cgroup/test.slice/**
+
+::
+
+    ID       AttachType      AttachFlags     Name
+    1        device          allow_multi     bpf_prog1
+
+|
+| **# bpftool cgroup detach /sys/fs/cgroup/test.slice/ device id 1**
+| **# bpftool cgroup list /sys/fs/cgroup/test.slice/**
+
+::
+
+    ID       AttachType      AttachFlags     Name
+
+SEE ALSO
+========
+	**bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8)
diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index 9f51a268eb06..421cabc417e6 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -128,4 +128,4 @@ EXAMPLES
 
 SEE ALSO
 ========
-	**bpftool**\ (8), **bpftool-prog**\ (8)
+	**bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8)
diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
index ffdb20e8280f..81c97c0e9b67 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
@@ -155,4 +155,4 @@ EXAMPLES
 
 SEE ALSO
 ========
-	**bpftool**\ (8), **bpftool-map**\ (8)
+	**bpftool**\ (8), **bpftool-map**\ (8), **bpftool-cgroup**\ (8)
diff --git a/tools/bpf/bpftool/Documentation/bpftool.rst b/tools/bpf/bpftool/Documentation/bpftool.rst
index f547a0c0aa34..6732a5a617e4 100644
--- a/tools/bpf/bpftool/Documentation/bpftool.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool.rst
@@ -16,7 +16,7 @@ SYNOPSIS
 
 	**bpftool** **version**
 
-	*OBJECT* := { **map** | **program** }
+	*OBJECT* := { **map** | **program** | **cgroup** }
 
 	*OPTIONS* := { { **-V** | **--version** } | { **-h** | **--help** }
 	| { **-j** | **--json** } [{ **-p** | **--pretty** }] }
@@ -28,6 +28,8 @@ SYNOPSIS
 	*PROG-COMMANDS* := { **show** | **dump jited** | **dump xlated** | **pin**
 	| **load** | **help** }
 
+	*CGROUP-COMMANDS* := { **list** | **attach** | **detach** | **help** }
+
 DESCRIPTION
 ===========
 	*bpftool* allows for inspection and simple modification of BPF objects
@@ -53,4 +55,4 @@ OPTIONS
 
 SEE ALSO
 ========
-	**bpftool-map**\ (8), **bpftool-prog**\ (8)
+	**bpftool-map**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8)
diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
new file mode 100644
index 000000000000..34ca303d72bc
--- /dev/null
+++ b/tools/bpf/bpftool/cgroup.c
@@ -0,0 +1,307 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright (C) 2017 Facebook
+// Author: Roman Gushchin <guro@fb.com>
+
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+#define HELP_SPEC_ATTACH_FLAGS						\
+	"ATTACH_FLAGS := { multi | override }"
+
+#define HELP_SPEC_ATTACH_TYPES						\
+	"ATTACH_TYPE := { ingress | egress | sock_create | sock_ops | device }"
+
+static const char * const attach_type_strings[] = {
+	[BPF_CGROUP_INET_INGRESS] = "ingress",
+	[BPF_CGROUP_INET_EGRESS] = "egress",
+	[BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
+	[BPF_CGROUP_SOCK_OPS] = "sock_ops",
+	[BPF_CGROUP_DEVICE] = "device",
+	[__MAX_BPF_ATTACH_TYPE] = NULL,
+};
+
+static enum bpf_attach_type parse_attach_type(const char *str)
+{
+	enum bpf_attach_type type;
+
+	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
+		if (attach_type_strings[type] &&
+		    is_prefix(str, attach_type_strings[type]))
+			return type;
+	}
+
+	return __MAX_BPF_ATTACH_TYPE;
+}
+
+static int list_bpf_prog(int id, const char *attach_type_str,
+			 const char *attach_flags_str)
+{
+	struct bpf_prog_info info = {};
+	__u32 info_len = sizeof(info);
+	int prog_fd;
+
+	prog_fd = bpf_prog_get_fd_by_id(id);
+	if (prog_fd < 0)
+		return -1;
+
+	if (bpf_obj_get_info_by_fd(prog_fd, &info, &info_len)) {
+		close(prog_fd);
+		return -1;
+	}
+
+	if (json_output) {
+		jsonw_start_object(json_wtr);
+		jsonw_uint_field(json_wtr, "id", info.id);
+		jsonw_string_field(json_wtr, "attach_type",
+				   attach_type_str);
+		jsonw_string_field(json_wtr, "attach_flags",
+				   attach_flags_str);
+		jsonw_string_field(json_wtr, "name", info.name);
+		jsonw_end_object(json_wtr);
+	} else {
+		printf("%-8u %-15s %-15s %-15s\n", info.id,
+		       attach_type_str,
+		       attach_flags_str,
+		       info.name);
+	}
+
+	close(prog_fd);
+	return 0;
+}
+
+static int list_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type)
+{
+	__u32 prog_ids[1024] = {0};
+	char *attach_flags_str;
+	__u32 prog_cnt, iter;
+	__u32 attach_flags;
+	char buf[32];
+	int ret;
+
+	prog_cnt = ARRAY_SIZE(prog_ids);
+	ret = bpf_prog_query(cgroup_fd, type, 0, &attach_flags, prog_ids,
+			     &prog_cnt);
+	if (ret)
+		return ret;
+
+	if (prog_cnt == 0)
+		return 0;
+
+	switch (attach_flags) {
+	case BPF_F_ALLOW_MULTI:
+		attach_flags_str = "multi";
+		break;
+	case BPF_F_ALLOW_OVERRIDE:
+		attach_flags_str = "override";
+		break;
+	case 0:
+		attach_flags_str = "";
+		break;
+	default:
+		snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags);
+		attach_flags_str = buf;
+	}
+
+	for (iter = 0; iter < prog_cnt; iter++)
+		list_bpf_prog(prog_ids[iter], attach_type_strings[type],
+			      attach_flags_str);
+
+	return 0;
+}
+
+static int do_list(int argc, char **argv)
+{
+	enum bpf_attach_type type;
+	int cgroup_fd;
+	int ret = -1;
+
+	if (argc < 1) {
+		p_err("too few parameters for cgroup list\n");
+		goto exit;
+	} else if (argc > 1) {
+		p_err("too many parameters for cgroup list\n");
+		goto exit;
+	}
+
+	cgroup_fd = open(argv[0], O_RDONLY);
+	if (cgroup_fd < 0) {
+		p_err("can't open cgroup %s\n", argv[1]);
+		goto exit;
+	}
+
+	if (json_output)
+		jsonw_start_array(json_wtr);
+	else
+		printf("%-8s %-15s %-15s %-15s\n", "ID", "AttachType",
+		       "AttachFlags", "Name");
+
+	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
+		/*
+		 * Not all attach types may be supported, so it's expected,
+		 * that some requests will fail.
+		 * If we were able to get the list for at least one
+		 * attach type, let's return 0.
+		 */
+		if (list_attached_bpf_progs(cgroup_fd, type) == 0)
+			ret = 0;
+	}
+
+	if (json_output)
+		jsonw_end_array(json_wtr);
+
+	close(cgroup_fd);
+exit:
+	return ret;
+}
+
+static int do_attach(int argc, char **argv)
+{
+	enum bpf_attach_type attach_type;
+	int cgroup_fd, prog_fd;
+	int attach_flags = 0;
+	int ret = -1;
+	int i;
+
+	if (argc < 4) {
+		p_err("too few parameters for cgroup attach\n");
+		goto exit;
+	}
+
+	cgroup_fd = open(argv[0], O_RDONLY);
+	if (cgroup_fd < 0) {
+		p_err("can't open cgroup %s\n", argv[1]);
+		goto exit;
+	}
+
+	attach_type = parse_attach_type(argv[1]);
+	if (attach_type == __MAX_BPF_ATTACH_TYPE) {
+		p_err("invalid attach type\n");
+		goto exit_cgroup;
+	}
+
+	argc -= 2;
+	argv = &argv[2];
+	prog_fd = prog_parse_fd(&argc, &argv);
+	if (prog_fd < 0)
+		goto exit_cgroup;
+
+	for (i = 0; i < argc; i++) {
+		if (is_prefix(argv[i], "multi")) {
+			attach_flags |= BPF_F_ALLOW_MULTI;
+		} else if (is_prefix(argv[i], "override")) {
+			attach_flags |= BPF_F_ALLOW_OVERRIDE;
+		} else {
+			p_err("unknown option: %s\n", argv[i]);
+			goto exit_cgroup;
+		}
+	}
+
+	if (bpf_prog_attach(prog_fd, cgroup_fd, attach_type, attach_flags)) {
+		p_err("failed to attach program");
+		goto exit_prog;
+	}
+
+	if (json_output)
+		jsonw_null(json_wtr);
+
+	ret = 0;
+
+exit_prog:
+	close(prog_fd);
+exit_cgroup:
+	close(cgroup_fd);
+exit:
+	return ret;
+}
+
+static int do_detach(int argc, char **argv)
+{
+	enum bpf_attach_type attach_type;
+	int prog_fd, cgroup_fd;
+	int ret = -1;
+
+	if (argc < 4) {
+		p_err("too few parameters for cgroup detach\n");
+		goto exit;
+	}
+
+	cgroup_fd = open(argv[0], O_RDONLY);
+	if (cgroup_fd < 0) {
+		p_err("can't open cgroup %s\n", argv[1]);
+		goto exit;
+	}
+
+	attach_type = parse_attach_type(argv[1]);
+	if (attach_type == __MAX_BPF_ATTACH_TYPE) {
+		p_err("invalid attach type");
+		goto exit_cgroup;
+	}
+
+	argc -= 2;
+	argv = &argv[2];
+	prog_fd = prog_parse_fd(&argc, &argv);
+	if (prog_fd < 0)
+		goto exit_cgroup;
+
+	if (bpf_prog_detach2(prog_fd, cgroup_fd, attach_type)) {
+		p_err("failed to detach program");
+		goto exit_prog;
+	}
+
+	if (json_output)
+		jsonw_null(json_wtr);
+
+	ret = 0;
+
+exit_prog:
+	close(prog_fd);
+exit_cgroup:
+	close(cgroup_fd);
+exit:
+	return ret;
+}
+
+static int do_help(int argc, char **argv)
+{
+	if (json_output) {
+		jsonw_null(json_wtr);
+		return 0;
+	}
+
+	fprintf(stderr,
+		"Usage: %s %s list CGROUP\n"
+		"       %s %s attach CGROUP ATTACH_TYPE PROG [ATTACH_FLAGS]\n"
+		"       %s %s detach CGROUP ATTACH_TYPE PROG\n"
+		"       %s %s help\n"
+		"\n"
+		"       " HELP_SPEC_ATTACH_TYPES "\n"
+		"       " HELP_SPEC_ATTACH_FLAGS "\n"
+		"       " HELP_SPEC_PROGRAM "\n"
+		"       " HELP_SPEC_OPTIONS "\n"
+		"",
+		bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2], bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "list",	do_list },
+	{ "attach",	do_attach },
+	{ "detach",	do_detach },
+	{ "help",	do_help },
+	{ 0 }
+};
+
+int do_cgroup(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index d294bc8168be..ecd53ccf1239 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -85,7 +85,7 @@ static int do_help(int argc, char **argv)
 		"       %s batch file FILE\n"
 		"       %s version\n"
 		"\n"
-		"       OBJECT := { prog | map }\n"
+		"       OBJECT := { prog | map | cgroup }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, bin_name, bin_name);
@@ -173,6 +173,7 @@ static const struct cmd cmds[] = {
 	{ "batch",	do_batch },
 	{ "prog",	do_prog },
 	{ "map",	do_map },
+	{ "cgroup",	do_cgroup },
 	{ "version",	do_version },
 	{ 0 }
 };
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index bec1ccbb49c7..8f6d3cac0347 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -115,6 +115,7 @@ int do_pin_fd(int fd, const char *name);
 
 int do_prog(int argc, char **arg);
 int do_map(int argc, char **arg);
+int do_cgroup(int argc, char **arg);
 
 int prog_parse_fd(int *argc, char ***argv);
 
-- 
2.14.3

^ permalink raw reply related

* [PATCH v4 net-next 2/4] libbpf: prefer global symbols as bpf program name source
From: Roman Gushchin @ 2017-12-13 15:18 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel-team, ast, daniel, jakub.kicinski, kafai,
	guro, Quentin Monnet, David Ahern
In-Reply-To: <20171213151854.21960-1-guro@fb.com>

Libbpf picks the name of the first symbol in the corresponding
elf section to use as a program name. But without taking symbol's
scope into account it may end's up with some local label
as a program name. E.g.:

$ bpftool prog
1: type 15  name LBB0_10    tag 0390a5136ba23f5c
	loaded_at Dec 07/17:22  uid 0
	xlated 456B  not jited  memlock 4096B

Fix this by preferring global symbols as program name.

For instance:
$ bpftool prog
1: type 15  name bpf_prog1  tag 0390a5136ba23f5c
	loaded_at Dec 07/17:26  uid 0
	xlated 456B  not jited  memlock 4096B

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/lib/bpf/libbpf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 205b7822fa0a..65d0d0aff4fa 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -387,6 +387,8 @@ bpf_object__init_prog_names(struct bpf_object *obj)
 				continue;
 			if (sym.st_shndx != prog->idx)
 				continue;
+			if (GELF_ST_BIND(sym.st_info) != STB_GLOBAL)
+				continue;
 
 			name = elf_strptr(obj->efile.elf,
 					  obj->efile.strtabidx,
-- 
2.14.3

^ permalink raw reply related

* [PATCH v4 net-next 3/4] bpftool: implement prog load command
From: Roman Gushchin @ 2017-12-13 15:18 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, kernel-team, ast, daniel, jakub.kicinski, kafai,
	guro, Quentin Monnet, David Ahern
In-Reply-To: <20171213151854.21960-1-guro@fb.com>

Add the prog load command to load a bpf program from a specified
binary file and pin it to bpffs.

Usage description and examples are given in the corresponding man
page.

Syntax:
$ bpftool prog load OBJ FILE

FILE is a non-existing file on bpffs.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/bpf/bpftool/Documentation/bpftool-prog.rst | 10 +++-
 tools/bpf/bpftool/Documentation/bpftool.rst      |  2 +-
 tools/bpf/bpftool/common.c                       | 71 +++++++++++++-----------
 tools/bpf/bpftool/main.h                         |  1 +
 tools/bpf/bpftool/prog.c                         | 29 +++++++++-
 5 files changed, 79 insertions(+), 34 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
index 36e8d1c3c40d..ffdb20e8280f 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
@@ -15,7 +15,7 @@ SYNOPSIS
 	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
 
 	*COMMANDS* :=
-	{ **show** | **dump xlated** | **dump jited** | **pin** | **help** }
+	{ **show** | **dump xlated** | **dump jited** | **pin** | **load** | **help** }
 
 MAP COMMANDS
 =============
@@ -24,6 +24,7 @@ MAP COMMANDS
 |	**bpftool** **prog dump xlated** *PROG* [{**file** *FILE* | **opcodes**}]
 |	**bpftool** **prog dump jited**  *PROG* [{**file** *FILE* | **opcodes**}]
 |	**bpftool** **prog pin** *PROG* *FILE*
+|	**bpftool** **prog load** *OBJ* *FILE*
 |	**bpftool** **prog help**
 |
 |	*PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* }
@@ -57,6 +58,11 @@ DESCRIPTION
 
 		  Note: *FILE* must be located in *bpffs* mount.
 
+	**bpftool prog load** *OBJ* *FILE*
+		  Load bpf program from binary *OBJ* and pin as *FILE*.
+
+		  Note: *FILE* must be located in *bpffs* mount.
+
 	**bpftool prog help**
 		  Print short help message.
 
@@ -126,8 +132,10 @@ EXAMPLES
 |
 | **# mount -t bpf none /sys/fs/bpf/**
 | **# bpftool prog pin id 10 /sys/fs/bpf/prog**
+| **# bpftool prog load ./my_prog.o /sys/fs/bpf/prog2**
 | **# ls -l /sys/fs/bpf/**
 |   -rw------- 1 root root 0 Jul 22 01:43 prog
+|   -rw------- 1 root root 0 Jul 22 01:44 prog2
 
 **# bpftool prog dum jited pinned /sys/fs/bpf/prog opcodes**
 
diff --git a/tools/bpf/bpftool/Documentation/bpftool.rst b/tools/bpf/bpftool/Documentation/bpftool.rst
index 926c03d5a8da..f547a0c0aa34 100644
--- a/tools/bpf/bpftool/Documentation/bpftool.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool.rst
@@ -26,7 +26,7 @@ SYNOPSIS
 	| **pin** | **help** }
 
 	*PROG-COMMANDS* := { **show** | **dump jited** | **dump xlated** | **pin**
-	| **help** }
+	| **load** | **help** }
 
 DESCRIPTION
 ===========
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index 2bd3b280e6dd..b62c94e3997a 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -163,13 +163,49 @@ int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type)
 	return fd;
 }
 
-int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32))
+int do_pin_fd(int fd, const char *name)
 {
 	char err_str[ERR_MAX_LEN];
-	unsigned int id;
-	char *endptr;
 	char *file;
 	char *dir;
+	int err = 0;
+
+	err = bpf_obj_pin(fd, name);
+	if (!err)
+		goto out;
+
+	file = malloc(strlen(name) + 1);
+	strcpy(file, name);
+	dir = dirname(file);
+
+	if (errno != EPERM || is_bpffs(dir)) {
+		p_err("can't pin the object (%s): %s", name, strerror(errno));
+		goto out_free;
+	}
+
+	/* Attempt to mount bpffs, then retry pinning. */
+	err = mnt_bpffs(dir, err_str, ERR_MAX_LEN);
+	if (!err) {
+		err = bpf_obj_pin(fd, name);
+		if (err)
+			p_err("can't pin the object (%s): %s", name,
+			      strerror(errno));
+	} else {
+		err_str[ERR_MAX_LEN - 1] = '\0';
+		p_err("can't mount BPF file system to pin the object (%s): %s",
+		      name, err_str);
+	}
+
+out_free:
+	free(file);
+out:
+	return err;
+}
+
+int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32))
+{
+	unsigned int id;
+	char *endptr;
 	int err;
 	int fd;
 
@@ -195,35 +231,8 @@ int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32))
 		return -1;
 	}
 
-	err = bpf_obj_pin(fd, *argv);
-	if (!err)
-		goto out_close;
-
-	file = malloc(strlen(*argv) + 1);
-	strcpy(file, *argv);
-	dir = dirname(file);
-
-	if (errno != EPERM || is_bpffs(dir)) {
-		p_err("can't pin the object (%s): %s", *argv, strerror(errno));
-		goto out_free;
-	}
+	err = do_pin_fd(fd, *argv);
 
-	/* Attempt to mount bpffs, then retry pinning. */
-	err = mnt_bpffs(dir, err_str, ERR_MAX_LEN);
-	if (!err) {
-		err = bpf_obj_pin(fd, *argv);
-		if (err)
-			p_err("can't pin the object (%s): %s", *argv,
-			      strerror(errno));
-	} else {
-		err_str[ERR_MAX_LEN - 1] = '\0';
-		p_err("can't mount BPF file system to pin the object (%s): %s",
-		      *argv, err_str);
-	}
-
-out_free:
-	free(file);
-out_close:
 	close(fd);
 	return err;
 }
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index bff330b49791..bec1ccbb49c7 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -111,6 +111,7 @@ char *get_fdinfo(int fd, const char *key);
 int open_obj_pinned(char *path);
 int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type);
 int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32));
+int do_pin_fd(int fd, const char *name);
 
 int do_prog(int argc, char **arg);
 int do_map(int argc, char **arg);
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index ad619b96c276..037484ceaeaf 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -45,6 +45,7 @@
 #include <sys/stat.h>
 
 #include <bpf.h>
+#include <libbpf.h>
 
 #include "main.h"
 #include "disasm.h"
@@ -635,6 +636,30 @@ static int do_pin(int argc, char **argv)
 	return err;
 }
 
+static int do_load(int argc, char **argv)
+{
+	struct bpf_object *obj;
+	int prog_fd;
+
+	if (argc != 2)
+		usage();
+
+	if (bpf_prog_load(argv[0], BPF_PROG_TYPE_UNSPEC, &obj, &prog_fd)) {
+		p_err("failed to load program\n");
+		return -1;
+	}
+
+	if (do_pin_fd(prog_fd, argv[1])) {
+		p_err("failed to pin program\n");
+		return -1;
+	}
+
+	if (json_output)
+		jsonw_null(json_wtr);
+
+	return 0;
+}
+
 static int do_help(int argc, char **argv)
 {
 	if (json_output) {
@@ -647,13 +672,14 @@ static int do_help(int argc, char **argv)
 		"       %s %s dump xlated PROG [{ file FILE | opcodes }]\n"
 		"       %s %s dump jited  PROG [{ file FILE | opcodes }]\n"
 		"       %s %s pin   PROG FILE\n"
+		"       %s %s load  OBJ  FILE\n"
 		"       %s %s help\n"
 		"\n"
 		"       " HELP_SPEC_PROGRAM "\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
-		bin_name, argv[-2], bin_name, argv[-2]);
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
 
 	return 0;
 }
@@ -663,6 +689,7 @@ static const struct cmd cmds[] = {
 	{ "help",	do_help },
 	{ "dump",	do_dump },
 	{ "pin",	do_pin },
+	{ "load",	do_load },
 	{ 0 }
 };
 
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH] ptp: make PTP a menuconfig to ease disabling it all
From: Richard Cochran @ 2017-12-13 15:22 UTC (permalink / raw)
  To: Vincent Legoll; +Cc: netdev, Linux Kernel ML
In-Reply-To: <CAEwRq=rcE9DKO5vVSvwazBaobpwnmWffN7oAPRaScOn1o8gcWA@mail.gmail.com>

On Wed, Dec 13, 2017 at 10:01:15AM +0100, Vincent Legoll wrote:
> I don't think there are alternatives available in Kconfig to achieve
> the goal of easing the management of a .config with menuconfig.
> 
> Do you have any idea about how to do it without using menuconfig ?

scripts/config

HTH,
Richard

^ permalink raw reply

* Re: [PATCH v2 1/2] seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
From: Andy Shevchenko @ 2017-12-13 15:38 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Johan Hedberg, open list:BLUETOOTH DRIVERS, David S. Miller,
	Network Development, Randy Dunlap
In-Reply-To: <1511871594.25007.464.camel@linux.intel.com>

On Tue, 2017-11-28 at 14:19 +0200, Andy Shevchenko wrote:
> On Tue, 2017-11-28 at 09:43 +0100, Marcel Holtmann wrote:
> > Hi Andy,
> > 
> > > The DEFINE_SHOW_ATTRIBUTE() helper macro would be useful for
> > > current
> > > users, which are many of them, and for new comers to decrease code
> > > duplication.
> > can we define this in hci_debugfs.c first and get that patch merged
> > into bluetooth-next. And only as a follow up patch try to generalize
> > this in seq_file.h. I really don’t like to carry a patch for
> > seq_file.h in the bluetooth-next tree.
> > 
> 
> This was in v1 [1].
> 
> Feel free to apply it instead. Frankly I don't understand what's wrong
> with carrying patch against seq_file.h? Is it "person non grata"?
> 
> [1]: https://marc.info/?l=linux-bluetooth&m=151138535801354&w=2

Marcel, can you apply v1? Or should I resend it as v3?

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply

* Re: [PATCH net-next] bpf/tracing: fix kernel/events/core.c compilation error
From: Daniel Borkmann @ 2017-12-13 15:44 UTC (permalink / raw)
  To: Yonghong Song, ast, sfr, netdev; +Cc: kernel-team
In-Reply-To: <20171213074252.823255-1-yhs@fb.com>

On 12/13/2017 08:42 AM, Yonghong Song wrote:
> Commit f371b304f12e ("bpf/tracing: allow user space to
> query prog array on the same tp") introduced a perf
> ioctl command to query prog array attached to the
> same perf tracepoint. The commit introduced a
> compilation error when either CONFIG_BPF_SYSCALL or
> CONFIG_EVENT_TRACING is not defined:
>   kernel/events/core.o: In function `perf_ioctl':
>   core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array'
> 
> This patch fixed this error.
> 
> Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp")
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Yonghong Song <yhs@fb.com>

Looking at _perf_ioctl(), we also have perf_event_set_bpf_prog()
there. It's basically under CONFIG_EVENT_TRACING, which later calls
perf_event_attach_bpf_prog() which is under CONFIG_BPF_EVENTS, so
where we have the dummy handler returning -EOPNOTSUPP when BPF
events is not set. bpf_trace.c is only built when CONFIG_BPF_EVENTS
is set and that by itself depends on BPF_SYSCALL already. So it would
be more correct to do the same thing here ...

#if defined(CONFIG_EVENT_TRACING) && defined(CONFIG_BPF_EVENTS)
[...]

> ---
>  include/linux/bpf.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 93e15b9..8dbbfd7 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -254,7 +254,6 @@ typedef unsigned long (*bpf_ctx_copy_t)(void *dst, const void *src,
>  
>  u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
>  		     void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy);
> -int bpf_event_query_prog_array(struct perf_event *event, void __user *info);
>  
>  int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>  			  union bpf_attr __user *uattr);
> @@ -558,6 +557,15 @@ static inline int sock_map_prog(struct bpf_map *map,
>  }
>  #endif
>  
> +#if defined(CONFIG_EVENT_TRACING) && defined(CONFIG_BPF_SYSCALL)
> +int bpf_event_query_prog_array(struct perf_event *event, void __user *info);
> +#else
> +static inline int bpf_event_query_prog_array(struct perf_event *event, void __user *info)
> +{
> +	return -EOPNOTSUPP;
> +}
> +#endif
> +
>  /* verifier prototypes for helper functions called from eBPF programs */
>  extern const struct bpf_func_proto bpf_map_lookup_elem_proto;
>  extern const struct bpf_func_proto bpf_map_update_elem_proto;
> 

^ permalink raw reply

* Re: [PATCHv2 2/3] net: socionext: Add Synquacer NetSec driver
From: Jassi Brar @ 2017-12-13 15:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: <netdev@vger.kernel.org>, devicetree@vger.kernel.org,
	David S. Miller, Arnd Bergmann, Rob Herring, Mark Rutland,
	Jassi Brar
In-Reply-To: <CAKv+Gu-6gb10VromMaTJkzpu1DovDO7B5-GWEQPs04qAG=2ksw@mail.gmail.com>

On Tue, Dec 12, 2017 at 10:54 PM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 12 December 2017 at 17:15,  <jassisinghbrar@gmail.com> wrote:
....
>> +
>> +static int netsec_netdev_load_microcode(struct netsec_priv *priv)
>> +{
>> +       int err;
>> +
>> +       err = netsec_netdev_load_ucode_region(
>> +               priv, NETSEC_REG_DMAC_HM_CMD_BUF,
>> +               le32_to_cpup(priv->eeprom_base + NETSEC_EEPROM_HM_ME_ADDRESS_H),
>> +               le32_to_cpup(priv->eeprom_base + NETSEC_EEPROM_HM_ME_ADDRESS_L),
>> +               le32_to_cpup(priv->eeprom_base + NETSEC_EEPROM_HM_ME_SIZE));
>
> ... here, and in other places below, you are still dereferencing
> eeprom_base as if it is a pointer, which is not allowed with in the
> __iomem address space. This should use readl() instead [which happens
> to incorporate the endian swap as well]
>
Ahh, how could it have been just two revisions!

thanks

^ permalink raw reply

* Re: [PATCHv2 2/3] net: socionext: Add Synquacer NetSec driver
From: Jassi Brar @ 2017-12-13 15:46 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ard Biesheuvel,
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	David S. Miller, Arnd Bergmann, Rob Herring, Mark Rutland,
	Jassi Brar
In-Reply-To: <20171212204811.GD1523-g2DYL2Zd6BY@public.gmane.org>

On Wed, Dec 13, 2017 at 2:18 AM, Andrew Lunn <andrew-g2DYL2Zd6BY@public.gmane.org> wrote:
>> > +static int netsec_mac_update_to_phy_state(struct netsec_priv *priv)
>> > +{
>> > +       struct phy_device *phydev = priv->ndev->phydev;
>> > +       u32 value = 0;
>> > +
>> > +       value = phydev->duplex ? NETSEC_GMAC_MCR_REG_FULL_DUPLEX_COMMON :
>> > +                                NETSEC_GMAC_MCR_REG_HALF_DUPLEX_COMMON;
>> > +
>> > +       if (phydev->speed != SPEED_1000)
>> > +               value |= NETSEC_MCR_PS;
>> > +
>> > +       if (priv->phy_interface != PHY_INTERFACE_MODE_GMII &&
>> > +           phydev->speed == SPEED_100)
>> > +               value |= NETSEC_GMAC_MCR_REG_FES;
>> > +
>> > +       value |= NETSEC_GMAC_MCR_REG_CST | NETSEC_GMAC_MCR_REG_JE;
>> > +
>> > +       if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII)
>> > +               value |= NETSEC_GMAC_MCR_REG_IBN;
>
> phy_interface_mode_is_rgmii() ??
>
Yes, thank you.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next] bpf/tracing: fix kernel/events/core.c compilation error
From: Alexei Starovoitov @ 2017-12-13 15:50 UTC (permalink / raw)
  To: Daniel Borkmann, Yonghong Song, sfr, netdev; +Cc: kernel-team
In-Reply-To: <b9670428-a023-5f09-5665-298bb815b85b@iogearbox.net>

On 12/13/17 7:44 AM, Daniel Borkmann wrote:
> On 12/13/2017 08:42 AM, Yonghong Song wrote:
>> Commit f371b304f12e ("bpf/tracing: allow user space to
>> query prog array on the same tp") introduced a perf
>> ioctl command to query prog array attached to the
>> same perf tracepoint. The commit introduced a
>> compilation error when either CONFIG_BPF_SYSCALL or
>> CONFIG_EVENT_TRACING is not defined:
>>   kernel/events/core.o: In function `perf_ioctl':
>>   core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array'
>>
>> This patch fixed this error.
>>
>> Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp")
>> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>
> Looking at _perf_ioctl(), we also have perf_event_set_bpf_prog()
> there. It's basically under CONFIG_EVENT_TRACING, which later calls
> perf_event_attach_bpf_prog() which is under CONFIG_BPF_EVENTS, so
> where we have the dummy handler returning -EOPNOTSUPP when BPF
> events is not set. bpf_trace.c is only built when CONFIG_BPF_EVENTS
> is set and that by itself depends on BPF_SYSCALL already. So it would
> be more correct to do the same thing here ...
>
> #if defined(CONFIG_EVENT_TRACING) && defined(CONFIG_BPF_EVENTS)
> [...]

+1
#ifdef CONFIG_BPF_EVENTS
works, whereas CONFIG_EVENT_TRACING probably not, since kprobe
can be disabled independently which will turn off BPF_EVENTS
and body of bpf_event_query_prog_array() will be gone.

^ permalink raw reply

* Re: [PATCH net 2/3] net/mlx4_core: Fix wrong calculation of free counters
From: Tariq Toukan @ 2017-12-13 16:07 UTC (permalink / raw)
  To: Tariq Toukan, David S. Miller; +Cc: netdev, Eran Ben Elisha
In-Reply-To: <1513153804-19545-3-git-send-email-tariqt@mellanox.com>



On 13/12/2017 10:30 AM, Tariq Toukan wrote:
> From: Eran Ben Elisha <eranbe@mellanox.com>
> 
> The field res_free indicates the total number of counters which are free
> to allocation (reserved and unreserved). Fix a bug of reducing the
> reserved counters from this field before allocation.
> 
> Due to this bug, free counters which are not reserved couldn't be
> allocated, fixed it.
> 
> Fixes: 9de92c60beaa ("net/mlx4_core: Adjust counter grant policy in the resource tracker")
> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>

I forgot to mention a reviewer for this patch, and would edit the commit 
message as well.

I'll send a re-spin shortly.

> ---
>   drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> index 04304dd894c6..606a0e0beeae 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> @@ -611,7 +611,6 @@ int mlx4_init_resource_tracker(struct mlx4_dev *dev)
>   						MLX4_MAX_PORTS;
>   				else
>   					res_alloc->guaranteed[t] = 0;
> -				res_alloc->res_free -= res_alloc->guaranteed[t];
>   				break;
>   			default:
>   				break;
> 

^ permalink raw reply

* Re: [PATCH net] tcp: refresh tcp_mstamp from timers callbacks
From: Mike Maloney @ 2017-12-13 16:10 UTC (permalink / raw)
  To: Soheil Hassas Yeganeh
  Cc: Neal Cardwell, Eric Dumazet, David Miller, netdev, Mike Maloney
In-Reply-To: <CACSApvaJ5Tzw_-zZuDyHY4U5CQMBa6Mi5FrF8PPyWKZhd64XqA@mail.gmail.com>

Acked-by:  Mike Maloney <maloney@google.com>

Thanks for the quick fix!

On Tue, Dec 12, 2017 at 9:42 PM, Soheil Hassas Yeganeh
<soheil.kdev@gmail.com> wrote:
> On Tue, Dec 12, 2017 at 9:26 PM, Neal Cardwell <ncardwell@google.com> wrote:
>> On Tue, Dec 12, 2017 at 9:22 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> Only the retransmit timer currently refreshes tcp_mstamp
>>>
>>> We should do the same for delayed acks and keepalives.
>>>
>>> Even if RFC 7323 does not request it, this is consistent to what linux
>>> did in the past, when TS values were based on jiffies.
>>>
>>> Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path")
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Cc: Soheil Hassas Yeganeh <soheil@google.com>
>>> Cc: Mike Maloney <maloney@google.com>
>>> Cc: Neal Cardwell <ncardwell@google.com>
>>> ---
>>
>> Acked-by: Neal Cardwell <ncardwell@google.com>
>>
>> Thanks, Eric!
>>
>> neal
>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
>
> This is a very nice catch! Thank you Eric!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox