Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCHv4 2/3] net: socionext: Add Synquacer NetSec driver
From: Ard Biesheuvel @ 2017-12-23 15:40 UTC (permalink / raw)
  To: Jassi Brar
  Cc: <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Devicetree List, David S. Miller, Arnd Bergmann, Andrew Lunn,
	Rob Herring, Mark Rutland, Masami Hiramatsu, Jassi Brar
In-Reply-To: <CABb+yY3jMLYx3fTSuHfhfR9HDvDqzqDyYGz7T=5k4-HLfcJukg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 23 December 2017 at 15:01, Jassi Brar <jassisinghbrar-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Sat, Dec 23, 2017 at 4:09 PM, Ard Biesheuvel
> <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>> On 23 December 2017 at 05:45,  <jassisinghbrar-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> From: Jassi Brar <jaswinder.singh-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>
>>> This driver adds support for Socionext "netsec" IP Gigabit
>>> Ethernet + PHY IP used in the Synquacer SC2A11 SoC.
>>>
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>> Signed-off-by: Jassi Brar <jaswinder.singh-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>> ---
>>>  drivers/net/ethernet/Kconfig            |    1 +
>>>  drivers/net/ethernet/Makefile           |    1 +
>>>  drivers/net/ethernet/socionext/Kconfig  |   29 +
>>>  drivers/net/ethernet/socionext/Makefile |    1 +
>>>  drivers/net/ethernet/socionext/netsec.c | 1844 +++++++++++++++++++++++++++++++
>>>  5 files changed, 1876 insertions(+)
>>>  create mode 100644 drivers/net/ethernet/socionext/Kconfig
>>>  create mode 100644 drivers/net/ethernet/socionext/Makefile
>>>  create mode 100644 drivers/net/ethernet/socionext/netsec.c
>>>
>> ...
>>> diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
>>> new file mode 100644
>>> index 0000000..6af047b
>>> --- /dev/null
>>> +++ b/drivers/net/ethernet/socionext/netsec.c
>> ...
>>> +static int netsec_netdev_load_ucode_region(struct netsec_priv *priv, u32 reg,
>>> +                                          u32 addr_h, u32 addr_l, u32 size)
>>> +{
>>> +       u64 base = (u64)addr_h << 32 | addr_l;
>>> +       void __iomem *ucode;
>>> +       u32 i;
>>> +
>>> +       ucode = ioremap(base, size * sizeof(u32));
>>> +       if (!ucode)
>>> +               return -ENOMEM;
>>> +
>>> +       for (i = 0; i < size; i++)
>>> +               netsec_write(priv, reg, readl(ucode + i));
>>> +
>>
>> This is incorrect. The microcode is written one u32 word at a time,
>> and indexing ucode like this results in byte indexing, not u32
>> indexing.
>>
> Ouch! careless mistake. I was too eager to get done with netsec before
> I leave for holidays.
>
>> I changed the ucode declaration locally to
>>
>> u32 __iomem *ucode;
>>
>> and now everything works fine again.
>>
> Or we keep the void pointer but do    readl(ucode + i * 4)  ?
>

Whichever you prefer.

>
>>
>>> +       iounmap(ucode);
>>> +       return 0;
>>> +}
>>> +
>> ...
>>> +static int netsec_register_mdio(struct netsec_priv *priv, u32 phy_addr)
>>> +{
>>> +       struct mii_bus *bus;
>>> +       int ret;
>>> +
>>> +       bus = devm_mdiobus_alloc(priv->dev);
>>> +       if (!bus)
>>> +               return -ENOMEM;
>>> +
>>> +       snprintf(bus->id, MII_BUS_ID_SIZE, "%s", dev_name(priv->dev));
>>> +       bus->priv = priv;
>>> +       bus->name = "SNI NETSEC MDIO";
>>> +       bus->read = netsec_phy_read;
>>> +       bus->write = netsec_phy_write;
>>> +       bus->parent = priv->dev;
>>> +       priv->mii_bus = bus;
>>> +
>>> +       if (dev_of_node(priv->dev)) {
>>> +               struct device_node *mdio_node, *parent = dev_of_node(priv->dev);
>>> +
>>> +               mdio_node = of_get_child_by_name(parent, "mdio");
>>> +               if (mdio_node) {
>>> +                       parent = mdio_node;
>>> +               } else {
>>> +                       /* older f/w doesn't populate the mdio subnode,
>>> +                        * allow relaxed upgrade of f/w in due time.
>>> +                        */
>>> +                       dev_err(priv->dev, "Upgrade f/w for mdio subnode!\n");
>>
>> I wouldn't mind if you dropped this fallback altogether, and would
>> simply stick with the new binding only. However, if you prefer to keep
>> it, could you change this to dev_info()? It is not really an error
>> condition, and dev_err/dev_warns have the annoying tendency to pierce
>> through 'quiet' boot splashes.
>>
> Yes, it should have been dev_info. But I would like to keep it,
> atleast for a couple months. For example, my board needs jtag to
> upgrade f/w.
>

Fair enough.

> Thanks.

Likewise! And happy holidays.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [patch net-next v4 00/10] net: sched: allow qdiscs to share filter block instances
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern

From: Jiri Pirko <jiri@mellanox.com>

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification. If we don't pass any
block number, a new one will be generated by kernel:

$ tc qdisc add dev ens7 ingress block 22
                                ^^^^^^^^
$ tc qdisc add dev ens8 ingress block 22
                                ^^^^^^^^

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22

To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Now we can add filter to any of qdiscs sharing the same block:

$ tc filter add dev ens7 ingress protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop


We will see the same output if we list filters for ens7 and ens8, including stats:

$ tc -s filter show dev ens7 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 39 sec used 2 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

$ tc -s filter show dev ens8 ingress
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1 installed 40 sec used 3 sec
        Action statistics:
        Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

---
v3->v4:
- patch 1:
 - rebased on top of the current net-next
 - added some extack strings
- patch 3:
 - rebased on top of the current net-next
- patch 5:
 - propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
   caller
- patch 6:
 - rebased on top of the current net-next

v2->v3:
- removed original patch 1, removing tp->q cls_bpf dependency. Fixed by
  Jakub in the meantime.
- patch 1:
 - rebased on top of the current net-next
- patch 5:
 - new patch
- patch 6:
 - removed "p_" prefix from block index function args
- patch 9:
 - add tc offload feature handling

Jiri Pirko (10):
  net: sched: introduce support for multiple filter chain pointers
    registration
  net: sched: avoid usage of tp->q in tcf_classify
  net: sched: introduce block mechanism to handle netif_keep_dst calls
  net: sched: remove classid and q fields from tcf_proto
  net: sched: keep track of offloaded filters and check tc offload
    feature
  net: sched: allow ingress and clsact qdiscs to share filter blocks
  mlxsw: spectrum_acl: Reshuffle code around
    mlxsw_sp_acl_ruleset_create/destroy
  mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
  mlxsw: spectrum_acl: Implement TC block sharing
  mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind
    ops

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 ++++++++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  44 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 302 ++++++++++++----
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    |  44 ++-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 +--
 include/net/pkt_cls.h                              |   4 +
 include/net/sch_generic.h                          |  26 +-
 include/uapi/linux/pkt_sched.h                     |  11 +
 net/sched/cls_api.c                                | 380 ++++++++++++++++++---
 net/sched/cls_bpf.c                                |   9 +-
 net/sched/cls_flow.c                               |   2 +-
 net/sched/cls_flower.c                             |   3 +-
 net/sched/cls_matchall.c                           |   3 +-
 net/sched/cls_route.c                              |   2 +-
 net/sched/cls_u32.c                                |  13 +-
 net/sched/sch_ingress.c                            |  89 ++++-
 16 files changed, 931 insertions(+), 224 deletions(-)

-- 
2.9.5

^ permalink raw reply

* [patch net-next v4 01/10] net: sched: introduce support for multiple filter chain pointers registration
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v3->v4:
- rebased on top of the current net-next
- added some extack strings
v2->v3:
- rebased on top of the current net-next
---
 include/net/pkt_cls.h     |   3 +
 include/net/sch_generic.h |   5 +-
 net/sched/cls_api.c       | 236 +++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 216 insertions(+), 28 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 31574c9..35ab7c9 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -29,6 +29,8 @@ struct tcf_block_ext_info {
 	enum tcf_block_binder_type binder_type;
 	tcf_chain_head_change_t *chain_head_change;
 	void *chain_head_change_priv;
+	bool shareable;
+	u32 block_index;
 };
 
 struct tcf_block_cb;
@@ -50,6 +52,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 
 static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 {
+	WARN_ON(block->refcnt != 1);
 	return block->q;
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index ac029d5..5cc4d71 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -275,8 +275,7 @@ typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv);
 
 struct tcf_chain {
 	struct tcf_proto __rcu *filter_chain;
-	tcf_chain_head_change_t *chain_head_change;
-	void *chain_head_change_priv;
+	struct list_head filter_chain_list;
 	struct list_head list;
 	struct tcf_block *block;
 	u32 index; /* chain index */
@@ -285,6 +284,8 @@ struct tcf_chain {
 
 struct tcf_block {
 	struct list_head chain_list;
+	u32 index; /* block index for shared blocks */
+	unsigned int refcnt;
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 4591b87..9b1b4fa 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -24,6 +24,7 @@
 #include <linux/init.h>
 #include <linux/kmod.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -179,6 +180,12 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
 	kfree_rcu(tp, rcu);
 }
 
+struct tcf_filter_chain_list_item {
+	struct list_head list;
+	tcf_chain_head_change_t *chain_head_change;
+	void *chain_head_change_priv;
+};
+
 static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 					  u32 chain_index)
 {
@@ -187,6 +194,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
 	if (!chain)
 		return NULL;
+	INIT_LIST_HEAD(&chain->filter_chain_list);
 	list_add_tail(&chain->list, &block->chain_list);
 	chain->block = block;
 	chain->index = chain_index;
@@ -194,12 +202,19 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
 	return chain;
 }
 
+static void tcf_chain_head_change_item(struct tcf_filter_chain_list_item *item,
+				       struct tcf_proto *tp_head)
+{
+	if (item->chain_head_change)
+		item->chain_head_change(tp_head, item->chain_head_change_priv);
+}
 static void tcf_chain_head_change(struct tcf_chain *chain,
 				  struct tcf_proto *tp_head)
 {
-	if (chain->chain_head_change)
-		chain->chain_head_change(tp_head,
-					 chain->chain_head_change_priv);
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list)
+		tcf_chain_head_change_item(item, tp_head);
 }
 
 static void tcf_chain_flush(struct tcf_chain *chain)
@@ -280,17 +295,91 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
 }
 
-int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
-		      struct tcf_block_ext_info *ei,
-		      struct netlink_ext_ack *extack)
+static int
+tcf_chain_head_change_cb_add(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei,
+			     struct netlink_ext_ack *extack)
 {
-	struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+	struct tcf_filter_chain_list_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item) {
+		NL_SET_ERR_MSG(extack, "Memory allocation for head change callback item failed");
+		return -ENOMEM;
+	}
+	item->chain_head_change = ei->chain_head_change;
+	item->chain_head_change_priv = ei->chain_head_change_priv;
+	if (chain->filter_chain)
+		tcf_chain_head_change_item(item, chain->filter_chain);
+	list_add(&item->list, &chain->filter_chain_list);
+	return 0;
+}
+
+static void
+tcf_chain_head_change_cb_del(struct tcf_chain *chain,
+			     struct tcf_block_ext_info *ei)
+{
+	struct tcf_filter_chain_list_item *item;
+
+	list_for_each_entry(item, &chain->filter_chain_list, list) {
+		if ((!ei->chain_head_change && !ei->chain_head_change_priv) ||
+		    (item->chain_head_change == ei->chain_head_change &&
+		     item->chain_head_change_priv == ei->chain_head_change_priv)) {
+			tcf_chain_head_change_item(item, NULL);
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
+struct tcf_net {
+	struct idr idr;
+};
+
+static unsigned int tcf_net_id;
+
+static int tcf_block_insert(struct tcf_block *block, struct net *net,
+			    u32 block_index, struct netlink_ext_ack *extack)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+	int idr_start;
+	int idr_end;
+	int index;
+
+	if (block_index >= INT_MAX) {
+		NL_SET_ERR_MSG(extack, "Invalid block index value (>= INT_MAX)");
+		return -EINVAL;
+	}
+	idr_start = block_index ? block_index : 1;
+	idr_end = block_index ? block_index + 1 : INT_MAX;
+
+	index = idr_alloc(&tn->idr, block, idr_start, idr_end, GFP_KERNEL);
+	if (index < 0)
+		return index;
+	block->index = index;
+	return 0;
+}
+
+static void tcf_block_remove(struct tcf_block *block, struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_remove(&tn->idr, block->index);
+}
+
+static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
+					  struct netlink_ext_ack *extack)
+{
+	struct tcf_block *block;
 	struct tcf_chain *chain;
 	int err;
 
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
 	if (!block) {
 		NL_SET_ERR_MSG(extack, "Memory allocation for block failed");
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
@@ -302,17 +391,75 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		err = -ENOMEM;
 		goto err_chain_create;
 	}
-	WARN_ON(!ei->chain_head_change);
-	chain->chain_head_change = ei->chain_head_change;
-	chain->chain_head_change_priv = ei->chain_head_change_priv;
 	block->net = qdisc_net(q);
+	block->refcnt = 1;
+	block->net = net;
 	block->q = q;
+	return block;
+
+err_chain_create:
+	kfree(block);
+	return ERR_PTR(err);
+}
+
+static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	return idr_find(&tn->idr, block_index);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+	return list_first_entry(&block->chain_list, struct tcf_chain, list);
+}
+
+int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
+		      struct tcf_block_ext_info *ei,
+		      struct netlink_ext_ack *extack)
+{
+	struct net *net = qdisc_net(q);
+	struct tcf_block *block = NULL;
+	bool created = false;
+	int err;
+
+	if (ei->shareable) {
+		block = tcf_block_lookup(net, ei->block_index);
+		if (block)
+			block->refcnt++;
+	}
+
+	if (!block) {
+		block = tcf_block_create(net, q, extack);
+		if (IS_ERR(block))
+			return PTR_ERR(block);
+		created = true;
+		if (ei->shareable) {
+			err = tcf_block_insert(block, net,
+					       ei->block_index, extack);
+			if (err)
+				goto err_block_insert;
+		}
+	}
+
+	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block),
+					   ei, extack);
+	if (err)
+		goto err_chain_head_change_cb_add;
+
 	tcf_block_offload_bind(block, q, ei);
 	*p_block = block;
 	return 0;
 
-err_chain_create:
-	kfree(block);
+err_chain_head_change_cb_add:
+	if (created) {
+		if (ei->shareable)
+			tcf_block_remove(block, net);
+err_block_insert:
+		kfree(block);
+	} else {
+		block->refcnt--;
+	}
 	return err;
 }
 EXPORT_SYMBOL(tcf_block_get_ext);
@@ -346,26 +493,35 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 {
 	struct tcf_chain *chain, *tmp;
 
-	/* Hold a refcnt for all chains, so that they don't disappear
-	 * while we are iterating.
-	 */
 	if (!block)
 		return;
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_hold(chain);
 
-	list_for_each_entry(chain, &block->chain_list, list)
-		tcf_chain_flush(chain);
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
+
+	if (--block->refcnt == 0) {
+		if (ei->shareable)
+			tcf_block_remove(block, block->net);
+
+		/* Hold a refcnt for all chains, so that they don't disappear
+		 * while we are iterating.
+		 */
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_hold(chain);
+
+		list_for_each_entry(chain, &block->chain_list, list)
+			tcf_chain_flush(chain);
+	}
 
 	tcf_block_offload_unbind(block, q, ei);
 
-	/* At this point, all the chains should have refcnt >= 1. */
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_put(chain);
+	if (block->refcnt == 0) {
+		/* At this point, all the chains should have refcnt >= 1. */
+		list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+			tcf_chain_put(chain);
 
-	/* Finally, put chain 0 and allow block to be freed. */
-	chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
-	tcf_chain_put(chain);
+		/* Finally, put chain 0 and allow block to be freed. */
+		tcf_chain_put(tcf_block_chain_zero(block));
+	}
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
@@ -1250,12 +1406,40 @@ int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+static __net_init int tcf_net_init(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_init(&tn->idr);
+	return 0;
+}
+
+static void __net_exit tcf_net_exit(struct net *net)
+{
+	struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+	idr_destroy(&tn->idr);
+}
+
+static struct pernet_operations tcf_net_ops = {
+	.init = tcf_net_init,
+	.exit = tcf_net_exit,
+	.id   = &tcf_net_id,
+	.size = sizeof(struct tcf_net),
+};
+
 static int __init tc_filter_init(void)
 {
+	int err;
+
 	tc_filter_wq = alloc_ordered_workqueue("tc_filter_workqueue", 0);
 	if (!tc_filter_wq)
 		return -ENOMEM;
 
+	err = register_pernet_subsys(&tcf_net_ops);
+	if (err)
+		return err;
+
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL, 0);
 	rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 02/10] net: sched: avoid usage of tp->q in tcf_classify
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Use block index in the messages instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/sched/cls_api.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 9b1b4fa..b93eca8 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -678,8 +678,9 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 #ifdef CONFIG_NET_CLS_ACT
 reset:
 	if (unlikely(limit++ >= max_reclassify_loop)) {
-		net_notice_ratelimited("%s: reclassify loop, rule prio %u, protocol %02x\n",
-				       tp->q->ops->id, tp->prio & 0xffff,
+		net_notice_ratelimited("%u: reclassify loop, rule prio %u, protocol %02x\n",
+				       tp->chain->block->index,
+				       tp->prio & 0xffff,
 				       ntohs(tp->protocol));
 		return TC_ACT_SHOT;
 	}
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 03/10] net: sched: introduce block mechanism to handle netif_keep_dst calls
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v3->v4:
- rebased on top of the current net-next
v1->v2:
- fix binder_type to check "egress" as well as pointed out by Daniel
---
 include/net/pkt_cls.h     |  1 +
 include/net/sch_generic.h |  2 ++
 net/sched/cls_api.c       | 69 +++++++++++++++++++++++++++++++++++++++++++++++
 net/sched/cls_bpf.c       |  4 +--
 net/sched/cls_flow.c      |  2 +-
 net/sched/cls_route.c     |  2 +-
 6 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 35ab7c9..4837f4a 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -40,6 +40,7 @@ bool tcf_queue_work(struct work_struct *work);
 struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
 				bool create);
 void tcf_chain_put(struct tcf_chain *chain);
+void tcf_block_netif_keep_dst(struct tcf_block *block);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain, struct Qdisc *q,
 		  struct netlink_ext_ack *extack);
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 5cc4d71..df97c3e 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -289,6 +289,8 @@ struct tcf_block {
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
+	struct list_head owner_list;
+	bool keep_dst;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index b93eca8..8a7d77a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -383,6 +383,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	}
 	INIT_LIST_HEAD(&block->chain_list);
 	INIT_LIST_HEAD(&block->cb_list);
+	INIT_LIST_HEAD(&block->owner_list);
 
 	/* Create chain 0 by default, it has to be always present. */
 	chain = tcf_chain_create(block, 0);
@@ -414,6 +415,65 @@ static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
 	return list_first_entry(&block->chain_list, struct tcf_chain, list);
 }
 
+struct tcf_block_owner_item {
+	struct list_head list;
+	struct Qdisc *q;
+	enum tcf_block_binder_type binder_type;
+};
+
+static void
+tcf_block_owner_netif_keep_dst(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	if (block->keep_dst &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS &&
+	    binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
+		netif_keep_dst(qdisc_dev(q));
+}
+
+void tcf_block_netif_keep_dst(struct tcf_block *block)
+{
+	struct tcf_block_owner_item *item;
+
+	block->keep_dst = true;
+	list_for_each_entry(item, &block->owner_list, list)
+		tcf_block_owner_netif_keep_dst(block, item->q,
+					       item->binder_type);
+}
+EXPORT_SYMBOL(tcf_block_netif_keep_dst);
+
+static int tcf_block_owner_add(struct tcf_block *block,
+			       struct Qdisc *q,
+			       enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	item = kmalloc(sizeof(*item), GFP_KERNEL);
+	if (!item)
+		return -ENOMEM;
+	item->q = q;
+	item->binder_type = binder_type;
+	list_add(&item->list, &block->owner_list);
+	return 0;
+}
+
+static void tcf_block_owner_del(struct tcf_block *block,
+				struct Qdisc *q,
+				enum tcf_block_binder_type binder_type)
+{
+	struct tcf_block_owner_item *item;
+
+	list_for_each_entry(item, &block->owner_list, list) {
+		if (item->q == q && item->binder_type == binder_type) {
+			list_del(&item->list);
+			kfree(item);
+			return;
+		}
+	}
+	WARN_ON(1);
+}
+
 int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		      struct tcf_block_ext_info *ei,
 		      struct netlink_ext_ack *extack)
@@ -442,6 +502,12 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 		}
 	}
 
+	err = tcf_block_owner_add(block, q, ei->binder_type);
+	if (err)
+		goto err_block_owner_add;
+
+	tcf_block_owner_netif_keep_dst(block, q, ei->binder_type);
+
 	err = tcf_chain_head_change_cb_add(tcf_block_chain_zero(block),
 					   ei, extack);
 	if (err)
@@ -452,6 +518,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	return 0;
 
 err_chain_head_change_cb_add:
+	tcf_block_owner_del(block, q, ei->binder_type);
+err_block_owner_add:
 	if (created) {
 		if (ei->shareable)
 			tcf_block_remove(block, net);
@@ -497,6 +565,7 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 		return;
 
 	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
+	tcf_block_owner_del(block, q, ei->binder_type);
 
 	if (--block->refcnt == 0) {
 		if (ei->shareable)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 8d78e7f..d79cc50 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -392,8 +392,8 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
 	prog->bpf_name = name;
 	prog->filter = fp;
 
-	if (fp->dst_needed && !(tp->q->flags & TCQ_F_INGRESS))
-		netif_keep_dst(qdisc_dev(tp->q));
+	if (fp->dst_needed)
+		tcf_block_netif_keep_dst(tp->chain->block);
 
 	return 0;
 }
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 25c2a88..28cd6fb 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -526,7 +526,7 @@ static int flow_change(struct net *net, struct sk_buff *in_skb,
 
 	timer_setup(&fnew->perturb_timer, flow_perturbation, TIMER_DEFERRABLE);
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 
 	if (tb[TCA_FLOW_KEYS]) {
 		fnew->keymask = keymask;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index ac9a5b8..a1f2b1b 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -527,7 +527,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
 		if (f->handle < f1->handle)
 			break;
 
-	netif_keep_dst(qdisc_dev(tp->q));
+	tcf_block_netif_keep_dst(tp->chain->block);
 	rcu_assign_pointer(f->next, f1);
 	rcu_assign_pointer(*fp, f);
 
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 04/10] net: sched: remove classid and q fields from tcf_proto
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Both are no longer used, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 2 --
 net/sched/cls_api.c       | 7 ++-----
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index df97c3e..dba2214 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -255,8 +255,6 @@ struct tcf_proto {
 
 	/* All the rest */
 	u32			prio;
-	u32			classid;
-	struct Qdisc		*q;
 	void			*data;
 	const struct tcf_proto_ops	*ops;
 	struct tcf_chain	*chain;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8a7d77a..07767b4 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -122,8 +122,7 @@ static inline u32 tcf_auto_prio(struct tcf_proto *tp)
 }
 
 static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
-					  u32 prio, u32 parent, struct Qdisc *q,
-					  struct tcf_chain *chain)
+					  u32 prio, struct tcf_chain *chain)
 {
 	struct tcf_proto *tp;
 	int err;
@@ -157,8 +156,6 @@ static struct tcf_proto *tcf_proto_create(const char *kind, u32 protocol,
 	tp->classify = tp->ops->classify;
 	tp->protocol = protocol;
 	tp->prio = prio;
-	tp->classid = parent;
-	tp->q = q;
 	tp->chain = chain;
 
 	err = tp->ops->init(tp);
@@ -1075,7 +1072,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 			prio = tcf_auto_prio(tcf_chain_tp_prev(&chain_info));
 
 		tp = tcf_proto_create(nla_data(tca[TCA_KIND]),
-				      protocol, prio, parent, q, chain);
+				      protocol, prio, chain);
 		if (IS_ERR(tp)) {
 			err = PTR_ERR(tp);
 			goto errout;
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 05/10] net: sched: keep track of offloaded filters and check tc offload feature
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

During block bind, we need to check tc offload feature. If it is
disabled yet still the block contains offloaded filters, forbid the
bind. Also forbid to register callback for a block that already
containes offloaded filters, as the play back is not supported now.
For keeping track of offloaded filters there is a new counter
introduced, alongside with couple of helpers called from cls_* code.
These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v3->v4:
- propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
  caller
v2->v3:
- new patch

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/sch_generic.h | 17 +++++++++++++
 net/sched/cls_api.c       | 63 ++++++++++++++++++++++++++++++++++++-----------
 net/sched/cls_bpf.c       |  5 +++-
 net/sched/cls_flower.c    |  3 ++-
 net/sched/cls_matchall.c  |  3 ++-
 net/sched/cls_u32.c       | 13 +++++-----
 6 files changed, 81 insertions(+), 23 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index dba2214..22a3a1d 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -289,8 +289,25 @@ struct tcf_block {
 	struct list_head cb_list;
 	struct list_head owner_list;
 	bool keep_dst;
+	unsigned int offloadcnt;
 };
 
+static inline void tcf_block_offload_inc(struct tcf_block *block, u32 *flags)
+{
+	if (*flags & TCA_CLS_FLAGS_IN_HW)
+		return;
+	*flags |= TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt++;
+}
+
+static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
+{
+	if (!(*flags & TCA_CLS_FLAGS_IN_HW))
+		return;
+	*flags &= ~TCA_CLS_FLAGS_IN_HW;
+	block->offloadcnt--;
+}
+
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 07767b4..37eea70 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -265,31 +265,54 @@ void tcf_chain_put(struct tcf_chain *chain)
 }
 EXPORT_SYMBOL(tcf_chain_put);
 
-static void tcf_block_offload_cmd(struct tcf_block *block, struct Qdisc *q,
-				  struct tcf_block_ext_info *ei,
-				  enum tc_block_command command)
+static bool tcf_block_offload_in_use(struct tcf_block *block)
+{
+	return block->offloadcnt;
+}
+
+static int tcf_block_offload_cmd(struct tcf_block *block,
+				 struct net_device *dev,
+				 struct tcf_block_ext_info *ei,
+				 enum tc_block_command command)
 {
-	struct net_device *dev = q->dev_queue->dev;
 	struct tc_block_offload bo = {};
 
-	if (!dev->netdev_ops->ndo_setup_tc)
-		return;
 	bo.command = command;
 	bo.binder_type = ei->binder_type;
 	bo.block = block;
-	dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
 }
 
-static void tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
-				   struct tcf_block_ext_info *ei)
+static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
+				  struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_BIND);
+	struct net_device *dev = q->dev_queue->dev;
+	int err;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		return 0;
+
+	/* If tc offload feature is disabled and the block we try to bind
+	 * to already has some offloaded filters, forbid to bind.
+	 */
+	if (!tc_can_offload(dev) && tcf_block_offload_in_use(block))
+		return -EOPNOTSUPP;
+
+	err = tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_BIND);
+	if (err == -EOPNOTSUPP)
+		/* Driver does not support binding. */
+		return 0;
+	return err;
 }
 
 static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
 				     struct tcf_block_ext_info *ei)
 {
-	tcf_block_offload_cmd(block, q, ei, TC_BLOCK_UNBIND);
+	struct net_device *dev = q->dev_queue->dev;
+
+	if (!dev->netdev_ops->ndo_setup_tc)
+		return;
+	tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_UNBIND);
 }
 
 static int
@@ -510,10 +533,15 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 	if (err)
 		goto err_chain_head_change_cb_add;
 
-	tcf_block_offload_bind(block, q, ei);
+	err = tcf_block_offload_bind(block, q, ei);
+	if (err)
+		goto err_block_offload_bind;
+
 	*p_block = block;
 	return 0;
 
+err_block_offload_bind:
+	tcf_chain_head_change_cb_del(tcf_block_chain_zero(block), ei);
 err_chain_head_change_cb_add:
 	tcf_block_owner_del(block, q, ei->binder_type);
 err_block_owner_add:
@@ -643,9 +671,16 @@ struct tcf_block_cb *__tcf_block_cb_register(struct tcf_block *block,
 {
 	struct tcf_block_cb *block_cb;
 
+	/* At this point, playback of previous block cb calls is not supported,
+	 * so forbid to register to block which already has some offloaded
+	 * filters present.
+	 */
+	if (tcf_block_offload_in_use(block))
+		return ERR_PTR(-EOPNOTSUPP);
+
 	block_cb = kzalloc(sizeof(*block_cb), GFP_KERNEL);
 	if (!block_cb)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	block_cb->cb = cb;
 	block_cb->cb_ident = cb_ident;
 	block_cb->cb_priv = cb_priv;
@@ -661,7 +696,7 @@ int tcf_block_cb_register(struct tcf_block *block,
 	struct tcf_block_cb *block_cb;
 
 	block_cb = __tcf_block_cb_register(block, cb, cb_ident, cb_priv);
-	return block_cb ? 0 : -ENOMEM;
+	return IS_ERR(block_cb) ? PTR_ERR(block_cb) : 0;
 }
 EXPORT_SYMBOL(tcf_block_cb_register);
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index d79cc50..cf72aef 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -167,13 +167,16 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 	cls_bpf.exts_integrated = obj->exts_integrated;
 	cls_bpf.gen_flags = obj->gen_flags;
 
+	if (oldprog)
+		tcf_block_offload_dec(block, &oldprog->gen_flags);
+
 	err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSBPF, &cls_bpf, skip_sw);
 	if (prog) {
 		if (err < 0) {
 			cls_bpf_offload_cmd(tp, oldprog, prog);
 			return err;
 		} else if (err > 0) {
-			prog->gen_flags |= TCA_CLS_FLAGS_IN_HW;
+			tcf_block_offload_inc(block, &prog->gen_flags);
 		}
 	}
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 6132a73..f61df19 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -229,6 +229,7 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f)
 
 	tc_setup_cb_call(block, &f->exts, TC_SETUP_CLSFLOWER,
 			 &cls_flower, false);
+	tcf_block_offload_dec(block, &f->flags);
 }
 
 static int fl_hw_replace_filter(struct tcf_proto *tp,
@@ -256,7 +257,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
 		fl_hw_destroy_filter(tp, f);
 		return err;
 	} else if (err > 0) {
-		f->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &f->flags);
 	}
 
 	if (skip_sw && !(f->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 66d4e00..d0e57c8 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -81,6 +81,7 @@ static void mall_destroy_hw_filter(struct tcf_proto *tp,
 	cls_mall.cookie = cookie;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSMATCHALL, &cls_mall, false);
+	tcf_block_offload_dec(block, &head->flags);
 }
 
 static int mall_replace_hw_filter(struct tcf_proto *tp,
@@ -103,7 +104,7 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
 		mall_destroy_hw_filter(tp, head, cookie);
 		return err;
 	} else if (err > 0) {
-		head->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &head->flags);
 	}
 
 	if (skip_sw && !(head->flags & TCA_CLS_FLAGS_IN_HW))
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 507859c..020d328 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -529,16 +529,17 @@ static int u32_replace_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	return 0;
 }
 
-static void u32_remove_hw_knode(struct tcf_proto *tp, u32 handle)
+static void u32_remove_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n)
 {
 	struct tcf_block *block = tp->chain->block;
 	struct tc_cls_u32_offload cls_u32 = {};
 
 	tc_cls_common_offload_init(&cls_u32.common, tp);
 	cls_u32.command = TC_CLSU32_DELETE_KNODE;
-	cls_u32.knode.handle = handle;
+	cls_u32.knode.handle = n->handle;
 
 	tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, false);
+	tcf_block_offload_dec(block, &n->flags);
 }
 
 static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
@@ -567,10 +568,10 @@ static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
 
 	err = tc_setup_cb_call(block, NULL, TC_SETUP_CLSU32, &cls_u32, skip_sw);
 	if (err < 0) {
-		u32_remove_hw_knode(tp, n->handle);
+		u32_remove_hw_knode(tp, n);
 		return err;
 	} else if (err > 0) {
-		n->flags |= TCA_CLS_FLAGS_IN_HW;
+		tcf_block_offload_inc(block, &n->flags);
 	}
 
 	if (skip_sw && !(n->flags & TCA_CLS_FLAGS_IN_HW))
@@ -589,7 +590,7 @@ static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht)
 			RCU_INIT_POINTER(ht->ht[h],
 					 rtnl_dereference(n->next));
 			tcf_unbind_filter(tp, &n->res);
-			u32_remove_hw_knode(tp, n->handle);
+			u32_remove_hw_knode(tp, n);
 			idr_remove_ext(&ht->handle_idr, n->handle);
 			if (tcf_exts_get_net(&n->exts))
 				call_rcu(&n->rcu, u32_delete_key_freepf_rcu);
@@ -682,7 +683,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last)
 		goto out;
 
 	if (TC_U32_KEY(ht->handle)) {
-		u32_remove_hw_knode(tp, ht->handle);
+		u32_remove_hw_knode(tp, (struct tc_u_knode *)ht);
 		ret = u32_delete_key(tp, (struct tc_u_knode *)ht);
 		goto out;
 	}
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 06/10] net: sched: allow ingress and clsact qdiscs to share filter blocks
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v3->v4:
- rebased on top of the current net-next
v2->v3:
- removed "p_" prefix from block index function args
---
 include/uapi/linux/pkt_sched.h | 11 ++++++
 net/sched/sch_ingress.c        | 89 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..8cc554a 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -934,4 +934,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index 7ca2be2..debcb8f 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -61,6 +61,29 @@ static void clsact_chain_head_change(struct tcf_proto *tp_head, void *priv)
 	struct mini_Qdisc_pair *miniqp = priv;
 
 	mini_qdisc_pair_swap(miniqp, tp_head);
+};
+
+static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int ingress_parse_opt(struct nlattr *opt, u32 *ingress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*ingress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, ingress_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	return 0;
 }
 
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt,
@@ -74,6 +97,11 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt,
 
 	mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress);
 
+	err = ingress_parse_opt(opt, &q->block_info.block_index);
+	if (err)
+		return err;
+
+	q->block_info.shareable = true;
 	q->block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->block_info.chain_head_change = clsact_chain_head_change;
 	q->block_info.chain_head_change_priv = &q->miniqp;
@@ -97,11 +125,14 @@ static void ingress_destroy(struct Qdisc *sch)
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+	struct ingress_sched_data *q = qdisc_priv(sch);
 	struct nlattr *nest;
 
 	nest = nla_nest_start(skb, TCA_OPTIONS);
 	if (nest == NULL)
 		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->block->index))
+		goto nla_put_failure;
 
 	return nla_nest_end(skb, nest);
 
@@ -170,6 +201,35 @@ static struct tcf_block *clsact_tcf_block(struct Qdisc *sch, unsigned long cl,
 	}
 }
 
+static const struct nla_policy clsact_policy[TCA_CLSACT_MAX + 1] = {
+	[TCA_CLSACT_INGRESS_BLOCK]	= { .type = NLA_U32 },
+	[TCA_CLSACT_EGRESS_BLOCK]	= { .type = NLA_U32 },
+};
+
+static int clsact_parse_opt(struct nlattr *opt, u32 *ingress_block_index,
+			    u32 *egress_block_index)
+{
+	struct nlattr *tb[TCA_CLSACT_MAX + 1];
+	int err;
+
+	*ingress_block_index = 0;
+	*egress_block_index = 0;
+
+	if (!opt)
+		return 0;
+	err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, clsact_policy, NULL);
+	if (err)
+		return err;
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK])
+		*ingress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+	if (tb[TCA_CLSACT_EGRESS_BLOCK])
+		*egress_block_index =
+			nla_get_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+	return 0;
+}
+
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
 		       struct netlink_ext_ack *extack)
 {
@@ -182,6 +242,12 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
 
 	mini_qdisc_pair_init(&q->miniqp_ingress, sch, &dev->miniq_ingress);
 
+	err = clsact_parse_opt(opt, &q->ingress_block_info.block_index,
+			       &q->egress_block_info.block_index);
+	if (err)
+		return err;
+
+	q->ingress_block_info.shareable = true;
 	q->ingress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
 	q->ingress_block_info.chain_head_change = clsact_chain_head_change;
 	q->ingress_block_info.chain_head_change_priv = &q->miniqp_ingress;
@@ -193,6 +259,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
 
 	mini_qdisc_pair_init(&q->miniqp_egress, sch, &dev->miniq_egress);
 
+	q->egress_block_info.shareable = true;
 	q->egress_block_info.binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS;
 	q->egress_block_info.chain_head_change = clsact_chain_head_change;
 	q->egress_block_info.chain_head_change_priv = &q->miniqp_egress;
@@ -218,6 +285,26 @@ static void clsact_destroy(struct Qdisc *sch)
 	net_dec_egress_queue();
 }
 
+static int clsact_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct clsact_sched_data *q = qdisc_priv(sch);
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->ingress_block->index))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, TCA_CLSACT_EGRESS_BLOCK, q->egress_block->index))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, nest);
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -1;
+}
+
 static const struct Qdisc_class_ops clsact_class_ops = {
 	.leaf		=	ingress_leaf,
 	.find		=	clsact_find,
@@ -233,7 +320,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct clsact_sched_data),
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
-	.dump		=	ingress_dump,
+	.dump		=	clsact_dump,
 	.owner		=	THIS_MODULE,
 };
 
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 07/10] mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 102 ++++++++++-----------
 1 file changed, 46 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 93dcd31..ead4cb8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -118,8 +118,26 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
+static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+				     struct mlxsw_sp_acl_ruleset *ruleset,
+				     struct net_device *dev, bool ingress)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+}
+
+static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+					struct mlxsw_sp_acl_ruleset *ruleset)
+{
+	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+}
+
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
+			    bool ingress, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -132,6 +150,9 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
+	ruleset->ht_key.dev = dev;
+	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
 	err = rhashtable_init(&ruleset->rule_ht, &mlxsw_sp_acl_rule_ht_params);
@@ -142,68 +163,49 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		goto err_ops_ruleset_add;
 
-	return ruleset;
-
-err_ops_ruleset_add:
-	rhashtable_destroy(&ruleset->rule_ht);
-err_rhashtable_init:
-	kfree(ruleset);
-	return ERR_PTR(err);
-}
-
-static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
-					 struct mlxsw_sp_acl_ruleset *ruleset)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-
-	ops->ruleset_del(mlxsw_sp, ruleset->priv);
-	rhashtable_destroy(&ruleset->rule_ht);
-	kfree(ruleset);
-}
-
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress,
-				     u32 chain_index)
-{
-	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
-	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
-	int err;
-
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
-	ruleset->ht_key.chain_index = chain_index;
 	err = rhashtable_insert_fast(&acl->ruleset_ht, &ruleset->ht_node,
 				     mlxsw_sp_acl_ruleset_ht_params);
 	if (err)
-		return err;
-	if (!ruleset->ht_key.chain_index) {
+		goto err_ht_insert;
+
+	if (!chain_index) {
 		/* We only need ruleset with chain index 0, the implicit one,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
+						dev, ingress);
 		if (err)
-			goto err_ops_ruleset_bind;
+			goto err_ruleset_bind;
 	}
-	return 0;
 
-err_ops_ruleset_bind:
+	return ruleset;
+
+err_ruleset_bind:
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
-	return err;
+err_ht_insert:
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+err_ops_ruleset_add:
+	rhashtable_destroy(&ruleset->rule_ht);
+err_rhashtable_init:
+	kfree(ruleset);
+	return ERR_PTR(err);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
+					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
+	ops->ruleset_del(mlxsw_sp, ruleset->priv);
+	rhashtable_destroy(&ruleset->rule_ht);
+	kfree(ruleset);
 }
 
 static void mlxsw_sp_acl_ruleset_ref_inc(struct mlxsw_sp_acl_ruleset *ruleset)
@@ -216,7 +218,6 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 {
 	if (--ruleset->ref_count)
 		return;
-	mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
 	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
 }
 
@@ -263,7 +264,6 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	const struct mlxsw_sp_acl_profile_ops *ops;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 	struct mlxsw_sp_acl_ruleset *ruleset;
-	int err;
 
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
@@ -275,18 +275,8 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	ruleset = mlxsw_sp_acl_ruleset_create(mlxsw_sp, ops);
-	if (IS_ERR(ruleset))
-		return ruleset;
-	err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset, dev,
-					ingress, chain_index);
-	if (err)
-		goto err_ruleset_bind;
-	return ruleset;
-
-err_ruleset_bind:
-	mlxsw_sp_acl_ruleset_destroy(mlxsw_sp, ruleset);
-	return ERR_PTR(err);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
+					   chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 08/10] mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Instead, pass netdev and ingress flag to ruleset unbind op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  3 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  9 ++++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 33 +++++++++++-----------
 3 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index a0adcd8..523e64e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,7 +477,8 @@ struct mlxsw_sp_acl_profile_ops {
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
 			    struct net_device *dev, bool ingress);
-	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
+	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
+			       struct net_device *dev, bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index ead4cb8..7fb41a4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -128,11 +128,12 @@ static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 }
 
 static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset)
+					struct mlxsw_sp_acl_ruleset *ruleset,
+					struct net_device *dev, bool ingress)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
 }
 
 static struct mlxsw_sp_acl_ruleset *
@@ -200,7 +201,9 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
 	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset);
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
+					    ruleset->ht_key.dev,
+					    ruleset->ht_key.ingress);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 7e8284b..50b2f9a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -154,10 +154,6 @@ struct mlxsw_sp_acl_tcam_group {
 	struct list_head region_list;
 	unsigned int region_count;
 	struct rhashtable chunk_ht;
-	struct {
-		u16 local_port;
-		bool ingress;
-	} bound;
 	struct mlxsw_sp_acl_tcam_group_ops *ops;
 	const struct mlxsw_sp_acl_tcam_pattern *patterns;
 	unsigned int patterns_count;
@@ -271,26 +267,28 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 		return -EINVAL;
 
 	mlxsw_sp_port = netdev_priv(dev);
-	group->bound.local_port = mlxsw_sp_port->local_port;
-	group->bound.ingress = ingress;
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_BIND, group->bound.local_port,
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
 			    group->id);
 	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
 
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
-			       struct mlxsw_sp_acl_tcam_group *group)
+			       struct mlxsw_sp_acl_tcam_group *group,
+			       struct net_device *dev, bool ingress)
 {
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	mlxsw_reg_ppbt_pack(ppbt_pl,
-			    group->bound.ingress ? MLXSW_REG_PXBT_E_IACL :
-						   MLXSW_REG_PXBT_E_EACL,
-			    MLXSW_REG_PXBT_OP_UNBIND, group->bound.local_port,
+	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
+		return;
+
+	mlxsw_sp_port = netdev_priv(dev);
+	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
+					       MLXSW_REG_PXBT_E_EACL,
+			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
 			    group->id);
 	mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ppbt), ppbt_pl);
 }
@@ -1066,11 +1064,12 @@ mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					void *ruleset_priv)
+					void *ruleset_priv,
+					struct net_device *dev, bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 09/10] mlxsw: spectrum_acl: Implement TC block sharing
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v2->v3:
- add tc offload feature handling
v1->v2:
- new patch
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     | 182 +++++++++++++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  37 +++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 237 ++++++++++++++++++---
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  41 ++--
 4 files changed, 401 insertions(+), 96 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index d373df7..fa2896f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1747,72 +1747,186 @@ static int mlxsw_sp_setup_tc_cls_matchall(struct mlxsw_sp_port *mlxsw_sp_port,
 }
 
 static int
-mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_port *mlxsw_sp_port,
-			     struct tc_cls_flower_offload *f,
-			     bool ingress)
+mlxsw_sp_setup_tc_cls_flower(struct mlxsw_sp_acl_block *acl_block,
+			     struct tc_cls_flower_offload *f)
 {
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_acl_block_mlxsw_sp(acl_block);
+
 	switch (f->command) {
 	case TC_CLSFLOWER_REPLACE:
-		return mlxsw_sp_flower_replace(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_replace(mlxsw_sp, acl_block, f);
 	case TC_CLSFLOWER_DESTROY:
-		mlxsw_sp_flower_destroy(mlxsw_sp_port, ingress, f);
+		mlxsw_sp_flower_destroy(mlxsw_sp, acl_block, f);
 		return 0;
 	case TC_CLSFLOWER_STATS:
-		return mlxsw_sp_flower_stats(mlxsw_sp_port, ingress, f);
+		return mlxsw_sp_flower_stats(mlxsw_sp, acl_block, f);
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
-				      void *cb_priv, bool ingress)
+static int mlxsw_sp_setup_tc_block_cb_matchall(enum tc_setup_type type,
+					       void *type_data,
+					       void *cb_priv, bool ingress)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = cb_priv;
 
-	if (!tc_can_offload(mlxsw_sp_port->dev))
-		return -EOPNOTSUPP;
-
 	switch (type) {
 	case TC_SETUP_CLSMATCHALL:
+		if (!tc_can_offload(mlxsw_sp_port->dev))
+			return -EOPNOTSUPP;
+
 		return mlxsw_sp_setup_tc_cls_matchall(mlxsw_sp_port, type_data,
 						      ingress);
 	case TC_SETUP_CLSFLOWER:
-		return mlxsw_sp_setup_tc_cls_flower(mlxsw_sp_port, type_data,
-						    ingress);
+		return 0;
 	default:
 		return -EOPNOTSUPP;
 	}
 }
 
-static int mlxsw_sp_setup_tc_block_cb_ig(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_ig(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, true);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, true);
 }
 
-static int mlxsw_sp_setup_tc_block_cb_eg(enum tc_setup_type type,
-					 void *type_data, void *cb_priv)
+static int mlxsw_sp_setup_tc_block_cb_matchall_eg(enum tc_setup_type type,
+						  void *type_data,
+						  void *cb_priv)
 {
-	return mlxsw_sp_setup_tc_block_cb(type, type_data, cb_priv, false);
+	return mlxsw_sp_setup_tc_block_cb_matchall(type, type_data,
+						   cb_priv, false);
+}
+
+static int mlxsw_sp_setup_tc_block_cb_flower(enum tc_setup_type type,
+					     void *type_data, void *cb_priv)
+{
+	struct mlxsw_sp_acl_block *acl_block = cb_priv;
+
+	switch (type) {
+	case TC_SETUP_CLSMATCHALL:
+		return 0;
+	case TC_SETUP_CLSFLOWER:
+		if (mlxsw_sp_acl_block_disabled(acl_block))
+			return -EOPNOTSUPP;
+
+		return mlxsw_sp_setup_tc_cls_flower(acl_block, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+mlxsw_sp_setup_tc_block_flower_bind(struct mlxsw_sp_port *mlxsw_sp_port,
+				    struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb) {
+		acl_block = mlxsw_sp_acl_block_create(mlxsw_sp, block->net);
+		if (!acl_block)
+			return -ENOMEM;
+		block_cb = __tcf_block_cb_register(block,
+						   mlxsw_sp_setup_tc_block_cb_flower,
+						   mlxsw_sp, acl_block);
+		if (IS_ERR(block_cb)) {
+			err = PTR_ERR(block_cb);
+			goto err_cb_register;
+		}
+	} else {
+		acl_block = tcf_block_cb_priv(block_cb);
+	}
+	tcf_block_cb_incref(block_cb);
+	err = mlxsw_sp_acl_block_bind(mlxsw_sp, acl_block,
+				      mlxsw_sp_port, ingress);
+	if (err)
+		goto err_block_bind;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = acl_block;
+	else
+		mlxsw_sp_port->eg_acl_block = acl_block;
+
+	return 0;
+
+err_block_bind:
+	if (!tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+err_cb_register:
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
+	return err;
+}
+
+static void
+mlxsw_sp_setup_tc_block_flower_unbind(struct mlxsw_sp_port *mlxsw_sp_port,
+				      struct tcf_block *block, bool ingress)
+{
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	struct mlxsw_sp_acl_block *acl_block;
+	struct tcf_block_cb *block_cb;
+	int err;
+
+	block_cb = tcf_block_cb_lookup(block, mlxsw_sp_setup_tc_block_cb_flower,
+				       mlxsw_sp);
+	if (!block_cb)
+		return;
+
+	if (ingress)
+		mlxsw_sp_port->ing_acl_block = NULL;
+	else
+		mlxsw_sp_port->eg_acl_block = NULL;
+
+	acl_block = tcf_block_cb_priv(block_cb);
+	err = mlxsw_sp_acl_block_unbind(mlxsw_sp, acl_block,
+					mlxsw_sp_port, ingress);
+	if (!err && !tcf_block_cb_decref(block_cb)) {
+		__tcf_block_cb_unregister(block_cb);
+		mlxsw_sp_acl_block_destroy(acl_block);
+	}
 }
 
 static int mlxsw_sp_setup_tc_block(struct mlxsw_sp_port *mlxsw_sp_port,
 				   struct tc_block_offload *f)
 {
 	tc_setup_cb_t *cb;
+	bool ingress;
+	int err;
 
-	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_ig;
-	else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS)
-		cb = mlxsw_sp_setup_tc_block_cb_eg;
-	else
+	if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_ig;
+		ingress = true;
+	} else if (f->binder_type == TCF_BLOCK_BINDER_TYPE_CLSACT_EGRESS) {
+		cb = mlxsw_sp_setup_tc_block_cb_matchall_eg;
+		ingress = false;
+	} else {
 		return -EOPNOTSUPP;
+	}
 
 	switch (f->command) {
 	case TC_BLOCK_BIND:
-		return tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
-					     mlxsw_sp_port);
+		err = tcf_block_cb_register(f->block, cb, mlxsw_sp_port,
+					    mlxsw_sp_port);
+		if (err)
+			return err;
+		err = mlxsw_sp_setup_tc_block_flower_bind(mlxsw_sp_port,
+							  f->block, ingress);
+		if (err) {
+			tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
+			return err;
+		}
+		return 0;
 	case TC_BLOCK_UNBIND:
+		mlxsw_sp_setup_tc_block_flower_unbind(mlxsw_sp_port,
+						      f->block, ingress);
 		tcf_block_cb_unregister(f->block, cb, mlxsw_sp_port);
 		return 0;
 	default:
@@ -1840,10 +1954,18 @@ static int mlxsw_sp_feature_hw_tc(struct net_device *dev, bool enable)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
 
-	if (!enable && (mlxsw_sp_port->acl_rule_count ||
-			!list_empty(&mlxsw_sp_port->mall_tc_list))) {
-		netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
-		return -EINVAL;
+	if (!enable) {
+		if (mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->ing_acl_block) ||
+		    mlxsw_sp_acl_block_rule_count(mlxsw_sp_port->eg_acl_block) ||
+		    !list_empty(&mlxsw_sp_port->mall_tc_list)) {
+			netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
+			return -EINVAL;
+		}
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_inc(mlxsw_sp_port->eg_acl_block);
+	} else {
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->ing_acl_block);
+		mlxsw_sp_acl_block_disable_dec(mlxsw_sp_port->eg_acl_block);
 	}
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 523e64e..ab6ada7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -270,7 +270,8 @@ struct mlxsw_sp_port {
 	struct mlxsw_sp_port_sample *sample;
 	struct list_head vlans_list;
 	struct mlxsw_sp_qdisc root_qdisc;
-	unsigned acl_rule_count;
+	struct mlxsw_sp_acl_block *ing_acl_block;
+	struct mlxsw_sp_acl_block *eg_acl_block;
 };
 
 static inline bool
@@ -498,17 +499,34 @@ struct mlxsw_sp_acl_ops {
 				       enum mlxsw_sp_acl_profile profile);
 };
 
+struct mlxsw_sp_acl_block;
 struct mlxsw_sp_acl_ruleset;
 
 /* spectrum_acl.c */
 struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl);
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block);
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block);
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block);
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block);
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net);
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block);
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile);
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile);
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
 			      struct mlxsw_sp_acl_ruleset *ruleset);
@@ -575,11 +593,14 @@ void mlxsw_sp_acl_fini(struct mlxsw_sp *mlxsw_sp);
 extern const struct mlxsw_sp_acl_ops mlxsw_sp_acl_tcam_ops;
 
 /* spectrum_flower.c */
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f);
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f);
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f);
 
 /* spectrum_qdisc.c */
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index 7fb41a4..f98bca9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -39,6 +39,7 @@
 #include <linux/string.h>
 #include <linux/rhashtable.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/tc_act/tc_vlan.h>
 
 #include "reg.h"
@@ -70,9 +71,23 @@ struct mlxsw_afk *mlxsw_sp_acl_afk(struct mlxsw_sp_acl *acl)
 	return acl->afk;
 }
 
-struct mlxsw_sp_acl_ruleset_ht_key {
-	struct net_device *dev; /* dev this ruleset is bound to */
+struct mlxsw_sp_acl_block_binding {
+	struct list_head list;
+	struct net_device *dev;
+	struct mlxsw_sp_port *mlxsw_sp_port;
 	bool ingress;
+};
+
+struct mlxsw_sp_acl_block {
+	struct list_head binding_list;
+	struct mlxsw_sp_acl_ruleset *ruleset_zero;
+	struct mlxsw_sp *mlxsw_sp;
+	unsigned int rule_count;
+	unsigned int disable_count;
+};
+
+struct mlxsw_sp_acl_ruleset_ht_key {
+	struct mlxsw_sp_acl_block *block;
 	u32 chain_index;
 	const struct mlxsw_sp_acl_profile_ops *ops;
 };
@@ -118,27 +133,185 @@ struct mlxsw_sp_fid *mlxsw_sp_acl_dummy_fid(struct mlxsw_sp *mlxsw_sp)
 	return mlxsw_sp->acl->dummy_fid;
 }
 
-static int mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
-				     struct mlxsw_sp_acl_ruleset *ruleset,
-				     struct net_device *dev, bool ingress)
+struct mlxsw_sp *mlxsw_sp_acl_block_mlxsw_sp(struct mlxsw_sp_acl_block *block)
+{
+	return block->mlxsw_sp;
+}
+
+unsigned int mlxsw_sp_acl_block_rule_count(struct mlxsw_sp_acl_block *block)
+{
+	return block ? block->rule_count : 0;
+}
+
+void mlxsw_sp_acl_block_disable_inc(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count++;
+}
+
+void mlxsw_sp_acl_block_disable_dec(struct mlxsw_sp_acl_block *block)
+{
+	if (block)
+		block->disable_count--;
+}
+
+bool mlxsw_sp_acl_block_disabled(struct mlxsw_sp_acl_block *block)
 {
+	return block->disable_count;
+}
+
+static int
+mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_acl_block_binding *binding)
+{
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	return ops->ruleset_bind(mlxsw_sp, ruleset->priv, dev, ingress);
+	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
+				 binding->mlxsw_sp_port->dev, binding->ingress);
 }
 
-static void mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
-					struct mlxsw_sp_acl_ruleset *ruleset,
-					struct net_device *dev, bool ingress)
+static void
+mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_acl_block_binding *binding)
 {
+	struct mlxsw_sp_acl_ruleset *ruleset = block->ruleset_zero;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
-	ops->ruleset_unbind(mlxsw_sp, ruleset->priv, dev, ingress);
+	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
+			    binding->mlxsw_sp_port->dev, binding->ingress);
+}
+
+static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
+{
+	return block->ruleset_zero;
+}
+
+static int
+mlxsw_sp_acl_ruleset_block_bind(struct mlxsw_sp *mlxsw_sp,
+				struct mlxsw_sp_acl_ruleset *ruleset,
+				struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	block->ruleset_zero = ruleset;
+	list_for_each_entry(binding, &block->binding_list, list) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto rollback;
+	}
+	return 0;
+
+rollback:
+	list_for_each_entry_continue_reverse(binding, &block->binding_list,
+					     list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+
+	return err;
+}
+
+static void
+mlxsw_sp_acl_ruleset_block_unbind(struct mlxsw_sp *mlxsw_sp,
+				  struct mlxsw_sp_acl_ruleset *ruleset,
+				  struct mlxsw_sp_acl_block *block)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+	block->ruleset_zero = NULL;
+}
+
+struct mlxsw_sp_acl_block *mlxsw_sp_acl_block_create(struct mlxsw_sp *mlxsw_sp,
+						     struct net *net)
+{
+	struct mlxsw_sp_acl_block *block;
+
+	block = kzalloc(sizeof(*block), GFP_KERNEL);
+	if (!block)
+		return NULL;
+	INIT_LIST_HEAD(&block->binding_list);
+	block->mlxsw_sp = mlxsw_sp;
+	return block;
+}
+
+void mlxsw_sp_acl_block_destroy(struct mlxsw_sp_acl_block *block)
+{
+	WARN_ON(!list_empty(&block->binding_list));
+	kfree(block);
+}
+
+static struct mlxsw_sp_acl_block_binding *
+mlxsw_sp_acl_block_lookup(struct mlxsw_sp_acl_block *block,
+			  struct mlxsw_sp_port *mlxsw_sp_port, bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	list_for_each_entry(binding, &block->binding_list, list)
+		if (binding->mlxsw_sp_port == mlxsw_sp_port &&
+		    binding->ingress == ingress)
+			return binding;
+	return NULL;
+}
+
+int mlxsw_sp_acl_block_bind(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+	int err;
+
+	if (WARN_ON(mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress)))
+		return -EEXIST;
+
+	binding = kzalloc(sizeof(*binding), GFP_KERNEL);
+	if (!binding)
+		return -ENOMEM;
+	binding->mlxsw_sp_port = mlxsw_sp_port;
+	binding->ingress = ingress;
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block)) {
+		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, block, binding);
+		if (err)
+			goto err_ruleset_bind;
+	}
+
+	list_add(&binding->list, &block->binding_list);
+	return 0;
+
+err_ruleset_bind:
+	kfree(binding);
+	return err;
+}
+
+int mlxsw_sp_acl_block_unbind(struct mlxsw_sp *mlxsw_sp,
+			      struct mlxsw_sp_acl_block *block,
+			      struct mlxsw_sp_port *mlxsw_sp_port,
+			      bool ingress)
+{
+	struct mlxsw_sp_acl_block_binding *binding;
+
+	binding = mlxsw_sp_acl_block_lookup(block, mlxsw_sp_port, ingress);
+	if (!binding)
+		return -ENOENT;
+
+	list_del(&binding->list);
+
+	if (mlxsw_sp_acl_ruleset_block_bound(block))
+		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, block, binding);
+
+	kfree(binding);
+	return 0;
 }
 
 static struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
@@ -151,8 +324,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ruleset)
 		return ERR_PTR(-ENOMEM);
 	ruleset->ref_count = 1;
-	ruleset->ht_key.dev = dev;
-	ruleset->ht_key.ingress = ingress;
+	ruleset->ht_key.block = block;
 	ruleset->ht_key.chain_index = chain_index;
 	ruleset->ht_key.ops = ops;
 
@@ -174,8 +346,7 @@ mlxsw_sp_acl_ruleset_create(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 		 * to be directly bound to device. The rest of the rulesets
 		 * are bound by "Goto action set".
 		 */
-		err = mlxsw_sp_acl_ruleset_bind(mlxsw_sp, ruleset,
-						dev, ingress);
+		err = mlxsw_sp_acl_ruleset_block_bind(mlxsw_sp, ruleset, block);
 		if (err)
 			goto err_ruleset_bind;
 	}
@@ -198,12 +369,12 @@ static void mlxsw_sp_acl_ruleset_destroy(struct mlxsw_sp *mlxsw_sp,
 					 struct mlxsw_sp_acl_ruleset *ruleset)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
+	struct mlxsw_sp_acl_block *block = ruleset->ht_key.block;
+	u32 chain_index = ruleset->ht_key.chain_index;
 	struct mlxsw_sp_acl *acl = mlxsw_sp->acl;
 
-	if (!ruleset->ht_key.chain_index)
-		mlxsw_sp_acl_ruleset_unbind(mlxsw_sp, ruleset,
-					    ruleset->ht_key.dev,
-					    ruleset->ht_key.ingress);
+	if (!chain_index)
+		mlxsw_sp_acl_ruleset_block_unbind(mlxsw_sp, ruleset, block);
 	rhashtable_remove_fast(&acl->ruleset_ht, &ruleset->ht_node,
 			       mlxsw_sp_acl_ruleset_ht_params);
 	ops->ruleset_del(mlxsw_sp, ruleset->priv);
@@ -225,15 +396,14 @@ static void mlxsw_sp_acl_ruleset_ref_dec(struct mlxsw_sp *mlxsw_sp,
 }
 
 static struct mlxsw_sp_acl_ruleset *
-__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
-			      bool ingress, u32 chain_index,
+__mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl,
+			      struct mlxsw_sp_acl_block *block, u32 chain_index,
 			      const struct mlxsw_sp_acl_profile_ops *ops)
 {
 	struct mlxsw_sp_acl_ruleset_ht_key ht_key;
 
 	memset(&ht_key, 0, sizeof(ht_key));
-	ht_key.dev = dev;
-	ht_key.ingress = ingress;
+	ht_key.block = block;
 	ht_key.chain_index = chain_index;
 	ht_key.ops = ops;
 	return rhashtable_lookup_fast(&acl->ruleset_ht, &ht_key,
@@ -241,8 +411,8 @@ __mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp_acl *acl, struct net_device *dev,
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			    bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block, u32 chain_index,
 			    enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -252,16 +422,15 @@ mlxsw_sp_acl_ruleset_lookup(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	ops = acl->ops->profile_ops(mlxsw_sp, profile);
 	if (!ops)
 		return ERR_PTR(-EINVAL);
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (!ruleset)
 		return ERR_PTR(-ENOENT);
 	return ruleset;
 }
 
 struct mlxsw_sp_acl_ruleset *
-mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
-			 bool ingress, u32 chain_index,
+mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp,
+			 struct mlxsw_sp_acl_block *block, u32 chain_index,
 			 enum mlxsw_sp_acl_profile profile)
 {
 	const struct mlxsw_sp_acl_profile_ops *ops;
@@ -272,14 +441,12 @@ mlxsw_sp_acl_ruleset_get(struct mlxsw_sp *mlxsw_sp, struct net_device *dev,
 	if (!ops)
 		return ERR_PTR(-EINVAL);
 
-	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, dev, ingress,
-						chain_index, ops);
+	ruleset = __mlxsw_sp_acl_ruleset_lookup(acl, block, chain_index, ops);
 	if (ruleset) {
 		mlxsw_sp_acl_ruleset_ref_inc(ruleset);
 		return ruleset;
 	}
-	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, dev, ingress,
-					   chain_index, ops);
+	return mlxsw_sp_acl_ruleset_create(mlxsw_sp, block, chain_index, ops);
 }
 
 void mlxsw_sp_acl_ruleset_put(struct mlxsw_sp *mlxsw_sp,
@@ -528,6 +695,7 @@ int mlxsw_sp_acl_rule_add(struct mlxsw_sp *mlxsw_sp,
 		goto err_rhashtable_insert;
 
 	list_add_tail(&rule->list, &mlxsw_sp->acl->rules);
+	ruleset->ht_key.block->rule_count++;
 	return 0;
 
 err_rhashtable_insert:
@@ -541,6 +709,7 @@ void mlxsw_sp_acl_rule_del(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_acl_ruleset *ruleset = rule->ruleset;
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
+	ruleset->ht_key.block->rule_count--;
 	list_del(&rule->list);
 	rhashtable_remove_fast(&ruleset->rule_ht, &rule->ht_node,
 			       mlxsw_sp_acl_rule_ht_params);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 42e8a36..cf7b97d4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -35,6 +35,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/netdevice.h>
+#include <net/net_namespace.h>
 #include <net/flow_dissector.h>
 #include <net/pkt_cls.h>
 #include <net/tc_act/tc_gact.h>
@@ -45,7 +46,7 @@
 #include "core_acl_flex_keys.h"
 
 static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
-					 struct net_device *dev, bool ingress,
+					 struct mlxsw_sp_acl_block *block,
 					 struct mlxsw_sp_acl_rule_info *rulei,
 					 struct tcf_exts *exts)
 {
@@ -80,8 +81,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			struct mlxsw_sp_acl_ruleset *ruleset;
 			u16 group_id;
 
-			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, dev,
-							      ingress,
+			ruleset = mlxsw_sp_acl_ruleset_lookup(mlxsw_sp, block,
 							      chain_index,
 							      MLXSW_SP_ACL_PROFILE_FLOWER);
 			if (IS_ERR(ruleset))
@@ -104,9 +104,6 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 				return err;
 
 			out_dev = tcf_mirred_dev(a);
-			if (out_dev == dev)
-				out_dev = NULL;
-
 			err = mlxsw_sp_acl_rulei_act_fwd(mlxsw_sp, rulei,
 							 out_dev);
 			if (err)
@@ -265,7 +262,7 @@ static int mlxsw_sp_flower_parse_ip(struct mlxsw_sp *mlxsw_sp,
 }
 
 static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
-				 struct net_device *dev, bool ingress,
+				 struct mlxsw_sp_acl_block *block,
 				 struct mlxsw_sp_acl_rule_info *rulei,
 				 struct tc_cls_flower_offload *f)
 {
@@ -383,21 +380,19 @@ static int mlxsw_sp_flower_parse(struct mlxsw_sp *mlxsw_sp,
 	if (err)
 		return err;
 
-	return mlxsw_sp_flower_parse_actions(mlxsw_sp, dev, ingress,
-					     rulei, f->exts);
+	return mlxsw_sp_flower_parse_actions(mlxsw_sp, block, rulei, f->exts);
 }
 
-int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_replace(struct mlxsw_sp *mlxsw_sp,
+			    struct mlxsw_sp_acl_block *block,
 			    struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-	struct net_device *dev = mlxsw_sp_port->dev;
 	struct mlxsw_sp_acl_rule_info *rulei;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, dev, ingress,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
 					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
@@ -410,7 +405,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	rulei = mlxsw_sp_acl_rule_rulei(rule);
-	err = mlxsw_sp_flower_parse(mlxsw_sp, dev, ingress, rulei, f);
+	err = mlxsw_sp_flower_parse(mlxsw_sp, block, rulei, f);
 	if (err)
 		goto err_flower_parse;
 
@@ -423,7 +418,6 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 		goto err_rule_add;
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count++;
 	return 0;
 
 err_rule_add:
@@ -435,15 +429,15 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	return err;
 }
 
-void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+void mlxsw_sp_flower_destroy(struct mlxsw_sp *mlxsw_sp,
+			     struct mlxsw_sp_acl_block *block,
 			     struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (IS_ERR(ruleset))
 		return;
@@ -455,13 +449,12 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
-	mlxsw_sp_port->acl_rule_count--;
 }
 
-int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
+int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
+			  struct mlxsw_sp_acl_block *block,
 			  struct tc_cls_flower_offload *f)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct mlxsw_sp_acl_ruleset *ruleset;
 	struct mlxsw_sp_acl_rule *rule;
 	u64 packets;
@@ -469,8 +462,8 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	u64 bytes;
 	int err;
 
-	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, mlxsw_sp_port->dev,
-					   ingress, f->common.chain_index,
+	ruleset = mlxsw_sp_acl_ruleset_get(mlxsw_sp, block,
+					   f->common.chain_index,
 					   MLXSW_SP_ACL_PROFILE_FLOWER);
 	if (WARN_ON(IS_ERR(ruleset)))
 		return -EINVAL;
-- 
2.9.5

^ permalink raw reply related

* [patch net-next v4 10/10] mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops
From: Jiri Pirko @ 2017-12-23 15:54 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

No need to convert from mlxsw_sp_port to net_device and back again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |  6 +++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |  4 ++--
 .../ethernet/mellanox/mlxsw/spectrum_acl_tcam.c    | 27 +++++++++-------------
 3 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index ab6ada7..525552d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -477,9 +477,11 @@ struct mlxsw_sp_acl_profile_ops {
 			   void *priv, void *ruleset_priv);
 	void (*ruleset_del)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv);
 	int (*ruleset_bind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			    struct net_device *dev, bool ingress);
+			    struct mlxsw_sp_port *mlxsw_sp_port,
+			    bool ingress);
 	void (*ruleset_unbind)(struct mlxsw_sp *mlxsw_sp, void *ruleset_priv,
-			       struct net_device *dev, bool ingress);
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress);
 	u16 (*ruleset_group_id)(void *ruleset_priv);
 	size_t rule_priv_size;
 	int (*rule_add)(struct mlxsw_sp *mlxsw_sp,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index f98bca9..9439bfa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -169,7 +169,7 @@ mlxsw_sp_acl_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	return ops->ruleset_bind(mlxsw_sp, ruleset->priv,
-				 binding->mlxsw_sp_port->dev, binding->ingress);
+				 binding->mlxsw_sp_port, binding->ingress);
 }
 
 static void
@@ -181,7 +181,7 @@ mlxsw_sp_acl_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 	const struct mlxsw_sp_acl_profile_ops *ops = ruleset->ht_key.ops;
 
 	ops->ruleset_unbind(mlxsw_sp, ruleset->priv,
-			    binding->mlxsw_sp_port->dev, binding->ingress);
+			    binding->mlxsw_sp_port, binding->ingress);
 }
 
 static bool mlxsw_sp_acl_ruleset_block_bound(struct mlxsw_sp_acl_block *block)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
index 50b2f9a..c6e180c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c
@@ -258,15 +258,11 @@ static void mlxsw_sp_acl_tcam_group_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 			     struct mlxsw_sp_acl_tcam_group *group,
-			     struct net_device *dev, bool ingress)
+			     struct mlxsw_sp_port *mlxsw_sp_port,
+			     bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (!mlxsw_sp_port_dev_check(dev))
-		return -EINVAL;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_BIND, mlxsw_sp_port->local_port,
@@ -277,15 +273,11 @@ mlxsw_sp_acl_tcam_group_bind(struct mlxsw_sp *mlxsw_sp,
 static void
 mlxsw_sp_acl_tcam_group_unbind(struct mlxsw_sp *mlxsw_sp,
 			       struct mlxsw_sp_acl_tcam_group *group,
-			       struct net_device *dev, bool ingress)
+			       struct mlxsw_sp_port *mlxsw_sp_port,
+			       bool ingress)
 {
-	struct mlxsw_sp_port *mlxsw_sp_port;
 	char ppbt_pl[MLXSW_REG_PPBT_LEN];
 
-	if (WARN_ON(!mlxsw_sp_port_dev_check(dev)))
-		return;
-
-	mlxsw_sp_port = netdev_priv(dev);
 	mlxsw_reg_ppbt_pack(ppbt_pl, ingress ? MLXSW_REG_PXBT_E_IACL :
 					       MLXSW_REG_PXBT_E_EACL,
 			    MLXSW_REG_PXBT_OP_UNBIND, mlxsw_sp_port->local_port,
@@ -1054,22 +1046,25 @@ mlxsw_sp_acl_tcam_flower_ruleset_del(struct mlxsw_sp *mlxsw_sp,
 static int
 mlxsw_sp_acl_tcam_flower_ruleset_bind(struct mlxsw_sp *mlxsw_sp,
 				      void *ruleset_priv,
-				      struct net_device *dev, bool ingress)
+				      struct mlxsw_sp_port *mlxsw_sp_port,
+				      bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
 	return mlxsw_sp_acl_tcam_group_bind(mlxsw_sp, &ruleset->group,
-					    dev, ingress);
+					    mlxsw_sp_port, ingress);
 }
 
 static void
 mlxsw_sp_acl_tcam_flower_ruleset_unbind(struct mlxsw_sp *mlxsw_sp,
 					void *ruleset_priv,
-					struct net_device *dev, bool ingress)
+					struct mlxsw_sp_port *mlxsw_sp_port,
+					bool ingress)
 {
 	struct mlxsw_sp_acl_tcam_flower_ruleset *ruleset = ruleset_priv;
 
-	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group, dev, ingress);
+	mlxsw_sp_acl_tcam_group_unbind(mlxsw_sp, &ruleset->group,
+				       mlxsw_sp_port, ingress);
 }
 
 static u16
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH 2/2] Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find."
From: David Miller @ 2017-12-23 15:56 UTC (permalink / raw)
  To: steffen.klassert; +Cc: herbert, robsonde, netdev
In-Reply-To: <20171223092217.4wbnjhpw2itbzaty@gauss3.secunet.de>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Sat, 23 Dec 2017 10:22:17 +0100

> On Thu, Nov 16, 2017 at 11:00:40AM +0100, Steffen Klassert wrote:
>> This reverts commit c9f3f813d462c72dbe412cee6a5cbacf13c4ad5e.
>> 
>> This commit breaks transport mode when the policy template
>> has widlcard addresses configured, so revert it.
>> 
>> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> David, can you please queue this one up for v4.14-stable?
> Commit ID is 94802151894d482e82c324edf2c658f8e6b96508
> 
> v4.14 is unusable for some people without this revert.

Yes, but it adds back the stack out-of-bounds bug.

If I queue up the revert, I would also need to queue up whatever
follow-on you used to fix the out-of-bounds bug properly.  Which
commit is that?

^ permalink raw reply

* Re: [PATCH v3 00/27] kill devm_ioremap_nocache
From: Guenter Roeck @ 2017-12-23 15:57 UTC (permalink / raw)
  To: Greg KH, Yisheng Xie
  Cc: linux-mips, ulf.hansson, jakub.kicinski, platform-driver-x86,
	airlied, linux-wireless, linus.walleij, alsa-devel, dri-devel,
	linux-kernel, linux-ide, linux-mtd, daniel.vetter, dan.j.williams,
	jason, linux-rtc, boris.brezillon, mchehab, dmaengine, vinod.koul,
	richard, marek.vasut, industrypack-devel, linux-pci, dvhart, wg,
	linux-media, seanpaul, devel, linux-watchdog, arnd, b.zolnierkie,
	marc.zyngier, jslaby
In-Reply-To: <20171223134831.GB10103@kroah.com>

On 12/23/2017 05:48 AM, Greg KH wrote:
> On Sat, Dec 23, 2017 at 06:55:25PM +0800, Yisheng Xie wrote:
>> Hi all,
>>
>> When I tried to use devm_ioremap function and review related code, I found
>> devm_ioremap and devm_ioremap_nocache is almost the same with each other,
>> except one use ioremap while the other use ioremap_nocache.
> 
> For all arches?  Really?  Look at MIPS, and x86, they have different
> functions.
> 

Both mips and x86 end up mapping the same function, but other arches don't.
mn10300 is one where ioremap and ioremap_nocache are definitely different.

Guenter

>> While ioremap's
>> default function is ioremap_nocache, so devm_ioremap_nocache also have the
>> same function with devm_ioremap, which can just be killed to reduce the size
>> of devres.o(from 20304 bytes to 18992 bytes in my compile environment).
>>
>> I have posted two versions, which use macro instead of function for
>> devm_ioremap_nocache[1] or devm_ioremap[2]. And Greg suggest me to kill
>> devm_ioremap_nocache for no need to keep a macro around for the duplicate
>> thing. So here comes v3 and please help to review.
> 
> I don't think this can be done, what am I missing?  These functions are
> not identical, sorry for missing that before.
> 
> thanks,
> 
> greg k-h
> 

^ permalink raw reply

* Re: kasan for bpf
From: David Miller @ 2017-12-23 16:03 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: jannh, netdev, daniel, ben, linux-kernel, kernel-team
In-Reply-To: <20171223043155.biljcns7iwe7n633@ast-mbp>

From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Fri, 22 Dec 2017 20:31:56 -0800

> Thoughts?

Even though you propose it as the opposite, it sounds like a crutch
for the verifier.

If we strictly control objects that the eBPF program can access,
verifier ensures this, and all other objects go through helpers,
then I cannot see what kasan for bpf can buy us.

To me it tells the world "yes, verifier and carefully designed helpers
are insufficient" and that's not the message I have been giving to
rooms full of hundreds of people listening to my xdp/bpf
presentations.

^ permalink raw reply

* [patch iproute2] tc: implement filter block sharing to ingress and clsact qdiscs
From: Jiri Pirko @ 2017-12-23 16:06 UTC (permalink / raw)
  To: netdev
  Cc: davem, jhs, xiyou.wangcong, mlxsw, andrew, vivien.didelot,
	f.fainelli, michael.chan, ganeshgr, saeedm, matanb, leonro,
	idosch, jakub.kicinski, simon.horman, pieter.jansenvanvuuren,
	john.hurley, alexander.h.duyck, ogerlitz, john.fastabend, daniel,
	dsahern
In-Reply-To: <20171223155436.9014-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/uapi/linux/pkt_sched.h | 11 +++++++++
 tc/q_clsact.c                  | 56 ++++++++++++++++++++++++++++++++++++++----
 tc/q_ingress.c                 | 32 +++++++++++++++++++++---
 3 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..8cc554a 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -934,4 +934,15 @@ enum {
 
 #define TCA_CBS_MAX (__TCA_CBS_MAX - 1)
 
+/* Ingress/clsact */
+
+enum {
+	TCA_CLSACT_UNSPEC,
+	TCA_CLSACT_INGRESS_BLOCK,
+	TCA_CLSACT_EGRESS_BLOCK,
+	__TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX	(__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/tc/q_clsact.c b/tc/q_clsact.c
index 341f653..06d67db 100644
--- a/tc/q_clsact.c
+++ b/tc/q_clsact.c
@@ -7,23 +7,69 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... clsact\n");
+	fprintf(stderr, "Usage: ... clsact [ingress_block BLOCK_INDEX] [egress_block BLOCK_INDEX]\n");
 }
 
 static int clsact_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			    struct nlmsghdr *n, const char *dev)
 {
-	if (argc > 0) {
-		fprintf(stderr, "What is \"%s\"?\n", *argv);
-		explain();
-		return -1;
+	struct rtattr *tail;
+	unsigned int ingress_block = 0;
+	unsigned int egress_block = 0;
+
+	while (argc > 0) {
+		if (strcmp(*argv, "ingress_block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&ingress_block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"ingress_block\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "egress_block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&egress_block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"egress_block\"\n");
+				return -1;
+			}
+		} else {
+			fprintf(stderr, "What is \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (ingress_block)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, ingress_block);
+	if (egress_block)
+		addattr32(n, 1024, TCA_CLSACT_EGRESS_BLOCK, egress_block);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int clsact_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 {
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	unsigned int block;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "ingress_block",
+			   "ingress_block %u ", block);
+	}
+	if (tb[TCA_CLSACT_EGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_EGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "egress_block",
+			   "egress_block %u ", block);
+	}
 	return 0;
 }
 
diff --git a/tc/q_ingress.c b/tc/q_ingress.c
index 1e42229..6899c4d 100644
--- a/tc/q_ingress.c
+++ b/tc/q_ingress.c
@@ -17,30 +17,56 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... ingress\n");
+	fprintf(stderr, "Usage: ... ingress [block BLOCK_INDEX]\n");
 }
 
 static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			     struct nlmsghdr *n, const char *dev)
 {
+	struct rtattr *tail;
+	unsigned int block;
+
 	while (argc > 0) {
 		if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
-			argc--; argv++;
+		} else if (strcmp(*argv, "block") == 0) {
+			NEXT_ARG();
+			if (get_unsigned(&block, *argv, 0)) {
+				fprintf(stderr, "Illegal \"block\"\n");
+				return -1;
+			}
 		} else {
 			fprintf(stderr, "What is \"%s\"?\n", *argv);
 			explain();
 			return -1;
 		}
+		NEXT_ARG_FWD();
 	}
 
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+	if (block)
+		addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, block);
+	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
 
 static int ingress_print_opt(struct qdisc_util *qu, FILE *f,
 			     struct rtattr *opt)
 {
-	fprintf(f, "---------------- ");
+	struct rtattr *tb[TCA_CLSACT_MAX + 1];
+	unsigned int block;
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+	if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+	    RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+		block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+		print_uint(PRINT_ANY, "block", "block %u ", block);
+	}
 	return 0;
 }
 
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH 2/2] Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find."
From: Steffen Klassert @ 2017-12-23 16:09 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, robsonde, netdev
In-Reply-To: <20171223.105612.1264946295573777970.davem@davemloft.net>

On Sat, Dec 23, 2017 at 10:56:12AM -0500, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Sat, 23 Dec 2017 10:22:17 +0100
> 
> > On Thu, Nov 16, 2017 at 11:00:40AM +0100, Steffen Klassert wrote:
> >> This reverts commit c9f3f813d462c72dbe412cee6a5cbacf13c4ad5e.
> >> 
> >> This commit breaks transport mode when the policy template
> >> has widlcard addresses configured, so revert it.
> >> 
> >> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> > 
> > David, can you please queue this one up for v4.14-stable?
> > Commit ID is 94802151894d482e82c324edf2c658f8e6b96508
> > 
> > v4.14 is unusable for some people without this revert.
> 
> Yes, but it adds back the stack out-of-bounds bug.
> 
> If I queue up the revert, I would also need to queue up whatever
> follow-on you used to fix the out-of-bounds bug properly.  Which
> commit is that?

This is commit ddc47e4404b58f03e98345398fb12d38fe291512
("xfrm: Fix stack-out-of-bounds read on socket policy lookup.")

It is included in the pull request for the net tree that
I sent yesterday. The patch looks save, but not so sure
if it should go directly to stable. These bugs reported by
the syzbot are usually quite subtile and I already broke
something when I tried to fix the original stack out-of-bounds
bug. So maybe we should wait until the v4.15 release before
backporting...

^ permalink raw reply

* Re: 4.4.103 linux kernel regression
From: Konstantin Khlebnikov @ 2017-12-23 16:36 UTC (permalink / raw)
  To: Greg KH, Mathias Tillman
  Cc: netdev, stable, xiyou.wangcong, dsahern, jeffy.chen, davem
In-Reply-To: <20171223135249.GA16617@kroah.com>

[-- Attachment #1: Type: text/plain, Size: 1493 bytes --]

On 23.12.2017 16:52, Greg KH wrote:
> adding stable@ and netdev@
> 
> On Sat, Dec 23, 2017 at 10:49:27AM +0000, Mathias Tillman wrote:
>> Hi, I wanted to make you aware of a recent regression to the Linux kernel
>> introduced with commit 2417da3f4d6bc4fc6c77f613f0e2264090892aa5:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/net/ipv6?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
> 
> Is this issue also present in Linus's tree?
> 
>> I have reported it here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=198189
> 
> Bugzilla doesn't work for networking bugs, nor stable stuff, just for a
> few subsystems, sorry.
> 
>> Basically, that commit causes an endless loop if, for some reason, not all
>> devices are unregistered in the rollback_registered_many function in
>> net/dev.c
>>
>> Decided to contact you directly since I have yet to receive any reply on
>> the bug report, and I wasn't entirely sure what the procedure was. Please
>> do let me know if I have to change anything in the report.
> 
> I can revert it, but it would be good to verify if this is an issue in
> the latest releases or not first.

Most likely bug fixed by that commit hid refcount leak for loopback device.

Mathias, please try debug patch from attachment.
It logs all refcount changes for loopback in non-host net namespace.
Hopefully log would will be tiny and show what is missing.

Looks like vsftpd creates and destroys empty net-ns, like "unshare -n true"

[-- Attachment #2: net-debug-lo-refcnt --]
[-- Type: text/plain, Size: 1892 bytes --]

net: debug lo refcnt

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 include/linux/netdevice.h |    4 ++++
 net/core/dev.c            |   14 ++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 310e729c47a4..b483b0eb22e7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3141,6 +3141,8 @@ extern int		netdev_budget;
 /* Called by rtnetlink.c:rtnl_unlock() */
 void netdev_run_todo(void);
 
+void netdev_refcnt_log(const struct net_device *dev, char op);
+
 /**
  *	dev_put - release reference to device
  *	@dev: network device
@@ -3150,6 +3152,7 @@ void netdev_run_todo(void);
 static inline void dev_put(struct net_device *dev)
 {
 	this_cpu_dec(*dev->pcpu_refcnt);
+	netdev_refcnt_log(dev, '-');
 }
 
 /**
@@ -3161,6 +3164,7 @@ static inline void dev_put(struct net_device *dev)
 static inline void dev_hold(struct net_device *dev)
 {
 	this_cpu_inc(*dev->pcpu_refcnt);
+	netdev_refcnt_log(dev, '+');
 }
 
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
diff --git a/net/core/dev.c b/net/core/dev.c
index 2e47d40388fc..d56c834140c8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6957,6 +6957,20 @@ int netdev_refcnt_read(const struct net_device *dev)
 }
 EXPORT_SYMBOL(netdev_refcnt_read);
 
+void netdev_refcnt_log(const struct net_device *dev, char op)
+{
+	static DEFINE_SPINLOCK(lock);
+	unsigned long flags;
+
+	if ((dev->flags & IFF_LOOPBACK) && !net_eq(dev_net(dev), &init_net)) {
+		spin_lock_irqsave(&lock, flags);
+		printk("%c %p %d\n", op, dev, netdev_refcnt_read(dev));
+		dump_stack();
+		spin_unlock_irqrestore(&lock, flags);
+	}
+}
+EXPORT_SYMBOL(netdev_refcnt_log);
+
 /**
  * netdev_wait_allrefs - wait until all references are gone.
  * @dev: target net_device

^ permalink raw reply related

* [RFT net-next 0/2] dwmac-meson8b: clock rounding fixes for Meson8b
From: Martin Blumenstingl @ 2017-12-23 17:04 UTC (permalink / raw)
  To: netdev, ingrassia
  Cc: linus.luessing, khilman, linux-amlogic, jbrunet, narmstrong,
	peppe.cavallaro, alexandre.torgue, Martin Blumenstingl

Hi Dave,

please do not apply this series until it got a Tested-by from Emiliano.

Hi Emiliano,

you reported [0] that you couldn't get dwmac-meson8b to work on your
Odroid-C1. With your findings (register dumps, clk_summary output, etc.)
I think I was able to find a fix: it consists of two patches (which you
find in this series)

Unfortunately I don't have any Meson8b boards with RGMII PHY so I could
only partially test this (I could only check if the clocks were
calculated correctly when using a dummy 500002394Hz input clock instead
of MPLL2).

Could you please give this series a try and let me know about the
results?
You obviously still need your two "ARM: dts: meson8b" patches which
- add the amlogic,meson8b-dwmac" compatible to meson8b.dtsi
- enable Ethernet on the Odroid-C1

I have tested this myself on a Khadas VIM (GXL SoC, internal RMII PHY)
and a Khadas VIM2 (GXM SoC, external RGMII PHY). Both are still working
fine (so let's hope that this also fixes your Meson8b issue :)).

[0] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005596.html

Martin Blumenstingl (2):
  net: stmmac: dwmac-meson8b: fix setting the PHY clock on Meson8b
  net: stmmac: dwmac-meson8b: don't try to change m250_div parent's rate

 drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

-- 
2.15.1

^ permalink raw reply

* [RFT net-next 1/2] net: stmmac: dwmac-meson8b: fix setting the PHY clock on Meson8b
From: Martin Blumenstingl @ 2017-12-23 17:04 UTC (permalink / raw)
  To: netdev, ingrassia
  Cc: linus.luessing, khilman, linux-amlogic, jbrunet, narmstrong,
	peppe.cavallaro, alexandre.torgue, Martin Blumenstingl
In-Reply-To: <20171223170433.8150-1-martin.blumenstingl@googlemail.com>

Meson8b only supports MPLL2 as clock input. The rate of the MPLL2 clock
set by Odroid-C1's u-boot is close to 500MHz. The exact rate is
500002394Hz, which is calculated in drivers/clk/meson/clk-mpll.c
using the following formula:
DIV_ROUND_UP_ULL((u64)parent_rate * SDM_DEN, (SDM_DEN * n2) + sdm)
Odroid-C1's u-boot configures MPLL2 with the following values:
- SDM_DEN = 16384
- SDM = 1638
- N2 = 5

The 250MHz and 25MHz clocks inside dwmac-meson8b driver are derived
from the MPLL2 clock. Due to MPLL2 running slightly faster than 500MHz
the common clock framework chooses dividers which are too big to
generate the 250MHz and 25MHz clocks. Emiliano Ingrassia observed that
the divider for the 250MHz clock was set to 0x5 which results in a clock
rate of close to 100MHz instead of 250MHz. The divider for the 25MHz
clock is set to 0x0 (which means "divide by 5") so the resulting RGMII
clock is running at 20MHz (plus a few additional Hz). The RTL8211F PHY
on Odroid-C1 however fails to operate with a 20MHz RGMII clock.

Round the divider's clock rates to prevent this issue on Meson8b. This
means we'll now end up with a clock rate of 25000120Hz (= 25MHz plus
120Hz).
This has no effect on the Meson GX SoCs since there fclk_div2 is used as
input clock, which has a rate of 1000MHz (and thus is divisible cleanly
to 250MHz and 25MHz).

Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Reported-by: Emiliano Ingrassia <ingrassia@epigenesys.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
index 4404650b32c5..c71966332387 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
@@ -144,7 +144,9 @@ static int meson8b_init_clk(struct meson8b_dwmac *dwmac)
 	dwmac->m250_div.shift = PRG_ETH0_CLK_M250_DIV_SHIFT;
 	dwmac->m250_div.width = PRG_ETH0_CLK_M250_DIV_WIDTH;
 	dwmac->m250_div.hw.init = &init;
-	dwmac->m250_div.flags = CLK_DIVIDER_ONE_BASED | CLK_DIVIDER_ALLOW_ZERO;
+	dwmac->m250_div.flags = CLK_DIVIDER_ONE_BASED |
+				CLK_DIVIDER_ALLOW_ZERO |
+				CLK_DIVIDER_ROUND_CLOSEST;
 
 	dwmac->m250_div_clk = devm_clk_register(dev, &dwmac->m250_div.hw);
 	if (WARN_ON(IS_ERR(dwmac->m250_div_clk)))
@@ -164,7 +166,8 @@ static int meson8b_init_clk(struct meson8b_dwmac *dwmac)
 	dwmac->m25_div.width = PRG_ETH0_CLK_M25_DIV_WIDTH;
 	dwmac->m25_div.table = clk_25m_div_table;
 	dwmac->m25_div.hw.init = &init;
-	dwmac->m25_div.flags = CLK_DIVIDER_ALLOW_ZERO;
+	dwmac->m25_div.flags = CLK_DIVIDER_ALLOW_ZERO |
+				CLK_DIVIDER_ROUND_CLOSEST;
 
 	dwmac->m25_div_clk = devm_clk_register(dev, &dwmac->m25_div.hw);
 	if (WARN_ON(IS_ERR(dwmac->m25_div_clk)))
-- 
2.15.1

^ permalink raw reply related

* [RFT net-next 2/2] net: stmmac: dwmac-meson8b: don't try to change m250_div parent's rate
From: Martin Blumenstingl @ 2017-12-23 17:04 UTC (permalink / raw)
  To: netdev, ingrassia
  Cc: linus.luessing, khilman, linux-amlogic, jbrunet, narmstrong,
	peppe.cavallaro, alexandre.torgue, Martin Blumenstingl
In-Reply-To: <20171223170433.8150-1-martin.blumenstingl@googlemail.com>

Trying to set the rate of m250_div's parent clock makes no sense since
it's a mux which has neither CLK_MUX_ROUND_CLOSEST nor
CLK_SET_RATE_PARENT set.
It even does harm on Meson8b SoCs where the input clock for the mux
cannot be divided down to 250MHz evenly (the parent rate is 500002394Hz)
which is why we need to use CLK_DIVIDER_ROUND_CLOSEST for the m250_div
clock. The clk-divider driver however ignores the
CLK_DIVIDER_ROUND_CLOSEST flag if CLK_SET_RATE_PARENT is set (because
it simply tries to set the best possible clock rate for the parent,
which does nothing in our case since the parent is a mux which doesn't
allow rate changes as explained above).

This fixes setting the RGMII clock on Meson8 SoCs which ended up with a
~20MHz clock instead of the expected ~25MHz.
The dwmac-meson8b driver requests a 25MHz clock rate for the m25_div
(which only supports "divide by 5" and "divide by 10") clock which is
derived from the m250_div clock. Due to clk-divider ignoring the
CLK_DIVIDER_ROUND_CLOSEST flag the resulting m250_div clock was set to
~100MHz (divider = 5) and the m25_div clock was set to ~20MHz (divider =
5) by the common clock framework (as this value is closest to 25MHz if
we would not have set CLK_DIVIDER_ROUND_CLOSEST). What we actually need
however is a rate of ~250MHz on the m250_div clock (divider = 2) and
~25MHz on the m25_div clock (divider = 10) - these are also the values
chosen by the out-of-tree vendor driver.
With this we end up with a RGMII clock of 25000120Hz (which is as close
to 25MHz we can get with an input clock of 500002394Hz).

SoCs from the Meson GX series are not affected by this change because
the input clock is FCLK_DIV2 whose rate cannot be changed. Additionally
the GX SoCs don't need to use the "closest" divider since the parent
clock is a multiple of 250MHz.

Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
index c71966332387..26f41c117d63 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
@@ -135,7 +135,7 @@ static int meson8b_init_clk(struct meson8b_dwmac *dwmac)
 	snprintf(clk_name, sizeof(clk_name), "%s#m250_div", dev_name(dev));
 	init.name = devm_kstrdup(dev, clk_name, GFP_KERNEL);
 	init.ops = &clk_divider_ops;
-	init.flags = CLK_SET_RATE_PARENT;
+	init.flags = 0;
 	clk_div_parents[0] = __clk_get_name(dwmac->m250_mux_clk);
 	init.parent_names = clk_div_parents;
 	init.num_parents = ARRAY_SIZE(clk_div_parents);
-- 
2.15.1

^ permalink raw reply related

* [PATCH v2 net-next 0/2] kcm: Fix two locking issues
From: Tom Herbert @ 2017-12-23 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, rohit, Tom Herbert

One issue is lockdep warnings when sock_owned_by_user returns true
in strparser. Fix is to add and call sock_owned_by_user_nocheck since
the check for owned by user is not an error condition in this case.

The other issue is a potential deadlock between TX and RX paths

KCM socket lock and the psock socket lock are acquired in both
the RX and TX path, however they take the locks in opposite order
which can lead to deadlock. The fix is to add try_sock_lock to see
if psock socket lock can get acquired in the TX path with KCM lock
held. If not, then KCM socket is released and the psock socket lock
and KCM socket lock are acquired in the same order as the RX path.

Tested:

Ran KCM traffic without incident.

v2: Remove patches to address potential deadlock. I couldn't convince
myself this is an issue after looking at the code some more.

Tom Herbert (2):
  sock: Add sock_owned_by_user_nocheck
  strparser: Call sock_owned_by_user_nocheck

 include/net/sock.h        | 5 +++++
 net/strparser/strparser.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH v2 net-next 1/2] sock: Add sock_owned_by_user_nocheck
From: Tom Herbert @ 2017-12-23 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, rohit, Tom Herbert
In-Reply-To: <20171223171716.16130-1-tom@quantonium.net>

This allows checking socket lock ownership with producing lockdep
warnings.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/net/sock.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 6c1db823f8b9..66fd3951e6f3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1515,6 +1515,11 @@ static inline bool sock_owned_by_user(const struct sock *sk)
 	return sk->sk_lock.owned;
 }
 
+static inline bool sock_owned_by_user_nocheck(const struct sock *sk)
+{
+	return sk->sk_lock.owned;
+}
+
 /* no reclassification while locks are held */
 static inline bool sock_allow_reclassification(const struct sock *csk)
 {
-- 
2.11.0

^ permalink raw reply related

* [PATCH v2 net-next 2/2] strparser: Call sock_owned_by_user_nocheck
From: Tom Herbert @ 2017-12-23 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, dvyukov, rohit, Tom Herbert
In-Reply-To: <20171223171716.16130-1-tom@quantonium.net>

strparser wants to check socket ownership without producing any
warnings. As indicated by the comment in the code, it is permissible
for owned_by_user to return true.

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 net/strparser/strparser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index c5fda15ba319..1fdab5c4eda8 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -401,7 +401,7 @@ void strp_data_ready(struct strparser *strp)
 	 * allows a thread in BH context to safely check if the process
 	 * lock is held. In this case, if the lock is held, queue work.
 	 */
-	if (sock_owned_by_user(strp->sk)) {
+	if (sock_owned_by_user_nocheck(strp->sk)) {
 		queue_work(strp_wq, &strp->work);
 		return;
 	}
-- 
2.11.0

^ permalink raw reply related

* [PATCH bpf-next] bpf: fix stacksafe exploration when comparing states
From: Gianluca Borello @ 2017-12-23 10:09 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, Gianluca Borello

Commit cc2b14d51053 ("bpf: teach verifier to recognize zero initialized
stack") introduced a very relaxed check when comparing stacks of different
states, effectively returning a positive result in many cases where it
shouldn't.

This can create problems in cases such as this following C pseudocode:

long var;
long *x = bpf_map_lookup(...);
if (!x)
        return;

if (*x != 0xbeef)
        var = 0;
else
        var = 1;

/* This is the key part, calling a helper causes an explored state
 * to be saved with the information that "var" is on the stack as
 * STACK_ZERO, since the helper is first met by the verifier after
 * the "var = 0" assignment. This state will however be wrongly used
 * also for the "var = 1" case, so the verifier assumes "var" is always
 * 0 and will replace the NULL assignment with nops, because the
 * search pruning prevents it from exploring the faulty branch.
 */
bpf_ktime_get_ns();

if (var)
        *(long *)0 = 0xbeef;

Fix the issue by making sure that the stack is fully explored before
returning a positive comparison result.

Also attach a couple tests that highlight the bad behavior. In the first
test, without this fix instructions 16 and 17 are replaced with nops
instead of being rejected by the verifier.

The second test, instead, allows a program to make a potentially illegal
read from the stack.

Fixes: cc2b14d51053 ("bpf: teach verifier to recognize zero initialized stack")
Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/verifier.c                       |  2 +-
 tools/testing/selftests/bpf/test_verifier.c | 51 +++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8b442ae125d0..93e1c77dae1d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4107,7 +4107,7 @@ static bool stacksafe(struct bpf_func_state *old,
 
 		if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ))
 			/* explored state didn't use this */
-			return true;
+			continue;
 
 		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID)
 			continue;
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 3bacff0d6f91..5e79515d10c5 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -9715,6 +9715,57 @@ static struct bpf_test tests[] = {
 		.result = REJECT,
 		.prog_type = BPF_PROG_TYPE_XDP,
 	},
+	{
+		"search pruning: all branches should be verified (nop operation)",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 11),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0xbeef, 2),
+			BPF_MOV64_IMM(BPF_REG_4, 0),
+			BPF_JMP_A(1),
+			BPF_MOV64_IMM(BPF_REG_4, 1),
+			BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_4, -16),
+			BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -16),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_5, 0, 2),
+			BPF_MOV64_IMM(BPF_REG_6, 0),
+			BPF_ST_MEM(BPF_DW, BPF_REG_6, 0, 0xdead),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map1 = { 3 },
+		.errstr = "R6 invalid mem access 'inv'",
+		.result = REJECT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
+	{
+		"search pruning: all branches should be verified (invalid stack access)",
+		.insns = {
+			BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+			BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+			BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+			BPF_LD_MAP_FD(BPF_REG_1, 0),
+			BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0),
+			BPF_MOV64_IMM(BPF_REG_4, 0),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0xbeef, 2),
+			BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_4, -16),
+			BPF_JMP_A(1),
+			BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_4, -24),
+			BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns),
+			BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_10, -16),
+			BPF_EXIT_INSN(),
+		},
+		.fixup_map1 = { 3 },
+		.errstr = "invalid read from stack off -16+0 size 8",
+		.result = REJECT,
+		.prog_type = BPF_PROG_TYPE_TRACEPOINT,
+	},
 };
 
 static int probe_filter_length(const struct bpf_insn *fp)
-- 
2.14.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox