netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use
@ 2023-12-14 14:10 Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 1/3] net/sched: Introduce tc block netdev tracking infra Victor Nogueira
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Victor Nogueira @ 2023-12-14 14:10 UTC (permalink / raw)
  To: jhs, davem, edumazet, kuba, pabeni, xiyou.wangcong, jiri
  Cc: mleitner, vladbu, paulb, pctammela, netdev, kernel

__context__
The "tc block" is a collection of netdevs/ports which allow qdiscs to share
match-action block instances (as opposed to the traditional tc filter per
netdev/port)[1].

Up to this point in the implementation, the block is unaware of its ports.
This patch makes the tc block ports available to the datapath.

For the datapath we provide a use case of the tc block in a mirred
action in patch 3. For users can levarage mirred to do something like
the following:

$ tc qdisc add dev ens7 ingress_block 22
$ tc qdisc add dev ens8 ingress_block 22
$ tc qdisc add dev ens9 ingress_block 22
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22

In this case, if the packet arrives on ens8, it will be copied and sent to
all ports in the block including ens8. Note that the packet is still in the
pipeline at this point - meaning other actions could be added after the mirror
because mirred copies/clones the skb. Example the following is valid:

$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 \
action mirred egress mirror blockid 22 \
action action vlan push id 123 \
action mirred egress redirect dev dummy0

redirect behavior always steals the packet from the pipeline and therefore
the skb is no longer available for a subsequent action as illustrated above
(in redirecting to dummy0).

The behavior of redirecting to a tc block is therefore adapted to work in the
same manner. So a setup as such:
$ tc qdisc add dev ens7 ingress_block 22
$ tc qdisc add dev ens8 ingress_block 22
$ tc qdisc add dev ens9 ingress_block 22
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action mirred egress redirect blockid 22

for a matching packet arriving on ens7 will first send a copy/clone to ens8
(as in the "mirror" behavior) then to ens9 as in the redirect behavior above.
Once this processing is done - no other actions are able to process this skb.
i.e it is removed from the "pipeline".

In this case, if the packet arrives on ens8, it will be copied and sent to
all ports in the block including ens8.

Patch 1 separates/exports mirror and redirect functions from act_mirred
Patch 2 introduces the required infra.
Patch 3 Allows mirred to blocks

Subsequent patches will come with tdc test cases.

__Acknowledgements__
Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this
patchset
better. The idea of integrating the ports into the tc block was
suggested
by Jiri Pirko.

[1] See commit ca46abd6f89f ("Merge branch'net-sched-allow-qdiscs-to-share-filter-block-instances'")

Changes in v2:
  - Remove RFC tag
  - Add more details in patch 0(Jiri)
  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
    Reported-by: kernel test robot <lkp@intel.com> (and
horms@kernel.org)
  - Fix bad dev dereference in printk of blockcast action (Simon)

Changes in v3:
  - Add missing xa_destroy (pointed out by Vlad)
  - Remove bugfix pointed by Vlad (will send in separate patch)
  - Removed ports from subject in patch #2 and typos (suggested by
    Marcelo)
  - Remove net_notice_ratelimited debug messages in error
    cases (suggested by Marcelo)
  - Minor changes to appease sparse's lock context warning

Changes in v4:
  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
  - Fix typo in cover letter (pointed out by Paolo)
  - Create a module description for act_blockcast
    (reported by Paolo and CI)

Changes in v5:
  - Added new patch which separated mirred into mirror and redirect
    functions (suggested by Jiri)
  - Instead of repeating the code to mirror in blockcast use mirror
    exported function by patch1 (tcf_mirror_act)
  - Make Block ID into act_blockcast's parameter passed by user space
    instead of always getting it from SKB (suggested by Jiri)
  - Add tx_type parameter which will specify what transmission behaviour
    we want (as described earlier)

Changes in v6:
  - Remove blockcast and make it a part of mirred
  - Block ID is now a mirred parameter
  - We now allow redirecting and mirroring to either ingress or egress

Victor Nogueira (3):
  net/sched: Introduce tc block netdev tracking infra
  net/sched: cls_api: Expose tc block to the datapath
  net/sched: act_mirred: Allow mirred to block

 include/net/sch_generic.h             |   6 +
 include/net/tc_act/tc_mirred.h        |   1 +
 include/uapi/linux/tc_act/tc_mirred.h |   1 +
 net/sched/act_mirred.c                | 280 +++++++++++++++++++-------
 net/sched/cls_api.c                   |   5 +-
 net/sched/sch_api.c                   |  55 +++++
 net/sched/sch_generic.c               |  31 ++-
 7 files changed, 302 insertions(+), 77 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next v6 1/3] net/sched: Introduce tc block netdev tracking infra
  2023-12-14 14:10 [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Victor Nogueira
@ 2023-12-14 14:10 ` Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 2/3] net/sched: cls_api: Expose tc block to the datapath Victor Nogueira
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Victor Nogueira @ 2023-12-14 14:10 UTC (permalink / raw)
  To: jhs, davem, edumazet, kuba, pabeni, xiyou.wangcong, jiri
  Cc: mleitner, vladbu, paulb, pctammela, netdev, kernel

This commit makes tc blocks track which ports have been added to them.
And, with that, we'll be able to use this new information to send
packets to the block's ports. Which will be done in the patch #3 of this
series.

Suggested-by: Jiri Pirko <jiri@nvidia.com>
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 include/net/sch_generic.h |  4 +++
 net/sched/cls_api.c       |  2 ++
 net/sched/sch_api.c       | 55 +++++++++++++++++++++++++++++++++++++++
 net/sched/sch_generic.c   | 31 ++++++++++++++++++++--
 4 files changed, 90 insertions(+), 2 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index dcb9160e6467..cefca55dd4f9 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -19,6 +19,7 @@
 #include <net/gen_stats.h>
 #include <net/rtnetlink.h>
 #include <net/flow_offload.h>
+#include <linux/xarray.h>
 
 struct Qdisc_ops;
 struct qdisc_walker;
@@ -126,6 +127,8 @@ struct Qdisc {
 
 	struct rcu_head		rcu;
 	netdevice_tracker	dev_tracker;
+	netdevice_tracker	in_block_tracker;
+	netdevice_tracker	eg_block_tracker;
 	/* private data */
 	long privdata[] ____cacheline_aligned;
 };
@@ -457,6 +460,7 @@ struct tcf_chain {
 };
 
 struct tcf_block {
+	struct xarray ports; /* datapath accessible */
 	/* Lock protects tcf_block and lifetime-management data of chains
 	 * attached to the block (refcnt, action_refcnt, explicitly_created).
 	 */
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index dc1c19a25882..6020a32ecff2 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -531,6 +531,7 @@ static void tcf_block_destroy(struct tcf_block *block)
 {
 	mutex_destroy(&block->lock);
 	mutex_destroy(&block->proto_destroy_lock);
+	xa_destroy(&block->ports);
 	kfree_rcu(block, rcu);
 }
 
@@ -1002,6 +1003,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	refcount_set(&block->refcnt, 1);
 	block->net = net;
 	block->index = block_index;
+	xa_init(&block->ports);
 
 	/* Don't store q pointer for blocks which are shared */
 	if (!tcf_block_shared(block))
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index e9eaf637220e..09ec64f2f463 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1180,6 +1180,57 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 	return 0;
 }
 
+static int qdisc_block_add_dev(struct Qdisc *sch, struct net_device *dev,
+			       struct nlattr **tca,
+			       struct netlink_ext_ack *extack)
+{
+	const struct Qdisc_class_ops *cl_ops = sch->ops->cl_ops;
+	struct tcf_block *in_block = NULL;
+	struct tcf_block *eg_block = NULL;
+	int err;
+
+	if (tca[TCA_INGRESS_BLOCK]) {
+		/* works for both ingress and clsact */
+		in_block = cl_ops->tcf_block(sch, TC_H_MIN_INGRESS, NULL);
+		if (!in_block) {
+			NL_SET_ERR_MSG(extack, "Shared ingress block missing");
+			return -EINVAL;
+		}
+
+		err = xa_insert(&in_block->ports, dev->ifindex, dev, GFP_KERNEL);
+		if (err) {
+			NL_SET_ERR_MSG(extack, "Ingress block dev insert failed");
+			return err;
+		}
+
+		netdev_hold(dev, &sch->in_block_tracker, GFP_KERNEL);
+	}
+
+	if (tca[TCA_EGRESS_BLOCK]) {
+		eg_block = cl_ops->tcf_block(sch, TC_H_MIN_EGRESS, NULL);
+		if (!eg_block) {
+			NL_SET_ERR_MSG(extack, "Shared egress block missing");
+			err = -EINVAL;
+			goto err_out;
+		}
+
+		err = xa_insert(&eg_block->ports, dev->ifindex, dev, GFP_KERNEL);
+		if (err) {
+			NL_SET_ERR_MSG(extack, "Egress block dev insert failed");
+			goto err_out;
+		}
+		netdev_hold(dev, &sch->eg_block_tracker, GFP_KERNEL);
+	}
+
+	return 0;
+err_out:
+	if (in_block) {
+		xa_erase(&in_block->ports, dev->ifindex);
+		netdev_put(dev, &sch->in_block_tracker);
+	}
+	return err;
+}
+
 static int qdisc_block_indexes_set(struct Qdisc *sch, struct nlattr **tca,
 				   struct netlink_ext_ack *extack)
 {
@@ -1350,6 +1401,10 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 	qdisc_hash_add(sch, false);
 	trace_qdisc_create(ops, dev, parent);
 
+	err = qdisc_block_add_dev(sch, dev, tca, extack);
+	if (err)
+		goto err_out4;
+
 	return sch;
 
 err_out4:
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 8dd0e5925342..32bed60dea9f 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1050,7 +1050,11 @@ static void qdisc_free_cb(struct rcu_head *head)
 
 static void __qdisc_destroy(struct Qdisc *qdisc)
 {
-	const struct Qdisc_ops  *ops = qdisc->ops;
+	struct net_device *dev = qdisc_dev(qdisc);
+	const struct Qdisc_ops *ops = qdisc->ops;
+	const struct Qdisc_class_ops *cops;
+	struct tcf_block *block;
+	u32 block_index;
 
 #ifdef CONFIG_NET_SCHED
 	qdisc_hash_del(qdisc);
@@ -1061,11 +1065,34 @@ static void __qdisc_destroy(struct Qdisc *qdisc)
 
 	qdisc_reset(qdisc);
 
+	cops = ops->cl_ops;
+	if (ops->ingress_block_get) {
+		block_index = ops->ingress_block_get(qdisc);
+		if (block_index) {
+			block = cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+			if (block) {
+				if (xa_erase(&block->ports, dev->ifindex))
+					netdev_put(dev, &qdisc->in_block_tracker);
+			}
+		}
+	}
+
+	if (ops->egress_block_get) {
+		block_index = ops->egress_block_get(qdisc);
+		if (block_index) {
+			block = cops->tcf_block(qdisc, TC_H_MIN_EGRESS, NULL);
+			if (block) {
+				if (xa_erase(&block->ports, dev->ifindex))
+					netdev_put(dev, &qdisc->eg_block_tracker);
+			}
+		}
+	}
+
 	if (ops->destroy)
 		ops->destroy(qdisc);
 
 	module_put(ops->owner);
-	netdev_put(qdisc_dev(qdisc), &qdisc->dev_tracker);
+	netdev_put(dev, &qdisc->dev_tracker);
 
 	trace_qdisc_destroy(qdisc);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next v6 2/3] net/sched: cls_api: Expose tc block to the datapath
  2023-12-14 14:10 [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 1/3] net/sched: Introduce tc block netdev tracking infra Victor Nogueira
@ 2023-12-14 14:10 ` Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block Victor Nogueira
  2023-12-14 14:29 ` [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Marcelo Ricardo Leitner
  3 siblings, 0 replies; 7+ messages in thread
From: Victor Nogueira @ 2023-12-14 14:10 UTC (permalink / raw)
  To: jhs, davem, edumazet, kuba, pabeni, xiyou.wangcong, jiri
  Cc: mleitner, vladbu, paulb, pctammela, netdev, kernel

The datapath can now find the block of the port in which the packet arrived
at.

In the next patch we show a possible usage of this patch in a new
version of mirred that multicasts to all ports except for the port in
which the packet arrived on.

Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 include/net/sch_generic.h | 2 ++
 net/sched/cls_api.c       | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index cefca55dd4f9..479bc195bb0f 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -487,6 +487,8 @@ struct tcf_block {
 	struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */
 };
 
+struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index);
+
 static inline bool lockdep_tcf_chain_is_locked(struct tcf_chain *chain)
 {
 	return lockdep_is_held(&chain->filter_chain_lock);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 6020a32ecff2..618f68733012 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1011,12 +1011,13 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q,
 	return block;
 }
 
-static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
+struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index)
 {
 	struct tcf_net *tn = net_generic(net, tcf_net_id);
 
 	return idr_find(&tn->idr, block_index);
 }
+EXPORT_SYMBOL(tcf_block_lookup);
 
 static struct tcf_block *tcf_block_refcnt_get(struct net *net, u32 block_index)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block
  2023-12-14 14:10 [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 1/3] net/sched: Introduce tc block netdev tracking infra Victor Nogueira
  2023-12-14 14:10 ` [PATCH net-next v6 2/3] net/sched: cls_api: Expose tc block to the datapath Victor Nogueira
@ 2023-12-14 14:10 ` Victor Nogueira
  2023-12-15  3:09   ` Jakub Kicinski
  2023-12-14 14:29 ` [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Marcelo Ricardo Leitner
  3 siblings, 1 reply; 7+ messages in thread
From: Victor Nogueira @ 2023-12-14 14:10 UTC (permalink / raw)
  To: jhs, davem, edumazet, kuba, pabeni, xiyou.wangcong, jiri
  Cc: mleitner, vladbu, paulb, pctammela, netdev, kernel

So far the mirred action has dealt with syntax that handles mirror/redirection for netdev.
A matching packet is redirected or mirrored to a target netdev.

In this patch we enable mirred to mirror to a tc block as well.
IOW, the new syntax looks as follows:
... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >

Examples of mirroring or redirecting to a tc block:
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22

$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22

Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 include/net/tc_act/tc_mirred.h        |   1 +
 include/uapi/linux/tc_act/tc_mirred.h |   1 +
 net/sched/act_mirred.c                | 280 +++++++++++++++++++-------
 3 files changed, 208 insertions(+), 74 deletions(-)

diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
index 32ce8ea36950..75722d967bf2 100644
--- a/include/net/tc_act/tc_mirred.h
+++ b/include/net/tc_act/tc_mirred.h
@@ -8,6 +8,7 @@
 struct tcf_mirred {
 	struct tc_action	common;
 	int			tcfm_eaction;
+	u32                     tcfm_blockid;
 	bool			tcfm_mac_header_xmit;
 	struct net_device __rcu	*tcfm_dev;
 	netdevice_tracker	tcfm_dev_tracker;
diff --git a/include/uapi/linux/tc_act/tc_mirred.h b/include/uapi/linux/tc_act/tc_mirred.h
index 2500a0005d05..54df06658bc8 100644
--- a/include/uapi/linux/tc_act/tc_mirred.h
+++ b/include/uapi/linux/tc_act/tc_mirred.h
@@ -20,6 +20,7 @@ enum {
 	TCA_MIRRED_UNSPEC,
 	TCA_MIRRED_TM,
 	TCA_MIRRED_PARMS,
+	TCA_MIRRED_BLOCKID,
 	TCA_MIRRED_PAD,
 	__TCA_MIRRED_MAX
 };
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 0a711c184c29..3bdf6a9764e8 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -85,10 +85,20 @@ static void tcf_mirred_release(struct tc_action *a)
 
 static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
 	[TCA_MIRRED_PARMS]	= { .len = sizeof(struct tc_mirred) },
+	[TCA_MIRRED_BLOCKID]	= { .type = NLA_U32 },
 };
 
 static struct tc_action_ops act_mirred_ops;
 
+static void tcf_mirred_replace_dev(struct tcf_mirred *m, struct net_device *ndev)
+{
+	struct net_device *odev;
+
+	odev = rcu_replace_pointer(m->tcfm_dev, ndev,
+				   lockdep_is_held(&m->tcf_lock));
+	netdev_put(odev, &m->tcfm_dev_tracker);
+}
+
 static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 			   struct nlattr *est, struct tc_action **a,
 			   struct tcf_proto *tp,
@@ -126,6 +136,13 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	if (exists && bind)
 		return 0;
 
+	if (tb[TCA_MIRRED_BLOCKID] && parm->ifindex) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Mustn't specify Block ID and dev simultaneously");
+		err = -EINVAL;
+		goto release_idr;
+	}
+
 	switch (parm->eaction) {
 	case TCA_EGRESS_MIRROR:
 	case TCA_EGRESS_REDIR:
@@ -142,9 +159,9 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	}
 
 	if (!exists) {
-		if (!parm->ifindex) {
+		if (!parm->ifindex && !tb[TCA_MIRRED_BLOCKID]) {
 			tcf_idr_cleanup(tn, index);
-			NL_SET_ERR_MSG_MOD(extack, "Specified device does not exist");
+			NL_SET_ERR_MSG_MOD(extack, "Must specify device or block");
 			return -EINVAL;
 		}
 		ret = tcf_idr_create_from_flags(tn, index, est, a,
@@ -170,7 +187,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	spin_lock_bh(&m->tcf_lock);
 
 	if (parm->ifindex) {
-		struct net_device *odev, *ndev;
+		struct net_device *ndev;
 
 		ndev = dev_get_by_index(net, parm->ifindex);
 		if (!ndev) {
@@ -179,11 +196,14 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 			goto put_chain;
 		}
 		mac_header_xmit = dev_is_mac_header_xmit(ndev);
-		odev = rcu_replace_pointer(m->tcfm_dev, ndev,
-					  lockdep_is_held(&m->tcf_lock));
-		netdev_put(odev, &m->tcfm_dev_tracker);
+		tcf_mirred_replace_dev(m, ndev);
 		netdev_tracker_alloc(ndev, &m->tcfm_dev_tracker, GFP_ATOMIC);
 		m->tcfm_mac_header_xmit = mac_header_xmit;
+		m->tcfm_blockid = 0;
+	} else if (tb[TCA_MIRRED_BLOCKID]) {
+		tcf_mirred_replace_dev(m, NULL);
+		m->tcfm_mac_header_xmit = false;
+		m->tcfm_blockid = nla_get_u32(tb[TCA_MIRRED_BLOCKID]);
 	}
 	goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch);
 	m->tcfm_eaction = parm->eaction;
@@ -211,13 +231,14 @@ static bool is_mirred_nested(void)
 	return unlikely(__this_cpu_read(mirred_nest_level) > 1);
 }
 
-static int tcf_mirred_forward(bool want_ingress, struct sk_buff *skb)
+static int tcf_mirred_forward(bool want_ingress, bool nested_call,
+			      struct sk_buff *skb)
 {
 	int err;
 
 	if (!want_ingress)
 		err = tcf_dev_queue_xmit(skb, dev_queue_xmit);
-	else if (is_mirred_nested())
+	else if (nested_call)
 		err = netif_rx(skb);
 	else
 		err = netif_receive_skb(skb);
@@ -225,110 +246,215 @@ static int tcf_mirred_forward(bool want_ingress, struct sk_buff *skb)
 	return err;
 }
 
-TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
-				     const struct tc_action *a,
-				     struct tcf_result *res)
+static int tcf_redirect_act(struct sk_buff *skb, bool want_ingress)
 {
-	struct tcf_mirred *m = to_mirred(a);
-	struct sk_buff *skb2 = skb;
-	bool m_mac_header_xmit;
-	struct net_device *dev;
-	unsigned int nest_level;
-	int retval, err = 0;
-	bool use_reinsert;
+	skb_set_redirected(skb, skb->tc_at_ingress);
+
+	return tcf_mirred_forward(want_ingress, is_mirred_nested(), skb);
+}
+
+static int tcf_mirror_act(struct sk_buff *skb, bool want_ingress)
+{
+	return tcf_mirred_forward(want_ingress, is_mirred_nested(), skb);
+}
+
+static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m,
+			     struct net_device *dev,
+			     const bool m_mac_header_xmit, int m_eaction,
+			     int retval)
+{
+	struct sk_buff *skb_to_send = skb;
 	bool want_ingress;
 	bool is_redirect;
-	bool expects_nh;
 	bool at_ingress;
-	int m_eaction;
+	bool dont_clone;
+	bool expects_nh;
 	int mac_len;
 	bool at_nh;
+	int err;
 
-	nest_level = __this_cpu_inc_return(mirred_nest_level);
-	if (unlikely(nest_level > MIRRED_NEST_LIMIT)) {
-		net_warn_ratelimited("Packet exceeded mirred recursion limit on dev %s\n",
-				     netdev_name(skb->dev));
-		__this_cpu_dec(mirred_nest_level);
-		return TC_ACT_SHOT;
-	}
-
-	tcf_lastuse_update(&m->tcf_tm);
-	tcf_action_update_bstats(&m->common, skb);
+	is_redirect = tcf_mirred_is_act_redirect(m_eaction);
+	want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
+	expects_nh = want_ingress || !m_mac_header_xmit;
 
-	m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit);
-	m_eaction = READ_ONCE(m->tcfm_eaction);
-	retval = READ_ONCE(m->tcf_action);
-	dev = rcu_dereference_bh(m->tcfm_dev);
-	if (unlikely(!dev)) {
-		pr_notice_once("tc mirred: target device is gone\n");
-		goto out;
-	}
+	/* we could easily avoid the clone only if called by ingress and clsact;
+	 * since we can't easily detect the clsact caller, skip clone only for
+	 * ingress - that covers the TC S/W datapath.
+	 */
+	dont_clone = skb_at_tc_ingress(skb) && is_redirect &&
+		     tcf_mirred_can_reinsert(retval);
 
-	if (unlikely(!(dev->flags & IFF_UP)) || !netif_carrier_ok(dev)) {
+	if (unlikely(!(dev->flags & IFF_UP)) ||
+	    !netif_carrier_ok(dev)) {
 		net_notice_ratelimited("tc mirred to Houston: device %s is down\n",
 				       dev->name);
+		err = -ENODEV;
 		goto out;
 	}
 
-	/* we could easily avoid the clone only if called by ingress and clsact;
-	 * since we can't easily detect the clsact caller, skip clone only for
-	 * ingress - that covers the TC S/W datapath.
-	 */
-	is_redirect = tcf_mirred_is_act_redirect(m_eaction);
-	at_ingress = skb_at_tc_ingress(skb);
-	use_reinsert = at_ingress && is_redirect &&
-		       tcf_mirred_can_reinsert(retval);
-	if (!use_reinsert) {
-		skb2 = skb_clone(skb, GFP_ATOMIC);
-		if (!skb2)
+	if (!dont_clone) {
+		skb_to_send = skb_clone(skb, GFP_ATOMIC);
+		if (!skb_to_send) {
+			err =  -ENOMEM;
 			goto out;
+		}
 	}
 
-	want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
+	at_ingress = skb_at_tc_ingress(skb);
 
 	/* All mirred/redirected skbs should clear previous ct info */
-	nf_reset_ct(skb2);
+	nf_reset_ct(skb_to_send);
 	if (want_ingress && !at_ingress) /* drop dst for egress -> ingress */
-		skb_dst_drop(skb2);
+		skb_dst_drop(skb_to_send);
 
-	expects_nh = want_ingress || !m_mac_header_xmit;
 	at_nh = skb->data == skb_network_header(skb);
 	if (at_nh != expects_nh) {
-		mac_len = skb_at_tc_ingress(skb) ? skb->mac_len :
+		mac_len = at_ingress ? skb->mac_len :
 			  skb_network_offset(skb);
 		if (expects_nh) {
 			/* target device/action expect data at nh */
-			skb_pull_rcsum(skb2, mac_len);
+			skb_pull_rcsum(skb_to_send, mac_len);
 		} else {
 			/* target device/action expect data at mac */
-			skb_push_rcsum(skb2, mac_len);
+			skb_push_rcsum(skb_to_send, mac_len);
 		}
 	}
 
-	skb2->skb_iif = skb->dev->ifindex;
-	skb2->dev = dev;
+	skb_to_send->skb_iif = skb->dev->ifindex;
+	skb_to_send->dev = dev;
 
-	/* mirror is always swallowed */
 	if (is_redirect) {
-		skb_set_redirected(skb2, skb2->tc_at_ingress);
-
-		/* let's the caller reinsert the packet, if possible */
-		if (use_reinsert) {
-			err = tcf_mirred_forward(want_ingress, skb);
-			if (err)
-				tcf_action_inc_overlimit_qstats(&m->common);
-			__this_cpu_dec(mirred_nest_level);
-			return TC_ACT_CONSUMED;
-		}
+		if (skb == skb_to_send)
+			retval = TC_ACT_CONSUMED;
+
+		err = tcf_redirect_act(skb_to_send, want_ingress);
+	} else {
+		err = tcf_mirror_act(skb_to_send, want_ingress);
 	}
 
-	err = tcf_mirred_forward(want_ingress, skb2);
-	if (err) {
 out:
+	if (err)
 		tcf_action_inc_overlimit_qstats(&m->common);
-		if (tcf_mirred_is_act_redirect(m_eaction))
-			retval = TC_ACT_SHOT;
+
+	return retval;
+}
+
+static int tcf_mirred_blockcast_redir(struct sk_buff *skb, struct tcf_mirred *m,
+				      struct tcf_block *block, int m_eaction,
+				      const u32 exception_ifindex, int retval)
+{
+	struct net_device *dev_prev = NULL;
+	struct net_device *dev = NULL;
+	unsigned long index;
+	int mirred_eaction;
+
+	mirred_eaction = tcf_mirred_act_wants_ingress(m_eaction) ?
+		TCA_INGRESS_MIRROR : TCA_EGRESS_MIRROR;
+
+	xa_for_each(&block->ports, index, dev) {
+		if (index == exception_ifindex)
+			continue;
+
+		if (dev_prev == NULL)
+			goto assign_prev;
+
+		tcf_mirred_to_dev(skb, m, dev_prev,
+				  dev_is_mac_header_xmit(dev),
+				  mirred_eaction, retval);
+assign_prev:
+		dev_prev = dev;
+	}
+
+	if (dev_prev)
+		return tcf_mirred_to_dev(skb, m, dev_prev,
+					 dev_is_mac_header_xmit(dev_prev),
+					 m_eaction, retval);
+
+	return retval;
+}
+
+static int tcf_mirred_blockcast(struct sk_buff *skb, struct tcf_mirred *m,
+				const u32 blockid, struct tcf_result *res,
+				int retval)
+{
+	const u32 exception_ifindex = skb->dev->ifindex;
+	struct net_device *dev = NULL;
+	struct tcf_block *block;
+	unsigned long index;
+	bool is_redirect;
+	int m_eaction;
+
+	m_eaction = READ_ONCE(m->tcfm_eaction);
+	is_redirect = tcf_mirred_is_act_redirect(m_eaction);
+
+	/* we are already under rcu protection, so can call block lookup directly */
+	block = tcf_block_lookup(dev_net(skb->dev), blockid);
+	if (!block || xa_empty(&block->ports)) {
+		tcf_action_inc_overlimit_qstats(&m->common);
+		return retval;
+	}
+
+	if (is_redirect)
+		return tcf_mirred_blockcast_redir(skb, m, block, m_eaction,
+						  exception_ifindex, retval);
+
+	xa_for_each(&block->ports, index, dev) {
+		if (index == exception_ifindex)
+			continue;
+
+		tcf_mirred_to_dev(skb, m, dev,
+				  dev_is_mac_header_xmit(dev),
+				  m_eaction, retval);
+	}
+
+	return retval;
+}
+
+TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
+				     const struct tc_action *a,
+				     struct tcf_result *res)
+{
+	struct tcf_mirred *m = to_mirred(a);
+	int retval = READ_ONCE(m->tcf_action);
+	unsigned int nest_level;
+	bool m_mac_header_xmit;
+	struct net_device *dev;
+	int m_eaction;
+	u32 blockid;
+	int err = 0;
+
+	nest_level = __this_cpu_inc_return(mirred_nest_level);
+	if (unlikely(nest_level > MIRRED_NEST_LIMIT)) {
+		net_warn_ratelimited("Packet exceeded mirred recursion limit on dev %s\n",
+				     netdev_name(skb->dev));
+		retval = TC_ACT_SHOT;
+		goto dec_nest_level;
+	}
+
+	tcf_lastuse_update(&m->tcf_tm);
+	tcf_action_update_bstats(&m->common, skb);
+
+	blockid = READ_ONCE(m->tcfm_blockid);
+	if (blockid) {
+		retval = tcf_mirred_blockcast(skb, m, blockid, res, retval);
+		goto dec_nest_level;
+	}
+
+	dev = rcu_dereference_bh(m->tcfm_dev);
+	if (unlikely(!dev)) {
+		pr_notice_once("tc mirred: target device is gone\n");
+		err = -ENODEV;
+		tcf_action_inc_overlimit_qstats(&m->common);
+		goto dec_nest_level;
 	}
+
+	m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit);
+	m_eaction = READ_ONCE(m->tcfm_eaction);
+
+	retval = tcf_mirred_to_dev(skb, m, dev, m_mac_header_xmit, m_eaction,
+				   retval);
+
+dec_nest_level:
 	__this_cpu_dec(mirred_nest_level);
 
 	return retval;
@@ -349,6 +475,7 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 {
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tcf_mirred *m = to_mirred(a);
+	const u32 blockid = m->tcfm_blockid;
 	struct tc_mirred opt = {
 		.index   = m->tcf_index,
 		.refcnt  = refcount_read(&m->tcf_refcnt) - ref,
@@ -367,6 +494,9 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	if (nla_put(skb, TCA_MIRRED_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
 
+	if (blockid && nla_put_u32(skb, TCA_MIRRED_BLOCKID, m->tcfm_blockid))
+		goto nla_put_failure;
+
 	tcf_tm_dump(&t, &m->tcf_tm);
 	if (nla_put_64bit(skb, TCA_MIRRED_TM, sizeof(t), &t, TCA_MIRRED_PAD))
 		goto nla_put_failure;
@@ -397,6 +527,8 @@ static int mirred_device_event(struct notifier_block *unused,
 				 * net_device are already rcu protected.
 				 */
 				RCU_INIT_POINTER(m->tcfm_dev, NULL);
+			} else if (m->tcfm_blockid) {
+				m->tcfm_blockid = 0;
 			}
 			spin_unlock_bh(&m->tcf_lock);
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use
  2023-12-14 14:10 [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Victor Nogueira
                   ` (2 preceding siblings ...)
  2023-12-14 14:10 ` [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block Victor Nogueira
@ 2023-12-14 14:29 ` Marcelo Ricardo Leitner
  3 siblings, 0 replies; 7+ messages in thread
From: Marcelo Ricardo Leitner @ 2023-12-14 14:29 UTC (permalink / raw)
  To: Victor Nogueira
  Cc: jhs, davem, edumazet, kuba, pabeni, xiyou.wangcong, jiri, vladbu,
	paulb, pctammela, netdev, kernel

On Thu, Dec 14, 2023 at 11:10:03AM -0300, Victor Nogueira wrote:

LGTM

To the patchset,
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block
  2023-12-14 14:10 ` [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block Victor Nogueira
@ 2023-12-15  3:09   ` Jakub Kicinski
  2023-12-15 11:12     ` Victor Nogueira
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2023-12-15  3:09 UTC (permalink / raw)
  To: Victor Nogueira
  Cc: jhs, davem, edumazet, pabeni, xiyou.wangcong, jiri, mleitner,
	vladbu, paulb, pctammela, netdev, kernel

On Thu, 14 Dec 2023 11:10:06 -0300 Victor Nogueira wrote:
> So far the mirred action has dealt with syntax that handles mirror/redirection for netdev.
> A matching packet is redirected or mirrored to a target netdev.
> 
> In this patch we enable mirred to mirror to a tc block as well.
> IOW, the new syntax looks as follows:
> ... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >
> 
> Examples of mirroring or redirecting to a tc block:
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
> 
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22

net/sched/act_mirred.c:424:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]
  424 |         int err = 0;
      |             ^
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block
  2023-12-15  3:09   ` Jakub Kicinski
@ 2023-12-15 11:12     ` Victor Nogueira
  0 siblings, 0 replies; 7+ messages in thread
From: Victor Nogueira @ 2023-12-15 11:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: jhs, davem, edumazet, pabeni, xiyou.wangcong, jiri, mleitner,
	vladbu, paulb, pctammela, netdev, kernel



On 15/12/2023 00:09, Jakub Kicinski wrote:
> On Thu, 14 Dec 2023 11:10:06 -0300 Victor Nogueira wrote:
>> So far the mirred action has dealt with syntax that handles mirror/redirection for netdev.
>> A matching packet is redirected or mirrored to a target netdev.
>>
>> In this patch we enable mirred to mirror to a tc block as well.
>> IOW, the new syntax looks as follows:
>> ... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >
>>
>> Examples of mirroring or redirecting to a tc block:
>> $ tc filter add block 22 protocol ip pref 25 \
>>    flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
>>
>> $ tc filter add block 22 protocol ip pref 25 \
>>    flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22
> 
> net/sched/act_mirred.c:424:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]
>    424 |         int err = 0;
>        |             ^

Thank you for the catch.
Sent a v7 fixing it.

cheers,
Victor

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-12-15 11:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-14 14:10 [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Victor Nogueira
2023-12-14 14:10 ` [PATCH net-next v6 1/3] net/sched: Introduce tc block netdev tracking infra Victor Nogueira
2023-12-14 14:10 ` [PATCH net-next v6 2/3] net/sched: cls_api: Expose tc block to the datapath Victor Nogueira
2023-12-14 14:10 ` [PATCH net-next v6 3/3] net/sched: act_mirred: Allow mirred to block Victor Nogueira
2023-12-15  3:09   ` Jakub Kicinski
2023-12-15 11:12     ` Victor Nogueira
2023-12-14 14:29 ` [PATCH net-next v6 0/3] net/sched: Introduce tc block ports tracking and use Marcelo Ricardo Leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).