netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Netfilter ingress support (v3)
@ 2015-05-04 10:50 Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 1/4] net: add minimalistic ingress filter hook and port sch_ingress on top of it Pablo Neira Ayuso
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 10:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kaber, jhs

Hi,

Another round of the patchset to add Netfilter ingress support. This new
patchset introduces the necessary updates in 2 steps:

1) Add minismalistic ingress hook infrastructure that allows to register one
   client at a time, so you hit -EBUSY in case the hook is in use. Basically,
   we have a function pointer that is rcu-protected to invoke the corresponding
   filter framework which has minimal performance impact in the critical ingress
   path and avoid more pollution in it. This patch also ports the ingress qdisc
   on top of this.

   This also results in most of the qdisc ingress code that used to be embedded
   into net/core/dev.c can now be placed in net/sched/sch_ingress.c, which
   should allow to get rid of the Qdisc->enqueue() call.

2) Add Netfilter ingress support using the minimalistic hook infrastructure.
   There is some extra memory consumption (24 bytes) in net_device but pahole
   reports here a hole due to ____cacheline_aligned_in_smp to get the transmit
   path area in a different cache line. So I'm not sure it's worth the effort
   to reduce this to 8 bytes at the cost of getting the hook code a bit more
   complicated.

As already said, this opens the window to existing nftables core features that
are not present in qdisc ingress and that can be used out-of-the-box, most
relevantly:

1) Multi-dimensional key dictionary lookups.
2) Arbitrary stateful flow tables.
3) Transactions.

Among others. You can find more on previous RFCs, see:

http://www.spinics.net/lists/netdev/msg325210.html
http://marc.info/?l=netfilter-devel&m=143033337020328&w=2

In summary, this provides the facility to keep both tc and netfilter in place,
while the user can select what they prefer to filter from ingress.

Thanks.

Pablo Neira Ayuso (4):
  net: add minimalistic ingress filter hook and port sch_ingress on top of it
  netfilter: cleanup struct nf_hook_ops indentation
  netfilter: add hook list to nf_hook_state
  net: add netfilter ingress hook

 include/linux/netdevice.h         |   16 ++++++++
 include/linux/netfilter.h         |   22 +++++-----
 include/linux/netfilter_ingress.h |   26 ++++++++++++
 include/uapi/linux/netfilter.h    |    6 +++
 net/Kconfig                       |    9 +++++
 net/core/dev.c                    |   81 ++++++++++++++++++-------------------
 net/netfilter/Makefile            |    1 +
 net/netfilter/core.c              |   29 ++++++++++---
 net/netfilter/ingress.c           |   41 +++++++++++++++++++
 net/sched/Kconfig                 |    1 +
 net/sched/sch_ingress.c           |   38 +++++++++++++++--
 11 files changed, 211 insertions(+), 59 deletions(-)
 create mode 100644 include/linux/netfilter_ingress.h
 create mode 100644 net/netfilter/ingress.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] net: add minimalistic ingress filter hook and port sch_ingress on top of it
  2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
@ 2015-05-04 10:50 ` Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 2/4] netfilter: cleanup struct nf_hook_ops indentation Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 10:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kaber, jhs

This patch adds a minimalistic hook infrastructure in netif_receive_core() that
allows you to attach one hook function at a time. In case that is already in
use, you will hit -EBUSY. The first client of this is sch_ingress that has been
ported on top of it. The abstraction is lightweight to avoid performance
concerns, and it is ruled by a global static key.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netdevice.h |   13 ++++++++
 net/Kconfig               |    3 ++
 net/core/dev.c            |   79 +++++++++++++++++++++------------------------
 net/sched/Kconfig         |    1 +
 net/sched/sch_ingress.c   |   38 ++++++++++++++++++++--
 5 files changed, 89 insertions(+), 45 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1899c74..18e1500 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -770,6 +770,15 @@ struct netdev_phys_item_id {
 typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
 				       struct sk_buff *skb);
 
+/* This allows you to register and to unregister a function the hook for
+ * ingress filtering.
+ */
+typedef struct sk_buff *ingress_hook_func_t(struct sk_buff *skb);
+
+int dev_ingress_hook_register(struct net_device *dev,
+			      ingress_hook_func_t *hookfn);
+void dev_ingress_hook_unregister(struct net_device *dev);
+
 /*
  * This structure defines the management hooks for network devices.
  * The following hooks can be defined; unless noted otherwise, they are
@@ -1655,7 +1664,11 @@ struct net_device {
 	rx_handler_func_t __rcu	*rx_handler;
 	void __rcu		*rx_handler_data;
 
+#ifdef CONFIG_NET_INGRESS_HOOK
+	ingress_hook_func_t __rcu *ingress_hook;
+#endif
 	struct netdev_queue __rcu *ingress_queue;
+
 	unsigned char		broadcast[MAX_ADDR_LEN];
 #ifdef CONFIG_RFS_ACCEL
 	struct cpu_rmap		*rx_cpu_rmap;
diff --git a/net/Kconfig b/net/Kconfig
index 44dd578..f0e2f3f 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -230,6 +230,9 @@ source "net/mpls/Kconfig"
 source "net/hsr/Kconfig"
 source "net/switchdev/Kconfig"
 
+config NET_INGRESS_HOOK
+	bool
+
 config RPS
 	bool
 	depends on SMP && SYSFS
diff --git a/net/core/dev.c b/net/core/dev.c
index 862875e..126d0b1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1630,20 +1630,41 @@ int call_netdevice_notifiers(unsigned long val, struct net_device *dev)
 }
 EXPORT_SYMBOL(call_netdevice_notifiers);
 
-#ifdef CONFIG_NET_CLS_ACT
+#ifdef CONFIG_NET_INGRESS_HOOK
 static struct static_key ingress_needed __read_mostly;
 
-void net_inc_ingress_queue(void)
+static DEFINE_MUTEX(ingress_hook_mutex);
+
+int dev_ingress_hook_register(struct net_device *dev,
+			      ingress_hook_func_t *hookfn)
 {
+	int ret = 0;
+
+	mutex_lock(&ingress_hook_mutex);
+	if (dev->ingress_hook != NULL) {
+		ret = -EBUSY;
+		goto err1;
+	}
+	rcu_assign_pointer(dev->ingress_hook, hookfn);
+	mutex_unlock(&ingress_hook_mutex);
+
 	static_key_slow_inc(&ingress_needed);
+	return 0;
+err1:
+	mutex_unlock(&ingress_hook_mutex);
+	return ret;
 }
-EXPORT_SYMBOL_GPL(net_inc_ingress_queue);
+EXPORT_SYMBOL_GPL(dev_ingress_hook_register);
 
-void net_dec_ingress_queue(void)
+void dev_ingress_hook_unregister(struct net_device *dev)
 {
+	mutex_lock(&ingress_hook_mutex);
+	rcu_assign_pointer(dev->ingress_hook, NULL);
+	mutex_unlock(&ingress_hook_mutex);
 	static_key_slow_dec(&ingress_needed);
+	synchronize_rcu();
 }
-EXPORT_SYMBOL_GPL(net_dec_ingress_queue);
+EXPORT_SYMBOL_GPL(dev_ingress_hook_unregister);
 #endif
 
 static struct static_key netstamp_needed __read_mostly;
@@ -3520,38 +3541,15 @@ int (*br_fdb_test_addr_hook)(struct net_device *dev,
 EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook);
 #endif
 
-#ifdef CONFIG_NET_CLS_ACT
-/* TODO: Maybe we should just force sch_ingress to be compiled in
- * when CONFIG_NET_CLS_ACT is? otherwise some useless instructions
- * a compare and 2 stores extra right now if we dont have it on
- * but have CONFIG_NET_CLS_ACT
- * NOTE: This doesn't stop any functionality; if you dont have
- * the ingress scheduler, you just can't add policies on ingress.
- *
- */
-static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
-{
-	int result = TC_ACT_OK;
-	struct Qdisc *q;
-
-	skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
-
-	q = rcu_dereference(rxq->qdisc);
-	if (q != &noop_qdisc) {
-		if (likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
-			result = qdisc_enqueue_root(skb, q);
-	}
-
-	return result;
-}
-
+#ifdef CONFIG_NET_INGRESS_HOOK
 static inline struct sk_buff *handle_ing(struct sk_buff *skb,
 					 struct packet_type **pt_prev,
 					 int *ret, struct net_device *orig_dev)
 {
-	struct netdev_queue *rxq = rcu_dereference(skb->dev->ingress_queue);
+	ingress_hook_func_t *ingress_hook;
 
-	if (!rxq || rcu_access_pointer(rxq->qdisc) == &noop_qdisc)
+	ingress_hook = rcu_dereference(skb->dev->ingress_hook);
+	if (ingress_hook == NULL)
 		return skb;
 
 	if (*pt_prev) {
@@ -3559,14 +3557,7 @@ static inline struct sk_buff *handle_ing(struct sk_buff *skb,
 		*pt_prev = NULL;
 	}
 
-	switch (ing_filter(skb, rxq)) {
-	case TC_ACT_SHOT:
-	case TC_ACT_STOLEN:
-		kfree_skb(skb);
-		return NULL;
-	}
-
-	return skb;
+	return ingress_hook(skb);
 }
 #endif
 
@@ -3700,13 +3691,14 @@ another_round:
 	}
 
 skip_taps:
-#ifdef CONFIG_NET_CLS_ACT
+#ifdef CONFIG_NET_INGRESS_HOOK
 	if (static_key_false(&ingress_needed)) {
 		skb = handle_ing(skb, &pt_prev, &ret, orig_dev);
 		if (!skb)
 			goto unlock;
 	}
-
+#endif
+#ifdef CONFIG_NET_CLS_ACT
 	skb->tc_verd = 0;
 ncls:
 #endif
@@ -6846,6 +6838,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	if (netif_alloc_netdev_queues(dev))
 		goto free_all;
 
+#ifdef CONFIG_NET_INGRESS_HOOK
+	RCU_INIT_POINTER(dev->ingress_hook, NULL);
+#endif
 #ifdef CONFIG_SYSFS
 	dev->num_rx_queues = rxqs;
 	dev->real_num_rx_queues = rxqs;
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 2274e72..3cef39e 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -312,6 +312,7 @@ config NET_SCH_PIE
 config NET_SCH_INGRESS
 	tristate "Ingress Qdisc"
 	depends on NET_CLS_ACT
+	select NET_INGRESS_HOOK
 	---help---
 	  Say Y here if you want to use classifiers for incoming packets.
 	  If unsure, say Y.
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index a89cc32..38ddef7 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -88,12 +88,44 @@ static int ingress_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 /* ------------------------------------------------------------- */
 
+static int ingress_filter(struct sk_buff *skb, struct netdev_queue *rxq)
+{
+	int result = TC_ACT_OK;
+	struct Qdisc *q;
+
+	skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
+
+	q = rcu_dereference(rxq->qdisc);
+	if (q != &noop_qdisc) {
+		if (likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
+			result = qdisc_enqueue_root(skb, q);
+	}
+
+	return result;
+}
+
+static struct sk_buff *qdisc_ingress_hook(struct sk_buff *skb)
+{
+	struct netdev_queue *rxq = rcu_dereference(skb->dev->ingress_queue);
+
+	if (!rxq || rcu_access_pointer(rxq->qdisc) == &noop_qdisc)
+		return skb;
+
+	switch (ingress_filter(skb, rxq)) {
+	case TC_ACT_SHOT:
+	case TC_ACT_STOLEN:
+		kfree_skb(skb);
+		return 0;
+	}
+
+	return skb;
+}
+
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 {
-	net_inc_ingress_queue();
 	sch->flags |= TCQ_F_CPUSTATS;
 
-	return 0;
+	return dev_ingress_hook_register(qdisc_dev(sch), qdisc_ingress_hook);
 }
 
 static void ingress_destroy(struct Qdisc *sch)
@@ -101,7 +133,7 @@ static void ingress_destroy(struct Qdisc *sch)
 	struct ingress_qdisc_data *p = qdisc_priv(sch);
 
 	tcf_destroy_chain(&p->filter_list);
-	net_dec_ingress_queue();
+	dev_ingress_hook_unregister(qdisc_dev(sch));
 }
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] netfilter: cleanup struct nf_hook_ops indentation
  2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 1/4] net: add minimalistic ingress filter hook and port sch_ingress on top of it Pablo Neira Ayuso
@ 2015-05-04 10:50 ` Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 3/4] netfilter: add hook list to nf_hook_state Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 10:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kaber, jhs

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 63560d0..83be4a3 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -79,16 +79,16 @@ typedef unsigned int nf_hookfn(const struct nf_hook_ops *ops,
 			       const struct nf_hook_state *state);
 
 struct nf_hook_ops {
-	struct list_head list;
+	struct list_head 	list;
 
 	/* User fills in from here down. */
-	nf_hookfn	*hook;
-	struct module	*owner;
-	void		*priv;
-	u_int8_t	pf;
-	unsigned int	hooknum;
+	nf_hookfn		*hook;
+	struct module		*owner;
+	void			*priv;
+	u_int8_t		pf;
+	unsigned int		hooknum;
 	/* Hooks are ordered in ascending priority. */
-	int		priority;
+	int			priority;
 };
 
 struct nf_sockopt_ops {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] netfilter: add hook list to nf_hook_state
  2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 1/4] net: add minimalistic ingress filter hook and port sch_ingress on top of it Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 2/4] netfilter: cleanup struct nf_hook_ops indentation Pablo Neira Ayuso
@ 2015-05-04 10:50 ` Pablo Neira Ayuso
  2015-05-04 10:50 ` [PATCH 4/4] net: add netfilter ingress hook Pablo Neira Ayuso
  2015-05-04 15:56 ` [PATCH 0/4] Netfilter ingress support (v3) Alexei Starovoitov
  4 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 10:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kaber, jhs

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h |    7 +++++--
 net/netfilter/core.c      |    6 ++----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 83be4a3..388ed19 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -54,10 +54,12 @@ struct nf_hook_state {
 	struct net_device *in;
 	struct net_device *out;
 	struct sock *sk;
+	struct list_head *hook_list;
 	int (*okfn)(struct sock *, struct sk_buff *);
 };
 
 static inline void nf_hook_state_init(struct nf_hook_state *p,
+				      struct list_head *hook_list,
 				      unsigned int hook,
 				      int thresh, u_int8_t pf,
 				      struct net_device *indev,
@@ -71,6 +73,7 @@ static inline void nf_hook_state_init(struct nf_hook_state *p,
 	p->in = indev;
 	p->out = outdev;
 	p->sk = sk;
+	p->hook_list = hook_list;
 	p->okfn = okfn;
 }
 
@@ -166,8 +169,8 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int hook,
 	if (nf_hooks_active(pf, hook)) {
 		struct nf_hook_state state;
 
-		nf_hook_state_init(&state, hook, thresh, pf,
-				   indev, outdev, sk, okfn);
+		nf_hook_state_init(&state, &nf_hooks[pf][hook], hook, thresh,
+				   pf, indev, outdev, sk, okfn);
 		return nf_hook_slow(skb, &state);
 	}
 	return 1;
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e616301..e418cfd 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -166,11 +166,9 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state)
 	/* We may already have this, but read-locks nest anyway */
 	rcu_read_lock();
 
-	elem = list_entry_rcu(&nf_hooks[state->pf][state->hook],
-			      struct nf_hook_ops, list);
+	elem = list_entry_rcu(state->hook_list, struct nf_hook_ops, list);
 next_hook:
-	verdict = nf_iterate(&nf_hooks[state->pf][state->hook], skb, state,
-			     &elem);
+	verdict = nf_iterate(state->hook_list, skb, state, &elem);
 	if (verdict == NF_ACCEPT || verdict == NF_STOP) {
 		ret = 1;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] net: add netfilter ingress hook
  2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2015-05-04 10:50 ` [PATCH 3/4] netfilter: add hook list to nf_hook_state Pablo Neira Ayuso
@ 2015-05-04 10:50 ` Pablo Neira Ayuso
  2015-05-04 15:56 ` [PATCH 0/4] Netfilter ingress support (v3) Alexei Starovoitov
  4 siblings, 0 replies; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 10:50 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kaber, jhs

This patch adds a new NFPROTO_NETDEV family that allows you to register
per-device hooks from the ingress path. This is built upon the minimalistic
ingress hook infrastructure.

The caller is responsible for holding/putting the reference on the net_device
that is attached to nf_hook_ops.
---
 include/linux/netdevice.h         |    3 +++
 include/linux/netfilter.h         |    1 +
 include/linux/netfilter_ingress.h |   26 +++++++++++++++++++++++
 include/uapi/linux/netfilter.h    |    6 ++++++
 net/Kconfig                       |    6 ++++++
 net/core/dev.c                    |    3 +++
 net/netfilter/Makefile            |    1 +
 net/netfilter/core.c              |   23 ++++++++++++++++++++-
 net/netfilter/ingress.c           |   41 +++++++++++++++++++++++++++++++++++++
 9 files changed, 109 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/netfilter_ingress.h
 create mode 100644 net/netfilter/ingress.c

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 18e1500..8333feb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1668,6 +1668,9 @@ struct net_device {
 	ingress_hook_func_t __rcu *ingress_hook;
 #endif
 	struct netdev_queue __rcu *ingress_queue;
+#ifdef CONFIG_NETFILTER_INGRESS
+	struct list_head 	nf_hooks_ingress;
+#endif
 
 	unsigned char		broadcast[MAX_ADDR_LEN];
 #ifdef CONFIG_RFS_ACCEL
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 388ed19..f91715a 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -90,6 +90,7 @@ struct nf_hook_ops {
 	void			*priv;
 	u_int8_t		pf;
 	unsigned int		hooknum;
+	struct net_device	*dev;
 	/* Hooks are ordered in ascending priority. */
 	int			priority;
 };
diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
new file mode 100644
index 0000000..5d94872
--- /dev/null
+++ b/include/linux/netfilter_ingress.h
@@ -0,0 +1,26 @@
+#ifndef _NETFILTER_INGRESS_H_
+#define _NETFILTER_INGRESS_H_
+
+#include <linux/netdevice.h>
+
+#ifdef CONFIG_NETFILTER_INGRESS
+struct list_head *nf_register_ingress_hook(struct nf_hook_ops *reg);
+void nf_unregister_ingress_hook(struct nf_hook_ops *reg);
+
+static inline void nf_hook_ingress_init(struct net_device *dev)
+{
+	INIT_LIST_HEAD(&dev->nf_hooks_ingress);
+}
+#else
+static inline struct list_head *
+nf_register_ingress_hook(struct nf_hook_ops *reg)
+{
+	return &nf_hooks[reg->pf][reg->hooknum];
+}
+
+static inline void nf_unregister_ingress_hook(struct nf_hook_ops *reg) {}
+
+static inline void nf_hook_ingress_init(struct net_device *dev) {}
+#endif
+
+#endif /* _NETFILTER_INGRESS_H_ */
diff --git a/include/uapi/linux/netfilter.h b/include/uapi/linux/netfilter.h
index ef1b1f8..177027c 100644
--- a/include/uapi/linux/netfilter.h
+++ b/include/uapi/linux/netfilter.h
@@ -51,11 +51,17 @@ enum nf_inet_hooks {
 	NF_INET_NUMHOOKS
 };
 
+enum nf_dev_hooks {
+	NF_NETDEV_INGRESS,
+	NF_NETDEV_NUMHOOKS
+};
+
 enum {
 	NFPROTO_UNSPEC =  0,
 	NFPROTO_INET   =  1,
 	NFPROTO_IPV4   =  2,
 	NFPROTO_ARP    =  3,
+	NFPROTO_NETDEV =  5,
 	NFPROTO_BRIDGE =  7,
 	NFPROTO_IPV6   = 10,
 	NFPROTO_DECNET = 12,
diff --git a/net/Kconfig b/net/Kconfig
index f0e2f3f..78d58c9 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -191,6 +191,12 @@ config BRIDGE_NETFILTER
 
 	  If unsure, say N.
 
+config NETFILTER_INGRESS
+	bool "Netfilter ingress hooks"
+	select NET_INGRESS_HOOK
+	help
+	  You can say Y here if you want to enable Netfilter ingress hook.
+
 source "net/netfilter/Kconfig"
 source "net/ipv4/netfilter/Kconfig"
 source "net/ipv6/netfilter/Kconfig"
diff --git a/net/core/dev.c b/net/core/dev.c
index 126d0b1..99d8728 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -135,6 +135,7 @@
 #include <linux/if_macvlan.h>
 #include <linux/errqueue.h>
 #include <linux/hrtimer.h>
+#include <linux/netfilter_ingress.h>
 
 #include "net-sysfs.h"
 
@@ -6841,6 +6842,8 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 #ifdef CONFIG_NET_INGRESS_HOOK
 	RCU_INIT_POINTER(dev->ingress_hook, NULL);
 #endif
+	nf_hook_ingress_init(dev);
+
 #ifdef CONFIG_SYSFS
 	dev->num_rx_queues = rxqs;
 	dev->real_num_rx_queues = rxqs;
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index a87d8b8..f6923e2 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -1,4 +1,5 @@
 netfilter-objs := core.o nf_log.o nf_queue.o nf_sockopt.o
+netfilter-$(CONFIG_NETFILTER_INGRESS) += ingress.o
 
 nf_conntrack-y	:= nf_conntrack_core.o nf_conntrack_standalone.o nf_conntrack_expect.o nf_conntrack_helper.o nf_conntrack_proto.o nf_conntrack_l3proto_generic.o nf_conntrack_proto_generic.o nf_conntrack_proto_tcp.o nf_conntrack_proto_udp.o nf_conntrack_extend.o nf_conntrack_acct.o nf_conntrack_seqadj.o
 nf_conntrack-$(CONFIG_NF_CONNTRACK_TIMEOUT) += nf_conntrack_timeout.o
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e418cfd..370ea06 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -22,6 +22,7 @@
 #include <linux/proc_fs.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
+#include <linux/netfilter_ingress.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 
@@ -64,10 +65,23 @@ static DEFINE_MUTEX(nf_hook_mutex);
 
 int nf_register_hook(struct nf_hook_ops *reg)
 {
+	struct list_head *nf_hook_list;
 	struct nf_hook_ops *elem;
 
 	mutex_lock(&nf_hook_mutex);
-	list_for_each_entry(elem, &nf_hooks[reg->pf][reg->hooknum], list) {
+	switch (reg->pf) {
+	case NFPROTO_NETDEV:
+		nf_hook_list = nf_register_ingress_hook(reg);
+		if (IS_ERR(nf_hook_list)) {
+			mutex_unlock(&nf_hook_mutex);
+			return PTR_ERR(nf_hook_list);
+		}
+		break;
+	default:
+		nf_hook_list = &nf_hooks[reg->pf][reg->hooknum];
+		break;
+	}
+	list_for_each_entry(elem, nf_hook_list, list) {
 		if (reg->priority < elem->priority)
 			break;
 	}
@@ -84,6 +98,13 @@ void nf_unregister_hook(struct nf_hook_ops *reg)
 {
 	mutex_lock(&nf_hook_mutex);
 	list_del_rcu(&reg->list);
+	switch (reg->pf) {
+	case NFPROTO_NETDEV:
+		nf_unregister_ingress_hook(reg);
+		break;
+	default:
+		break;
+	}
 	mutex_unlock(&nf_hook_mutex);
 #ifdef HAVE_JUMP_LABEL
 	static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]);
diff --git a/net/netfilter/ingress.c b/net/netfilter/ingress.c
new file mode 100644
index 0000000..82bcfd1
--- /dev/null
+++ b/net/netfilter/ingress.c
@@ -0,0 +1,41 @@
+#include <linux/skbuff.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter_ingress.h>
+
+static struct sk_buff *nf_hook_ingress(struct sk_buff *skb)
+{
+	struct nf_hook_state state;
+
+	nf_hook_state_init(&state, &skb->dev->nf_hooks_ingress,
+			   NF_NETDEV_INGRESS, INT_MIN, NFPROTO_NETDEV, NULL,
+			   skb->dev, NULL, NULL);
+	if (nf_hook_slow(skb, &state) < 0)
+		return NULL;
+
+	return skb;
+}
+
+struct list_head *nf_register_ingress_hook(struct nf_hook_ops *reg)
+{
+	int ret;
+
+	BUG_ON(reg->dev == NULL);
+
+	if (reg->hooknum == NF_NETDEV_INGRESS &&
+	    list_empty(&reg->dev->nf_hooks_ingress)) {
+		ret = dev_ingress_hook_register(reg->dev, nf_hook_ingress);
+		if (ret < 0)
+			return ERR_PTR(ret);
+	}
+
+	return &reg->dev->nf_hooks_ingress;
+}
+
+void nf_unregister_ingress_hook(struct nf_hook_ops *reg)
+{
+	WARN_ON(reg->dev == NULL);
+
+	if (reg->hooknum == NF_NETDEV_INGRESS &&
+	    list_empty(&reg->dev->nf_hooks_ingress))
+		dev_ingress_hook_unregister(reg->dev);
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2015-05-04 10:50 ` [PATCH 4/4] net: add netfilter ingress hook Pablo Neira Ayuso
@ 2015-05-04 15:56 ` Alexei Starovoitov
  2015-05-04 16:19   ` Florian Westphal
  4 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2015-05-04 15:56 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev, kaber, jhs

On Mon, May 04, 2015 at 12:50:45PM +0200, Pablo Neira Ayuso wrote:
> Hi,
> 
> Another round of the patchset to add Netfilter ingress support. This new
> patchset introduces the necessary updates in 2 steps:
> 
> 1) Add minismalistic ingress hook infrastructure that allows to register one
>    client at a time, so you hit -EBUSY in case the hook is in use. Basically,
>    we have a function pointer that is rcu-protected to invoke the corresponding
>    filter framework which has minimal performance impact in the critical ingress
>    path and avoid more pollution in it. This patch also ports the ingress qdisc
>    on top of this.
...
> In summary, this provides the facility to keep both tc and netfilter in place,
> while the user can select what they prefer to filter from ingress.

wow, I have to say I'm impressed. That's the most genius way to
really kill TC.
Patch 1 looks good, patch 2,3,4 are nicely building on top...
until somebody starts asking how patch 5 will look.
In the future netfilter ingress module will be loaded along with
other iptables modules just like conntrack is today and users
who would want to use ingress tc would have to _unload_
netfilter_ingress module, but if it has interesting dependencies
it may mean to unload iptables and the rest.
So at the end the users will have a binary choice either to use
iptables/nft or use tc, because they won't be able to co-exist
because ingress_hook is the only one.
I don't understand this 'tc hate'. Why go out of the way to
make TC more difficult to use ?
Just add _new_ hook for netfilter ingress and both subsystems
can happily co-exist.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 15:56 ` [PATCH 0/4] Netfilter ingress support (v3) Alexei Starovoitov
@ 2015-05-04 16:19   ` Florian Westphal
  2015-05-04 17:21     ` Jamal Hadi Salim
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-05-04 16:19 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Pablo Neira Ayuso, netfilter-devel, davem, netdev, kaber, jhs

Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > Another round of the patchset to add Netfilter ingress support. This new
> > patchset introduces the necessary updates in 2 steps:
> > 
> > 1) Add minismalistic ingress hook infrastructure that allows to register one
> >    client at a time, so you hit -EBUSY in case the hook is in use. Basically,
> >    we have a function pointer that is rcu-protected to invoke the corresponding
> >    filter framework which has minimal performance impact in the critical ingress
> >    path and avoid more pollution in it. This patch also ports the ingress qdisc
> >    on top of this.
> ...
> > In summary, this provides the facility to keep both tc and netfilter in place,
> > while the user can select what they prefer to filter from ingress.
> 
> wow, I have to say I'm impressed. That's the most genius way to
> really kill TC.
> Patch 1 looks good, patch 2,3,4 are nicely building on top...
> until somebody starts asking how patch 5 will look.
> In the future netfilter ingress module will be loaded along with
> other iptables modules just like conntrack is today and users
> who would want to use ingress tc would have to _unload_
> netfilter_ingress module, but if it has interesting dependencies
> it may mean to unload iptables and the rest.

FWIW while I think this is a valid concern, I believe its unfounded.

netfilter_ingress must not force run-time
dependencies like 'oh, you want tc, too bad, no conntrack for you)'.

(and i don't see any need for such a dependency).

> So at the end the users will have a binary choice either to use
> iptables/nft or use tc, because they won't be able to co-exist
> because ingress_hook is the only one.
> I don't understand this 'tc hate'. Why go out of the way to
> make TC more difficult to use ?

I don't see this having anything to do with 'TC hate'.

Lets face it, tc and netfilter are horrid when it comes to play
nice with each other, and functionality provided by both partially
overlaps.

Just look at all the hacks we have in tc to use netfilter (iptables,
conntrack) functionality, or even iptables CLASSIFY target (aka
"tc filters are too hard for me").

In case there is later some really good reason why you need both
tc ingress and nft ingress simultaneously then we can always revisit this
and add a second hook for coexistence, at the expense of more
complexity.

> Just add _new_ hook for netfilter ingress and both subsystems
> can happily co-exist.

I don't see a use case where you'd need both tc ingress and nft
ingress at the same time.  Can you elaborate?

At the moment I think having two hooks provides no advantage, it only
complicates code.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 16:19   ` Florian Westphal
@ 2015-05-04 17:21     ` Jamal Hadi Salim
  2015-05-04 17:43       ` Florian Westphal
  0 siblings, 1 reply; 14+ messages in thread
From: Jamal Hadi Salim @ 2015-05-04 17:21 UTC (permalink / raw)
  To: Florian Westphal, Alexei Starovoitov
  Cc: Pablo Neira Ayuso, netfilter-devel, davem, netdev, kaber

On 05/04/15 12:19, Florian Westphal wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

>> wow, I have to say I'm impressed. That's the most genius way to
>> really kill TC.
>> Patch 1 looks good, patch 2,3,4 are nicely building on top...
>> until somebody starts asking how patch 5 will look.
>> In the future netfilter ingress module will be loaded along with
>> other iptables modules just like conntrack is today and users
>> who would want to use ingress tc would have to _unload_
>> netfilter_ingress module, but if it has interesting dependencies
>> it may mean to unload iptables and the rest.
>
> FWIW while I think this is a valid concern, I believe its unfounded.
>
> netfilter_ingress must not force run-time
> dependencies like 'oh, you want tc, too bad, no conntrack for you)'.
>
> (and i don't see any need for such a dependency).
>

It is an either-or choice. You cant have both. The _evil genius_ part i
think is that distros which ship with iptables rules and conntracking
on are going to not even turn on tc and my scripts now have to go
unload one.
But even if the scripts worked (perhaps there are plans to make sure
all scripts continue to work transparently), i care about performance
and  youve suddenly taken that away from me.

So i would agree with adding the two hooks. Priority should be given
to tc in the code path.

cheers,
jamal

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 17:21     ` Jamal Hadi Salim
@ 2015-05-04 17:43       ` Florian Westphal
  2015-05-04 18:47         ` Jamal Hadi Salim
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-05-04 17:43 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Florian Westphal, Alexei Starovoitov, Pablo Neira Ayuso,
	netfilter-devel, davem, netdev, kaber

Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 05/04/15 12:19, Florian Westphal wrote:
> >Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> >>wow, I have to say I'm impressed. That's the most genius way to
> >>really kill TC.
> >>Patch 1 looks good, patch 2,3,4 are nicely building on top...
> >>until somebody starts asking how patch 5 will look.
> >>In the future netfilter ingress module will be loaded along with
> >>other iptables modules just like conntrack is today and users
> >>who would want to use ingress tc would have to _unload_
> >>netfilter_ingress module, but if it has interesting dependencies
> >>it may mean to unload iptables and the rest.
> >
> >FWIW while I think this is a valid concern, I believe its unfounded.
> >
> >netfilter_ingress must not force run-time
> >dependencies like 'oh, you want tc, too bad, no conntrack for you)'.
> >
> >(and i don't see any need for such a dependency).
> >
> 
> It is an either-or choice. You cant have both.

Again, why is there a need to have both (at same time)?

> The _evil genius_ part i

Please don't accuse anyone of being 'evil'.

I think that tc and netfilter are both broken by design in the sense
that we have two different systems with partially overlapping
functionality while interaction between them consists of hacks.

It works both ways, iptables CLASSIFY target is also not very
elegant from a design point of view, it bolts the packet/connection matching
functionality provided by iptables to qdisc hierarchy.

> think is that distros which ship with iptables rules and conntracking
> on are going to not even turn on tc and my scripts now have to go
> unload one.

You mean add 'rmmod nft_ingress' or something similar?

That might be a problem.  One possible way out would be to
make 'tc qdisc add dev eth0 ingress' silently unregister nft ingress
on kernel side (but not vice versa).

Not nice, but it would keep compatibility with tc ingress scripts.

> But even if the scripts worked (perhaps there are plans to make sure
> all scripts continue to work transparently), i care about performance
> and  youve suddenly taken that away from me.

I don't think thats an unsolveable problem, currently we have the
qdisc->enqueue() indirection, now we have hook + qdisc->enqueue() BUT
AFAIU the latter could now be avoided by having ingress hook
interact with sch_ingress directly.

That being said, I am not opposed to two hooks, I just don't see any
technical reason whatsoever why one would need two different ingress
classification engines in use at same time.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 17:43       ` Florian Westphal
@ 2015-05-04 18:47         ` Jamal Hadi Salim
  2015-05-04 18:59           ` Florian Westphal
  0 siblings, 1 reply; 14+ messages in thread
From: Jamal Hadi Salim @ 2015-05-04 18:47 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Alexei Starovoitov, Pablo Neira Ayuso, netfilter-devel, davem,
	netdev, kaber

On 05/04/15 13:43, Florian Westphal wrote:
> Jamal Hadi Salim <jhs@mojatatu.com> wrote:

>> It is an either-or choice. You cant have both.
>
> Again, why is there a need to have both (at same time)?
>

Pick your favorite distro. Do you seriously believe my tc scripts
will work by default as they used to when the distro ships
with iptables turned on? Get a freshly minted distro
and try to check what the defaults are.
I have no idea what some of these things are for but they
are there.

>> The _evil genius_ part i
>
> Please don't accuse anyone of being 'evil'.
>

It is a figure of speech and was intended to be humorous.
I consider Pablo a friend, sorry if that came out wrong.

> I think that tc and netfilter are both broken by design in the sense
> that we have two different systems with partially overlapping
> functionality while interaction between them consists of hacks.
>
> It works both ways, iptables CLASSIFY target is also not very
> elegant from a design point of view, it bolts the packet/connection matching
> functionality provided by iptables to qdisc hierarchy.
>

Well, from the tc perspective the angle has been one of laziness
and avoiding to rewrite any features that netfilter has. Essentially
a gateway-to-iptables. It has not been easy.
It does look silly if we copy things netfilter does in our view.

>> think is that distros which ship with iptables rules and conntracking
>> on are going to not even turn on tc and my scripts now have to go
>> unload one.
>
> You mean add 'rmmod nft_ingress' or something similar?
>
> That might be a problem.  One possible way out would be to
> make 'tc qdisc add dev eth0 ingress' silently unregister nft ingress
> on kernel side (but not vice versa).
>
> Not nice, but it would keep compatibility with tc ingress scripts.
>

Assuming there is something else using the nft side, now they break
because tc has taken over.

>> But even if the scripts worked (perhaps there are plans to make sure
>> all scripts continue to work transparently), i care about performance
>> and  youve suddenly taken that away from me.
>
> I don't think thats an unsolveable problem, currently we have the
> qdisc->enqueue() indirection, now we have hook + qdisc->enqueue() BUT
> AFAIU the latter could now be avoided by having ingress hook
> interact with sch_ingress directly.
>
> That being said, I am not opposed to two hooks, I just don't see any
> technical reason whatsoever why one would need two different ingress
> classification engines in use at same time.
>

What i described above.

cheers,
jamal

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 18:47         ` Jamal Hadi Salim
@ 2015-05-04 18:59           ` Florian Westphal
  2015-05-04 20:05             ` Alexei Starovoitov
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-05-04 18:59 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Florian Westphal, Alexei Starovoitov, Pablo Neira Ayuso,
	netfilter-devel, davem, netdev, kaber

Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 05/04/15 13:43, Florian Westphal wrote:
> >Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >Please don't accuse anyone of being 'evil'.
> >
> 
> It is a figure of speech and was intended to be humorous.
> I consider Pablo a friend, sorry if that came out wrong.

Ugh.  I'm sorry Jamal, seems I lack a sense of humor.
In any case, no offense taken.

> >I think that tc and netfilter are both broken by design in the sense
> >that we have two different systems with partially overlapping
> >functionality while interaction between them consists of hacks.
> >
> >It works both ways, iptables CLASSIFY target is also not very
> >elegant from a design point of view, it bolts the packet/connection matching
> >functionality provided by iptables to qdisc hierarchy.
> >
> 
> Well, from the tc perspective the angle has been one of laziness
> and avoiding to rewrite any features that netfilter has. Essentially
> a gateway-to-iptables. It has not been easy.

Right, I can imagine that.

> It does look silly if we copy things netfilter does in our view.

Sure, that would not be good either (perhaps even worse).

> >>think is that distros which ship with iptables rules and conntracking
> >>on are going to not even turn on tc and my scripts now have to go
> >>unload one.
> >
> >You mean add 'rmmod nft_ingress' or something similar?
> >
> >That might be a problem.  One possible way out would be to
> >make 'tc qdisc add dev eth0 ingress' silently unregister nft ingress
> >on kernel side (but not vice versa).
> >
> >Not nice, but it would keep compatibility with tc ingress scripts.
> >
> 
> Assuming there is something else using the nft side, now they break
> because tc has taken over.

Well, thats true.  But I would not care at this point (in the 'nft
ingress breaks when tc takes over') case.

I'd first like to see what distros would do (by default, that is).

I'd prefer to not base a 'we need to support both at same time' decision
on speculation as to what some distribution might do at some point.

Anything that makes sure that tc ingress doesn't break is fine.
If that turns out to be 'two hooks' then, alas, so be it.

I'd just like some evidence first 8-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 18:59           ` Florian Westphal
@ 2015-05-04 20:05             ` Alexei Starovoitov
  2015-05-04 22:21               ` Pablo Neira Ayuso
  0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2015-05-04 20:05 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Jamal Hadi Salim, Pablo Neira Ayuso, netfilter-devel, davem,
	netdev, kaber

On Mon, May 04, 2015 at 08:59:41PM +0200, Florian Westphal wrote:

> At the moment I think having two hooks provides no advantage, it only
> complicates code.

It's exactly the opposite which this thread is showing.
Single hook creates complicated relationship between tc/nft.
There is no code shared. TC critical path becomes slower with
extra dereference and indirect jump. We have to think how
'tc qdisc add ingress' can auto-unload nft_ingress, etc
That is real complexity that can be avoided with two hooks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 20:05             ` Alexei Starovoitov
@ 2015-05-04 22:21               ` Pablo Neira Ayuso
  2015-05-04 23:04                 ` Thomas Graf
  0 siblings, 1 reply; 14+ messages in thread
From: Pablo Neira Ayuso @ 2015-05-04 22:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Florian Westphal, Jamal Hadi Salim, netfilter-devel, davem,
	netdev, kaber

On Mon, May 04, 2015 at 01:05:08PM -0700, Alexei Starovoitov wrote:
> [...] TC critical path becomes slower with extra dereference and
> indirect jump.

You complained on the performance impact of the initial generic hook
infrastructure that allows qdisc ingress and nft to co-exist, fair
enough.

I came back with an even more simple hook infrastructure, to address
performance concerns, and you indicate that an *extra dereference and
indirect jump* makes this slowier and that both cannot co-exist with
this.

Well, any kind of hook generalization will need at least those minimal
changes. And that allows us to move code that belongs to sch_ingress.
That got embedded into net/core/dev.c because at the time this code
was made when there was no RCU around.

This sounds as an excuse to me.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Netfilter ingress support (v3)
  2015-05-04 22:21               ` Pablo Neira Ayuso
@ 2015-05-04 23:04                 ` Thomas Graf
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Graf @ 2015-05-04 23:04 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Alexei Starovoitov, Florian Westphal, Jamal Hadi Salim,
	netfilter-devel, davem, netdev, kaber

On 05/05/15 at 12:21am, Pablo Neira Ayuso wrote:
> On Mon, May 04, 2015 at 01:05:08PM -0700, Alexei Starovoitov wrote:
> > [...] TC critical path becomes slower with extra dereference and
> > indirect jump.
> 
> You complained on the performance impact of the initial generic hook
> infrastructure that allows qdisc ingress and nft to co-exist, fair
> enough.
> 
> I came back with an even more simple hook infrastructure, to address
> performance concerns, and you indicate that an *extra dereference and
> indirect jump* makes this slowier and that both cannot co-exist with
> this.
> 
> Well, any kind of hook generalization will need at least those minimal
> changes. And that allows us to move code that belongs to sch_ingress.
> That got embedded into net/core/dev.c because at the time this code
> was made when there was no RCU around.

Does anything speak against a separate hook with a static key?

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-05-04 23:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-04 10:50 [PATCH 0/4] Netfilter ingress support (v3) Pablo Neira Ayuso
2015-05-04 10:50 ` [PATCH 1/4] net: add minimalistic ingress filter hook and port sch_ingress on top of it Pablo Neira Ayuso
2015-05-04 10:50 ` [PATCH 2/4] netfilter: cleanup struct nf_hook_ops indentation Pablo Neira Ayuso
2015-05-04 10:50 ` [PATCH 3/4] netfilter: add hook list to nf_hook_state Pablo Neira Ayuso
2015-05-04 10:50 ` [PATCH 4/4] net: add netfilter ingress hook Pablo Neira Ayuso
2015-05-04 15:56 ` [PATCH 0/4] Netfilter ingress support (v3) Alexei Starovoitov
2015-05-04 16:19   ` Florian Westphal
2015-05-04 17:21     ` Jamal Hadi Salim
2015-05-04 17:43       ` Florian Westphal
2015-05-04 18:47         ` Jamal Hadi Salim
2015-05-04 18:59           ` Florian Westphal
2015-05-04 20:05             ` Alexei Starovoitov
2015-05-04 22:21               ` Pablo Neira Ayuso
2015-05-04 23:04                 ` Thomas Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).