netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf-next] netfilter: remove need for skb_clone in nf_fwd_netdev_egress
@ 2016-11-28 10:40 Florian Westphal
  2016-11-28 10:40 ` [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS Florian Westphal
  2016-11-28 10:40 ` [PATCH nf-next] netfilter: add and use nf_fwd_netdev_egress Florian Westphal
  0 siblings, 2 replies; 4+ messages in thread
From: Florian Westphal @ 2016-11-28 10:40 UTC (permalink / raw)
  To: netfilter-devel

forward expression currently clones the skb, then forwards the clone and returns
NF_DROP to the core.

First patch allows using NF_STOLEN from ingress/netdev hook, 2nd patch
adds a new helper that then forwards the original skb.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS
  2016-11-28 10:40 [PATCH nf-next] netfilter: remove need for skb_clone in nf_fwd_netdev_egress Florian Westphal
@ 2016-11-28 10:40 ` Florian Westphal
  2016-12-06 10:35   ` Pablo Neira Ayuso
  2016-11-28 10:40 ` [PATCH nf-next] netfilter: add and use nf_fwd_netdev_egress Florian Westphal
  1 sibling, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2016-11-28 10:40 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

The caller assumes that < 0 means that skb was stolen (or free'd).

All other return values continue skb processing.

nf_hook_slow returns 3 different return value types:

A) a (negative) errno value: the skb was dropped (NF_DROP, e.g.
by iptables '-j DROP' rule).

B) 0. The skb was stolen by the hook or queued to userspace.

C) 1. all hooks returned NF_ACCEPT so the caller should invoke
   the okfn so packet processing can continue.

nft ingress facility currently doesn't have the 'okfn' that
the NF_HOOK() macros use; there is no nfqueue support either.

So 1 means that nf_hook_ingress() caller should go on processing the skb.

In order to allow use of NF_STOLEN from ingress we need to translate
this to an errno number, else we'd crash because we continue with
already-free'd (or about to be free-d) skb.

Random dice roll has chosen EINPROGRESS.  The errno value isn't checked,
its just important that its less than 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/netfilter_ingress.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
index 2dc3b49b804a..d5027e470a24 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -19,6 +19,7 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
 {
 	struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress);
 	struct nf_hook_state state;
+	int ret;
 
 	/* Must recheck the ingress hook head, in the event it became NULL
 	 * after the check in nf_hook_ingress_active evaluated to true.
@@ -29,7 +30,11 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
 	nf_hook_state_init(&state, NF_NETDEV_INGRESS,
 			   NFPROTO_NETDEV, skb->dev, NULL, NULL,
 			   dev_net(skb->dev), NULL);
-	return nf_hook_slow(skb, &state, e);
+	ret = nf_hook_slow(skb, &state, e);
+	if (unlikely(ret == 0))
+		return -EINPROGRESS;
+
+	return ret;
 }
 
 static inline void nf_hook_ingress_init(struct net_device *dev)
-- 
2.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH nf-next] netfilter: add and use nf_fwd_netdev_egress
  2016-11-28 10:40 [PATCH nf-next] netfilter: remove need for skb_clone in nf_fwd_netdev_egress Florian Westphal
  2016-11-28 10:40 ` [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS Florian Westphal
@ 2016-11-28 10:40 ` Florian Westphal
  1 sibling, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2016-11-28 10:40 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

... so we can use current skb instead of working with a clone.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_dup_netdev.h |  1 +
 net/netfilter/nf_dup_netdev.c         | 33 +++++++++++++++++++++++++--------
 net/netfilter/nft_fwd_netdev.c        |  4 ++--
 3 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/include/net/netfilter/nf_dup_netdev.h b/include/net/netfilter/nf_dup_netdev.h
index 397dcae349f9..3e919356bedf 100644
--- a/include/net/netfilter/nf_dup_netdev.h
+++ b/include/net/netfilter/nf_dup_netdev.h
@@ -2,5 +2,6 @@
 #define _NF_DUP_NETDEV_H_
 
 void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif);
+void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif);
 
 #endif
diff --git a/net/netfilter/nf_dup_netdev.c b/net/netfilter/nf_dup_netdev.c
index 44ae986c383f..49c38855f547 100644
--- a/net/netfilter/nf_dup_netdev.c
+++ b/net/netfilter/nf_dup_netdev.c
@@ -14,6 +14,29 @@
 #include <linux/netfilter/nf_tables.h>
 #include <net/netfilter/nf_tables.h>
 
+static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev)
+{
+	if (skb_mac_header_was_set(skb))
+		skb_push(skb, skb->mac_len);
+
+	skb->dev = dev;
+	dev_queue_xmit(skb);
+}
+
+void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif)
+{
+	struct net_device *dev;
+
+	dev = dev_get_by_index_rcu(nft_net(pkt), oif);
+	if (!dev) {
+		kfree_skb(pkt->skb);
+		return;
+	}
+
+	nf_do_netdev_egress(pkt->skb, dev);
+}
+EXPORT_SYMBOL_GPL(nf_fwd_netdev_egress);
+
 void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif)
 {
 	struct net_device *dev;
@@ -24,14 +47,8 @@ void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif)
 		return;
 
 	skb = skb_clone(pkt->skb, GFP_ATOMIC);
-	if (skb == NULL)
-		return;
-
-	if (skb_mac_header_was_set(skb))
-		skb_push(skb, skb->mac_len);
-
-	skb->dev = dev;
-	dev_queue_xmit(skb);
+	if (skb)
+		nf_do_netdev_egress(skb, dev);
 }
 EXPORT_SYMBOL_GPL(nf_dup_netdev_egress);
 
diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c
index 763ebc3e0b2b..ce13a50b9189 100644
--- a/net/netfilter/nft_fwd_netdev.c
+++ b/net/netfilter/nft_fwd_netdev.c
@@ -26,8 +26,8 @@ static void nft_fwd_netdev_eval(const struct nft_expr *expr,
 	struct nft_fwd_netdev *priv = nft_expr_priv(expr);
 	int oif = regs->data[priv->sreg_dev];
 
-	nf_dup_netdev_egress(pkt, oif);
-	regs->verdict.code = NF_DROP;
+	nf_fwd_netdev_egress(pkt, oif);
+	regs->verdict.code = NF_STOLEN;
 }
 
 static const struct nla_policy nft_fwd_netdev_policy[NFTA_FWD_MAX + 1] = {
-- 
2.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS
  2016-11-28 10:40 ` [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS Florian Westphal
@ 2016-12-06 10:35   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2016-12-06 10:35 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Nov 28, 2016 at 11:40:05AM +0100, Florian Westphal wrote:
> The caller assumes that < 0 means that skb was stolen (or free'd).
> 
> All other return values continue skb processing.
> 
> nf_hook_slow returns 3 different return value types:
> 
> A) a (negative) errno value: the skb was dropped (NF_DROP, e.g.
> by iptables '-j DROP' rule).
> 
> B) 0. The skb was stolen by the hook or queued to userspace.
> 
> C) 1. all hooks returned NF_ACCEPT so the caller should invoke
>    the okfn so packet processing can continue.
> 
> nft ingress facility currently doesn't have the 'okfn' that
> the NF_HOOK() macros use; there is no nfqueue support either.
> 
> So 1 means that nf_hook_ingress() caller should go on processing the skb.
> 
> In order to allow use of NF_STOLEN from ingress we need to translate
> this to an errno number, else we'd crash because we continue with
> already-free'd (or about to be free-d) skb.
> 
> Random dice roll has chosen EINPROGRESS.  The errno value isn't checked,
> its just important that its less than 0.

Not really comments to block anything, so take them lightly.

Probably we can just set this to -1, I know this maps to some errno
value, but at least from code reading it will be make think that we
meant to return EINPROGRESS. If you agree, I can just mangle the
patch.

One more question below.

> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  include/linux/netfilter_ingress.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
> index 2dc3b49b804a..d5027e470a24 100644
> --- a/include/linux/netfilter_ingress.h
> +++ b/include/linux/netfilter_ingress.h
> @@ -19,6 +19,7 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
>  {
>  	struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress);
>  	struct nf_hook_state state;
> +	int ret;
>  
>  	/* Must recheck the ingress hook head, in the event it became NULL
>  	 * after the check in nf_hook_ingress_active evaluated to true.
> @@ -29,7 +30,11 @@ static inline int nf_hook_ingress(struct sk_buff *skb)
>  	nf_hook_state_init(&state, NF_NETDEV_INGRESS,
>  			   NFPROTO_NETDEV, skb->dev, NULL, NULL,
>  			   dev_net(skb->dev), NULL);
> -	return nf_hook_slow(skb, &state, e);
> +	ret = nf_hook_slow(skb, &state, e);
> +	if (unlikely(ret == 0))
> +		return -EINPROGRESS;

Do you prefer to keep this branch as unlikely? I understand most
people are not using fwd much so far, but we're targeting to enrich
ingress so I would expect users will be using this more and more.
Anyway, we can revisit this later on.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-06 10:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-28 10:40 [PATCH nf-next] netfilter: remove need for skb_clone in nf_fwd_netdev_egress Florian Westphal
2016-11-28 10:40 ` [PATCH nf-next] netfilter: ingress: translate 0 nf_hook_slow retval to -EINPROGRESS Florian Westphal
2016-12-06 10:35   ` Pablo Neira Ayuso
2016-11-28 10:40 ` [PATCH nf-next] netfilter: add and use nf_fwd_netdev_egress Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).