netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFQUEUE v2 target with 'queue bypass' support
@ 2010-12-26 23:58 Florian Westphal
  2010-12-26 23:58 ` [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE Florian Westphal
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel

Following patch series (for net-next) adds a NFQUEUE v2 target revision
that introduces a "--queue-bypass" flag.

If the flag is used with a -j NFQUEUE rule, then NFQUEUE will behave
like ACCEPT instead of DROP iff no program has opened the queue.

I will send the userspace patch for iptables in a couple of days.

The patch series is also available via git, but beware:
the tree is based on net-next-2.6 and NOT nf-next, because the former
includes Eric Paris' selinux netfilter changes which would
cause merge conflicts with these patches.

The following changes since commit 041110a439e21cd40709ead4ffbfa8034619ad77:

  Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6 (2010-12-25 19:20:38 -0800)

are available in the git repository at:

  git://git.breakpoint.cc/fw/net-next-2.6.git nfq_bypass

Florian Westphal (6):
      netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE
      netfilter: nfnetlink_queue: return error number to caller
      netfilter: nfnetlink_queue: do not free skb on error
      netfilter: reduce NF_VERDICT_MASK to 0xff
      netfilter: allow NFQUEUE bypass if no listener is available
      netfilter: do not omit re-route check on NF_QUEUE verdict

 include/linux/netfilter.h            |   21 ++++++++---
 include/linux/netfilter/xt_NFQUEUE.h |    6 +++
 net/ipv4/netfilter/iptable_mangle.c  |    2 +-
 net/netfilter/Kconfig                |    1 +
 net/netfilter/core.c                 |   14 +++++--
 net/netfilter/nf_queue.c             |   66 +++++++++++++++++++++------------
 net/netfilter/nfnetlink_queue.c      |   22 +++++++----
 net/netfilter/xt_NFQUEUE.c           |   28 +++++++++++++--
 8 files changed, 115 insertions(+), 45 deletions(-)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  2010-12-27  8:41   ` Jan Engelhardt
  2010-12-26 23:58 ` [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller Florian Westphal
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

NFLOG already does the same thing for NETFILTER_NETLINK_LOG.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/Kconfig |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 1534f2b..c617ec9 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -477,6 +477,7 @@ config NETFILTER_XT_TARGET_NFLOG
 config NETFILTER_XT_TARGET_NFQUEUE
 	tristate '"NFQUEUE" target Support'
 	depends on NETFILTER_ADVANCED
+	select NETFILTER_NETLINK_QUEUE
 	help
 	  This target replaced the old obsolete QUEUE target.
 
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
  2010-12-26 23:58 ` [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  2011-01-12 18:56   ` Pablo Neira Ayuso
  2010-12-26 23:58 ` [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error Florian Westphal
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

instead of returning -1 on error, return an error number to allow the
caller to handle some errors differently.

To make things simpler try_module_get() failure is no longer special-cased.

The new return values will be used in followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/core.c            |    5 +--
 net/netfilter/nf_queue.c        |   50 ++++++++++++++++++++++----------------
 net/netfilter/nfnetlink_queue.c |   22 +++++++++++------
 3 files changed, 45 insertions(+), 32 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 32fcbe2..518a2e3 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -179,9 +179,8 @@ next_hook:
 		if (ret == 0)
 			ret = -EPERM;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
-		if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
-			      verdict >> NF_VERDICT_BITS))
-			goto next_hook;
+		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
+			       verdict >> NF_VERDICT_BITS);
 	}
 	rcu_read_unlock();
 	return ret;
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 74aebed..a417458 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -115,7 +115,7 @@ static int __nf_queue(struct sk_buff *skb,
 		      int (*okfn)(struct sk_buff *),
 		      unsigned int queuenum)
 {
-	int status;
+	int error = -EINVAL;
 	struct nf_queue_entry *entry = NULL;
 #ifdef CONFIG_BRIDGE_NETFILTER
 	struct net_device *physindev;
@@ -136,8 +136,10 @@ static int __nf_queue(struct sk_buff *skb,
 		goto err_unlock;
 
 	entry = kmalloc(sizeof(*entry) + afinfo->route_key_size, GFP_ATOMIC);
-	if (!entry)
+	if (!entry) {
+		error = -ENOMEM;
 		goto err_unlock;
+	}
 
 	*entry = (struct nf_queue_entry) {
 		.skb	= skb,
@@ -149,12 +151,8 @@ static int __nf_queue(struct sk_buff *skb,
 		.okfn	= okfn,
 	};
 
-	/* If it's going away, ignore hook. */
-	if (!try_module_get(entry->elem->owner)) {
-		rcu_read_unlock();
-		kfree(entry);
-		return 0;
-	}
+	if (!try_module_get(entry->elem->owner))
+		goto err_unlock;
 
 	/* Bump dev refs so they don't vanish while packet is out */
 	if (indev)
@@ -173,23 +171,23 @@ static int __nf_queue(struct sk_buff *skb,
 #endif
 	skb_dst_force(skb);
 	afinfo->saveroute(skb, entry);
-	status = qh->outfn(entry, queuenum);
+	error = qh->outfn(entry, queuenum);
 
 	rcu_read_unlock();
 
-	if (status < 0) {
+	if (error < 0) {
 		nf_queue_entry_release_refs(entry);
 		goto err;
 	}
 
-	return 1;
+	return 0;
 
 err_unlock:
 	rcu_read_unlock();
 err:
 	kfree_skb(skb);
 	kfree(entry);
-	return 1;
+	return error;
 }
 
 int nf_queue(struct sk_buff *skb,
@@ -201,6 +199,8 @@ int nf_queue(struct sk_buff *skb,
 	     unsigned int queuenum)
 {
 	struct sk_buff *segs;
+	int error = 0;
+	unsigned int queued;
 
 	if (!skb_is_gso(skb))
 		return __nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
@@ -218,18 +218,26 @@ int nf_queue(struct sk_buff *skb,
 	segs = skb_gso_segment(skb, 0);
 	kfree_skb(skb);
 	if (IS_ERR(segs))
-		return 1;
+		return -ENOMEM;
 
+	queued = 0;
 	do {
 		struct sk_buff *nskb = segs->next;
 
 		segs->next = NULL;
-		if (!__nf_queue(segs, elem, pf, hook, indev, outdev, okfn,
-				queuenum))
-			kfree_skb(segs);
+		if (error == 0) {
+			error = __nf_queue(segs, elem, pf, hook, indev,
+					   outdev, okfn, queuenum);
+			if (error == 0)
+				queued++;
+		}
+		kfree_skb(segs);
 		segs = nskb;
 	} while (segs);
-	return 1;
+
+	if (unlikely(error && queued))
+		error = 0;
+	return error;
 }
 
 void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
@@ -237,6 +245,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 	struct sk_buff *skb = entry->skb;
 	struct list_head *elem = &entry->elem->list;
 	const struct nf_afinfo *afinfo;
+	int err;
 
 	rcu_read_lock();
 
@@ -270,10 +279,9 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 		local_bh_enable();
 		break;
 	case NF_QUEUE:
-		if (!__nf_queue(skb, elem, entry->pf, entry->hook,
-				entry->indev, entry->outdev, entry->okfn,
-				verdict >> NF_VERDICT_BITS))
-			goto next_hook;
+		__nf_queue(skb, elem, entry->pf, entry->hook,
+				 entry->indev, entry->outdev, entry->okfn,
+				 verdict >> NF_VERDICT_BITS);
 		break;
 	case NF_STOLEN:
 	default:
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 68e67d1..b83123f 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -387,25 +387,31 @@ nfqnl_enqueue_packet(struct nf_queue_entry *entry, unsigned int queuenum)
 {
 	struct sk_buff *nskb;
 	struct nfqnl_instance *queue;
-	int err;
+	int err = -ENOBUFS;
 
 	/* rcu_read_lock()ed by nf_hook_slow() */
 	queue = instance_lookup(queuenum);
-	if (!queue)
+	if (!queue) {
+		err = -ESRCH;
 		goto err_out;
+	}
 
-	if (queue->copy_mode == NFQNL_COPY_NONE)
+	if (queue->copy_mode == NFQNL_COPY_NONE) {
+		err = -EINVAL;
 		goto err_out;
+	}
 
 	nskb = nfqnl_build_packet_message(queue, entry);
-	if (nskb == NULL)
+	if (nskb == NULL) {
+		err = -ENOMEM;
 		goto err_out;
-
+	}
 	spin_lock_bh(&queue->lock);
 
-	if (!queue->peer_pid)
+	if (!queue->peer_pid) {
+		err = -EINVAL;
 		goto err_out_free_nskb;
-
+	}
 	if (queue->queue_total >= queue->queue_maxlen) {
 		queue->queue_dropped++;
 		if (net_ratelimit())
@@ -432,7 +438,7 @@ err_out_free_nskb:
 err_out_unlock:
 	spin_unlock_bh(&queue->lock);
 err_out:
-	return -1;
+	return err;
 }
 
 static int
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
  2010-12-26 23:58 ` [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE Florian Westphal
  2010-12-26 23:58 ` [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  2011-01-12 19:01   ` Pablo Neira Ayuso
  2010-12-26 23:58 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

Move free responsibility from nf_queue to caller.
This enables more flexible error handling; we could
now accept the skb instead of freeing it.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/core.c     |    5 ++++-
 net/netfilter/nf_queue.c |   10 +++++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 518a2e3..73e44c4 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -179,8 +179,11 @@ next_hook:
 		if (ret == 0)
 			ret = -EPERM;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
-		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
+		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
 			       verdict >> NF_VERDICT_BITS);
+		if (ret < 0)
+			kfree_skb(skb);
+		ret = 0;
 	}
 	rcu_read_unlock();
 	return ret;
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index a417458..2a9b80c 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -185,7 +185,6 @@ static int __nf_queue(struct sk_buff *skb,
 err_unlock:
 	rcu_read_unlock();
 err:
-	kfree_skb(skb);
 	kfree(entry);
 	return error;
 }
@@ -216,7 +215,6 @@ int nf_queue(struct sk_buff *skb,
 	}
 
 	segs = skb_gso_segment(skb, 0);
-	kfree_skb(skb);
 	if (IS_ERR(segs))
 		return -ENOMEM;
 
@@ -235,8 +233,12 @@ int nf_queue(struct sk_buff *skb,
 		segs = nskb;
 	} while (segs);
 
+	/* also free orig skb if only some segments were queued */
 	if (unlikely(error && queued))
 		error = 0;
+
+	if (error == 0)
+		kfree_skb(skb);
 	return error;
 }
 
@@ -279,9 +281,11 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 		local_bh_enable();
 		break;
 	case NF_QUEUE:
-		__nf_queue(skb, elem, entry->pf, entry->hook,
+		err = __nf_queue(skb, elem, entry->pf, entry->hook,
 				 entry->indev, entry->outdev, entry->okfn,
 				 verdict >> NF_VERDICT_BITS);
+		if (err < 0)
+			kfree_skb(skb);
 		break;
 	case NF_STOLEN:
 	default:
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
                   ` (2 preceding siblings ...)
  2010-12-26 23:58 ` [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  2011-01-12 19:02   ` Pablo Neira Ayuso
  2010-12-26 23:58 ` [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available Florian Westphal
  2010-12-26 23:58 ` [PATCH 6/6] netfilter: do not omit re-route check on NF_QUEUE verdict Florian Westphal
  5 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

NF_VERDICT_MASK is currently 0xffff. This is because the upper
16 bits are used to store errno (for NF_DROP) or the queue number
(NF_QUEUE verdict).

As there are up to 0xffff different queues available, there is
no more room to store additional flags.

At the moment there are only 6 different verdicts, i.e. we can
reduce NF_VERDICT_MASK to 0xff to allow storing additional flags
in the 0xff00 space.

NF_VERDICT_BITS would then be reduced to 8, but because the value
is exported to userspace, this might cause breakage; e.g.:

e.g. 'queuenr = (1 << NF_VERDICT_BITS) | NF_QUEUE'  would now break.

Thus, remove NF_VERDICT_BITS usage in the kernel and move the old
value to the 'userspace compat' section.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/netfilter.h |   20 +++++++++++++++-----
 net/netfilter/core.c      |    4 ++--
 net/netfilter/nf_queue.c  |    2 +-
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 1893837..231277f 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -24,16 +24,19 @@
 #define NF_MAX_VERDICT NF_STOP
 
 /* we overload the higher bits for encoding auxiliary data such as the queue
- * number. Not nice, but better than additional function arguments. */
-#define NF_VERDICT_MASK 0x0000ffff
-#define NF_VERDICT_BITS 16
+ * number or errno values. Not nice, but better than additional function
+ * arguments. */
+#define NF_VERDICT_MASK 0x000000ff
+
+/* extra verdict flags have mask 0x0000ff00 */
 
+/* queue number (NF_QUEUE) or errno (NF_DROP) */
 #define NF_VERDICT_QMASK 0xffff0000
 #define NF_VERDICT_QBITS 16
 
-#define NF_QUEUE_NR(x) ((((x) << NF_VERDICT_BITS) & NF_VERDICT_QMASK) | NF_QUEUE)
+#define NF_QUEUE_NR(x) ((((x) << 16) & NF_VERDICT_QMASK) | NF_QUEUE)
 
-#define NF_DROP_ERR(x) (((-x) << NF_VERDICT_BITS) | NF_DROP)
+#define NF_DROP_ERR(x) (((-x) << 16) | NF_DROP)
 
 /* only for userspace compatibility */
 #ifndef __KERNEL__
@@ -41,6 +44,9 @@
    <= 0x2000 is used for protocol-flags. */
 #define NFC_UNKNOWN 0x4000
 #define NFC_ALTERED 0x8000
+
+/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
+#define NF_VERDICT_BITS 16
 #endif
 
 enum nf_inet_hooks {
@@ -72,6 +78,10 @@ union nf_inet_addr {
 
 #ifdef __KERNEL__
 #ifdef CONFIG_NETFILTER
+static inline int NF_DROP_GETERR(int verdict)
+{
+	return -(verdict >> NF_VERDICT_QBITS);
+}
 
 static inline int nf_inet_addr_cmp(const union nf_inet_addr *a1,
 				   const union nf_inet_addr *a2)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 73e44c4..18ee9b9 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -175,12 +175,12 @@ next_hook:
 		ret = 1;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
 		kfree_skb(skb);
-		ret = -(verdict >> NF_VERDICT_BITS);
+		ret = NF_DROP_GETERR(verdict);
 		if (ret == 0)
 			ret = -EPERM;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
 		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
-			       verdict >> NF_VERDICT_BITS);
+			       verdict >> NF_VERDICT_QBITS);
 		if (ret < 0)
 			kfree_skb(skb);
 		ret = 0;
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 2a9b80c..1a5067c 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -283,7 +283,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 	case NF_QUEUE:
 		err = __nf_queue(skb, elem, entry->pf, entry->hook,
 				 entry->indev, entry->outdev, entry->okfn,
-				 verdict >> NF_VERDICT_BITS);
+				 verdict >> NF_VERDICT_QBITS);
 		if (err < 0)
 			kfree_skb(skb);
 		break;
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
                   ` (3 preceding siblings ...)
  2010-12-26 23:58 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  2011-01-12 19:03   ` Pablo Neira Ayuso
  2010-12-26 23:58 ` [PATCH 6/6] netfilter: do not omit re-route check on NF_QUEUE verdict Florian Westphal
  5 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal, Florian Westphal

If an skb is to be NF_QUEUE'd, but no program has opened the queue, the
packet is dropped.

This adds a v2 target revision of xt_NFQUEUE that allows packets to
continue through the ruleset instead.

Because the actual queueing happens outside of the target context, the
'bypass' flag has to be communicated back to the netfilter core.

Unfortunately the only choice to do this without adding a new function
argument is to use the target function return value (i.e. the verdict).

In the NF_QUEUE case, the upper 16bit already contain the queue number
to use.  The previous patch reduced NF_VERDICT_MASK to 0xff, i.e.
we now have extra room for a new flag.

If a hook issued a NF_QUEUE verdict, then the netfilter core will
continue packet processing if the queueing hook
returns -ESRCH (== "this queue does not exist") and the new
NF_VERDICT_FLAG_QUEUE_BYPASS flag is set in the verdict value.

Note: If the queue exists, but userspace does not consume packets fast
enough, the skb will still be dropped.

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
---
 include/linux/netfilter.h            |    1 +
 include/linux/netfilter/xt_NFQUEUE.h |    6 ++++++
 net/netfilter/core.c                 |    6 +++++-
 net/netfilter/nf_queue.c             |   10 ++++++++--
 net/netfilter/xt_NFQUEUE.c           |   28 +++++++++++++++++++++++++---
 5 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 231277f..041ec8f 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -29,6 +29,7 @@
 #define NF_VERDICT_MASK 0x000000ff
 
 /* extra verdict flags have mask 0x0000ff00 */
+#define NF_VERDICT_FLAG_QUEUE_BYPASS	0x00008000
 
 /* queue number (NF_QUEUE) or errno (NF_DROP) */
 #define NF_VERDICT_QMASK 0xffff0000
diff --git a/include/linux/netfilter/xt_NFQUEUE.h b/include/linux/netfilter/xt_NFQUEUE.h
index 2584f4a..9eafdbb 100644
--- a/include/linux/netfilter/xt_NFQUEUE.h
+++ b/include/linux/netfilter/xt_NFQUEUE.h
@@ -20,4 +20,10 @@ struct xt_NFQ_info_v1 {
 	__u16 queues_total;
 };
 
+struct xt_NFQ_info_v2 {
+	__u16 queuenum;
+	__u16 queues_total;
+	__u16 bypass;
+};
+
 #endif /* _XT_NFQ_TARGET_H */
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 18ee9b9..7dabbdd 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -181,8 +181,12 @@ next_hook:
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
 		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
 			       verdict >> NF_VERDICT_QBITS);
-		if (ret < 0)
+		if (ret < 0) {
+			if (ret == -ESRCH &&
+			   (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
+				goto next_hook;
 			kfree_skb(skb);
+		}
 		ret = 0;
 	}
 	rcu_read_unlock();
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 1a5067c..f32d097 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -128,8 +128,10 @@ static int __nf_queue(struct sk_buff *skb,
 	rcu_read_lock();
 
 	qh = rcu_dereference(queue_handler[pf]);
-	if (!qh)
+	if (!qh) {
+		error = -ESRCH;
 		goto err_unlock;
+	}
 
 	afinfo = nf_get_afinfo(pf);
 	if (!afinfo)
@@ -284,8 +286,12 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 		err = __nf_queue(skb, elem, entry->pf, entry->hook,
 				 entry->indev, entry->outdev, entry->okfn,
 				 verdict >> NF_VERDICT_QBITS);
-		if (err < 0)
+		if (err < 0) {
+			if (err == -ESRCH &&
+			   (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
+				goto next_hook;
 			kfree_skb(skb);
+		}
 		break;
 	case NF_STOLEN:
 	default:
diff --git a/net/netfilter/xt_NFQUEUE.c b/net/netfilter/xt_NFQUEUE.c
index 039cce1..41da6c3 100644
--- a/net/netfilter/xt_NFQUEUE.c
+++ b/net/netfilter/xt_NFQUEUE.c
@@ -81,9 +81,20 @@ nfqueue_tg_v1(struct sk_buff *skb, const struct xt_action_param *par)
 	return NF_QUEUE_NR(queue);
 }
 
-static int nfqueue_tg_v1_check(const struct xt_tgchk_param *par)
+static unsigned int
+nfqueue_tg_v2(struct sk_buff *skb, const struct xt_action_param *par)
 {
-	const struct xt_NFQ_info_v1 *info = par->targinfo;
+	const struct xt_NFQ_info_v2 *info = par->targinfo;
+	unsigned int ret = nfqueue_tg_v1(skb, par);
+
+	if (info->bypass)
+		ret |= NF_VERDICT_FLAG_QUEUE_BYPASS;
+	return ret;
+}
+
+static int nfqueue_tg_check(const struct xt_tgchk_param *par)
+{
+	const struct xt_NFQ_info_v2 *info = par->targinfo;
 	u32 maxid;
 
 	if (unlikely(!rnd_inited)) {
@@ -100,6 +111,8 @@ static int nfqueue_tg_v1_check(const struct xt_tgchk_param *par)
 		       info->queues_total, maxid);
 		return -ERANGE;
 	}
+	if (par->target->revision == 2 && info->bypass > 1)
+		return -EINVAL;
 	return 0;
 }
 
@@ -115,11 +128,20 @@ static struct xt_target nfqueue_tg_reg[] __read_mostly = {
 		.name		= "NFQUEUE",
 		.revision	= 1,
 		.family		= NFPROTO_UNSPEC,
-		.checkentry	= nfqueue_tg_v1_check,
+		.checkentry	= nfqueue_tg_check,
 		.target		= nfqueue_tg_v1,
 		.targetsize	= sizeof(struct xt_NFQ_info_v1),
 		.me		= THIS_MODULE,
 	},
+	{
+		.name		= "NFQUEUE",
+		.revision	= 2,
+		.family		= NFPROTO_UNSPEC,
+		.checkentry	= nfqueue_tg_check,
+		.target		= nfqueue_tg_v2,
+		.targetsize	= sizeof(struct xt_NFQ_info_v2),
+		.me		= THIS_MODULE,
+	},
 };
 
 static int __init nfqueue_tg_init(void)
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] netfilter: do not omit re-route check on NF_QUEUE verdict
  2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
                   ` (4 preceding siblings ...)
  2010-12-26 23:58 ` [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available Florian Westphal
@ 2010-12-26 23:58 ` Florian Westphal
  5 siblings, 0 replies; 23+ messages in thread
From: Florian Westphal @ 2010-12-26 23:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

ret != NF_QUEUE only works in the "--queue-num 0" case; for
queues > 0 the test should be '(ret & NF_VERDICT_MASK) != NF_QUEUE'.

However, NF_QUEUE no longer DROPs the skb unconditionally if queueing
fails (due to NF_VERDICT_FLAG_QUEUE_BYPASS verdict flag), so the
re-route test should also be performed if this flag is set in the
verdict.

The full test would then look something like

&& ((ret & NF_VERDICT_MASK) == NF_QUEUE && (ret & NF_VERDICT_FLAG_QUEUE_BYPASS))

This is rather ugly, so just remove the NF_QUEUE test altogether.

The only effect is that we might perform an unnecessary route lookup
in the NF_QUEUE case.

ip6table_mangle did not have such a check.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/ipv4/netfilter/iptable_mangle.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c
index 294a2a3..aef5d1f 100644
--- a/net/ipv4/netfilter/iptable_mangle.c
+++ b/net/ipv4/netfilter/iptable_mangle.c
@@ -60,7 +60,7 @@ ipt_mangle_out(struct sk_buff *skb, const struct net_device *out)
 	ret = ipt_do_table(skb, NF_INET_LOCAL_OUT, NULL, out,
 			   dev_net(out)->ipv4.iptable_mangle);
 	/* Reroute for ANY change. */
-	if (ret != NF_DROP && ret != NF_STOLEN && ret != NF_QUEUE) {
+	if (ret != NF_DROP && ret != NF_STOLEN) {
 		iph = ip_hdr(skb);
 
 		if (iph->saddr != saddr ||
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE
  2010-12-26 23:58 ` [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE Florian Westphal
@ 2010-12-27  8:41   ` Jan Engelhardt
  2010-12-27  8:47     ` Michał Mirosław
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Engelhardt @ 2010-12-27  8:41 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Monday 2010-12-27 00:58, Florian Westphal wrote:

>NFLOG already does the same thing for NETFILTER_NETLINK_LOG.
>diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
>index 1534f2b..c617ec9 100644
>--- a/net/netfilter/Kconfig
>+++ b/net/netfilter/Kconfig
>@@ -477,6 +477,7 @@ config NETFILTER_XT_TARGET_NFLOG
> config NETFILTER_XT_TARGET_NFQUEUE
> 	tristate '"NFQUEUE" target Support'
> 	depends on NETFILTER_ADVANCED
>+	select NETFILTER_NETLINK_QUEUE
> 	help
> 	  This target replaced the old obsolete QUEUE target.
> 
>-- 

There once was sort of a small debate on whether we should not reduce 
the usage of "select" (and instead use: depends on 
NETFILTER_NETLINK_QUEUE, in this case). What do people think about it?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE
  2010-12-27  8:41   ` Jan Engelhardt
@ 2010-12-27  8:47     ` Michał Mirosław
  0 siblings, 0 replies; 23+ messages in thread
From: Michał Mirosław @ 2010-12-27  8:47 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Florian Westphal, netfilter-devel

On Mon, Dec 27, 2010 at 09:41:29AM +0100, Jan Engelhardt wrote:
> On Monday 2010-12-27 00:58, Florian Westphal wrote:
> >NFLOG already does the same thing for NETFILTER_NETLINK_LOG.
> >diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> >index 1534f2b..c617ec9 100644
> >--- a/net/netfilter/Kconfig
> >+++ b/net/netfilter/Kconfig
> >@@ -477,6 +477,7 @@ config NETFILTER_XT_TARGET_NFLOG
> > config NETFILTER_XT_TARGET_NFQUEUE
> > 	tristate '"NFQUEUE" target Support'
> > 	depends on NETFILTER_ADVANCED
> >+	select NETFILTER_NETLINK_QUEUE
> > 	help
> > 	  This target replaced the old obsolete QUEUE target.
> There once was sort of a small debate on whether we should not reduce 
> the usage of "select" (and instead use: depends on 
> NETFILTER_NETLINK_QUEUE, in this case). What do people think about it?

For users, NETFILTER_XT_TARGET_NFQUEUE is a real thing they want and
NETFILTER_NETLINK_QUEUE just an implementation detail. So I suggest
select here, like the patch does.

Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller
  2010-12-26 23:58 ` [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller Florian Westphal
@ 2011-01-12 18:56   ` Pablo Neira Ayuso
  2011-01-12 20:49     ` Florian Westphal
  0 siblings, 1 reply; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-12 18:56 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

Hi Florian,

I've got some questions and comments on this patchset, please see below.

On 27/12/10 00:58, Florian Westphal wrote:
> instead of returning -1 on error, return an error number to allow the
> caller to handle some errors differently.
> 
> To make things simpler try_module_get() failure is no longer special-cased.
> 
> The new return values will be used in followup patch.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/netfilter/core.c            |    5 +--
>  net/netfilter/nf_queue.c        |   50 ++++++++++++++++++++++----------------
>  net/netfilter/nfnetlink_queue.c |   22 +++++++++++------
>  3 files changed, 45 insertions(+), 32 deletions(-)
> 
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index 32fcbe2..518a2e3 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -179,9 +179,8 @@ next_hook:
>  		if (ret == 0)
>  			ret = -EPERM;
>  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
> -		if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> -			      verdict >> NF_VERDICT_BITS))
> -			goto next_hook;
> +		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> +			       verdict >> NF_VERDICT_BITS);

You have to remove the next_hook label if you want to do this.

It's not clear to me why we need to remove the goto jump, could you
clarify this?

>  	}
>  	rcu_read_unlock();
>  	return ret;
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 74aebed..a417458 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -115,7 +115,7 @@ static int __nf_queue(struct sk_buff *skb,
>  		      int (*okfn)(struct sk_buff *),
>  		      unsigned int queuenum)
>  {
> -	int status;
> +	int error = -EINVAL;

Please, don't rename this variable unless that you have a good reason to
do so. Better keep the same name to reduce the number of lines that has
changed in this patch.

>  	struct nf_queue_entry *entry = NULL;
>  #ifdef CONFIG_BRIDGE_NETFILTER
>  	struct net_device *physindev;
> @@ -136,8 +136,10 @@ static int __nf_queue(struct sk_buff *skb,
>  		goto err_unlock;

here above there's the following:

        qh = rcu_dereference(queue_handler[pf]);
        if (!qh)
                goto err_unlock;

        afinfo = nf_get_afinfo(pf);
        if (!afinfo)
                goto err_unlock;

With your patch, we're returning -EINVAL. I think it's better to return
-ENOENT.

>  	entry = kmalloc(sizeof(*entry) + afinfo->route_key_size, GFP_ATOMIC);
> -	if (!entry)
> +	if (!entry) {
> +		error = -ENOMEM;
>  		goto err_unlock;
> +	}
>  
>  	*entry = (struct nf_queue_entry) {
>  		.skb	= skb,
> @@ -149,12 +151,8 @@ static int __nf_queue(struct sk_buff *skb,
>  		.okfn	= okfn,
>  	};
>  
> -	/* If it's going away, ignore hook. */
> -	if (!try_module_get(entry->elem->owner)) {
> -		rcu_read_unlock();
> -		kfree(entry);
> -		return 0;
> -	}
> +	if (!try_module_get(entry->elem->owner))
> +		goto err_unlock;

With this change, we're now releasing the skb, is this deliberate?

>  	/* Bump dev refs so they don't vanish while packet is out */
>  	if (indev)
> @@ -173,23 +171,23 @@ static int __nf_queue(struct sk_buff *skb,
>  #endif
>  	skb_dst_force(skb);
>  	afinfo->saveroute(skb, entry);
> -	status = qh->outfn(entry, queuenum);
> +	error = qh->outfn(entry, queuenum);
>  
>  	rcu_read_unlock();
>  
> -	if (status < 0) {
> +	if (error < 0) {
>  		nf_queue_entry_release_refs(entry);
>  		goto err;
>  	}
>  
> -	return 1;
> +	return 0;
>  
>  err_unlock:
>  	rcu_read_unlock();
>  err:
>  	kfree_skb(skb);
>  	kfree(entry);
> -	return 1;
> +	return error;
>  }
>  
>  int nf_queue(struct sk_buff *skb,
> @@ -201,6 +199,8 @@ int nf_queue(struct sk_buff *skb,
>  	     unsigned int queuenum)
>  {
>  	struct sk_buff *segs;
> +	int error = 0;
> +	unsigned int queued;
>  
>  	if (!skb_is_gso(skb))
>  		return __nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> @@ -218,18 +218,26 @@ int nf_queue(struct sk_buff *skb,
>  	segs = skb_gso_segment(skb, 0);
>  	kfree_skb(skb);
>  	if (IS_ERR(segs))
> -		return 1;
> +		return -ENOMEM;

Better propagate the error of skb_gso_segment(...) with PTR_ERR()

> +	queued = 0;
>  	do {
>  		struct sk_buff *nskb = segs->next;
>  
>  		segs->next = NULL;
> -		if (!__nf_queue(segs, elem, pf, hook, indev, outdev, okfn,
> -				queuenum))
> -			kfree_skb(segs);
> +		if (error == 0) {
> +			error = __nf_queue(segs, elem, pf, hook, indev,
> +					   outdev, okfn, queuenum);
> +			if (error == 0)
> +				queued++;
> +		}
> +		kfree_skb(segs);

Before this patch, segs were only released if try_module_get() failed.
Now it always release it?

>  		segs = nskb;
>  	} while (segs);
> -	return 1;
> +
> +	if (unlikely(error && queued))
> +		error = 0;
> +	return error;
>  }
>  
>  void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
> @@ -237,6 +245,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
>  	struct sk_buff *skb = entry->skb;
>  	struct list_head *elem = &entry->elem->list;
>  	const struct nf_afinfo *afinfo;
> +	int err;
>  
>  	rcu_read_lock();
>  
> @@ -270,10 +279,9 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
>  		local_bh_enable();
>  		break;
>  	case NF_QUEUE:
> -		if (!__nf_queue(skb, elem, entry->pf, entry->hook,
> -				entry->indev, entry->outdev, entry->okfn,
> -				verdict >> NF_VERDICT_BITS))
> -			goto next_hook;
> +		__nf_queue(skb, elem, entry->pf, entry->hook,
> +				 entry->indev, entry->outdev, entry->okfn,
> +				 verdict >> NF_VERDICT_BITS);

Again, the next_label was not removed.

>  		break;
>  	case NF_STOLEN:
>  	default:
> diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
> index 68e67d1..b83123f 100644
> --- a/net/netfilter/nfnetlink_queue.c
> +++ b/net/netfilter/nfnetlink_queue.c
> @@ -387,25 +387,31 @@ nfqnl_enqueue_packet(struct nf_queue_entry *entry, unsigned int queuenum)
>  {
>  	struct sk_buff *nskb;
>  	struct nfqnl_instance *queue;
> -	int err;
> +	int err = -ENOBUFS;
>  
>  	/* rcu_read_lock()ed by nf_hook_slow() */
>  	queue = instance_lookup(queuenum);
> -	if (!queue)
> +	if (!queue) {
> +		err = -ESRCH;
>  		goto err_out;
> +	}
>  
> -	if (queue->copy_mode == NFQNL_COPY_NONE)
> +	if (queue->copy_mode == NFQNL_COPY_NONE) {
> +		err = -EINVAL;
>  		goto err_out;
> +	}
>  
>  	nskb = nfqnl_build_packet_message(queue, entry);
> -	if (nskb == NULL)
> +	if (nskb == NULL) {
> +		err = -ENOMEM;
>  		goto err_out;
> -
> +	}
>  	spin_lock_bh(&queue->lock);
>  
> -	if (!queue->peer_pid)
> +	if (!queue->peer_pid) {
> +		err = -EINVAL;
>  		goto err_out_free_nskb;
> -
> +	}
>  	if (queue->queue_total >= queue->queue_maxlen) {
>  		queue->queue_dropped++;
>  		if (net_ratelimit())
> @@ -432,7 +438,7 @@ err_out_free_nskb:
>  err_out_unlock:
>  	spin_unlock_bh(&queue->lock);
>  err_out:
> -	return -1;
> +	return err;
>  }
>  
>  static int


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error
  2010-12-26 23:58 ` [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error Florian Westphal
@ 2011-01-12 19:01   ` Pablo Neira Ayuso
  2011-01-12 20:50     ` Florian Westphal
  0 siblings, 1 reply; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-12 19:01 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 27/12/10 00:58, Florian Westphal wrote:
> Move free responsibility from nf_queue to caller.
> This enables more flexible error handling; we could
> now accept the skb instead of freeing it.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/netfilter/core.c     |    5 ++++-
>  net/netfilter/nf_queue.c |   10 +++++++---
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index 518a2e3..73e44c4 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -179,8 +179,11 @@ next_hook:
>  		if (ret == 0)
>  			ret = -EPERM;
>  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
> -		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> +		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
>  			       verdict >> NF_VERDICT_BITS);
> +		if (ret < 0)
> +			kfree_skb(skb);
> +		ret = 0;
>  	}

Suggestion: Better put this patch on top of the pile. In the previous,
you remove the return value of nf_queue, and again you reintroduce it.

>  	rcu_read_unlock();
>  	return ret;
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index a417458..2a9b80c 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -185,7 +185,6 @@ static int __nf_queue(struct sk_buff *skb,
>  err_unlock:
>  	rcu_read_unlock();
>  err:
> -	kfree_skb(skb);
>  	kfree(entry);
>  	return error;
>  }
> @@ -216,7 +215,6 @@ int nf_queue(struct sk_buff *skb,
>  	}
>  
>  	segs = skb_gso_segment(skb, 0);
> -	kfree_skb(skb);
>  	if (IS_ERR(segs))
>  		return -ENOMEM;
>  
> @@ -235,8 +233,12 @@ int nf_queue(struct sk_buff *skb,
>  		segs = nskb;
>  	} while (segs);
>  
> +	/* also free orig skb if only some segments were queued */
>  	if (unlikely(error && queued))
>  		error = 0;
> +
> +	if (error == 0)
> +		kfree_skb(skb);
>  	return error;
>  }
>  
> @@ -279,9 +281,11 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
>  		local_bh_enable();
>  		break;
>  	case NF_QUEUE:
> -		__nf_queue(skb, elem, entry->pf, entry->hook,
> +		err = __nf_queue(skb, elem, entry->pf, entry->hook,
>  				 entry->indev, entry->outdev, entry->okfn,
>  				 verdict >> NF_VERDICT_BITS);
> +		if (err < 0)
> +			kfree_skb(skb);
>  		break;
>  	case NF_STOLEN:
>  	default:


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2010-12-26 23:58 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
@ 2011-01-12 19:02   ` Pablo Neira Ayuso
  2011-01-12 20:52     ` Florian Westphal
  0 siblings, 1 reply; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-12 19:02 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 27/12/10 00:58, Florian Westphal wrote:
> NF_VERDICT_MASK is currently 0xffff. This is because the upper
> 16 bits are used to store errno (for NF_DROP) or the queue number
> (NF_QUEUE verdict).
> 
> As there are up to 0xffff different queues available, there is
> no more room to store additional flags.
> 
> At the moment there are only 6 different verdicts, i.e. we can
> reduce NF_VERDICT_MASK to 0xff to allow storing additional flags
> in the 0xff00 space.
> 
> NF_VERDICT_BITS would then be reduced to 8, but because the value
> is exported to userspace, this might cause breakage; e.g.:
> 
> e.g. 'queuenr = (1 << NF_VERDICT_BITS) | NF_QUEUE'  would now break.
> 
> Thus, remove NF_VERDICT_BITS usage in the kernel and move the old
> value to the 'userspace compat' section.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  include/linux/netfilter.h |   20 +++++++++++++++-----
>  net/netfilter/core.c      |    4 ++--
>  net/netfilter/nf_queue.c  |    2 +-
>  3 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
> index 1893837..231277f 100644
> --- a/include/linux/netfilter.h
> +++ b/include/linux/netfilter.h
> @@ -24,16 +24,19 @@
>  #define NF_MAX_VERDICT NF_STOP
>  
>  /* we overload the higher bits for encoding auxiliary data such as the queue
> - * number. Not nice, but better than additional function arguments. */
> -#define NF_VERDICT_MASK 0x0000ffff
> -#define NF_VERDICT_BITS 16
> + * number or errno values. Not nice, but better than additional function
> + * arguments. */
> +#define NF_VERDICT_MASK 0x000000ff
> +
> +/* extra verdict flags have mask 0x0000ff00 */
>  
> +/* queue number (NF_QUEUE) or errno (NF_DROP) */
>  #define NF_VERDICT_QMASK 0xffff0000
>  #define NF_VERDICT_QBITS 16
>  
> -#define NF_QUEUE_NR(x) ((((x) << NF_VERDICT_BITS) & NF_VERDICT_QMASK) | NF_QUEUE)
> +#define NF_QUEUE_NR(x) ((((x) << 16) & NF_VERDICT_QMASK) | NF_QUEUE)
>  
> -#define NF_DROP_ERR(x) (((-x) << NF_VERDICT_BITS) | NF_DROP)
> +#define NF_DROP_ERR(x) (((-x) << 16) | NF_DROP)
>  
>  /* only for userspace compatibility */
>  #ifndef __KERNEL__
> @@ -41,6 +44,9 @@
>     <= 0x2000 is used for protocol-flags. */
>  #define NFC_UNKNOWN 0x4000
>  #define NFC_ALTERED 0x8000
> +
> +/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
> +#define NF_VERDICT_BITS 16
>  #endif

i don't find any use of this in user-space.

>  
>  enum nf_inet_hooks {
> @@ -72,6 +78,10 @@ union nf_inet_addr {
>  
>  #ifdef __KERNEL__
>  #ifdef CONFIG_NETFILTER
> +static inline int NF_DROP_GETERR(int verdict)
> +{
> +	return -(verdict >> NF_VERDICT_QBITS);
> +}
>  
>  static inline int nf_inet_addr_cmp(const union nf_inet_addr *a1,
>  				   const union nf_inet_addr *a2)
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index 73e44c4..18ee9b9 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -175,12 +175,12 @@ next_hook:
>  		ret = 1;
>  	} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
>  		kfree_skb(skb);
> -		ret = -(verdict >> NF_VERDICT_BITS);
> +		ret = NF_DROP_GETERR(verdict);
>  		if (ret == 0)
>  			ret = -EPERM;
>  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
>  		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> -			       verdict >> NF_VERDICT_BITS);
> +			       verdict >> NF_VERDICT_QBITS);
>  		if (ret < 0)
>  			kfree_skb(skb);
>  		ret = 0;
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 2a9b80c..1a5067c 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -283,7 +283,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
>  	case NF_QUEUE:
>  		err = __nf_queue(skb, elem, entry->pf, entry->hook,
>  				 entry->indev, entry->outdev, entry->okfn,
> -				 verdict >> NF_VERDICT_BITS);
> +				 verdict >> NF_VERDICT_QBITS);
>  		if (err < 0)
>  			kfree_skb(skb);
>  		break;


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available
  2010-12-26 23:58 ` [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available Florian Westphal
@ 2011-01-12 19:03   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-12 19:03 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, Florian Westphal

On 27/12/10 00:58, Florian Westphal wrote:
> If an skb is to be NF_QUEUE'd, but no program has opened the queue, the
> packet is dropped.
> 
> This adds a v2 target revision of xt_NFQUEUE that allows packets to
> continue through the ruleset instead.
> 
> Because the actual queueing happens outside of the target context, the
> 'bypass' flag has to be communicated back to the netfilter core.
> 
> Unfortunately the only choice to do this without adding a new function
> argument is to use the target function return value (i.e. the verdict).
> 
> In the NF_QUEUE case, the upper 16bit already contain the queue number
> to use.  The previous patch reduced NF_VERDICT_MASK to 0xff, i.e.
> we now have extra room for a new flag.
> 
> If a hook issued a NF_QUEUE verdict, then the netfilter core will
> continue packet processing if the queueing hook
> returns -ESRCH (== "this queue does not exist") and the new
> NF_VERDICT_FLAG_QUEUE_BYPASS flag is set in the verdict value.
> 
> Note: If the queue exists, but userspace does not consume packets fast
> enough, the skb will still be dropped.
> 
> Signed-off-by: Florian Westphal <fwestphal@astaro.com>
> ---
>  include/linux/netfilter.h            |    1 +
>  include/linux/netfilter/xt_NFQUEUE.h |    6 ++++++
>  net/netfilter/core.c                 |    6 +++++-
>  net/netfilter/nf_queue.c             |   10 ++++++++--
>  net/netfilter/xt_NFQUEUE.c           |   28 +++++++++++++++++++++++++---
>  5 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
> index 231277f..041ec8f 100644
> --- a/include/linux/netfilter.h
> +++ b/include/linux/netfilter.h
> @@ -29,6 +29,7 @@
>  #define NF_VERDICT_MASK 0x000000ff
>  
>  /* extra verdict flags have mask 0x0000ff00 */
> +#define NF_VERDICT_FLAG_QUEUE_BYPASS	0x00008000
>  
>  /* queue number (NF_QUEUE) or errno (NF_DROP) */
>  #define NF_VERDICT_QMASK 0xffff0000
> diff --git a/include/linux/netfilter/xt_NFQUEUE.h b/include/linux/netfilter/xt_NFQUEUE.h
> index 2584f4a..9eafdbb 100644
> --- a/include/linux/netfilter/xt_NFQUEUE.h
> +++ b/include/linux/netfilter/xt_NFQUEUE.h
> @@ -20,4 +20,10 @@ struct xt_NFQ_info_v1 {
>  	__u16 queues_total;
>  };
>  
> +struct xt_NFQ_info_v2 {
> +	__u16 queuenum;
> +	__u16 queues_total;
> +	__u16 bypass;
> +};
> +
>  #endif /* _XT_NFQ_TARGET_H */
> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> index 18ee9b9..7dabbdd 100644
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
> @@ -181,8 +181,12 @@ next_hook:
>  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
>  		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
>  			       verdict >> NF_VERDICT_QBITS);
> -		if (ret < 0)
> +		if (ret < 0) {
> +			if (ret == -ESRCH &&
> +			   (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
> +				goto next_hook;

and then, next_hook appears again :-). Please, fix this patchset.

>  			kfree_skb(skb);
> +		}
>  		ret = 0;
>  	}
>  	rcu_read_unlock();
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 1a5067c..f32d097 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -128,8 +128,10 @@ static int __nf_queue(struct sk_buff *skb,
>  	rcu_read_lock();
>  
>  	qh = rcu_dereference(queue_handler[pf]);
> -	if (!qh)
> +	if (!qh) {
> +		error = -ESRCH;
>  		goto err_unlock;
> +	}
>  
>  	afinfo = nf_get_afinfo(pf);
>  	if (!afinfo)
> @@ -284,8 +286,12 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
>  		err = __nf_queue(skb, elem, entry->pf, entry->hook,
>  				 entry->indev, entry->outdev, entry->okfn,
>  				 verdict >> NF_VERDICT_QBITS);
> -		if (err < 0)
> +		if (err < 0) {
> +			if (err == -ESRCH &&
> +			   (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
> +				goto next_hook;
>  			kfree_skb(skb);
> +		}
>  		break;
>  	case NF_STOLEN:
>  	default:
> diff --git a/net/netfilter/xt_NFQUEUE.c b/net/netfilter/xt_NFQUEUE.c
> index 039cce1..41da6c3 100644
> --- a/net/netfilter/xt_NFQUEUE.c
> +++ b/net/netfilter/xt_NFQUEUE.c
> @@ -81,9 +81,20 @@ nfqueue_tg_v1(struct sk_buff *skb, const struct xt_action_param *par)
>  	return NF_QUEUE_NR(queue);
>  }
>  
> -static int nfqueue_tg_v1_check(const struct xt_tgchk_param *par)
> +static unsigned int
> +nfqueue_tg_v2(struct sk_buff *skb, const struct xt_action_param *par)
>  {
> -	const struct xt_NFQ_info_v1 *info = par->targinfo;
> +	const struct xt_NFQ_info_v2 *info = par->targinfo;
> +	unsigned int ret = nfqueue_tg_v1(skb, par);
> +
> +	if (info->bypass)
> +		ret |= NF_VERDICT_FLAG_QUEUE_BYPASS;
> +	return ret;
> +}
> +
> +static int nfqueue_tg_check(const struct xt_tgchk_param *par)
> +{
> +	const struct xt_NFQ_info_v2 *info = par->targinfo;
>  	u32 maxid;
>  
>  	if (unlikely(!rnd_inited)) {
> @@ -100,6 +111,8 @@ static int nfqueue_tg_v1_check(const struct xt_tgchk_param *par)
>  		       info->queues_total, maxid);
>  		return -ERANGE;
>  	}
> +	if (par->target->revision == 2 && info->bypass > 1)
> +		return -EINVAL;
>  	return 0;
>  }
>  
> @@ -115,11 +128,20 @@ static struct xt_target nfqueue_tg_reg[] __read_mostly = {
>  		.name		= "NFQUEUE",
>  		.revision	= 1,
>  		.family		= NFPROTO_UNSPEC,
> -		.checkentry	= nfqueue_tg_v1_check,
> +		.checkentry	= nfqueue_tg_check,
>  		.target		= nfqueue_tg_v1,
>  		.targetsize	= sizeof(struct xt_NFQ_info_v1),
>  		.me		= THIS_MODULE,
>  	},
> +	{
> +		.name		= "NFQUEUE",
> +		.revision	= 2,
> +		.family		= NFPROTO_UNSPEC,
> +		.checkentry	= nfqueue_tg_check,
> +		.target		= nfqueue_tg_v2,
> +		.targetsize	= sizeof(struct xt_NFQ_info_v2),
> +		.me		= THIS_MODULE,
> +	},
>  };
>  
>  static int __init nfqueue_tg_init(void)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller
  2011-01-12 18:56   ` Pablo Neira Ayuso
@ 2011-01-12 20:49     ` Florian Westphal
  2011-01-12 21:59       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2011-01-12 20:49 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> I've got some questions and comments on this patchset, please see below.

Thank you for spending time on reviewing this patchset!

> On 27/12/10 00:58, Florian Westphal wrote:
> > instead of returning -1 on error, return an error number to allow the
> > caller to handle some errors differently.
> > 
> > To make things simpler try_module_get() failure is no longer special-cased.
> > 
> > The new return values will be used in followup patch.
> > 
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > --- a/net/netfilter/core.c
> > +++ b/net/netfilter/core.c
> > @@ -179,9 +179,8 @@ next_hook:
> >  		if (ret == 0)
> >  			ret = -EPERM;
> >  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
> > -		if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> > -			      verdict >> NF_VERDICT_BITS))
> > -			goto next_hook;
> > +		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> > +			       verdict >> NF_VERDICT_BITS);
> 
> You have to remove the next_hook label if you want to do this.
> 
> It's not clear to me why we need to remove the goto jump, could you
> clarify this?

Sure. The only place where nf_queue returns 0 is this snippet in __nf_queue():

> > -	/* If it's going away, ignore hook. */
> > -	if (!try_module_get(entry->elem->owner)) {
> > -		rcu_read_unlock();
> > -		kfree(entry);
> > -		return 0;
> > -	}
> > +	if (!try_module_get(entry->elem->owner))
> > +		goto err_unlock;
> 
> With this change, we're now releasing the skb, is this deliberate?

I think its not necessary to make this a special case (i.e. just
kfree_skb(skb) instead).

An alternative is to decide on a meaningful errno value that
could be used as return value to signal this condition.

> >  #ifdef CONFIG_BRIDGE_NETFILTER
> >  	struct net_device *physindev;
> > @@ -136,8 +136,10 @@ static int __nf_queue(struct sk_buff *skb,
> >  		goto err_unlock;
> 
> here above there's the following:
> 
>         qh = rcu_dereference(queue_handler[pf]);
>         if (!qh)
>                 goto err_unlock;
> 
>         afinfo = nf_get_afinfo(pf);
>         if (!afinfo)
>                 goto err_unlock;
> 
> With your patch, we're returning -EINVAL. I think it's better to return
> -ENOENT.

Agreed.

> > @@ -218,18 +218,26 @@ int nf_queue(struct sk_buff *skb,
> >  	segs = skb_gso_segment(skb, 0);
> >  	kfree_skb(skb);
> >  	if (IS_ERR(segs))
> > -		return 1;
> > +		return -ENOMEM;
> 
> Better propagate the error of skb_gso_segment(...) with PTR_ERR()

The reason I did not do that is because I wanted to limit the number
of code one has to check for possible errno values that can be returned
by nf_queue.

Its probably reasonable to assume that skb_gso_segment() won't return
ESRCH (which I use to determine that qeueuing failed because the queue number did
not exist), so it might be safe to use PTR_ERR here.

What do you think?

> > +	queued = 0;
> >  	do {
> >  		struct sk_buff *nskb = segs->next;
> >  
> >  		segs->next = NULL;
> > -		if (!__nf_queue(segs, elem, pf, hook, indev, outdev, okfn,
> > -				queuenum))
> > -			kfree_skb(segs);
> > +		if (error == 0) {
> > +			error = __nf_queue(segs, elem, pf, hook, indev,
> > +					   outdev, okfn, queuenum);
> > +			if (error == 0)
> > +				queued++;
> > +		}
> > +		kfree_skb(segs);
> 
> Before this patch, segs were only released if try_module_get() failed.
> Now it always release it?

Yes, my bad.
I'll add it to the list of things to fix up. Thanks for catching this.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error
  2011-01-12 19:01   ` Pablo Neira Ayuso
@ 2011-01-12 20:50     ` Florian Westphal
  0 siblings, 0 replies; 23+ messages in thread
From: Florian Westphal @ 2011-01-12 20:50 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > --- a/net/netfilter/core.c
> > +++ b/net/netfilter/core.c
> > @@ -179,8 +179,11 @@ next_hook:
> >  		if (ret == 0)
> >  			ret = -EPERM;
> >  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
> > -		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> > +		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
> >  			       verdict >> NF_VERDICT_BITS);
> > +		if (ret < 0)
> > +			kfree_skb(skb);
> > +		ret = 0;
> >  	}
> 
> Suggestion: Better put this patch on top of the pile. In the previous,
> you remove the return value of nf_queue, and again you reintroduce it.

Yes, I'll either rebase this or just merge the two patches into one
to avoid this confusion.  Sorry about that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-12 19:02   ` Pablo Neira Ayuso
@ 2011-01-12 20:52     ` Florian Westphal
  2011-01-14 14:05       ` Patrick McHardy
  0 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2011-01-12 20:52 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >  /* only for userspace compatibility */
> >  #ifndef __KERNEL__
> > @@ -41,6 +44,9 @@
> >     <= 0x2000 is used for protocol-flags. */
> >  #define NFC_UNKNOWN 0x4000
> >  #define NFC_ALTERED 0x8000
> > +
> > +/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
> > +#define NF_VERDICT_BITS 16
> >  #endif
> 
> i don't find any use of this in user-space.

Alright, I will simply remove it, then :-)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller
  2011-01-12 20:49     ` Florian Westphal
@ 2011-01-12 21:59       ` Pablo Neira Ayuso
  2011-01-13  0:14         ` Florian Westphal
  0 siblings, 1 reply; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-12 21:59 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 12/01/11 21:49, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> I've got some questions and comments on this patchset, please see below.
> 
> Thank you for spending time on reviewing this patchset!

welcome Florian ;-)

>> On 27/12/10 00:58, Florian Westphal wrote:
>>> instead of returning -1 on error, return an error number to allow the
>>> caller to handle some errors differently.
>>>
>>> To make things simpler try_module_get() failure is no longer special-cased.
>>>
>>> The new return values will be used in followup patch.
>>>
>>> Signed-off-by: Florian Westphal <fw@strlen.de>
>>> --- a/net/netfilter/core.c
>>> +++ b/net/netfilter/core.c
>>> @@ -179,9 +179,8 @@ next_hook:
>>>  		if (ret == 0)
>>>  			ret = -EPERM;
>>>  	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
>>> -		if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
>>> -			      verdict >> NF_VERDICT_BITS))
>>> -			goto next_hook;
>>> +		nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
>>> +			       verdict >> NF_VERDICT_BITS);
>>
>> You have to remove the next_hook label if you want to do this.
>>
>> It's not clear to me why we need to remove the goto jump, could you
>> clarify this?
> 
> Sure. The only place where nf_queue returns 0 is this snippet in __nf_queue():

I see, only the try_module_get part.

>>> -	/* If it's going away, ignore hook. */
>>> -	if (!try_module_get(entry->elem->owner)) {
>>> -		rcu_read_unlock();
>>> -		kfree(entry);
>>> -		return 0;
>>> -	}
>>> +	if (!try_module_get(entry->elem->owner))
>>> +		goto err_unlock;
>>
>> With this change, we're now releasing the skb, is this deliberate?
> 
> I think its not necessary to make this a special case (i.e. just
> kfree_skb(skb) instead).

This case is rare but I'd prefer not to drop packets if the hook is
going away.

[...]
>> Better propagate the error of skb_gso_segment(...) with PTR_ERR()
> 
> The reason I did not do that is because I wanted to limit the number
> of code one has to check for possible errno values that can be returned
> by nf_queue.
> 
> Its probably reasonable to assume that skb_gso_segment() won't return
> ESRCH (which I use to determine that qeueuing failed because the queue number did
> not exist), so it might be safe to use PTR_ERR here.

Currently, it only return -EINVAL. Anyway, it makes sense what you
mention. Better use -EINVAL and add a short comment upon it that tells
why we're doing this.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller
  2011-01-12 21:59       ` Pablo Neira Ayuso
@ 2011-01-13  0:14         ` Florian Westphal
  0 siblings, 0 replies; 23+ messages in thread
From: Florian Westphal @ 2011-01-13  0:14 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On 12/01/11 21:49, Florian Westphal wrote:
> > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >>> -	/* If it's going away, ignore hook. */
> >>> -	if (!try_module_get(entry->elem->owner)) {
> >>> -		rcu_read_unlock();
> >>> -		kfree(entry);
> >>> -		return 0;
> >>> -	}
> >>> +	if (!try_module_get(entry->elem->owner))
> >>> +		goto err_unlock;
> >>
> >> With this change, we're now releasing the skb, is this deliberate?
> > 
> > I think its not necessary to make this a special case (i.e. just
> > kfree_skb(skb) instead).
> 
> This case is rare but I'd prefer not to drop packets if the hook is
> going away.

Okay. I used -ECANCELED here, its rarely used in the kernel and
i think it comes closest.

> >> Better propagate the error of skb_gso_segment(...) with PTR_ERR()
> > 
> > The reason I did not do that is because I wanted to limit the number
> > of code one has to check for possible errno values that can be returned
> > by nf_queue.
> > 
> > Its probably reasonable to assume that skb_gso_segment() won't return
> > ESRCH (which I use to determine that qeueuing failed because the queue number did
> > not exist), so it might be safe to use PTR_ERR here.
> 
> Currently, it only return -EINVAL. Anyway, it makes sense what you
> mention. Better use -EINVAL and add a short comment upon it that tells
> why we're doing this.

I added a comment about this and plan to send v2 of the patch set on
sunday.

Thanks again for reviewing,
Florian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-12 20:52     ` Florian Westphal
@ 2011-01-14 14:05       ` Patrick McHardy
  2011-01-15 14:29         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2011-01-14 14:05 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

On 12.01.2011 21:52, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>>>  /* only for userspace compatibility */
>>>  #ifndef __KERNEL__
>>> @@ -41,6 +44,9 @@
>>>     <= 0x2000 is used for protocol-flags. */
>>>  #define NFC_UNKNOWN 0x4000
>>>  #define NFC_ALTERED 0x8000
>>> +
>>> +/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
>>> +#define NF_VERDICT_BITS 16
>>>  #endif
>>
>> i don't find any use of this in user-space.
> 
> Alright, I will simply remove it, then :-)

I think we should still keep it around in the compat section,
just because we don't know of any userspace code using this
doesn't mean it doesn't exist.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-14 14:05       ` Patrick McHardy
@ 2011-01-15 14:29         ` Pablo Neira Ayuso
  2011-01-15 14:33           ` Patrick McHardy
  0 siblings, 1 reply; 23+ messages in thread
From: Pablo Neira Ayuso @ 2011-01-15 14:29 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Florian Westphal, netfilter-devel

On 14/01/11 15:05, Patrick McHardy wrote:
> On 12.01.2011 21:52, Florian Westphal wrote:
>> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>>>>  /* only for userspace compatibility */
>>>>  #ifndef __KERNEL__
>>>> @@ -41,6 +44,9 @@
>>>>     <= 0x2000 is used for protocol-flags. */
>>>>  #define NFC_UNKNOWN 0x4000
>>>>  #define NFC_ALTERED 0x8000
>>>> +
>>>> +/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
>>>> +#define NF_VERDICT_BITS 16
>>>>  #endif
>>>
>>> i don't find any use of this in user-space.
>>
>> Alright, I will simply remove it, then :-)
> 
> I think we should still keep it around in the compat section,
> just because we don't know of any userspace code using this
> doesn't mean it doesn't exist.

Yes, that's true, but I don't fine any sane use of that macro in userspace.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-15 14:29         ` Pablo Neira Ayuso
@ 2011-01-15 14:33           ` Patrick McHardy
  0 siblings, 0 replies; 23+ messages in thread
From: Patrick McHardy @ 2011-01-15 14:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Am 15.01.2011 15:29, schrieb Pablo Neira Ayuso:
> On 14/01/11 15:05, Patrick McHardy wrote:
>>>>> >>>> +/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
>>>>> >>>> +#define NF_VERDICT_BITS 16
>>>>> >>>>  #endif
>>>> >>>
>>>> >>> i don't find any use of this in user-space.
>>> >>
>>> >> Alright, I will simply remove it, then :-)
>> > 
>> > I think we should still keep it around in the compat section,
>> > just because we don't know of any userspace code using this
>> > doesn't mean it doesn't exist.
> Yes, that's true, but I don't fine any sane use of that macro in userspace.
> 

Well, as Florian said, using (queue_nr << NF_VERDICT_BITS) in userspace
is a valid use when reinjecting packets.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-16 13:19 [PATCH v2] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
@ 2011-01-16 13:19 ` Florian Westphal
  2011-01-18 14:52   ` Patrick McHardy
  0 siblings, 1 reply; 23+ messages in thread
From: Florian Westphal @ 2011-01-16 13:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

NF_VERDICT_MASK is currently 0xffff. This is because the upper
16 bits are used to store errno (for NF_DROP) or the queue number
(NF_QUEUE verdict).

As there are up to 0xffff different queues available, there is no more
room to store additional flags.

At the moment there are only 6 different verdicts, i.e. we can reduce
NF_VERDICT_MASK to 0xff to allow storing additional flags in the 0xff00 space.

NF_VERDICT_BITS would then be reduced to 8, but because the value is
exported to userspace, this might cause breakage; e.g.:

e.g. 'queuenr = (1 << NF_VERDICT_BITS) | NF_QUEUE'  would now break.

Thus, remove NF_VERDICT_BITS usage in the kernel and move the old value
to the 'userspace compat' section.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 no changes since v1.  In particular, I've kept NF_VERDICT_BITS define
 in the userspace compat section, as Patrick did prefer to keept it.

 include/linux/netfilter.h |   20 +++++++++++++++-----
 net/netfilter/core.c      |    4 ++--
 net/netfilter/nf_queue.c  |    2 +-
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 0ab7ca7..78b73cc 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -24,16 +24,19 @@
 #define NF_MAX_VERDICT NF_STOP
 
 /* we overload the higher bits for encoding auxiliary data such as the queue
- * number. Not nice, but better than additional function arguments. */
-#define NF_VERDICT_MASK 0x0000ffff
-#define NF_VERDICT_BITS 16
+ * number or errno values. Not nice, but better than additional function
+ * arguments. */
+#define NF_VERDICT_MASK 0x000000ff
+
+/* extra verdict flags have mask 0x0000ff00 */
 
+/* queue number (NF_QUEUE) or errno (NF_DROP) */
 #define NF_VERDICT_QMASK 0xffff0000
 #define NF_VERDICT_QBITS 16
 
-#define NF_QUEUE_NR(x) ((((x) << NF_VERDICT_BITS) & NF_VERDICT_QMASK) | NF_QUEUE)
+#define NF_QUEUE_NR(x) ((((x) << 16) & NF_VERDICT_QMASK) | NF_QUEUE)
 
-#define NF_DROP_ERR(x) (((-x) << NF_VERDICT_BITS) | NF_DROP)
+#define NF_DROP_ERR(x) (((-x) << 16) | NF_DROP)
 
 /* only for userspace compatibility */
 #ifndef __KERNEL__
@@ -41,6 +44,9 @@
    <= 0x2000 is used for protocol-flags. */
 #define NFC_UNKNOWN 0x4000
 #define NFC_ALTERED 0x8000
+
+/* NF_VERDICT_BITS should be 8 now, but userspace might break if this changes */
+#define NF_VERDICT_BITS 16
 #endif
 
 enum nf_inet_hooks {
@@ -72,6 +78,10 @@ union nf_inet_addr {
 
 #ifdef __KERNEL__
 #ifdef CONFIG_NETFILTER
+static inline int NF_DROP_GETERR(int verdict)
+{
+	return -(verdict >> NF_VERDICT_QBITS);
+}
 
 static inline int nf_inet_addr_cmp(const union nf_inet_addr *a1,
 				   const union nf_inet_addr *a2)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 0c5b796..4d88e45 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -175,12 +175,12 @@ next_hook:
 		ret = 1;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
 		kfree_skb(skb);
-		ret = -(verdict >> NF_VERDICT_BITS);
+		ret = NF_DROP_GETERR(verdict);
 		if (ret == 0)
 			ret = -EPERM;
 	} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
 		ret = nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
-			       verdict >> NF_VERDICT_BITS);
+			       verdict >> NF_VERDICT_QBITS);
 		if (ret < 0) {
 			if (ret == -ECANCELED)
 				goto next_hook;
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 2a9a6fc..4800649 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -299,7 +299,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 	case NF_QUEUE:
 		err = __nf_queue(skb, elem, entry->pf, entry->hook,
 				 entry->indev, entry->outdev, entry->okfn,
-				 verdict >> NF_VERDICT_BITS);
+				 verdict >> NF_VERDICT_QBITS);
 		if (err < 0) {
 			if (err == -ECANCELED)
 				goto next_hook;
-- 
1.7.2.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff
  2011-01-16 13:19 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
@ 2011-01-18 14:52   ` Patrick McHardy
  0 siblings, 0 replies; 23+ messages in thread
From: Patrick McHardy @ 2011-01-18 14:52 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 16.01.2011 14:19, Florian Westphal wrote:
> NF_VERDICT_MASK is currently 0xffff. This is because the upper
> 16 bits are used to store errno (for NF_DROP) or the queue number
> (NF_QUEUE verdict).
> 
> As there are up to 0xffff different queues available, there is no more
> room to store additional flags.
> 
> At the moment there are only 6 different verdicts, i.e. we can reduce
> NF_VERDICT_MASK to 0xff to allow storing additional flags in the 0xff00 space.
> 
> NF_VERDICT_BITS would then be reduced to 8, but because the value is
> exported to userspace, this might cause breakage; e.g.:
> 
> e.g. 'queuenr = (1 << NF_VERDICT_BITS) | NF_QUEUE'  would now break.
> 
> Thus, remove NF_VERDICT_BITS usage in the kernel and move the old value
> to the 'userspace compat' section.

Applied, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-01-18 14:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-26 23:58 [PATCH] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
2010-12-26 23:58 ` [PATCH 1/6] netfilter: kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE Florian Westphal
2010-12-27  8:41   ` Jan Engelhardt
2010-12-27  8:47     ` Michał Mirosław
2010-12-26 23:58 ` [PATCH 2/6] netfilter: nfnetlink_queue: return error number to caller Florian Westphal
2011-01-12 18:56   ` Pablo Neira Ayuso
2011-01-12 20:49     ` Florian Westphal
2011-01-12 21:59       ` Pablo Neira Ayuso
2011-01-13  0:14         ` Florian Westphal
2010-12-26 23:58 ` [PATCH 3/6] netfilter: nfnetlink_queue: do not free skb on error Florian Westphal
2011-01-12 19:01   ` Pablo Neira Ayuso
2011-01-12 20:50     ` Florian Westphal
2010-12-26 23:58 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
2011-01-12 19:02   ` Pablo Neira Ayuso
2011-01-12 20:52     ` Florian Westphal
2011-01-14 14:05       ` Patrick McHardy
2011-01-15 14:29         ` Pablo Neira Ayuso
2011-01-15 14:33           ` Patrick McHardy
2010-12-26 23:58 ` [PATCH 5/6] netfilter: allow NFQUEUE bypass if no listener is available Florian Westphal
2011-01-12 19:03   ` Pablo Neira Ayuso
2010-12-26 23:58 ` [PATCH 6/6] netfilter: do not omit re-route check on NF_QUEUE verdict Florian Westphal
  -- strict thread matches above, loose matches on Subject: below --
2011-01-16 13:19 [PATCH v2] NFQUEUE v2 target with 'queue bypass' support Florian Westphal
2011-01-16 13:19 ` [PATCH 4/6] netfilter: reduce NF_VERDICT_MASK to 0xff Florian Westphal
2011-01-18 14:52   ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).