* nf-next: TEE only
@ 2010-04-13 23:21 Jan Engelhardt
2010-04-13 23:21 ` [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE Jan Engelhardt
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-13 23:21 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel
Hi,
in this round:
- use IP6SKB_REROUTED in v6 code
- pick_net function: use skb->dev or skb->dst->dev when available
(or completely fall back to init_net in case there's something
going on)
----
The following changes since commit 9c6eb28aca52d562f3ffbaebaa56385df9972a43:
Jan Engelhardt (1):
netfilter: ipv6: add IPSKB_REROUTED exclusion to NF_HOOK/POSTROUTING invocation
are available in the git repository at:
git://dev.medozas.de/linux master
Jan Engelhardt (4):
netfilter: xtables: inclusion of xt_TEE
netfilter: xtables2: make ip_tables reentrant
netfilter: xt_TEE: have cloned packet travel through Xtables too
netfilter: xtables: remove old comments about reentrancy
include/linux/netfilter/Kbuild | 1 +
include/linux/netfilter/x_tables.h | 7 +
include/linux/netfilter/xt_TEE.h | 8 ++
net/ipv4/netfilter/arp_tables.c | 6 +-
net/ipv4/netfilter/ip_tables.c | 67 ++++++-----
net/ipv4/netfilter/ipt_REJECT.c | 3 -
net/ipv6/netfilter/ip6_tables.c | 58 ++++------
net/ipv6/netfilter/ip6t_REJECT.c | 3 -
net/netfilter/Kconfig | 7 +
net/netfilter/Makefile | 1 +
net/netfilter/x_tables.c | 77 ++++++++++++
net/netfilter/xt_TEE.c | 228 ++++++++++++++++++++++++++++++++++++
12 files changed, 390 insertions(+), 76 deletions(-)
create mode 100644 include/linux/netfilter/xt_TEE.h
create mode 100644 net/netfilter/xt_TEE.c
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
@ 2010-04-13 23:21 ` Jan Engelhardt
2010-04-14 5:52 ` Eric Dumazet
2010-04-13 23:21 ` [PATCH 2/4] netfilter: xtables2: make ip_tables reentrant Jan Engelhardt
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-13 23:21 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel
xt_TEE can be used to clone and reroute a packet. This can for
example be used to copy traffic at a router for logging purposes
to another dedicated machine.
References: http://www.gossamer-threads.com/lists/iptables/devel/68781
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
include/linux/netfilter/Kbuild | 1 +
include/linux/netfilter/xt_TEE.h | 8 ++
net/ipv4/ip_output.c | 1 +
net/ipv6/ip6_output.c | 1 +
net/netfilter/Kconfig | 7 +
net/netfilter/Makefile | 1 +
net/netfilter/xt_TEE.c | 232 ++++++++++++++++++++++++++++++++++++++
7 files changed, 251 insertions(+), 0 deletions(-)
create mode 100644 include/linux/netfilter/xt_TEE.h
create mode 100644 net/netfilter/xt_TEE.c
diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index a5a63e4..48767cd 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -16,6 +16,7 @@ header-y += xt_RATEEST.h
header-y += xt_SECMARK.h
header-y += xt_TCPMSS.h
header-y += xt_TCPOPTSTRIP.h
+header-y += xt_TEE.h
header-y += xt_TPROXY.h
header-y += xt_comment.h
header-y += xt_connbytes.h
diff --git a/include/linux/netfilter/xt_TEE.h b/include/linux/netfilter/xt_TEE.h
new file mode 100644
index 0000000..83fa768
--- /dev/null
+++ b/include/linux/netfilter/xt_TEE.h
@@ -0,0 +1,8 @@
+#ifndef _XT_TEE_TARGET_H
+#define _XT_TEE_TARGET_H
+
+struct xt_tee_tginfo {
+ union nf_inet_addr gw;
+};
+
+#endif /* _XT_TEE_TARGET_H */
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index f09135e..0abfdde 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -309,6 +309,7 @@ int ip_output(struct sk_buff *skb)
ip_finish_output,
!(IPCB(skb)->flags & IPSKB_REROUTED));
}
+EXPORT_SYMBOL_GPL(ip_output);
int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
{
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c10a38a..d09be7f 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -176,6 +176,7 @@ int ip6_output(struct sk_buff *skb)
ip6_finish_output,
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
}
+EXPORT_SYMBOL_GPL(ip6_output);
/*
* xmit an sk_buff (used by TCP)
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 8055786..673a6c8 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -502,6 +502,13 @@ config NETFILTER_XT_TARGET_RATEEST
To compile it as a module, choose M here. If unsure, say N.
+config NETFILTER_XT_TARGET_TEE
+ tristate '"TEE" - packet cloning to alternate destiantion'
+ depends on NETFILTER_ADVANCED
+ ---help---
+ This option adds a "TEE" target with which a packet can be cloned and
+ this clone be rerouted to another nexthop.
+
config NETFILTER_XT_TARGET_TPROXY
tristate '"TPROXY" target support (EXPERIMENTAL)'
depends on EXPERIMENTAL
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index cd31afe..14e3a8f 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_SECMARK) += xt_SECMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_TPROXY) += xt_TPROXY.o
obj-$(CONFIG_NETFILTER_XT_TARGET_TCPMSS) += xt_TCPMSS.o
obj-$(CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP) += xt_TCPOPTSTRIP.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_TEE) += xt_TEE.o
obj-$(CONFIG_NETFILTER_XT_TARGET_TRACE) += xt_TRACE.o
# matches
diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
new file mode 100644
index 0000000..96e3785
--- /dev/null
+++ b/net/netfilter/xt_TEE.c
@@ -0,0 +1,232 @@
+/*
+ * "TEE" target extension for Xtables
+ * Copyright © Sebastian ClaÃen, 2007
+ * Jan Engelhardt, 2007-2010
+ *
+ * based on ipt_ROUTE.c from Cédric de Launois
+ * <delaunois@info.ucl.be>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 or later, as published by the Free Software Foundation.
+ */
+#include <linux/ip.h>
+#include <linux/module.h>
+#include <linux/route.h>
+#include <linux/skbuff.h>
+#include <net/checksum.h>
+#include <net/icmp.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/ip6_route.h>
+#include <net/route.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_TEE.h>
+
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+# define WITH_CONNTRACK 1
+# include <net/netfilter/nf_conntrack.h>
+#endif
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+# define WITH_IPV6 1
+#endif
+
+static const union nf_inet_addr tee_zero_address;
+
+static struct net *pick_net(struct sk_buff *skb)
+{
+ const struct net_device *dev;
+ const struct dst_entry *dst;
+
+ if (skb->dev != NULL)
+ return dev_net(skb->dev);
+ dst = skb_dst(skb);
+ if (dst != NULL && dst->dev != NULL)
+ return dev_net(dst->dev);
+ return &init_net;
+}
+
+static bool
+tee_tg_route4(struct sk_buff *skb, const struct xt_tee_tginfo *info)
+{
+ const struct iphdr *iph = ip_hdr(skb);
+ struct rtable *rt;
+ struct flowi fl;
+
+ memset(&fl, 0, sizeof(fl));
+ fl.nl_u.ip4_u.daddr = info->gw.ip;
+ fl.nl_u.ip4_u.tos = RT_TOS(iph->tos);
+ fl.nl_u.ip4_u.scope = RT_SCOPE_UNIVERSE;
+
+ if (ip_route_output_key(pick_net(skb), &rt, &fl) != 0) {
+ kfree_skb(skb);
+ return false;
+ }
+ dst_release(skb_dst(skb));
+ skb_dst_set(skb, &rt->u.dst);
+ skb->dev = rt->u.dst.dev;
+ skb->protocol = htons(ETH_P_IP);
+ return true;
+}
+
+static unsigned int
+tee_tg4(struct sk_buff *skb, const struct xt_target_param *par)
+{
+ const struct xt_tee_tginfo *info = par->targinfo;
+ struct iphdr *iph;
+
+ /*
+ * Copy the skb, and route the copy. Will later return %XT_CONTINUE for
+ * the original skb, which should continue on its way as if nothing has
+ * happened. The copy should be independently delivered to the TEE
+ * --gateway.
+ */
+ skb = skb_copy(skb, GFP_ATOMIC);
+ if (skb == NULL)
+ return XT_CONTINUE;
+
+#ifdef WITH_CONNTRACK
+ /* Avoid counting cloned packets towards the original connection. */
+ nf_conntrack_put(skb->nfct);
+ skb->nfct = &nf_conntrack_untracked.ct_general;
+ skb->nfctinfo = IP_CT_NEW;
+ nf_conntrack_get(skb->nfct);
+#endif
+ /*
+ * If we are in PREROUTING/INPUT, the checksum must be recalculated
+ * since the length could have changed as a result of defragmentation.
+ *
+ * We also decrease the TTL to mitigate potential TEE loops
+ * between two hosts.
+ *
+ * Set %IP_DF so that the original source is notified of a potentially
+ * decreased MTU on the clone route. IPv6 does this too.
+ */
+ iph = ip_hdr(skb);
+ iph->frag_off |= htons(IP_DF);
+ if (par->hooknum == NF_INET_PRE_ROUTING ||
+ par->hooknum == NF_INET_LOCAL_IN)
+ --iph->ttl;
+ ip_send_check(iph);
+
+ /*
+ * Xtables is not reentrant currently, so a choice has to be made:
+ * 1. return absolute verdict for the original and let the cloned
+ * packet travel through the chains
+ * 2. let the original continue travelling and not pass the clone
+ * to Xtables.
+ * #2 is chosen. Normally, we would use ip_local_out for the clone.
+ * Because iph->check is already correct and we don't pass it to
+ * Xtables anyway, a shortcut to dst_output [forwards to ip_output] can
+ * be taken. %IPSKB_REROUTED needs to be set so that ip_output does not
+ * invoke POSTROUTING on the cloned packet.
+ */
+ IPCB(skb)->flags |= IPSKB_REROUTED;
+ if (tee_tg_route4(skb, info))
+ ip_output(skb);
+
+ return XT_CONTINUE;
+}
+
+#ifdef WITH_IPV6
+static bool
+tee_tg_route6(struct sk_buff *skb, const struct xt_tee_tginfo *info)
+{
+ const struct ipv6hdr *iph = ipv6_hdr(skb);
+ struct dst_entry *dst;
+ struct flowi fl;
+
+ memset(&fl, 0, sizeof(fl));
+ fl.nl_u.ip6_u.daddr = info->gw.in6;
+ fl.nl_u.ip6_u.flowlabel = ((iph->flow_lbl[0] & 0xF) << 16) |
+ (iph->flow_lbl[1] << 8) | iph->flow_lbl[2];
+
+ dst = ip6_route_output(pick_net(skb), NULL, &fl);
+ if (dst == NULL) {
+ kfree_skb(skb);
+ return false;
+ }
+ dst_release(skb_dst(skb));
+ skb_dst_set(skb, dst);
+ skb->dev = dst->dev;
+ skb->protocol = htons(ETH_P_IPV6);
+ return true;
+}
+
+static unsigned int
+tee_tg6(struct sk_buff *skb, const struct xt_target_param *par)
+{
+ const struct xt_tee_tginfo *info = par->targinfo;
+
+ if ((skb = skb_copy(skb, GFP_ATOMIC)) == NULL)
+ return XT_CONTINUE;
+
+#ifdef WITH_CONNTRACK
+ nf_conntrack_put(skb->nfct);
+ skb->nfct = &nf_conntrack_untracked.ct_general;
+ skb->nfctinfo = IP_CT_NEW;
+ nf_conntrack_get(skb->nfct);
+#endif
+ if (par->hooknum == NF_INET_PRE_ROUTING ||
+ par->hooknum == NF_INET_LOCAL_IN) {
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+ --iph->hop_limit;
+ }
+ IP6CB(skb)->flags |= IP6SKB_REROUTED;
+ if (tee_tg_route6(skb, info))
+ ip6_output(skb);
+
+ return XT_CONTINUE;
+}
+#endif /* WITH_IPV6 */
+
+static int tee_tg_check(const struct xt_tgchk_param *par)
+{
+ const struct xt_tee_tginfo *info = par->targinfo;
+
+ /* 0.0.0.0 and :: not allowed */
+ return (memcmp(&info->gw, &tee_zero_address,
+ sizeof(tee_zero_address)) == 0) ? -EINVAL : 0;
+}
+
+static struct xt_target tee_tg_reg[] __read_mostly = {
+ {
+ .name = "TEE",
+ .revision = 0,
+ .family = NFPROTO_IPV4,
+ .target = tee_tg4,
+ .targetsize = sizeof(struct xt_tee_tginfo),
+ .checkentry = tee_tg_check,
+ .me = THIS_MODULE,
+ },
+#ifdef WITH_IPV6
+ {
+ .name = "TEE",
+ .revision = 0,
+ .family = NFPROTO_IPV6,
+ .target = tee_tg6,
+ .targetsize = sizeof(struct xt_tee_tginfo),
+ .checkentry = tee_tg_check,
+ .me = THIS_MODULE,
+ },
+#endif
+};
+
+static int __init tee_tg_init(void)
+{
+ return xt_register_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
+}
+
+static void __exit tee_tg_exit(void)
+{
+ xt_unregister_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
+}
+
+module_init(tee_tg_init);
+module_exit(tee_tg_exit);
+MODULE_AUTHOR("Sebastian ClaÃen <sebastian.classen@freenet.ag>");
+MODULE_AUTHOR("Jan Engelhardt <jengelh@medozas.de>");
+MODULE_DESCRIPTION("Xtables: Reroute packet copy");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("ipt_TEE");
+MODULE_ALIAS("ip6t_TEE");
--
1.7.0.4
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] netfilter: xtables2: make ip_tables reentrant
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
2010-04-13 23:21 ` [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE Jan Engelhardt
@ 2010-04-13 23:21 ` Jan Engelhardt
2010-04-13 23:21 ` [PATCH 3/4] netfilter: xt_TEE: have cloned packet travel through Xtables too Jan Engelhardt
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-13 23:21 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel
Currently, the table traverser stores return addresses in the ruleset
itself (struct ip6t_entry->comefrom). This has a well-known drawback:
the jumpstack is overwritten on reentry, making it necessary for
targets to return absolute verdicts. Also, the ruleset (which might
be heavy memory-wise) needs to be replicated for each CPU that can
possibly invoke ip6t_do_table.
This patch decouples the jumpstack from struct ip6t_entry and instead
puts it into xt_table_info. Not being restricted by 'comefrom'
anymore, we can set up a stack as needed. By default, there is room
allocated for two entries into the traverser. The setting is
configurable at runtime through sysfs and will take effect when a
table is replaced by a new one.
arp_tables is not touched though, because there is just one/two
modules and further patches seek to collapse the table traverser
anyhow.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
include/linux/netfilter/x_tables.h | 7 +++
net/ipv4/netfilter/arp_tables.c | 6 ++-
net/ipv4/netfilter/ip_tables.c | 65 ++++++++++++++++--------------
net/ipv6/netfilter/ip6_tables.c | 56 ++++++++++----------------
net/netfilter/x_tables.c | 77 ++++++++++++++++++++++++++++++++++++
5 files changed, 145 insertions(+), 66 deletions(-)
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index 26ced0c..50c8672 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -401,6 +401,13 @@ struct xt_table_info {
unsigned int hook_entry[NF_INET_NUMHOOKS];
unsigned int underflow[NF_INET_NUMHOOKS];
+ /*
+ * Number of user chains. Since tables cannot have loops, at most
+ * @stacksize jumps (number of user chains) can possibly be made.
+ */
+ unsigned int stacksize;
+ unsigned int *stackptr;
+ void ***jumpstack;
/* ipt_entry tables: one per CPU */
/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
void *entries[1];
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index e8e363d..07a6990 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -649,6 +649,9 @@ static int translate_table(struct xt_table_info *newinfo, void *entry0,
if (ret != 0)
break;
++i;
+ if (strcmp(arpt_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+ ++newinfo->stacksize;
}
duprintf("translate_table: ARPT_ENTRY_ITERATE gives %d\n", ret);
if (ret != 0)
@@ -1774,8 +1777,7 @@ struct xt_table *arpt_register_table(struct net *net,
{
int ret;
struct xt_table_info *newinfo;
- struct xt_table_info bootstrap
- = { 0, 0, 0, { 0 }, { 0 }, { } };
+ struct xt_table_info bootstrap = {0};
void *loc_cpu_entry;
struct xt_table *new_table;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 18c5b15..70900ec 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -321,8 +321,6 @@ ipt_do_table(struct sk_buff *skb,
const struct net_device *out,
struct xt_table *table)
{
-#define tb_comefrom ((struct ipt_entry *)table_base)->comefrom
-
static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long))));
const struct iphdr *ip;
bool hotdrop = false;
@@ -330,7 +328,8 @@ ipt_do_table(struct sk_buff *skb,
unsigned int verdict = NF_DROP;
const char *indev, *outdev;
const void *table_base;
- struct ipt_entry *e, *back;
+ struct ipt_entry *e, **jumpstack;
+ unsigned int *stackptr, origptr, cpu;
const struct xt_table_info *private;
struct xt_match_param mtpar;
struct xt_target_param tgpar;
@@ -356,19 +355,23 @@ ipt_do_table(struct sk_buff *skb,
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
xt_info_rdlock_bh();
private = table->private;
- table_base = private->entries[smp_processor_id()];
+ cpu = smp_processor_id();
+ table_base = private->entries[cpu];
+ jumpstack = (struct ipt_entry **)private->jumpstack[cpu];
+ stackptr = &private->stackptr[cpu];
+ origptr = *stackptr;
e = get_entry(table_base, private->hook_entry[hook]);
- /* For return from builtin chain */
- back = get_entry(table_base, private->underflow[hook]);
+ pr_devel("Entering %s(hook %u); sp at %u (UF %p)\n",
+ table->name, hook, origptr,
+ get_entry(table_base, private->underflow[hook]));
do {
const struct ipt_entry_target *t;
const struct xt_entry_match *ematch;
IP_NF_ASSERT(e);
- IP_NF_ASSERT(back);
if (!ip_packet_match(ip, indev, outdev,
&e->ip, mtpar.fragoff)) {
no_match:
@@ -403,17 +406,28 @@ ipt_do_table(struct sk_buff *skb,
verdict = (unsigned)(-v) - 1;
break;
}
- e = back;
- back = get_entry(table_base, back->comefrom);
+ if (*stackptr == 0) {
+ e = get_entry(table_base,
+ private->underflow[hook]);
+ pr_devel("Underflow (this is normal) "
+ "to %p\n", e);
+ } else {
+ e = jumpstack[--*stackptr];
+ pr_devel("Pulled %p out from pos %u\n",
+ e, *stackptr);
+ e = ipt_next_entry(e);
+ }
continue;
}
if (table_base + v != ipt_next_entry(e) &&
!(e->ip.flags & IPT_F_GOTO)) {
- /* Save old back ptr in next entry */
- struct ipt_entry *next = ipt_next_entry(e);
- next->comefrom = (void *)back - table_base;
- /* set back pointer to next entry */
- back = next;
+ if (*stackptr >= private->stacksize) {
+ verdict = NF_DROP;
+ break;
+ }
+ jumpstack[(*stackptr)++] = e;
+ pr_devel("Pushed %p into pos %u\n",
+ e, *stackptr - 1);
}
e = get_entry(table_base, v);
@@ -426,18 +440,7 @@ ipt_do_table(struct sk_buff *skb,
tgpar.targinfo = t->data;
-#ifdef CONFIG_NETFILTER_DEBUG
- tb_comefrom = 0xeeeeeeec;
-#endif
verdict = t->u.kernel.target->target(skb, &tgpar);
-#ifdef CONFIG_NETFILTER_DEBUG
- if (tb_comefrom != 0xeeeeeeec && verdict == IPT_CONTINUE) {
- printk("Target %s reentered!\n",
- t->u.kernel.target->name);
- verdict = NF_DROP;
- }
- tb_comefrom = 0x57acc001;
-#endif
/* Target might have changed stuff. */
ip = ip_hdr(skb);
if (verdict == IPT_CONTINUE)
@@ -447,7 +450,9 @@ ipt_do_table(struct sk_buff *skb,
break;
} while (!hotdrop);
xt_info_rdunlock_bh();
-
+ pr_devel("Exiting %s; resetting sp from %u to %u\n",
+ __func__, *stackptr, origptr);
+ *stackptr = origptr;
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
#else
@@ -455,8 +460,6 @@ ipt_do_table(struct sk_buff *skb,
return NF_DROP;
else return verdict;
#endif
-
-#undef tb_comefrom
}
/* Figures out from what hook each rule can be called: returns 0 if
@@ -838,6 +841,9 @@ translate_table(struct net *net, struct xt_table_info *newinfo, void *entry0,
if (ret != 0)
return ret;
++i;
+ if (strcmp(ipt_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+ ++newinfo->stacksize;
}
if (i != repl->num_entries) {
@@ -2086,8 +2092,7 @@ struct xt_table *ipt_register_table(struct net *net,
{
int ret;
struct xt_table_info *newinfo;
- struct xt_table_info bootstrap
- = { 0, 0, 0, { 0 }, { 0 }, { } };
+ struct xt_table_info bootstrap = {0};
void *loc_cpu_entry;
struct xt_table *new_table;
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index f2b815e..2a2770b 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -351,15 +351,14 @@ ip6t_do_table(struct sk_buff *skb,
const struct net_device *out,
struct xt_table *table)
{
-#define tb_comefrom ((struct ip6t_entry *)table_base)->comefrom
-
static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long))));
bool hotdrop = false;
/* Initializing verdict to NF_DROP keeps gcc happy. */
unsigned int verdict = NF_DROP;
const char *indev, *outdev;
const void *table_base;
- struct ip6t_entry *e, *back;
+ struct ip6t_entry *e, **jumpstack;
+ unsigned int *stackptr, origptr, cpu;
const struct xt_table_info *private;
struct xt_match_param mtpar;
struct xt_target_param tgpar;
@@ -383,19 +382,19 @@ ip6t_do_table(struct sk_buff *skb,
xt_info_rdlock_bh();
private = table->private;
- table_base = private->entries[smp_processor_id()];
+ cpu = smp_processor_id();
+ table_base = private->entries[cpu];
+ jumpstack = (struct ip6t_entry **)private->jumpstack[cpu];
+ stackptr = &private->stackptr[cpu];
+ origptr = *stackptr;
e = get_entry(table_base, private->hook_entry[hook]);
- /* For return from builtin chain */
- back = get_entry(table_base, private->underflow[hook]);
-
do {
const struct ip6t_entry_target *t;
const struct xt_entry_match *ematch;
IP_NF_ASSERT(e);
- IP_NF_ASSERT(back);
if (!ip6_packet_match(skb, indev, outdev, &e->ipv6,
&mtpar.thoff, &mtpar.fragoff, &hotdrop)) {
no_match:
@@ -432,17 +431,20 @@ ip6t_do_table(struct sk_buff *skb,
verdict = (unsigned)(-v) - 1;
break;
}
- e = back;
- back = get_entry(table_base, back->comefrom);
+ if (*stackptr == 0)
+ e = get_entry(table_base,
+ private->underflow[hook]);
+ else
+ e = ip6t_next_entry(jumpstack[--*stackptr]);
continue;
}
if (table_base + v != ip6t_next_entry(e) &&
!(e->ipv6.flags & IP6T_F_GOTO)) {
- /* Save old back ptr in next entry */
- struct ip6t_entry *next = ip6t_next_entry(e);
- next->comefrom = (void *)back - table_base;
- /* set back pointer to next entry */
- back = next;
+ if (*stackptr >= private->stacksize) {
+ verdict = NF_DROP;
+ break;
+ }
+ jumpstack[(*stackptr)++] = e;
}
e = get_entry(table_base, v);
@@ -454,19 +456,7 @@ ip6t_do_table(struct sk_buff *skb,
tgpar.target = t->u.kernel.target;
tgpar.targinfo = t->data;
-#ifdef CONFIG_NETFILTER_DEBUG
- tb_comefrom = 0xeeeeeeec;
-#endif
verdict = t->u.kernel.target->target(skb, &tgpar);
-
-#ifdef CONFIG_NETFILTER_DEBUG
- if (tb_comefrom != 0xeeeeeeec && verdict == IP6T_CONTINUE) {
- printk("Target %s reentered!\n",
- t->u.kernel.target->name);
- verdict = NF_DROP;
- }
- tb_comefrom = 0x57acc001;
-#endif
if (verdict == IP6T_CONTINUE)
e = ip6t_next_entry(e);
else
@@ -474,10 +464,8 @@ ip6t_do_table(struct sk_buff *skb,
break;
} while (!hotdrop);
-#ifdef CONFIG_NETFILTER_DEBUG
- tb_comefrom = NETFILTER_LINK_POISON;
-#endif
xt_info_rdunlock_bh();
+ *stackptr = origptr;
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
@@ -486,8 +474,6 @@ ip6t_do_table(struct sk_buff *skb,
return NF_DROP;
else return verdict;
#endif
-
-#undef tb_comefrom
}
/* Figures out from what hook each rule can be called: returns 0 if
@@ -869,6 +855,9 @@ translate_table(struct net *net, struct xt_table_info *newinfo, void *entry0,
if (ret != 0)
return ret;
++i;
+ if (strcmp(ip6t_get_target(iter)->u.user.name,
+ XT_ERROR_TARGET) == 0)
+ ++newinfo->stacksize;
}
if (i != repl->num_entries) {
@@ -2120,8 +2109,7 @@ struct xt_table *ip6t_register_table(struct net *net,
{
int ret;
struct xt_table_info *newinfo;
- struct xt_table_info bootstrap
- = { 0, 0, 0, { 0 }, { 0 }, { } };
+ struct xt_table_info bootstrap = {0};
void *loc_cpu_entry;
struct xt_table *new_table;
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 8e23d8f..edde5c6 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -62,6 +62,9 @@ static const char *const xt_prefix[NFPROTO_NUMPROTO] = {
[NFPROTO_IPV6] = "ip6",
};
+/* Allow this many total (re)entries. */
+static const unsigned int xt_jumpstack_multiplier = 2;
+
/* Registration hooks for targets. */
int
xt_register_target(struct xt_target *target)
@@ -680,6 +683,26 @@ void xt_free_table_info(struct xt_table_info *info)
else
vfree(info->entries[cpu]);
}
+
+ if (info->jumpstack != NULL) {
+ if (sizeof(void *) * info->stacksize > PAGE_SIZE) {
+ for_each_possible_cpu(cpu)
+ vfree(info->jumpstack[cpu]);
+ } else {
+ for_each_possible_cpu(cpu)
+ kfree(info->jumpstack[cpu]);
+ }
+ }
+
+ if (sizeof(void **) * nr_cpu_ids > PAGE_SIZE)
+ vfree(info->jumpstack);
+ else
+ kfree(info->jumpstack);
+ if (sizeof(unsigned int) * nr_cpu_ids > PAGE_SIZE)
+ vfree(info->stackptr);
+ else
+ kfree(info->stackptr);
+
kfree(info);
}
EXPORT_SYMBOL(xt_free_table_info);
@@ -724,6 +747,49 @@ EXPORT_SYMBOL_GPL(xt_compat_unlock);
DEFINE_PER_CPU(struct xt_info_lock, xt_info_locks);
EXPORT_PER_CPU_SYMBOL_GPL(xt_info_locks);
+static int xt_jumpstack_alloc(struct xt_table_info *i)
+{
+ unsigned int size;
+ int cpu;
+
+ size = sizeof(unsigned int) * nr_cpu_ids;
+ if (size > PAGE_SIZE)
+ i->stackptr = vmalloc(size);
+ else
+ i->stackptr = kmalloc(size, GFP_KERNEL);
+ if (i->stackptr == NULL)
+ return -ENOMEM;
+ memset(i->stackptr, 0, size);
+
+ size = sizeof(void **) * nr_cpu_ids;
+ if (size > PAGE_SIZE)
+ i->jumpstack = vmalloc(size);
+ else
+ i->jumpstack = kmalloc(size, GFP_KERNEL);
+ if (i->jumpstack == NULL)
+ return -ENOMEM;
+ memset(i->jumpstack, 0, size);
+
+ i->stacksize *= xt_jumpstack_multiplier;
+ size = sizeof(void *) * i->stacksize;
+ for_each_possible_cpu(cpu) {
+ if (size > PAGE_SIZE)
+ i->jumpstack[cpu] = vmalloc_node(size,
+ cpu_to_node(cpu));
+ else
+ i->jumpstack[cpu] = kmalloc_node(size,
+ GFP_KERNEL, cpu_to_node(cpu));
+ if (i->jumpstack[cpu] == NULL)
+ /*
+ * Freeing will be done later on by the callers. The
+ * chain is: xt_replace_table -> __do_replace ->
+ * do_replace -> xt_free_table_info.
+ */
+ return -ENOMEM;
+ }
+
+ return 0;
+}
struct xt_table_info *
xt_replace_table(struct xt_table *table,
@@ -732,6 +798,7 @@ xt_replace_table(struct xt_table *table,
int *error)
{
struct xt_table_info *private;
+ int ret;
/* Do the substitution. */
local_bh_disable();
@@ -746,6 +813,12 @@ xt_replace_table(struct xt_table *table,
return NULL;
}
+ ret = xt_jumpstack_alloc(newinfo);
+ if (ret < 0) {
+ *error = ret;
+ return NULL;
+ }
+
table->private = newinfo;
newinfo->initial_entries = private->initial_entries;
@@ -770,6 +843,10 @@ struct xt_table *xt_register_table(struct net *net,
struct xt_table_info *private;
struct xt_table *t, *table;
+ ret = xt_jumpstack_alloc(newinfo);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
/* Don't add one object to multiple lists. */
table = kmemdup(input_table, sizeof(struct xt_table), GFP_KERNEL);
if (!table) {
--
1.7.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] netfilter: xt_TEE: have cloned packet travel through Xtables too
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
2010-04-13 23:21 ` [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE Jan Engelhardt
2010-04-13 23:21 ` [PATCH 2/4] netfilter: xtables2: make ip_tables reentrant Jan Engelhardt
@ 2010-04-13 23:21 ` Jan Engelhardt
2010-04-13 23:21 ` [PATCH 4/4] netfilter: xtables: remove old comments about reentrancy Jan Engelhardt
2010-04-14 10:57 ` nf-next: TEE only Patrick McHardy
4 siblings, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-13 23:21 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel
Since Xtables is now reentrant/nestable, the cloned packet can also go
through Xtables and be subject to rules itself.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
net/ipv4/ip_output.c | 1 -
net/ipv6/ip6_output.c | 1 -
net/netfilter/xt_TEE.c | 38 +++++++++++++++++---------------------
3 files changed, 17 insertions(+), 23 deletions(-)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 0abfdde..f09135e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -309,7 +309,6 @@ int ip_output(struct sk_buff *skb)
ip_finish_output,
!(IPCB(skb)->flags & IPSKB_REROUTED));
}
-EXPORT_SYMBOL_GPL(ip_output);
int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
{
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d09be7f..c10a38a 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -176,7 +176,6 @@ int ip6_output(struct sk_buff *skb)
ip6_finish_output,
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
}
-EXPORT_SYMBOL_GPL(ip6_output);
/*
* xmit an sk_buff (used by TCP)
diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 96e3785..f193133 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -12,6 +12,7 @@
*/
#include <linux/ip.h>
#include <linux/module.h>
+#include <linux/percpu.h>
#include <linux/route.h>
#include <linux/skbuff.h>
#include <net/checksum.h>
@@ -32,6 +33,7 @@
#endif
static const union nf_inet_addr tee_zero_address;
+static DEFINE_PER_CPU(bool, tee_active);
static struct net *pick_net(struct sk_buff *skb)
{
@@ -75,13 +77,15 @@ tee_tg4(struct sk_buff *skb, const struct xt_target_param *par)
const struct xt_tee_tginfo *info = par->targinfo;
struct iphdr *iph;
+ if (percpu_read(tee_active))
+ return XT_CONTINUE;
/*
* Copy the skb, and route the copy. Will later return %XT_CONTINUE for
* the original skb, which should continue on its way as if nothing has
* happened. The copy should be independently delivered to the TEE
* --gateway.
*/
- skb = skb_copy(skb, GFP_ATOMIC);
+ skb = pskb_copy(skb, GFP_ATOMIC);
if (skb == NULL)
return XT_CONTINUE;
@@ -109,22 +113,11 @@ tee_tg4(struct sk_buff *skb, const struct xt_target_param *par)
--iph->ttl;
ip_send_check(iph);
- /*
- * Xtables is not reentrant currently, so a choice has to be made:
- * 1. return absolute verdict for the original and let the cloned
- * packet travel through the chains
- * 2. let the original continue travelling and not pass the clone
- * to Xtables.
- * #2 is chosen. Normally, we would use ip_local_out for the clone.
- * Because iph->check is already correct and we don't pass it to
- * Xtables anyway, a shortcut to dst_output [forwards to ip_output] can
- * be taken. %IPSKB_REROUTED needs to be set so that ip_output does not
- * invoke POSTROUTING on the cloned packet.
- */
- IPCB(skb)->flags |= IPSKB_REROUTED;
- if (tee_tg_route4(skb, info))
- ip_output(skb);
-
+ if (tee_tg_route4(skb, info)) {
+ percpu_write(tee_active, true);
+ ip_local_out(skb);
+ percpu_write(tee_active, false);
+ }
return XT_CONTINUE;
}
@@ -158,6 +151,8 @@ tee_tg6(struct sk_buff *skb, const struct xt_target_param *par)
{
const struct xt_tee_tginfo *info = par->targinfo;
+ if (percpu_read(tee_active))
+ return XT_CONTINUE;
if ((skb = skb_copy(skb, GFP_ATOMIC)) == NULL)
return XT_CONTINUE;
@@ -172,10 +167,11 @@ tee_tg6(struct sk_buff *skb, const struct xt_target_param *par)
struct ipv6hdr *iph = ipv6_hdr(skb);
--iph->hop_limit;
}
- IP6CB(skb)->flags |= IP6SKB_REROUTED;
- if (tee_tg_route6(skb, info))
- ip6_output(skb);
-
+ if (tee_tg_route6(skb, info)) {
+ percpu_write(tee_active, true);
+ ip6_local_out(skb);
+ percpu_write(tee_active, false);
+ }
return XT_CONTINUE;
}
#endif /* WITH_IPV6 */
--
1.7.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] netfilter: xtables: remove old comments about reentrancy
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
` (2 preceding siblings ...)
2010-04-13 23:21 ` [PATCH 3/4] netfilter: xt_TEE: have cloned packet travel through Xtables too Jan Engelhardt
@ 2010-04-13 23:21 ` Jan Engelhardt
2010-04-14 10:57 ` nf-next: TEE only Patrick McHardy
4 siblings, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-13 23:21 UTC (permalink / raw)
To: kaber; +Cc: netfilter-devel
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
net/ipv4/netfilter/ip_tables.c | 2 --
net/ipv4/netfilter/ipt_REJECT.c | 3 ---
net/ipv6/netfilter/ip6_tables.c | 2 --
net/ipv6/netfilter/ip6t_REJECT.c | 3 ---
4 files changed, 0 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 70900ec..bb5e0d9 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -434,8 +434,6 @@ ipt_do_table(struct sk_buff *skb,
continue;
}
- /* Targets which reenter must return
- abs. verdicts */
tgpar.target = t->u.kernel.target;
tgpar.targinfo = t->data;
diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c
index b026014..038fa0b 100644
--- a/net/ipv4/netfilter/ipt_REJECT.c
+++ b/net/ipv4/netfilter/ipt_REJECT.c
@@ -139,9 +139,6 @@ reject_tg(struct sk_buff *skb, const struct xt_target_param *par)
{
const struct ipt_reject_info *reject = par->targinfo;
- /* WARNING: This code causes reentry within iptables.
- This means that the iptables jump stack is now crap. We
- must return an absolute verdict. --RR */
switch (reject->with) {
case IPT_ICMP_NET_UNREACHABLE:
send_unreach(skb, ICMP_NET_UNREACH);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 2a2770b..7afa117 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -451,8 +451,6 @@ ip6t_do_table(struct sk_buff *skb,
continue;
}
- /* Targets which reenter must return
- abs. verdicts */
tgpar.target = t->u.kernel.target;
tgpar.targinfo = t->data;
diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c
index 55b9b2d..dad9762 100644
--- a/net/ipv6/netfilter/ip6t_REJECT.c
+++ b/net/ipv6/netfilter/ip6t_REJECT.c
@@ -179,9 +179,6 @@ reject_tg6(struct sk_buff *skb, const struct xt_target_param *par)
struct net *net = dev_net((par->in != NULL) ? par->in : par->out);
pr_debug("%s: medium point\n", __func__);
- /* WARNING: This code causes reentry within ip6tables.
- This means that the ip6tables jump stack is now crap. We
- must return an absolute verdict. --RR */
switch (reject->with) {
case IP6T_ICMP6_NO_ROUTE:
send_unreach(net, skb, ICMPV6_NOROUTE, par->hooknum);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE
2010-04-13 23:21 ` [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE Jan Engelhardt
@ 2010-04-14 5:52 ` Eric Dumazet
2010-04-14 10:00 ` Jan Engelhardt
0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2010-04-14 5:52 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: kaber, netfilter-devel
Le mercredi 14 avril 2010 à 01:21 +0200, Jan Engelhardt a écrit :
> xt_TEE can be used to clone and reroute a packet. This can for
> example be used to copy traffic at a router for logging purposes
> to another dedicated machine.
>
> References: http://www.gossamer-threads.com/lists/iptables/devel/68781
> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
> ---
Lovely :)
> +
> +static const union nf_inet_addr tee_zero_address;
> +
> +static struct net *pick_net(struct sk_buff *skb)
> +{
#ifdef CONFIG_NET_NS
> + const struct net_device *dev;
> + const struct dst_entry *dst;
> +
> + if (skb->dev != NULL)
> + return dev_net(skb->dev);
> + dst = skb_dst(skb);
> + if (dst != NULL && dst->dev != NULL)
> + return dev_net(dst->dev);
#endif /* CONFIG_NET_NS */
> + return &init_net;
> +}
> +
> +
> +static unsigned int
> +tee_tg4(struct sk_buff *skb, const struct xt_target_param *par)
> +{
> + const struct xt_tee_tginfo *info = par->targinfo;
> + struct iphdr *iph;
> +
> + /*
> + * Copy the skb, and route the copy. Will later return %XT_CONTINUE for
> + * the original skb, which should continue on its way as if nothing has
> + * happened. The copy should be independently delivered to the TEE
> + * --gateway.
> + */
> + skb = skb_copy(skb, GFP_ATOMIC);
> + if (skb == NULL)
> + return XT_CONTINUE;
> +
> +#ifdef WITH_CONNTRACK
> + /* Avoid counting cloned packets towards the original connection. */
> + nf_conntrack_put(skb->nfct);
> + skb->nfct = &nf_conntrack_untracked.ct_general;
> + skb->nfctinfo = IP_CT_NEW;
> + nf_conntrack_get(skb->nfct);
This atomic increment on a global variable worries me... Would it be
possible to avoid it (and the associated decrement and test if null)
I would like to use this TEE facility but with xxx kpps for instance ;)
> +#endif
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE
2010-04-14 5:52 ` Eric Dumazet
@ 2010-04-14 10:00 ` Jan Engelhardt
2010-04-14 10:56 ` Patrick McHardy
0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-14 10:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: kaber, netfilter-devel
On Wednesday 2010-04-14 07:52, Eric Dumazet wrote:
>> +#ifdef WITH_CONNTRACK
>> + /* Avoid counting cloned packets towards the original connection. */
>> + nf_conntrack_put(skb->nfct);
>> + skb->nfct = &nf_conntrack_untracked.ct_general;
>> + skb->nfctinfo = IP_CT_NEW;
>> + nf_conntrack_get(skb->nfct);
>
>This atomic increment on a global variable worries me... Would it be
>possible to avoid it (and the associated decrement and test if null)
>
>I would like to use this TEE facility but with xxx kpps for instance ;)
Correctness before speed. You're free to send patches on top.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE
2010-04-14 10:00 ` Jan Engelhardt
@ 2010-04-14 10:56 ` Patrick McHardy
0 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2010-04-14 10:56 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Eric Dumazet, netfilter-devel
Jan Engelhardt wrote:
> On Wednesday 2010-04-14 07:52, Eric Dumazet wrote:
>>> +#ifdef WITH_CONNTRACK
>>> + /* Avoid counting cloned packets towards the original connection. */
>>> + nf_conntrack_put(skb->nfct);
>>> + skb->nfct = &nf_conntrack_untracked.ct_general;
>>> + skb->nfctinfo = IP_CT_NEW;
>>> + nf_conntrack_get(skb->nfct);
>> This atomic increment on a global variable worries me... Would it be
>> possible to avoid it (and the associated decrement and test if null)
>>
>> I would like to use this TEE facility but with xxx kpps for instance ;)
>
> Correctness before speed. You're free to send patches on top.
>
Pablo suggest to encode "untracked" in one of the nfctinfo bits a
while ago, which would avoid all atomic operations. Anyways, this
can be done later.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
` (3 preceding siblings ...)
2010-04-13 23:21 ` [PATCH 4/4] netfilter: xtables: remove old comments about reentrancy Jan Engelhardt
@ 2010-04-14 10:57 ` Patrick McHardy
2010-04-14 11:15 ` Jan Engelhardt
4 siblings, 1 reply; 14+ messages in thread
From: Patrick McHardy @ 2010-04-14 10:57 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: netfilter-devel
Jan Engelhardt wrote:
> in this round:
> - use IP6SKB_REROUTED in v6 code
> - pick_net function: use skb->dev or skb->dst->dev when available
> (or completely fall back to init_net in case there's something
> going on)
So what about oif routing which I asked for two times?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-14 10:57 ` nf-next: TEE only Patrick McHardy
@ 2010-04-14 11:15 ` Jan Engelhardt
2010-04-14 11:24 ` Patrick McHardy
0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-14 11:15 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel
On Wednesday 2010-04-14 12:57, Patrick McHardy wrote:
>Jan Engelhardt wrote:
>> in this round:
>> - use IP6SKB_REROUTED in v6 code
>> - pick_net function: use skb->dev or skb->dst->dev when available
>> (or completely fall back to init_net in case there's something
>> going on)
>
>So what about oif routing which I asked for two times?
Guess it must have fallen off somewhere between the resends. We can
still add it as a patch on top.
>I guess you'd usually have a host for logging or IDS somewhere on a
>private network and TEE packets there. So specifying oif and gateway
>seems most useful to me.
The oif is already determined by the route to the gateway(logging
host). I'd also fear that people abuse TEE as a ROUTE replacement
when they see an --oif.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-14 11:15 ` Jan Engelhardt
@ 2010-04-14 11:24 ` Patrick McHardy
2010-04-14 11:32 ` Jan Engelhardt
0 siblings, 1 reply; 14+ messages in thread
From: Patrick McHardy @ 2010-04-14 11:24 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: netfilter-devel
Jan Engelhardt wrote:
> On Wednesday 2010-04-14 12:57, Patrick McHardy wrote:
>
>> Jan Engelhardt wrote:
>>> in this round:
>>> - use IP6SKB_REROUTED in v6 code
>>> - pick_net function: use skb->dev or skb->dst->dev when available
>>> (or completely fall back to init_net in case there's something
>>> going on)
>> So what about oif routing which I asked for two times?
>
> Guess it must have fallen off somewhere between the resends. We can
> still add it as a patch on top.
Please add it before I apply it. Should be a fairly trivial change.
>> I guess you'd usually have a host for logging or IDS somewhere on a
>> private network and TEE packets there. So specifying oif and gateway
>> seems most useful to me.
>
> The oif is already determined by the route to the gateway(logging
> host). I'd also fear that people abuse TEE as a ROUTE replacement
> when they see an --oif.
That's something different. The oif forces use of a specific output
device, independant of the routing tables. F.i.:
# ip l l dummy0
4: dummy0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 96:7f:5e:d2:d6:c9 brd ff:ff:ff:ff:ff:ff
# ip a s dummy0
4: dummy0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
li# ip a s dummy0
4: dummy0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 96:7f:5e:d2:d6:c9 brd ff:ff:ff:ff:ff:ff
inet6 fe80::947f:5eff:fed2:d6c9/64 scope link
valid_lft forever preferred_lft forever
# ip r | grep dummy0
#
# ping 10.0.0.1 -I dummy0
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.100 > 10.0.0.1: ICMP echo request, id 25874, seq 1, length 64
IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.100 > 10.0.0.1: ICMP echo request, id 25874, seq 2, length 64
This is quite useful since your logging host doesn't have to be
reachable through normal routing.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-14 11:24 ` Patrick McHardy
@ 2010-04-14 11:32 ` Jan Engelhardt
2010-04-15 11:41 ` Jan Engelhardt
0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-14 11:32 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel
On Wednesday 2010-04-14 13:24, Patrick McHardy wrote:
>>> So what about oif routing which I asked for two times?
>>
>> Guess it must have fallen off somewhere between the resends. We can
>> still add it as a patch on top.
>
>Please add it before I apply it. Should be a fairly trivial change.
>
>>> I guess you'd usually have a host for logging or IDS somewhere on a
>>> private network and TEE packets there. So specifying oif and gateway
>>> seems most useful to me.
>>
>> The oif is already determined by the route to the gateway(logging
>> host). I'd also fear that people abuse TEE as a ROUTE replacement
>> when they see an --oif.
>
>That's something different. The oif forces use of a specific output
>device, independant of the routing tables. F.i.:
You should be able to use a specific output device by use of a routing
table, and selecting that table with fwmark.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-14 11:32 ` Jan Engelhardt
@ 2010-04-15 11:41 ` Jan Engelhardt
2010-04-15 11:56 ` Patrick McHardy
0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2010-04-15 11:41 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel
On Wednesday 2010-04-14 13:32, Jan Engelhardt wrote:
>On Wednesday 2010-04-14 13:24, Patrick McHardy wrote:
>>>> So what about oif routing which I asked for two times?
>>>
>>> Guess it must have fallen off somewhere between the resends. We can
>>> still add it as a patch on top.
>>
>>Please add it before I apply it. Should be a fairly trivial change.
>>
>>>> I guess you'd usually have a host for logging or IDS somewhere on a
>>>> private network and TEE packets there. So specifying oif and gateway
>>>> seems most useful to me.
>>>
>>> The oif is already determined by the route to the gateway(logging
>>> host). I'd also fear that people abuse TEE as a ROUTE replacement
>>> when they see an --oif.
>>
>>That's something different. The oif forces use of a specific output
>>device, independant of the routing tables. F.i.:
>
>You should be able to use a specific output device by use of a routing
>table, and selecting that table with fwmark.
>This is quite useful since your logging host doesn't have to be
>reachable through normal routing.
>4: dummy0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
> link/ether 96:7f:5e:d2:d6:c9 brd ff:ff:ff:ff:ff:ff
> inet6 fe80::947f:5eff:fed2:d6c9/64 scope link
> valid_lft forever preferred_lft forever
>
># ping 10.0.0.1 -I dummy0
>IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
> 192.168.0.100 > 10.0.0.1: ICMP echo request, id 25874, seq 1, length 64
>IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
> 192.168.0.100 > 10.0.0.1: ICMP echo request, id 25874, seq 2, length 64
I don't see why one would want the log server to be unreachable.
What _does_ look reasonable however is an encapsulation device for transporting
teed packets across multiple real hops:
14: gre1: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1476 qdisc noqueue state UNKNOWN
link/gre 5.6.7.8 peer 1.2.3.4
and this case can be solved with standard(/fwmarking) policy routing
('default via gre1').
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: nf-next: TEE only
2010-04-15 11:41 ` Jan Engelhardt
@ 2010-04-15 11:56 ` Patrick McHardy
0 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2010-04-15 11:56 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: netfilter-devel
Jan Engelhardt wrote:
> On Wednesday 2010-04-14 13:32, Jan Engelhardt wrote:
> I don't see why one would want the log server to be unreachable.
I don't want that, in fact I don't have use for TEE personally.
> What _does_ look reasonable however is an encapsulation device for transporting
> teed packets across multiple real hops:
>
> 14: gre1: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1476 qdisc noqueue state UNKNOWN
> link/gre 5.6.7.8 peer 1.2.3.4
>
> and this case can be solved with standard(/fwmarking) policy routing
> ('default via gre1').
Sure. And it can be solved using standard routing by specifying
the oif. I'm not interested in debatting this endlessly, this is
a standard routing key and there's no reason at all to not support
it.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-04-15 11:57 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-13 23:21 nf-next: TEE only Jan Engelhardt
2010-04-13 23:21 ` [PATCH 1/4] netfilter: xtables: inclusion of xt_TEE Jan Engelhardt
2010-04-14 5:52 ` Eric Dumazet
2010-04-14 10:00 ` Jan Engelhardt
2010-04-14 10:56 ` Patrick McHardy
2010-04-13 23:21 ` [PATCH 2/4] netfilter: xtables2: make ip_tables reentrant Jan Engelhardt
2010-04-13 23:21 ` [PATCH 3/4] netfilter: xt_TEE: have cloned packet travel through Xtables too Jan Engelhardt
2010-04-13 23:21 ` [PATCH 4/4] netfilter: xtables: remove old comments about reentrancy Jan Engelhardt
2010-04-14 10:57 ` nf-next: TEE only Patrick McHardy
2010-04-14 11:15 ` Jan Engelhardt
2010-04-14 11:24 ` Patrick McHardy
2010-04-14 11:32 ` Jan Engelhardt
2010-04-15 11:41 ` Jan Engelhardt
2010-04-15 11:56 ` Patrick McHardy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).