* netfilter 01/41: change generic l4 protocol number
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 02/41: remove unneeded goto Patrick McHardy
` (40 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit fe2a7ce4de07472ace0cdf460a41f462a4621687
Author: Christoph Paasch <christoph.paasch@student.uclouvain.be>
Date: Wed Feb 18 16:28:35 2009 +0100
netfilter: change generic l4 protocol number
0 is used by Hop-by-hop header and so this may cause confusion.
255 is stated as 'Reserved' by IANA.
Signed-off-by: Christoph Paasch <christoph.paasch@student.uclouvain.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_proto_generic.c b/net/netfilter/nf_conntrack_proto_generic.c
index 4be80d7..829374f 100644
--- a/net/netfilter/nf_conntrack_proto_generic.c
+++ b/net/netfilter/nf_conntrack_proto_generic.c
@@ -92,7 +92,7 @@ static struct ctl_table generic_compat_sysctl_table[] = {
struct nf_conntrack_l4proto nf_conntrack_l4proto_generic __read_mostly =
{
.l3proto = PF_UNSPEC,
- .l4proto = 0,
+ .l4proto = 255,
.name = "unknown",
.pkt_to_tuple = generic_pkt_to_tuple,
.invert_tuple = generic_invert_tuple,
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 02/41: remove unneeded goto
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
2009-03-24 14:03 ` netfilter 01/41: change generic l4 protocol number Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 03/41: x_tables: change elements in x_tables Patrick McHardy
` (39 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit fecea3a389c89de9afae2eda74fad894d5677229
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Wed Feb 18 16:29:08 2009 +0100
netfilter: remove unneeded goto
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index a90ac83..5bb3473 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -174,7 +174,6 @@ next_hook:
outdev, &elem, okfn, hook_thresh);
if (verdict == NF_ACCEPT || verdict == NF_STOP) {
ret = 1;
- goto unlock;
} else if (verdict == NF_DROP) {
kfree_skb(skb);
ret = -EPERM;
@@ -183,7 +182,6 @@ next_hook:
verdict >> NF_VERDICT_BITS))
goto next_hook;
}
-unlock:
rcu_read_unlock();
return ret;
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 03/41: x_tables: change elements in x_tables
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
2009-03-24 14:03 ` netfilter 01/41: change generic l4 protocol number Patrick McHardy
2009-03-24 14:03 ` netfilter 02/41: remove unneeded goto Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 04/41: x_tables: remove unneeded initializations Patrick McHardy
` (38 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 4a2f965ca5a4e2593744bf75425d85e0e8ff814a
Author: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed Feb 18 16:29:44 2009 +0100
netfilter: x_tables: change elements in x_tables
Change to proper type on private pointer rather than anonymous void.
Keep active elements on same cache line.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index c7ee874..9fac88f 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -349,9 +349,6 @@ struct xt_table
{
struct list_head list;
- /* A unique name... */
- const char name[XT_TABLE_MAXNAMELEN];
-
/* What hooks you will enter on */
unsigned int valid_hooks;
@@ -359,13 +356,15 @@ struct xt_table
rwlock_t lock;
/* Man behind the curtain... */
- //struct ip6t_table_info *private;
- void *private;
+ struct xt_table_info *private;
/* Set this to THIS_MODULE if you are a module, otherwise NULL */
struct module *me;
u_int8_t af; /* address/protocol family */
+
+ /* A unique name... */
+ const char name[XT_TABLE_MAXNAMELEN];
};
#include <linux/netfilter_ipv4.h>
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 04/41: x_tables: remove unneeded initializations
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (2 preceding siblings ...)
2009-03-24 14:03 ` netfilter 03/41: x_tables: change elements in x_tables Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 05/41: ebtables: " Patrick McHardy
` (37 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 9c8222b9e71b690c8388bb0ebe5c3e5a1469e884
Author: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed Feb 18 16:30:20 2009 +0100
netfilter: x_tables: remove unneeded initializations
Later patches change the locking on xt_table and the initialization of
the lock element is not needed since the lock is always initialized in
xt_table_register anyway.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/arptable_filter.c b/net/ipv4/netfilter/arptable_filter.c
index e091187..6ecfdae 100644
--- a/net/ipv4/netfilter/arptable_filter.c
+++ b/net/ipv4/netfilter/arptable_filter.c
@@ -48,8 +48,6 @@ static struct
static struct xt_table packet_filter = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_filter.lock),
- .private = NULL,
.me = THIS_MODULE,
.af = NFPROTO_ARP,
};
diff --git a/net/ipv4/netfilter/iptable_filter.c b/net/ipv4/netfilter/iptable_filter.c
index 52cb693..c30a969 100644
--- a/net/ipv4/netfilter/iptable_filter.c
+++ b/net/ipv4/netfilter/iptable_filter.c
@@ -56,7 +56,6 @@ static struct
static struct xt_table packet_filter = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_filter.lock),
.me = THIS_MODULE,
.af = AF_INET,
};
diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c
index 3929d20..4087614 100644
--- a/net/ipv4/netfilter/iptable_mangle.c
+++ b/net/ipv4/netfilter/iptable_mangle.c
@@ -67,7 +67,6 @@ static struct
static struct xt_table packet_mangler = {
.name = "mangle",
.valid_hooks = MANGLE_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_mangler.lock),
.me = THIS_MODULE,
.af = AF_INET,
};
diff --git a/net/ipv4/netfilter/iptable_raw.c b/net/ipv4/netfilter/iptable_raw.c
index 7f65d18..e5356da 100644
--- a/net/ipv4/netfilter/iptable_raw.c
+++ b/net/ipv4/netfilter/iptable_raw.c
@@ -39,7 +39,6 @@ static struct
static struct xt_table packet_raw = {
.name = "raw",
.valid_hooks = RAW_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_raw.lock),
.me = THIS_MODULE,
.af = AF_INET,
};
diff --git a/net/ipv4/netfilter/iptable_security.c b/net/ipv4/netfilter/iptable_security.c
index a52a35f..29ab630 100644
--- a/net/ipv4/netfilter/iptable_security.c
+++ b/net/ipv4/netfilter/iptable_security.c
@@ -60,7 +60,6 @@ static struct
static struct xt_table security_table = {
.name = "security",
.valid_hooks = SECURITY_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(security_table.lock),
.me = THIS_MODULE,
.af = AF_INET,
};
diff --git a/net/ipv4/netfilter/nf_nat_rule.c b/net/ipv4/netfilter/nf_nat_rule.c
index a7eb047..6348a79 100644
--- a/net/ipv4/netfilter/nf_nat_rule.c
+++ b/net/ipv4/netfilter/nf_nat_rule.c
@@ -61,7 +61,6 @@ static struct
static struct xt_table nat_table = {
.name = "nat",
.valid_hooks = NAT_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(nat_table.lock),
.me = THIS_MODULE,
.af = AF_INET,
};
diff --git a/net/ipv6/netfilter/ip6table_filter.c b/net/ipv6/netfilter/ip6table_filter.c
index 40d2e36..ef5a0a3 100644
--- a/net/ipv6/netfilter/ip6table_filter.c
+++ b/net/ipv6/netfilter/ip6table_filter.c
@@ -54,7 +54,6 @@ static struct
static struct xt_table packet_filter = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_filter.lock),
.me = THIS_MODULE,
.af = AF_INET6,
};
diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c
index d0b31b2..ab0d398 100644
--- a/net/ipv6/netfilter/ip6table_mangle.c
+++ b/net/ipv6/netfilter/ip6table_mangle.c
@@ -60,7 +60,6 @@ static struct
static struct xt_table packet_mangler = {
.name = "mangle",
.valid_hooks = MANGLE_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_mangler.lock),
.me = THIS_MODULE,
.af = AF_INET6,
};
diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c
index 109fab6..4b792b6 100644
--- a/net/ipv6/netfilter/ip6table_raw.c
+++ b/net/ipv6/netfilter/ip6table_raw.c
@@ -38,7 +38,6 @@ static struct
static struct xt_table packet_raw = {
.name = "raw",
.valid_hooks = RAW_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(packet_raw.lock),
.me = THIS_MODULE,
.af = AF_INET6,
};
diff --git a/net/ipv6/netfilter/ip6table_security.c b/net/ipv6/netfilter/ip6table_security.c
index 20bc52f..0ea37ff 100644
--- a/net/ipv6/netfilter/ip6table_security.c
+++ b/net/ipv6/netfilter/ip6table_security.c
@@ -59,7 +59,6 @@ static struct
static struct xt_table security_table = {
.name = "security",
.valid_hooks = SECURITY_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(security_table.lock),
.me = THIS_MODULE,
.af = AF_INET6,
};
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 05/41: ebtables: remove unneeded initializations
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (3 preceding siblings ...)
2009-03-24 14:03 ` netfilter 04/41: x_tables: remove unneeded initializations Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 06/41: log invalid new icmpv6 packet with nf_log_packet() Patrick McHardy
` (36 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 842bff366b536787b88c07cbf2416e2cb26cae67
Author: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed Feb 18 16:30:38 2009 +0100
netfilter: ebtables: remove unneeded initializations
The initialization of the lock element is not needed
since the lock is always initialized in ebt_register_table.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 8604dfc..c751111 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -46,7 +46,6 @@ static struct ebt_table broute_table =
.name = "broute",
.table = &initial_table,
.valid_hooks = 1 << NF_BR_BROUTING,
- .lock = __RW_LOCK_UNLOCKED(broute_table.lock),
.check = check,
.me = THIS_MODULE,
};
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index 2b2e804..a5eea72 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -55,7 +55,6 @@ static struct ebt_table frame_filter =
.name = "filter",
.table = &initial_table,
.valid_hooks = FILTER_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(frame_filter.lock),
.check = check,
.me = THIS_MODULE,
};
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 3fe1ae8..6024c55 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -55,7 +55,6 @@ static struct ebt_table frame_nat =
.name = "nat",
.table = &initial_table,
.valid_hooks = NAT_VALID_HOOKS,
- .lock = __RW_LOCK_UNLOCKED(frame_nat.lock),
.check = check,
.me = THIS_MODULE,
};
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 06/41: log invalid new icmpv6 packet with nf_log_packet()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (4 preceding siblings ...)
2009-03-24 14:03 ` netfilter 05/41: ebtables: " Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match() Patrick McHardy
` (35 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 55df4ac0c927c7f1f84e6d75532f0ca45d391e64
Author: Eric Leblond <eric@inl.fr>
Date: Wed Feb 18 16:30:56 2009 +0100
netfilter: log invalid new icmpv6 packet with nf_log_packet()
This patch adds a logging message for invalid new icmpv6 packet.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index c323643..165b256 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -126,6 +126,10 @@ static bool icmpv6_new(struct nf_conn *ct, const struct sk_buff *skb,
pr_debug("icmpv6: can't create new conn with type %u\n",
type + 128);
nf_ct_dump_tuple_ipv6(&ct->tuplehash[0].tuple);
+ if (LOG_INVALID(nf_ct_net(ct), IPPROTO_ICMPV6))
+ nf_log_packet(PF_INET6, 0, skb, NULL, NULL, NULL,
+ "nf_ct_icmpv6: invalid new with type %d ",
+ type + 128);
return false;
}
atomic_set(&ct->proto.icmp.count, 0);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (5 preceding siblings ...)
2009-03-24 14:03 ` netfilter 06/41: log invalid new icmpv6 packet with nf_log_packet() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 20:29 ` David Miller
2009-03-24 14:03 ` netfilter 08/41: Combine ipt_TTL and ip6t_HL source Patrick McHardy
` (34 subsequent siblings)
41 siblings, 1 reply; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit ddc214c43a923e89741e04da2f10e3037a64e222
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed Feb 18 17:47:50 2009 +0100
netfilter: arp_tables: unfold two critical loops in arp_packet_match()
x86 and powerpc can perform long word accesses in an efficient maner.
We can use this to unroll two loops in arp_packet_match(), to
perform arithmetic on long words instead of bytes. This is a win
on x86_64 for example.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 7ea88b6..b5db463 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -73,6 +73,36 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
return (ret != 0);
}
+/*
+ * Unfortunatly, _b and _mask are not aligned to an int (or long int)
+ * Some arches dont care, unrolling the loop is a win on them.
+ */
+static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
+{
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+#else
+ unsigned long ret = 0;
+ int i;
+
+ for (i = 0; i < IFNAMSIZ; i++)
+ ret |= (_a[i] ^ _b[i]) & _mask[i];
+#endif
+ return ret;
+}
+
/* Returns whether packet matches rule or not. */
static inline int arp_packet_match(const struct arphdr *arphdr,
struct net_device *dev,
@@ -83,7 +113,7 @@ static inline int arp_packet_match(const struct arphdr *arphdr,
const char *arpptr = (char *)(arphdr + 1);
const char *src_devaddr, *tgt_devaddr;
__be32 src_ipaddr, tgt_ipaddr;
- int i, ret;
+ long ret;
#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg)))
@@ -156,10 +186,7 @@ static inline int arp_packet_match(const struct arphdr *arphdr,
}
/* Look for ifname matches. */
- for (i = 0, ret = 0; i < IFNAMSIZ; i++) {
- ret |= (indev[i] ^ arpinfo->iniface[i])
- & arpinfo->iniface_mask[i];
- }
+ ret = ifname_compare(indev, arpinfo->iniface, arpinfo->iniface_mask);
if (FWINV(ret != 0, ARPT_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -168,10 +195,7 @@ static inline int arp_packet_match(const struct arphdr *arphdr,
return 0;
}
- for (i = 0, ret = 0; i < IFNAMSIZ; i++) {
- ret |= (outdev[i] ^ arpinfo->outiface[i])
- & arpinfo->outiface_mask[i];
- }
+ ret = ifname_compare(outdev, arpinfo->outiface, arpinfo->outiface_mask);
if (FWINV(ret != 0, ARPT_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
@@ -221,7 +245,7 @@ unsigned int arpt_do_table(struct sk_buff *skb,
const struct net_device *out,
struct xt_table *table)
{
- static const char nulldevname[IFNAMSIZ];
+ static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long))));
unsigned int verdict = NF_DROP;
const struct arphdr *arp;
bool hotdrop = false;
^ permalink raw reply related [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 14:03 ` netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match() Patrick McHardy
@ 2009-03-24 20:29 ` David Miller
2009-03-24 21:06 ` Eric Dumazet
0 siblings, 1 reply; 58+ messages in thread
From: David Miller @ 2009-03-24 20:29 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel
From: Patrick McHardy <kaber@trash.net>
Date: Tue, 24 Mar 2009 15:03:16 +0100 (MET)
> +/*
> + * Unfortunatly, _b and _mask are not aligned to an int (or long int)
> + * Some arches dont care, unrolling the loop is a win on them.
> + */
> +static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
> +{
> +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + const unsigned long *a = (const unsigned long *)_a;
> + const unsigned long *b = (const unsigned long *)_b;
I think we can at least give some help for the platforms which
require alignment.
We can, for example, assume 16-bit alignment and thus loop
over u16's
^ permalink raw reply [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 20:29 ` David Miller
@ 2009-03-24 21:06 ` Eric Dumazet
2009-03-24 21:16 ` David Miller
2009-03-24 21:17 ` Jan Engelhardt
0 siblings, 2 replies; 58+ messages in thread
From: Eric Dumazet @ 2009-03-24 21:06 UTC (permalink / raw)
To: David Miller; +Cc: kaber, netdev, netfilter-devel
David Miller a écrit :
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue, 24 Mar 2009 15:03:16 +0100 (MET)
>
>> +/*
>> + * Unfortunatly, _b and _mask are not aligned to an int (or long int)
>> + * Some arches dont care, unrolling the loop is a win on them.
>> + */
>> +static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
>> +{
>> +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>> + const unsigned long *a = (const unsigned long *)_a;
>> + const unsigned long *b = (const unsigned long *)_b;
>
> I think we can at least give some help for the platforms which
> require alignment.
>
> We can, for example, assume 16-bit alignment and thus loop
> over u16's
Right. How about this incremental patch ?
Thanks
[PATCH] arp_tables: ifname_compare() can assume 16bit alignment
Arches without efficient unaligned access can still perform a loop
assuming 16bit alignment in ifname_compare()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 64a7c6c..84b9c17 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -76,6 +76,7 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
/*
* Unfortunatly, _b and _mask are not aligned to an int (or long int)
* Some arches dont care, unrolling the loop is a win on them.
+ * For other arches, we only have a 16bit alignement.
*/
static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
{
@@ -95,10 +96,13 @@ static unsigned long ifname_compare(const char *_a, const char *_b, const char *
BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
#else
unsigned long ret = 0;
+ const u16 *a = (const u16 *)_a;
+ const u16 *b = (const u16 *)_b;
+ const u16 *mask = (const u16 *)_mask;
int i;
- for (i = 0; i < IFNAMSIZ; i++)
- ret |= (_a[i] ^ _b[i]) & _mask[i];
+ for (i = 0; i < IFNAMSIZ/sizeof(u16); i++)
+ ret |= (a[i] ^ b[i]) & mask[i];
#endif
return ret;
}
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:06 ` Eric Dumazet
@ 2009-03-24 21:16 ` David Miller
2009-03-24 21:17 ` Jan Engelhardt
1 sibling, 0 replies; 58+ messages in thread
From: David Miller @ 2009-03-24 21:16 UTC (permalink / raw)
To: dada1; +Cc: kaber, netdev, netfilter-devel
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 24 Mar 2009 22:06:50 +0100
> [PATCH] arp_tables: ifname_compare() can assume 16bit alignment
>
> Arches without efficient unaligned access can still perform a loop
> assuming 16bit alignment in ifname_compare()
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Looks great, applied, thanks Eric!
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:06 ` Eric Dumazet
2009-03-24 21:16 ` David Miller
@ 2009-03-24 21:17 ` Jan Engelhardt
2009-03-24 21:18 ` David Miller
1 sibling, 1 reply; 58+ messages in thread
From: Jan Engelhardt @ 2009-03-24 21:17 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, kaber, netdev, netfilter-devel
On Tuesday 2009-03-24 22:06, Eric Dumazet wrote:
>>> +/*
>>> + * Unfortunatly, _b and _mask are not aligned to an int (or long int)
>>> + * Some arches dont care, unrolling the loop is a win on them.
>>> + */
>>> +static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
>>> +{
>>> +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>>> + const unsigned long *a = (const unsigned long *)_a;
>>> + const unsigned long *b = (const unsigned long *)_b;
>>
>> I think we can at least give some help for the platforms which
>> require alignment.
>>
>> We can, for example, assume 16-bit alignment and thus loop
>> over u16's
>
>Right. How about this incremental patch ?
>
>Thanks
>
>[PATCH] arp_tables: ifname_compare() can assume 16bit alignment
>
>Arches without efficient unaligned access can still perform a loop
>assuming 16bit alignment in ifname_compare()
Allow me some skepticism, but the code looks pretty much like a
standard memcmp.
> unsigned long ret = 0;
>+ const u16 *a = (const u16 *)_a;
>+ const u16 *b = (const u16 *)_b;
>+ const u16 *mask = (const u16 *)_mask;
> int i;
>
>- for (i = 0; i < IFNAMSIZ; i++)
>- ret |= (_a[i] ^ _b[i]) & _mask[i];
>+ for (i = 0; i < IFNAMSIZ/sizeof(u16); i++)
>+ ret |= (a[i] ^ b[i]) & mask[i];
> #endif
> return ret;
> }
^ permalink raw reply [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:17 ` Jan Engelhardt
@ 2009-03-24 21:18 ` David Miller
2009-03-24 21:23 ` Jan Engelhardt
2009-03-25 10:33 ` netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match() Andi Kleen
0 siblings, 2 replies; 58+ messages in thread
From: David Miller @ 2009-03-24 21:18 UTC (permalink / raw)
To: jengelh; +Cc: dada1, kaber, netdev, netfilter-devel
From: Jan Engelhardt <jengelh@medozas.de>
Date: Tue, 24 Mar 2009 22:17:17 +0100 (CET)
>
> On Tuesday 2009-03-24 22:06, Eric Dumazet wrote:
> >>> +/*
> >>> + * Unfortunatly, _b and _mask are not aligned to an int (or long int)
> >>> + * Some arches dont care, unrolling the loop is a win on them.
> >>> + */
> >>> +static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
> >>> +{
> >>> +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> >>> + const unsigned long *a = (const unsigned long *)_a;
> >>> + const unsigned long *b = (const unsigned long *)_b;
> >>
> >> I think we can at least give some help for the platforms which
> >> require alignment.
> >>
> >> We can, for example, assume 16-bit alignment and thus loop
> >> over u16's
> >
> >Right. How about this incremental patch ?
> >
> >Thanks
> >
> >[PATCH] arp_tables: ifname_compare() can assume 16bit alignment
> >
> >Arches without efficient unaligned access can still perform a loop
> >assuming 16bit alignment in ifname_compare()
>
> Allow me some skepticism, but the code looks pretty much like a
> standard memcmp.
memcmp() can't make any assumptions about alignment.
Whereas we _know_ this thing is exactly 16-bit aligned.
All of the optimized memcmp() implementations look for
32-bit alignment and punt to byte at a time comparison
loops if things are not aligned enough.
^ permalink raw reply [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:18 ` David Miller
@ 2009-03-24 21:23 ` Jan Engelhardt
2009-03-24 21:25 ` David Miller
2009-03-24 21:39 ` Eric Dumazet
2009-03-25 10:33 ` netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match() Andi Kleen
1 sibling, 2 replies; 58+ messages in thread
From: Jan Engelhardt @ 2009-03-24 21:23 UTC (permalink / raw)
To: David Miller; +Cc: dada1, kaber, netdev, netfilter-devel
On Tuesday 2009-03-24 22:18, David Miller wrote:
>> >
>> >Arches without efficient unaligned access can still perform a loop
>> >assuming 16bit alignment in ifname_compare()
>>
>> Allow me some skepticism, but the code looks pretty much like a
>> standard memcmp.
>
>memcmp() can't make any assumptions about alignment.
>Whereas we _know_ this thing is exactly 16-bit aligned.
>
>All of the optimized memcmp() implementations look for
>32-bit alignment and punt to byte at a time comparison
>loops if things are not aligned enough.
Yes, I seem to remember glibc doing something like
if ((addr & 0x03) != 0) {
// process single bytes (increment addr as you go)
// until addr & 0x03 == 0.
}
/* optimized loop here. also increases addr */
if ((addr & 0x03) != 0)
// still bytes left after loop - process on a per-byte basis
Is the cost of testing for non-4-divisibility expensive enough
to warrant not usnig memcmp?
Irrespective of all that, I think putting the interface comparison
code should be agglomerated in a function/header so that it is
replicated across iptables, ip6tables, ebtables, arptables, etc.
^ permalink raw reply [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:23 ` Jan Engelhardt
@ 2009-03-24 21:25 ` David Miller
2009-03-24 21:39 ` Eric Dumazet
1 sibling, 0 replies; 58+ messages in thread
From: David Miller @ 2009-03-24 21:25 UTC (permalink / raw)
To: jengelh; +Cc: dada1, kaber, netdev, netfilter-devel
From: Jan Engelhardt <jengelh@medozas.de>
Date: Tue, 24 Mar 2009 22:23:23 +0100 (CET)
> Is the cost of testing for non-4-divisibility expensive enough
> to warrant not usnig memcmp?
I think so. This is the fast path of these things.
> Irrespective of all that, I think putting the interface comparison
> code should be agglomerated in a function/header so that it is
> replicated across iptables, ip6tables, ebtables, arptables, etc.
Agreed.
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:23 ` Jan Engelhardt
2009-03-24 21:25 ` David Miller
@ 2009-03-24 21:39 ` Eric Dumazet
2009-03-24 21:52 ` Jan Engelhardt
1 sibling, 1 reply; 58+ messages in thread
From: Eric Dumazet @ 2009-03-24 21:39 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: David Miller, kaber, netdev, netfilter-devel
Jan Engelhardt a écrit :
> On Tuesday 2009-03-24 22:18, David Miller wrote:
>>>> Arches without efficient unaligned access can still perform a loop
>>>> assuming 16bit alignment in ifname_compare()
>>> Allow me some skepticism, but the code looks pretty much like a
>>> standard memcmp.
>> memcmp() can't make any assumptions about alignment.
>> Whereas we _know_ this thing is exactly 16-bit aligned.
>>
>> All of the optimized memcmp() implementations look for
>> 32-bit alignment and punt to byte at a time comparison
>> loops if things are not aligned enough.
>
> Yes, I seem to remember glibc doing something like
>
> if ((addr & 0x03) != 0) {
> // process single bytes (increment addr as you go)
> // until addr & 0x03 == 0.
> }
>
> /* optimized loop here. also increases addr */
>
> if ((addr & 0x03) != 0)
> // still bytes left after loop - process on a per-byte basis
>
> Is the cost of testing for non-4-divisibility expensive enough
> to warrant not usnig memcmp?
>
> Irrespective of all that, I think putting the interface comparison
> code should be agglomerated in a function/header so that it is
> replicated across iptables, ip6tables, ebtables, arptables, etc.
memcmp() is fine, but how is it solving the masking problem we have ?
Also in the case of arp_tables, _a is long word aligned, while _b and _mask are not.
memcmp() in this case is slower, (and dont handle mask thing)
If you look various ifname_compare(), we have two different implementations.
So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and xt_physdev.c
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 58+ messages in thread* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:39 ` Eric Dumazet
@ 2009-03-24 21:52 ` Jan Engelhardt
2009-03-25 11:27 ` [PATCH] netfilter: factorize ifname_compare() Eric Dumazet
0 siblings, 1 reply; 58+ messages in thread
From: Jan Engelhardt @ 2009-03-24 21:52 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, kaber, netdev, netfilter-devel
On Tuesday 2009-03-24 22:39, Eric Dumazet wrote:
>>> memcmp() can't make any assumptions about alignment.
>>> Whereas we _know_ this thing is exactly 16-bit aligned.
>>>
>>> All of the optimized memcmp() implementations look for
>>> 32-bit alignment and punt to byte at a time comparison
>>> loops if things are not aligned enough.
>>
>> Yes, I seem to remember glibc doing something like
>> [...]
>> Is the cost of testing for non-4-divisibility expensive enough
>> to warrant not usnig memcmp?
>>
>> Irrespective of all that, I think putting the interface comparison
>> code should be agglomerated in a function/header so that it is
>> replicated across iptables, ip6tables, ebtables, arptables, etc.
>
>memcmp() is fine, but how is it solving the masking problem we have ?
You are right; we would have to calculate the prefix length of the
mask first. (And I think we can assume that there will not be any 1s
in the mask after the first 0.) It would consume CPU indeed.
>Also in the case of arp_tables, _a is long word aligned, while _b and _mask
>are not.
>
>If you look various ifname_compare(), we have two different implementations.
>
>So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and
> xt_physdev.c.
Very well.
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH] netfilter: factorize ifname_compare()
2009-03-24 21:52 ` Jan Engelhardt
@ 2009-03-25 11:27 ` Eric Dumazet
2009-03-25 16:32 ` Patrick McHardy
0 siblings, 1 reply; 58+ messages in thread
From: Eric Dumazet @ 2009-03-25 11:27 UTC (permalink / raw)
To: David Miller; +Cc: Jan Engelhardt, kaber, netdev, netfilter-devel
Jan Engelhardt a écrit :
> On Tuesday 2009-03-24 22:39, Eric Dumazet wrote:
>> memcmp() is fine, but how is it solving the masking problem we have ?
>
> You are right; we would have to calculate the prefix length of the
> mask first. (And I think we can assume that there will not be any 1s
> in the mask after the first 0.) It would consume CPU indeed.
>
>> Also in the case of arp_tables, _a is long word aligned, while _b and _mask
>> are not.
>>
>> If you look various ifname_compare(), we have two different implementations.
>>
>> So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and
>> xt_physdev.c.
>
> Very well.
Here is a patch to factorize these functions, as suggested by Jan
Thank you
[PATCH] netfilter: factorize ifname_compare()
We use same not trivial helper function in four places. We can factorize it.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
include/linux/netfilter/x_tables.h | 23 +++++++++++++++++++++++
net/ipv4/netfilter/arp_tables.c | 14 +-------------
net/ipv4/netfilter/ip_tables.c | 23 ++---------------------
net/ipv6/netfilter/ip6_tables.c | 23 ++---------------------
net/netfilter/xt_physdev.c | 21 ++-------------------
5 files changed, 30 insertions(+), 74 deletions(-)
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index e8e08d0..72918b7 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -435,6 +435,29 @@ extern void xt_free_table_info(struct xt_table_info *info);
extern void xt_table_entry_swap_rcu(struct xt_table_info *old,
struct xt_table_info *new);
+/*
+ * This helper is performance critical and must be inlined
+ */
+static inline unsigned long ifname_compare_aligned(const char *_a,
+ const char *_b,
+ const char *_mask)
+{
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ return ret;
+}
+
#ifdef CONFIG_COMPAT
#include <net/compat.h>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 84b9c17..ee1133c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -81,19 +81,7 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
{
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ unsigned long ret = ifname_compare_aligned(_a, _b, _mask);
#else
unsigned long ret = 0;
const u16 *a = (const u16 *)_a;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index e5294ae..41c59e3 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -74,25 +74,6 @@ do { \
Hence the start of any table is given by get_table() below. */
-static unsigned long ifname_compare(const char *_a, const char *_b,
- const unsigned char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
-
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -121,7 +102,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- ret = ifname_compare(indev, ipinfo->iniface, ipinfo->iniface_mask);
+ ret = ifname_compare_aligned(indev, ipinfo->iniface, ipinfo->iniface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -130,7 +111,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- ret = ifname_compare(outdev, ipinfo->outiface, ipinfo->outiface_mask);
+ ret = ifname_compare_aligned(outdev, ipinfo->outiface, ipinfo->outiface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 34af7bb..e59662b 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -89,25 +89,6 @@ ip6t_ext_hdr(u8 nexthdr)
(nexthdr == IPPROTO_DSTOPTS) );
}
-static unsigned long ifname_compare(const char *_a, const char *_b,
- const unsigned char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
-
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -138,7 +119,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- ret = ifname_compare(indev, ip6info->iniface, ip6info->iniface_mask);
+ ret = ifname_compare_aligned(indev, ip6info->iniface, ip6info->iniface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -147,7 +128,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- ret = ifname_compare(outdev, ip6info->outiface, ip6info->outiface_mask);
+ ret = ifname_compare_aligned(outdev, ip6info->outiface, ip6info->outiface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index 44a234e..8d28ca5 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -20,23 +20,6 @@ MODULE_DESCRIPTION("Xtables: Bridge physical device match");
MODULE_ALIAS("ipt_physdev");
MODULE_ALIAS("ip6t_physdev");
-static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
static bool
physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
@@ -85,7 +68,7 @@ physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
if (!(info->bitmask & XT_PHYSDEV_OP_IN))
goto match_outdev;
indev = nf_bridge->physindev ? nf_bridge->physindev->name : nulldevname;
- ret = ifname_compare(indev, info->physindev, info->in_mask);
+ ret = ifname_compare_aligned(indev, info->physindev, info->in_mask);
if (!ret ^ !(info->invert & XT_PHYSDEV_OP_IN))
return false;
@@ -95,7 +78,7 @@ match_outdev:
return true;
outdev = nf_bridge->physoutdev ?
nf_bridge->physoutdev->name : nulldevname;
- ret = ifname_compare(outdev, info->physoutdev, info->out_mask);
+ ret = ifname_compare_aligned(outdev, info->physoutdev, info->out_mask);
return (!!ret ^ !(info->invert & XT_PHYSDEV_OP_OUT));
}
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match()
2009-03-24 21:18 ` David Miller
2009-03-24 21:23 ` Jan Engelhardt
@ 2009-03-25 10:33 ` Andi Kleen
1 sibling, 0 replies; 58+ messages in thread
From: Andi Kleen @ 2009-03-25 10:33 UTC (permalink / raw)
To: David Miller; +Cc: jengelh, dada1, kaber, netdev, netfilter-devel
David Miller <davem@davemloft.net> writes:
>
> memcmp() can't make any assumptions about alignment.
Newer gcc often[1] knows how to generate specialized aligned memcmp()
based on the type alignment.
However you need to make sure the input types are right.
So in theories some casts might be enough given a new enough
compiler.
[1] not always unfortunately, sometimes it loses this information.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 58+ messages in thread
* netfilter 08/41: Combine ipt_TTL and ip6t_HL source
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (6 preceding siblings ...)
2009-03-24 14:03 ` netfilter 07/41: arp_tables: unfold two critical loops in arp_packet_match() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 09/41: Combine ipt_ttl and ip6t_hl source Patrick McHardy
` (33 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 563d36eb3fb22dd04da9aa6f12e1b9ba0ac115f3
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Wed Feb 18 18:38:40 2009 +0100
netfilter: Combine ipt_TTL and ip6t_HL source
Suggested by: James King <t.james.king@gmail.com>
Similarly to commit c9fd49680954714473d6cbd2546d6ff120f96840, merge
TTL and HL. Since HL does not depend on any IPv6-specific function,
no new module dependencies would arise.
With slight adjustments to the Kconfig help text.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 3816e1d..3ad9f43 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -322,21 +322,6 @@ config IP_NF_TARGET_ECN
To compile it as a module, choose M here. If unsure, say N.
-config IP_NF_TARGET_TTL
- tristate 'TTL target support'
- depends on IP_NF_MANGLE
- depends on NETFILTER_ADVANCED
- help
- This option adds a `TTL' target, which enables the user to modify
- the TTL value of the IP header.
-
- While it is safe to decrement/lower the TTL, this target also enables
- functionality to increment and set the TTL value of the IP header to
- arbitrary values. This is EXTREMELY DANGEROUS since you can easily
- create immortal packets that loop forever on the network.
-
- To compile it as a module, choose M here. If unsure, say N.
-
# raw + specific targets
config IP_NF_RAW
tristate 'raw table support (required for NOTRACK/TRACE)'
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 5f9b650..20b0c37 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -61,7 +61,6 @@ obj-$(CONFIG_IP_NF_TARGET_MASQUERADE) += ipt_MASQUERADE.o
obj-$(CONFIG_IP_NF_TARGET_NETMAP) += ipt_NETMAP.o
obj-$(CONFIG_IP_NF_TARGET_REDIRECT) += ipt_REDIRECT.o
obj-$(CONFIG_IP_NF_TARGET_REJECT) += ipt_REJECT.o
-obj-$(CONFIG_IP_NF_TARGET_TTL) += ipt_TTL.o
obj-$(CONFIG_IP_NF_TARGET_ULOG) += ipt_ULOG.o
# generic ARP tables
diff --git a/net/ipv4/netfilter/ipt_TTL.c b/net/ipv4/netfilter/ipt_TTL.c
deleted file mode 100644
index 6d76aae..0000000
--- a/net/ipv4/netfilter/ipt_TTL.c
+++ /dev/null
@@ -1,97 +0,0 @@
-/* TTL modification target for IP tables
- * (C) 2000,2005 by Harald Welte <laforge@netfilter.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
-#include <linux/ip.h>
-#include <net/checksum.h>
-
-#include <linux/netfilter/x_tables.h>
-#include <linux/netfilter_ipv4/ipt_TTL.h>
-
-MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
-MODULE_DESCRIPTION("Xtables: IPv4 TTL field modification target");
-MODULE_LICENSE("GPL");
-
-static unsigned int
-ttl_tg(struct sk_buff *skb, const struct xt_target_param *par)
-{
- struct iphdr *iph;
- const struct ipt_TTL_info *info = par->targinfo;
- int new_ttl;
-
- if (!skb_make_writable(skb, skb->len))
- return NF_DROP;
-
- iph = ip_hdr(skb);
-
- switch (info->mode) {
- case IPT_TTL_SET:
- new_ttl = info->ttl;
- break;
- case IPT_TTL_INC:
- new_ttl = iph->ttl + info->ttl;
- if (new_ttl > 255)
- new_ttl = 255;
- break;
- case IPT_TTL_DEC:
- new_ttl = iph->ttl - info->ttl;
- if (new_ttl < 0)
- new_ttl = 0;
- break;
- default:
- new_ttl = iph->ttl;
- break;
- }
-
- if (new_ttl != iph->ttl) {
- csum_replace2(&iph->check, htons(iph->ttl << 8),
- htons(new_ttl << 8));
- iph->ttl = new_ttl;
- }
-
- return XT_CONTINUE;
-}
-
-static bool ttl_tg_check(const struct xt_tgchk_param *par)
-{
- const struct ipt_TTL_info *info = par->targinfo;
-
- if (info->mode > IPT_TTL_MAXMODE) {
- printk(KERN_WARNING "ipt_TTL: invalid or unknown Mode %u\n",
- info->mode);
- return false;
- }
- if (info->mode != IPT_TTL_SET && info->ttl == 0)
- return false;
- return true;
-}
-
-static struct xt_target ttl_tg_reg __read_mostly = {
- .name = "TTL",
- .family = NFPROTO_IPV4,
- .target = ttl_tg,
- .targetsize = sizeof(struct ipt_TTL_info),
- .table = "mangle",
- .checkentry = ttl_tg_check,
- .me = THIS_MODULE,
-};
-
-static int __init ttl_tg_init(void)
-{
- return xt_register_target(&ttl_tg_reg);
-}
-
-static void __exit ttl_tg_exit(void)
-{
- xt_unregister_target(&ttl_tg_reg);
-}
-
-module_init(ttl_tg_init);
-module_exit(ttl_tg_exit);
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 53ea512..6a42a96 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -170,23 +170,6 @@ config IP6_NF_MANGLE
To compile it as a module, choose M here. If unsure, say N.
-config IP6_NF_TARGET_HL
- tristate 'HL (hoplimit) target support'
- depends on IP6_NF_MANGLE
- depends on NETFILTER_ADVANCED
- help
- This option adds a `HL' target, which enables the user to decrement
- the hoplimit value of the IPv6 header or set it to a given (lower)
- value.
-
- While it is safe to decrement the hoplimit value, this option also
- enables functionality to increment and set the hoplimit value of the
- IPv6 header to arbitrary values. This is EXTREMELY DANGEROUS since
- you can easily create immortal packets that loop forever on the
- network.
-
- To compile it as a module, choose M here. If unsure, say N.
-
config IP6_NF_RAW
tristate 'raw table support (required for TRACE)'
depends on NETFILTER_ADVANCED
diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
index 3f17c94..61a4570 100644
--- a/net/ipv6/netfilter/Makefile
+++ b/net/ipv6/netfilter/Makefile
@@ -27,6 +27,5 @@ obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o
obj-$(CONFIG_IP6_NF_MATCH_RT) += ip6t_rt.o
# targets
-obj-$(CONFIG_IP6_NF_TARGET_HL) += ip6t_HL.o
obj-$(CONFIG_IP6_NF_TARGET_LOG) += ip6t_LOG.o
obj-$(CONFIG_IP6_NF_TARGET_REJECT) += ip6t_REJECT.o
diff --git a/net/ipv6/netfilter/ip6t_HL.c b/net/ipv6/netfilter/ip6t_HL.c
deleted file mode 100644
index 27b5adf..0000000
--- a/net/ipv6/netfilter/ip6t_HL.c
+++ /dev/null
@@ -1,95 +0,0 @@
-/*
- * Hop Limit modification target for ip6tables
- * Maciej Soltysiak <solt@dns.toxicfilms.tv>
- * Based on HW's TTL module
- *
- * This software is distributed under the terms of GNU GPL
- */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
-#include <linux/ip.h>
-#include <linux/ipv6.h>
-
-#include <linux/netfilter/x_tables.h>
-#include <linux/netfilter_ipv6/ip6t_HL.h>
-
-MODULE_AUTHOR("Maciej Soltysiak <solt@dns.toxicfilms.tv>");
-MODULE_DESCRIPTION("Xtables: IPv6 Hop Limit field modification target");
-MODULE_LICENSE("GPL");
-
-static unsigned int
-hl_tg6(struct sk_buff *skb, const struct xt_target_param *par)
-{
- struct ipv6hdr *ip6h;
- const struct ip6t_HL_info *info = par->targinfo;
- int new_hl;
-
- if (!skb_make_writable(skb, skb->len))
- return NF_DROP;
-
- ip6h = ipv6_hdr(skb);
-
- switch (info->mode) {
- case IP6T_HL_SET:
- new_hl = info->hop_limit;
- break;
- case IP6T_HL_INC:
- new_hl = ip6h->hop_limit + info->hop_limit;
- if (new_hl > 255)
- new_hl = 255;
- break;
- case IP6T_HL_DEC:
- new_hl = ip6h->hop_limit - info->hop_limit;
- if (new_hl < 0)
- new_hl = 0;
- break;
- default:
- new_hl = ip6h->hop_limit;
- break;
- }
-
- ip6h->hop_limit = new_hl;
-
- return XT_CONTINUE;
-}
-
-static bool hl_tg6_check(const struct xt_tgchk_param *par)
-{
- const struct ip6t_HL_info *info = par->targinfo;
-
- if (info->mode > IP6T_HL_MAXMODE) {
- printk(KERN_WARNING "ip6t_HL: invalid or unknown Mode %u\n",
- info->mode);
- return false;
- }
- if (info->mode != IP6T_HL_SET && info->hop_limit == 0) {
- printk(KERN_WARNING "ip6t_HL: increment/decrement doesn't "
- "make sense with value 0\n");
- return false;
- }
- return true;
-}
-
-static struct xt_target hl_tg6_reg __read_mostly = {
- .name = "HL",
- .family = NFPROTO_IPV6,
- .target = hl_tg6,
- .targetsize = sizeof(struct ip6t_HL_info),
- .table = "mangle",
- .checkentry = hl_tg6_check,
- .me = THIS_MODULE
-};
-
-static int __init hl_tg6_init(void)
-{
- return xt_register_target(&hl_tg6_reg);
-}
-
-static void __exit hl_tg6_exit(void)
-{
- xt_unregister_target(&hl_tg6_reg);
-}
-
-module_init(hl_tg6_init);
-module_exit(hl_tg6_exit);
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index c2bac9c..d99f29b 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -357,6 +357,21 @@ config NETFILTER_XT_TARGET_DSCP
To compile it as a module, choose M here. If unsure, say N.
+config NETFILTER_XT_TARGET_HL
+ tristate '"HL" hoplimit target support'
+ depends on IP_NF_MANGLE || IP6_NF_MANGLE
+ depends on NETFILTER_ADVANCED
+ ---help---
+ This option adds the "HL" (for IPv6) and "TTL" (for IPv4)
+ targets, which enable the user to change the
+ hoplimit/time-to-live value of the IP header.
+
+ While it is safe to decrement the hoplimit/TTL value, the
+ modules also allow to increment and set the hoplimit value of
+ the header to arbitrary values. This is EXTREMELY DANGEROUS
+ since you can easily create immortal packets that loop
+ forever on the network.
+
config NETFILTER_XT_TARGET_MARK
tristate '"MARK" target support'
default m if NETFILTER_ADVANCED=n
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index da3d909..6ebe048 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CONNMARK) += xt_CONNMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_HL) += xt_HL.o
obj-$(CONFIG_NETFILTER_XT_TARGET_MARK) += xt_MARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_NFLOG) += xt_NFLOG.o
obj-$(CONFIG_NETFILTER_XT_TARGET_NFQUEUE) += xt_NFQUEUE.o
diff --git a/net/netfilter/xt_HL.c b/net/netfilter/xt_HL.c
new file mode 100644
index 0000000..10e789e
--- /dev/null
+++ b/net/netfilter/xt_HL.c
@@ -0,0 +1,171 @@
+/*
+ * TTL modification target for IP tables
+ * (C) 2000,2005 by Harald Welte <laforge@netfilter.org>
+ *
+ * Hop Limit modification target for ip6tables
+ * Maciej Soltysiak <solt@dns.toxicfilms.tv>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <net/checksum.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ipt_TTL.h>
+#include <linux/netfilter_ipv6/ip6t_HL.h>
+
+MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
+MODULE_AUTHOR("Maciej Soltysiak <solt@dns.toxicfilms.tv>");
+MODULE_DESCRIPTION("Xtables: Hoplimit/TTL Limit field modification target");
+MODULE_LICENSE("GPL");
+
+static unsigned int
+ttl_tg(struct sk_buff *skb, const struct xt_target_param *par)
+{
+ struct iphdr *iph;
+ const struct ipt_TTL_info *info = par->targinfo;
+ int new_ttl;
+
+ if (!skb_make_writable(skb, skb->len))
+ return NF_DROP;
+
+ iph = ip_hdr(skb);
+
+ switch (info->mode) {
+ case IPT_TTL_SET:
+ new_ttl = info->ttl;
+ break;
+ case IPT_TTL_INC:
+ new_ttl = iph->ttl + info->ttl;
+ if (new_ttl > 255)
+ new_ttl = 255;
+ break;
+ case IPT_TTL_DEC:
+ new_ttl = iph->ttl - info->ttl;
+ if (new_ttl < 0)
+ new_ttl = 0;
+ break;
+ default:
+ new_ttl = iph->ttl;
+ break;
+ }
+
+ if (new_ttl != iph->ttl) {
+ csum_replace2(&iph->check, htons(iph->ttl << 8),
+ htons(new_ttl << 8));
+ iph->ttl = new_ttl;
+ }
+
+ return XT_CONTINUE;
+}
+
+static unsigned int
+hl_tg6(struct sk_buff *skb, const struct xt_target_param *par)
+{
+ struct ipv6hdr *ip6h;
+ const struct ip6t_HL_info *info = par->targinfo;
+ int new_hl;
+
+ if (!skb_make_writable(skb, skb->len))
+ return NF_DROP;
+
+ ip6h = ipv6_hdr(skb);
+
+ switch (info->mode) {
+ case IP6T_HL_SET:
+ new_hl = info->hop_limit;
+ break;
+ case IP6T_HL_INC:
+ new_hl = ip6h->hop_limit + info->hop_limit;
+ if (new_hl > 255)
+ new_hl = 255;
+ break;
+ case IP6T_HL_DEC:
+ new_hl = ip6h->hop_limit - info->hop_limit;
+ if (new_hl < 0)
+ new_hl = 0;
+ break;
+ default:
+ new_hl = ip6h->hop_limit;
+ break;
+ }
+
+ ip6h->hop_limit = new_hl;
+
+ return XT_CONTINUE;
+}
+
+static bool ttl_tg_check(const struct xt_tgchk_param *par)
+{
+ const struct ipt_TTL_info *info = par->targinfo;
+
+ if (info->mode > IPT_TTL_MAXMODE) {
+ printk(KERN_WARNING "ipt_TTL: invalid or unknown Mode %u\n",
+ info->mode);
+ return false;
+ }
+ if (info->mode != IPT_TTL_SET && info->ttl == 0)
+ return false;
+ return true;
+}
+
+static bool hl_tg6_check(const struct xt_tgchk_param *par)
+{
+ const struct ip6t_HL_info *info = par->targinfo;
+
+ if (info->mode > IP6T_HL_MAXMODE) {
+ printk(KERN_WARNING "ip6t_HL: invalid or unknown Mode %u\n",
+ info->mode);
+ return false;
+ }
+ if (info->mode != IP6T_HL_SET && info->hop_limit == 0) {
+ printk(KERN_WARNING "ip6t_HL: increment/decrement doesn't "
+ "make sense with value 0\n");
+ return false;
+ }
+ return true;
+}
+
+static struct xt_target hl_tg_reg[] __read_mostly = {
+ {
+ .name = "TTL",
+ .revision = 0,
+ .family = NFPROTO_IPV4,
+ .target = ttl_tg,
+ .targetsize = sizeof(struct ipt_TTL_info),
+ .table = "mangle",
+ .checkentry = ttl_tg_check,
+ .me = THIS_MODULE,
+ },
+ {
+ .name = "HL",
+ .revision = 0,
+ .family = NFPROTO_IPV6,
+ .target = hl_tg6,
+ .targetsize = sizeof(struct ip6t_HL_info),
+ .table = "mangle",
+ .checkentry = hl_tg6_check,
+ .me = THIS_MODULE,
+ },
+};
+
+static int __init hl_tg_init(void)
+{
+ return xt_register_targets(hl_tg_reg, ARRAY_SIZE(hl_tg_reg));
+}
+
+static void __exit hl_tg_exit(void)
+{
+ xt_unregister_targets(hl_tg_reg, ARRAY_SIZE(hl_tg_reg));
+}
+
+module_init(hl_tg_init);
+module_exit(hl_tg_exit);
+MODULE_ALIAS("ipt_TTL");
+MODULE_ALIAS("ip6t_HL");
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 09/41: Combine ipt_ttl and ip6t_hl source
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (7 preceding siblings ...)
2009-03-24 14:03 ` netfilter 08/41: Combine ipt_TTL and ip6t_HL source Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 10/41: xt_physdev fixes Patrick McHardy
` (32 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit cfac5ef7b92a2d504563989ecd0beb563920444b
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Wed Feb 18 18:39:31 2009 +0100
netfilter: Combine ipt_ttl and ip6t_hl source
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 3ad9f43..40ad41f 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -92,15 +92,6 @@ config IP_NF_MATCH_ECN
To compile it as a module, choose M here. If unsure, say N.
-config IP_NF_MATCH_TTL
- tristate '"ttl" match support'
- depends on NETFILTER_ADVANCED
- help
- This adds CONFIG_IP_NF_MATCH_TTL option, which enabled the user
- to match packets by their TTL value.
-
- To compile it as a module, choose M here. If unsure, say N.
-
# `filter', generic and specific targets
config IP_NF_FILTER
tristate "Packet filtering"
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 20b0c37..4811159 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -51,7 +51,6 @@ obj-$(CONFIG_IP_NF_SECURITY) += iptable_security.o
obj-$(CONFIG_IP_NF_MATCH_ADDRTYPE) += ipt_addrtype.o
obj-$(CONFIG_IP_NF_MATCH_AH) += ipt_ah.o
obj-$(CONFIG_IP_NF_MATCH_ECN) += ipt_ecn.o
-obj-$(CONFIG_IP_NF_MATCH_TTL) += ipt_ttl.o
# targets
obj-$(CONFIG_IP_NF_TARGET_CLUSTERIP) += ipt_CLUSTERIP.o
diff --git a/net/ipv4/netfilter/ipt_ttl.c b/net/ipv4/netfilter/ipt_ttl.c
deleted file mode 100644
index 297f1cb..0000000
--- a/net/ipv4/netfilter/ipt_ttl.c
+++ /dev/null
@@ -1,63 +0,0 @@
-/* IP tables module for matching the value of the TTL
- *
- * (C) 2000,2001 by Harald Welte <laforge@netfilter.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/ip.h>
-#include <linux/module.h>
-#include <linux/skbuff.h>
-
-#include <linux/netfilter_ipv4/ipt_ttl.h>
-#include <linux/netfilter/x_tables.h>
-
-MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
-MODULE_DESCRIPTION("Xtables: IPv4 TTL field match");
-MODULE_LICENSE("GPL");
-
-static bool ttl_mt(const struct sk_buff *skb, const struct xt_match_param *par)
-{
- const struct ipt_ttl_info *info = par->matchinfo;
- const u8 ttl = ip_hdr(skb)->ttl;
-
- switch (info->mode) {
- case IPT_TTL_EQ:
- return ttl == info->ttl;
- case IPT_TTL_NE:
- return ttl != info->ttl;
- case IPT_TTL_LT:
- return ttl < info->ttl;
- case IPT_TTL_GT:
- return ttl > info->ttl;
- default:
- printk(KERN_WARNING "ipt_ttl: unknown mode %d\n",
- info->mode);
- return false;
- }
-
- return false;
-}
-
-static struct xt_match ttl_mt_reg __read_mostly = {
- .name = "ttl",
- .family = NFPROTO_IPV4,
- .match = ttl_mt,
- .matchsize = sizeof(struct ipt_ttl_info),
- .me = THIS_MODULE,
-};
-
-static int __init ttl_mt_init(void)
-{
- return xt_register_match(&ttl_mt_reg);
-}
-
-static void __exit ttl_mt_exit(void)
-{
- xt_unregister_match(&ttl_mt_reg);
-}
-
-module_init(ttl_mt_init);
-module_exit(ttl_mt_exit);
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 6a42a96..4a8d7ec 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -94,15 +94,6 @@ config IP6_NF_MATCH_OPTS
To compile it as a module, choose M here. If unsure, say N.
-config IP6_NF_MATCH_HL
- tristate '"hl" match support'
- depends on NETFILTER_ADVANCED
- help
- HL matching allows you to match packets based on the hop
- limit of the packet.
-
- To compile it as a module, choose M here. If unsure, say N.
-
config IP6_NF_MATCH_IPV6HEADER
tristate '"ipv6header" IPv6 Extension Headers Match'
default m if NETFILTER_ADVANCED=n
diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
index 61a4570..aafbba3 100644
--- a/net/ipv6/netfilter/Makefile
+++ b/net/ipv6/netfilter/Makefile
@@ -20,7 +20,6 @@ obj-$(CONFIG_NF_CONNTRACK_IPV6) += nf_conntrack_ipv6.o
obj-$(CONFIG_IP6_NF_MATCH_AH) += ip6t_ah.o
obj-$(CONFIG_IP6_NF_MATCH_EUI64) += ip6t_eui64.o
obj-$(CONFIG_IP6_NF_MATCH_FRAG) += ip6t_frag.o
-obj-$(CONFIG_IP6_NF_MATCH_HL) += ip6t_hl.o
obj-$(CONFIG_IP6_NF_MATCH_IPV6HEADER) += ip6t_ipv6header.o
obj-$(CONFIG_IP6_NF_MATCH_MH) += ip6t_mh.o
obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o
diff --git a/net/ipv6/netfilter/ip6t_hl.c b/net/ipv6/netfilter/ip6t_hl.c
deleted file mode 100644
index c964dca..0000000
--- a/net/ipv6/netfilter/ip6t_hl.c
+++ /dev/null
@@ -1,68 +0,0 @@
-/* Hop Limit matching module */
-
-/* (C) 2001-2002 Maciej Soltysiak <solt@dns.toxicfilms.tv>
- * Based on HW's ttl module
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/ipv6.h>
-#include <linux/module.h>
-#include <linux/skbuff.h>
-
-#include <linux/netfilter_ipv6/ip6t_hl.h>
-#include <linux/netfilter/x_tables.h>
-
-MODULE_AUTHOR("Maciej Soltysiak <solt@dns.toxicfilms.tv>");
-MODULE_DESCRIPTION("Xtables: IPv6 Hop Limit field match");
-MODULE_LICENSE("GPL");
-
-static bool hl_mt6(const struct sk_buff *skb, const struct xt_match_param *par)
-{
- const struct ip6t_hl_info *info = par->matchinfo;
- const struct ipv6hdr *ip6h = ipv6_hdr(skb);
-
- switch (info->mode) {
- case IP6T_HL_EQ:
- return ip6h->hop_limit == info->hop_limit;
- break;
- case IP6T_HL_NE:
- return ip6h->hop_limit != info->hop_limit;
- break;
- case IP6T_HL_LT:
- return ip6h->hop_limit < info->hop_limit;
- break;
- case IP6T_HL_GT:
- return ip6h->hop_limit > info->hop_limit;
- break;
- default:
- printk(KERN_WARNING "ip6t_hl: unknown mode %d\n",
- info->mode);
- return false;
- }
-
- return false;
-}
-
-static struct xt_match hl_mt6_reg __read_mostly = {
- .name = "hl",
- .family = NFPROTO_IPV6,
- .match = hl_mt6,
- .matchsize = sizeof(struct ip6t_hl_info),
- .me = THIS_MODULE,
-};
-
-static int __init hl_mt6_init(void)
-{
- return xt_register_match(&hl_mt6_reg);
-}
-
-static void __exit hl_mt6_exit(void)
-{
- xt_unregister_match(&hl_mt6_reg);
-}
-
-module_init(hl_mt6_init);
-module_exit(hl_mt6_exit);
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index d99f29b..0eb98b4 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -620,6 +620,14 @@ config NETFILTER_XT_MATCH_HELPER
To compile it as a module, choose M here. If unsure, say Y.
+config NETFILTER_XT_MATCH_HL
+ tristate '"hl" hoplimit/TTL match support'
+ depends on NETFILTER_ADVANCED
+ ---help---
+ HL matching allows you to match packets based on the hoplimit
+ in the IPv6 header, or the time-to-live field in the IPv4
+ header of the packet.
+
config NETFILTER_XT_MATCH_IPRANGE
tristate '"iprange" address range match support'
depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 6ebe048..da73ed2 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -68,6 +68,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_DSCP) += xt_dscp.o
obj-$(CONFIG_NETFILTER_XT_MATCH_ESP) += xt_esp.o
obj-$(CONFIG_NETFILTER_XT_MATCH_HASHLIMIT) += xt_hashlimit.o
obj-$(CONFIG_NETFILTER_XT_MATCH_HELPER) += xt_helper.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_HL) += xt_hl.o
obj-$(CONFIG_NETFILTER_XT_MATCH_IPRANGE) += xt_iprange.o
obj-$(CONFIG_NETFILTER_XT_MATCH_LENGTH) += xt_length.o
obj-$(CONFIG_NETFILTER_XT_MATCH_LIMIT) += xt_limit.o
diff --git a/net/netfilter/xt_hl.c b/net/netfilter/xt_hl.c
new file mode 100644
index 0000000..7726154
--- /dev/null
+++ b/net/netfilter/xt_hl.c
@@ -0,0 +1,108 @@
+/*
+ * IP tables module for matching the value of the TTL
+ * (C) 2000,2001 by Harald Welte <laforge@netfilter.org>
+ *
+ * Hop Limit matching module
+ * (C) 2001-2002 Maciej Soltysiak <solt@dns.toxicfilms.tv>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ipt_ttl.h>
+#include <linux/netfilter_ipv6/ip6t_hl.h>
+
+MODULE_AUTHOR("Maciej Soltysiak <solt@dns.toxicfilms.tv>");
+MODULE_DESCRIPTION("Xtables: Hoplimit/TTL field match");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("ipt_ttl");
+MODULE_ALIAS("ip6t_hl");
+
+static bool ttl_mt(const struct sk_buff *skb, const struct xt_match_param *par)
+{
+ const struct ipt_ttl_info *info = par->matchinfo;
+ const u8 ttl = ip_hdr(skb)->ttl;
+
+ switch (info->mode) {
+ case IPT_TTL_EQ:
+ return ttl == info->ttl;
+ case IPT_TTL_NE:
+ return ttl != info->ttl;
+ case IPT_TTL_LT:
+ return ttl < info->ttl;
+ case IPT_TTL_GT:
+ return ttl > info->ttl;
+ default:
+ printk(KERN_WARNING "ipt_ttl: unknown mode %d\n",
+ info->mode);
+ return false;
+ }
+
+ return false;
+}
+
+static bool hl_mt6(const struct sk_buff *skb, const struct xt_match_param *par)
+{
+ const struct ip6t_hl_info *info = par->matchinfo;
+ const struct ipv6hdr *ip6h = ipv6_hdr(skb);
+
+ switch (info->mode) {
+ case IP6T_HL_EQ:
+ return ip6h->hop_limit == info->hop_limit;
+ break;
+ case IP6T_HL_NE:
+ return ip6h->hop_limit != info->hop_limit;
+ break;
+ case IP6T_HL_LT:
+ return ip6h->hop_limit < info->hop_limit;
+ break;
+ case IP6T_HL_GT:
+ return ip6h->hop_limit > info->hop_limit;
+ break;
+ default:
+ printk(KERN_WARNING "ip6t_hl: unknown mode %d\n",
+ info->mode);
+ return false;
+ }
+
+ return false;
+}
+
+static struct xt_match hl_mt_reg[] __read_mostly = {
+ {
+ .name = "ttl",
+ .revision = 0,
+ .family = NFPROTO_IPV4,
+ .match = ttl_mt,
+ .matchsize = sizeof(struct ipt_ttl_info),
+ .me = THIS_MODULE,
+ },
+ {
+ .name = "hl",
+ .revision = 0,
+ .family = NFPROTO_IPV6,
+ .match = hl_mt6,
+ .matchsize = sizeof(struct ip6t_hl_info),
+ .me = THIS_MODULE,
+ },
+};
+
+static int __init hl_mt_init(void)
+{
+ return xt_register_matches(hl_mt_reg, ARRAY_SIZE(hl_mt_reg));
+}
+
+static void __exit hl_mt_exit(void)
+{
+ xt_unregister_matches(hl_mt_reg, ARRAY_SIZE(hl_mt_reg));
+}
+
+module_init(hl_mt_init);
+module_exit(hl_mt_exit);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 10/41: xt_physdev fixes
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (8 preceding siblings ...)
2009-03-24 14:03 ` netfilter 09/41: Combine ipt_ttl and ip6t_hl source Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 11/41: xtables: add backward-compat options Patrick McHardy
` (31 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 4f1c3b7e7ee4d841c8af3a074dc361d6a7a77803
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed Feb 18 19:11:39 2009 +0100
netfilter: xt_physdev fixes
1) physdev_mt() incorrectly assumes nulldevname[] is aligned on an int
2) It also uses word comparisons, while it could use long word ones.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index 1bcdfc1..4b13ef7 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -24,9 +24,9 @@ static bool
physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
{
int i;
- static const char nulldevname[IFNAMSIZ];
+ static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long))));
const struct xt_physdev_info *info = par->matchinfo;
- bool ret;
+ unsigned long ret;
const char *indev, *outdev;
const struct nf_bridge_info *nf_bridge;
@@ -68,10 +68,10 @@ physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
if (!(info->bitmask & XT_PHYSDEV_OP_IN))
goto match_outdev;
indev = nf_bridge->physindev ? nf_bridge->physindev->name : nulldevname;
- for (i = 0, ret = false; i < IFNAMSIZ/sizeof(unsigned int); i++) {
- ret |= (((const unsigned int *)indev)[i]
- ^ ((const unsigned int *)info->physindev)[i])
- & ((const unsigned int *)info->in_mask)[i];
+ for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
+ ret |= (((const unsigned long *)indev)[i]
+ ^ ((const unsigned long *)info->physindev)[i])
+ & ((const unsigned long *)info->in_mask)[i];
}
if (!ret ^ !(info->invert & XT_PHYSDEV_OP_IN))
@@ -82,13 +82,12 @@ match_outdev:
return true;
outdev = nf_bridge->physoutdev ?
nf_bridge->physoutdev->name : nulldevname;
- for (i = 0, ret = false; i < IFNAMSIZ/sizeof(unsigned int); i++) {
- ret |= (((const unsigned int *)outdev)[i]
- ^ ((const unsigned int *)info->physoutdev)[i])
- & ((const unsigned int *)info->out_mask)[i];
+ for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
+ ret |= (((const unsigned long *)outdev)[i]
+ ^ ((const unsigned long *)info->physoutdev)[i])
+ & ((const unsigned long *)info->out_mask)[i];
}
-
- return ret ^ !(info->invert & XT_PHYSDEV_OP_OUT);
+ return (!!ret ^ !(info->invert & XT_PHYSDEV_OP_OUT));
}
static bool physdev_mt_check(const struct xt_mtchk_param *par)
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 11/41: xtables: add backward-compat options
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (9 preceding siblings ...)
2009-03-24 14:03 ` netfilter 10/41: xt_physdev fixes Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 12/41: xt_physdev: unfold two loops in physdev_mt() Patrick McHardy
` (30 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 4323362e49bd10b8ff3fe5cf183fdd52662ff4a3
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Thu Feb 19 11:16:03 2009 +0100
netfilter: xtables: add backward-compat options
Concern has been expressed about the changing Kconfig options.
Provide the old options that forward-select.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 40ad41f..f8d6180 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -92,6 +92,15 @@ config IP_NF_MATCH_ECN
To compile it as a module, choose M here. If unsure, say N.
+config IP_NF_MATCH_TTL
+ tristate '"ttl" match support'
+ depends on NETFILTER_ADVANCED
+ select NETFILTER_XT_MATCH_HL
+ ---help---
+ This is a backwards-compat option for the user's convenience
+ (e.g. when running oldconfig). It selects
+ COFNIG_NETFILTER_XT_MATCH_HL.
+
# `filter', generic and specific targets
config IP_NF_FILTER
tristate "Packet filtering"
@@ -313,6 +322,15 @@ config IP_NF_TARGET_ECN
To compile it as a module, choose M here. If unsure, say N.
+config IP_NF_TARGET_TTL
+ tristate '"TTL" target support'
+ depends on NETFILTER_ADVANCED
+ select NETFILTER_XT_TARGET_HL
+ ---help---
+ This is a backwards-compat option for the user's convenience
+ (e.g. when running oldconfig). It selects
+ COFNIG_NETFILTER_XT_TARGET_HL.
+
# raw + specific targets
config IP_NF_RAW
tristate 'raw table support (required for NOTRACK/TRACE)'
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 4a8d7ec..625353a 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -94,6 +94,15 @@ config IP6_NF_MATCH_OPTS
To compile it as a module, choose M here. If unsure, say N.
+config IP6_NF_MATCH_HL
+ tristate '"hl" hoplimit match support'
+ depends on NETFILTER_ADVANCED
+ select NETFILTER_XT_MATCH_HL
+ ---help---
+ This is a backwards-compat option for the user's convenience
+ (e.g. when running oldconfig). It selects
+ COFNIG_NETFILTER_XT_MATCH_HL.
+
config IP6_NF_MATCH_IPV6HEADER
tristate '"ipv6header" IPv6 Extension Headers Match'
default m if NETFILTER_ADVANCED=n
@@ -121,6 +130,15 @@ config IP6_NF_MATCH_RT
To compile it as a module, choose M here. If unsure, say N.
# The targets
+config IP6_NF_TARGET_HL
+ tristate '"HL" hoplimit target support'
+ depends on NETFILTER_ADVANCED
+ select NETFILTER_XT_TARGET_HL
+ ---help---
+ This is a backwards-compat option for the user's convenience
+ (e.g. when running oldconfig). It selects
+ COFNIG_NETFILTER_XT_TARGET_HL.
+
config IP6_NF_TARGET_LOG
tristate "LOG target support"
default m if NETFILTER_ADVANCED=n
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 12/41: xt_physdev: unfold two loops in physdev_mt()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (10 preceding siblings ...)
2009-03-24 14:03 ` netfilter 11/41: xtables: add backward-compat options Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 13/41: ip6_tables: unfold two loops in ip6_packet_match() Patrick McHardy
` (29 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit eacc17fb64f03b6c268aaf6cea320100d19d8af5
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Thu Feb 19 11:17:17 2009 +0100
netfilter: xt_physdev: unfold two loops in physdev_mt()
xt_physdev netfilter module can use an ifname_compare() helper
so that two loops are unfolded.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index 4b13ef7..44a234e 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -20,10 +20,27 @@ MODULE_DESCRIPTION("Xtables: Bridge physical device match");
MODULE_ALIAS("ipt_physdev");
MODULE_ALIAS("ip6t_physdev");
+static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
+{
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ return ret;
+}
+
static bool
physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
{
- int i;
static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long))));
const struct xt_physdev_info *info = par->matchinfo;
unsigned long ret;
@@ -68,11 +85,7 @@ physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
if (!(info->bitmask & XT_PHYSDEV_OP_IN))
goto match_outdev;
indev = nf_bridge->physindev ? nf_bridge->physindev->name : nulldevname;
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)indev)[i]
- ^ ((const unsigned long *)info->physindev)[i])
- & ((const unsigned long *)info->in_mask)[i];
- }
+ ret = ifname_compare(indev, info->physindev, info->in_mask);
if (!ret ^ !(info->invert & XT_PHYSDEV_OP_IN))
return false;
@@ -82,11 +95,8 @@ match_outdev:
return true;
outdev = nf_bridge->physoutdev ?
nf_bridge->physoutdev->name : nulldevname;
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)outdev)[i]
- ^ ((const unsigned long *)info->physoutdev)[i])
- & ((const unsigned long *)info->out_mask)[i];
- }
+ ret = ifname_compare(outdev, info->physoutdev, info->out_mask);
+
return (!!ret ^ !(info->invert & XT_PHYSDEV_OP_OUT));
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 13/41: ip6_tables: unfold two loops in ip6_packet_match()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (11 preceding siblings ...)
2009-03-24 14:03 ` netfilter 12/41: xt_physdev: unfold two loops in physdev_mt() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 14/41: iptables: lock free counters Patrick McHardy
` (28 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 323dbf96382f057d035afce0237f08e18571ac1d
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Thu Feb 19 11:18:23 2009 +0100
netfilter: ip6_tables: unfold two loops in ip6_packet_match()
ip6_tables netfilter module can use an ifname_compare() helper
so that two loops are unfolded.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index a33485d..d64594b 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -89,6 +89,25 @@ ip6t_ext_hdr(u8 nexthdr)
(nexthdr == IPPROTO_DSTOPTS) );
}
+static unsigned long ifname_compare(const char *_a, const char *_b,
+ const unsigned char *_mask)
+{
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ return ret;
+}
+
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -99,7 +118,6 @@ ip6_packet_match(const struct sk_buff *skb,
unsigned int *protoff,
int *fragoff, bool *hotdrop)
{
- size_t i;
unsigned long ret;
const struct ipv6hdr *ipv6 = ipv6_hdr(skb);
@@ -120,12 +138,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- /* Look for ifname matches; this should unroll nicely. */
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)indev)[i]
- ^ ((const unsigned long *)ip6info->iniface)[i])
- & ((const unsigned long *)ip6info->iniface_mask)[i];
- }
+ ret = ifname_compare(indev, ip6info->iniface, ip6info->iniface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -134,11 +147,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)outdev)[i]
- ^ ((const unsigned long *)ip6info->outiface)[i])
- & ((const unsigned long *)ip6info->outiface_mask)[i];
- }
+ ret = ifname_compare(outdev, ip6info->outiface, ip6info->outiface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 14/41: iptables: lock free counters
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (12 preceding siblings ...)
2009-03-24 14:03 ` netfilter 13/41: ip6_tables: unfold two loops in ip6_packet_match() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 15/41: nf_conntrack: table max size should hold at least table size Patrick McHardy
` (27 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 784544739a25c30637397ace5489eeb6e15d7d49
Author: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri Feb 20 10:35:32 2009 +0100
netfilter: iptables: lock free counters
The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:
1) it prevents changes to table state (xt_replace) while table is in use.
This is now handled by doing rcu on the xt_table. When table is
replaced, the new table(s) are put in and the old one table(s) are freed
after RCU period.
2) it provides synchronization when accesing the counter values.
This is now handled by swapping in new table_info entries for each cpu
then summing the old values, and putting the result back onto one
cpu. On a busy system it may cause sampling to occur at different
times on each cpu, but no packet/byte counts are lost in the process.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index 9fac88f..e8e08d0 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -353,7 +353,7 @@ struct xt_table
unsigned int valid_hooks;
/* Lock for the curtain */
- rwlock_t lock;
+ struct mutex lock;
/* Man behind the curtain... */
struct xt_table_info *private;
@@ -385,7 +385,7 @@ struct xt_table_info
/* ipt_entry tables: one per CPU */
/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
- char *entries[1];
+ void *entries[1];
};
#define XT_TABLE_INFO_SZ (offsetof(struct xt_table_info, entries) \
@@ -432,6 +432,8 @@ extern void xt_proto_fini(struct net *net, u_int8_t af);
extern struct xt_table_info *xt_alloc_table_info(unsigned int size);
extern void xt_free_table_info(struct xt_table_info *info);
+extern void xt_table_entry_swap_rcu(struct xt_table_info *old,
+ struct xt_table_info *new);
#ifdef CONFIG_COMPAT
#include <net/compat.h>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index b5db463..64a7c6c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -261,9 +261,10 @@ unsigned int arpt_do_table(struct sk_buff *skb,
indev = in ? in->name : nulldevname;
outdev = out ? out->name : nulldevname;
- read_lock_bh(&table->lock);
- private = table->private;
- table_base = (void *)private->entries[smp_processor_id()];
+ rcu_read_lock();
+ private = rcu_dereference(table->private);
+ table_base = rcu_dereference(private->entries[smp_processor_id()]);
+
e = get_entry(table_base, private->hook_entry[hook]);
back = get_entry(table_base, private->underflow[hook]);
@@ -335,7 +336,8 @@ unsigned int arpt_do_table(struct sk_buff *skb,
e = (void *)e + e->next_offset;
}
} while (!hotdrop);
- read_unlock_bh(&table->lock);
+
+ rcu_read_unlock();
if (hotdrop)
return NF_DROP;
@@ -738,11 +740,65 @@ static void get_counters(const struct xt_table_info *t,
}
}
-static inline struct xt_counters *alloc_counters(struct xt_table *table)
+
+/* We're lazy, and add to the first CPU; overflow works its fey magic
+ * and everything is OK. */
+static int
+add_counter_to_entry(struct arpt_entry *e,
+ const struct xt_counters addme[],
+ unsigned int *i)
+{
+ ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
+
+ (*i)++;
+ return 0;
+}
+
+/* Take values from counters and add them back onto the current cpu */
+static void put_counters(struct xt_table_info *t,
+ const struct xt_counters counters[])
+{
+ unsigned int i, cpu;
+
+ local_bh_disable();
+ cpu = smp_processor_id();
+ i = 0;
+ ARPT_ENTRY_ITERATE(t->entries[cpu],
+ t->size,
+ add_counter_to_entry,
+ counters,
+ &i);
+ local_bh_enable();
+}
+
+static inline int
+zero_entry_counter(struct arpt_entry *e, void *arg)
+{
+ e->counters.bcnt = 0;
+ e->counters.pcnt = 0;
+ return 0;
+}
+
+static void
+clone_counters(struct xt_table_info *newinfo, const struct xt_table_info *info)
+{
+ unsigned int cpu;
+ const void *loc_cpu_entry = info->entries[raw_smp_processor_id()];
+
+ memcpy(newinfo, info, offsetof(struct xt_table_info, entries));
+ for_each_possible_cpu(cpu) {
+ memcpy(newinfo->entries[cpu], loc_cpu_entry, info->size);
+ ARPT_ENTRY_ITERATE(newinfo->entries[cpu], newinfo->size,
+ zero_entry_counter, NULL);
+ }
+}
+
+static struct xt_counters *alloc_counters(struct xt_table *table)
{
unsigned int countersize;
struct xt_counters *counters;
- const struct xt_table_info *private = table->private;
+ struct xt_table_info *private = table->private;
+ struct xt_table_info *info;
/* We need atomic snapshot of counters: rest doesn't change
* (other than comefrom, which userspace doesn't care
@@ -752,14 +808,30 @@ static inline struct xt_counters *alloc_counters(struct xt_table *table)
counters = vmalloc_node(countersize, numa_node_id());
if (counters == NULL)
- return ERR_PTR(-ENOMEM);
+ goto nomem;
+
+ info = xt_alloc_table_info(private->size);
+ if (!info)
+ goto free_counters;
- /* First, sum counters... */
- write_lock_bh(&table->lock);
- get_counters(private, counters);
- write_unlock_bh(&table->lock);
+ clone_counters(info, private);
+
+ mutex_lock(&table->lock);
+ xt_table_entry_swap_rcu(private, info);
+ synchronize_net(); /* Wait until smoke has cleared */
+
+ get_counters(info, counters);
+ put_counters(private, counters);
+ mutex_unlock(&table->lock);
+
+ xt_free_table_info(info);
return counters;
+
+ free_counters:
+ vfree(counters);
+ nomem:
+ return ERR_PTR(-ENOMEM);
}
static int copy_entries_to_user(unsigned int total_size,
@@ -1099,20 +1171,6 @@ static int do_replace(struct net *net, void __user *user, unsigned int len)
return ret;
}
-/* We're lazy, and add to the first CPU; overflow works its fey magic
- * and everything is OK.
- */
-static inline int add_counter_to_entry(struct arpt_entry *e,
- const struct xt_counters addme[],
- unsigned int *i)
-{
-
- ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
-
- (*i)++;
- return 0;
-}
-
static int do_add_counters(struct net *net, void __user *user, unsigned int len,
int compat)
{
@@ -1172,13 +1230,14 @@ static int do_add_counters(struct net *net, void __user *user, unsigned int len,
goto free;
}
- write_lock_bh(&t->lock);
+ mutex_lock(&t->lock);
private = t->private;
if (private->number != num_counters) {
ret = -EINVAL;
goto unlock_up_free;
}
+ preempt_disable();
i = 0;
/* Choose the copy that is on our node */
loc_cpu_entry = private->entries[smp_processor_id()];
@@ -1187,8 +1246,10 @@ static int do_add_counters(struct net *net, void __user *user, unsigned int len,
add_counter_to_entry,
paddc,
&i);
+ preempt_enable();
unlock_up_free:
- write_unlock_bh(&t->lock);
+ mutex_unlock(&t->lock);
+
xt_table_unlock(t);
module_put(t->me);
free:
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index ef8b6ca..08cde5b 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -347,10 +347,12 @@ ipt_do_table(struct sk_buff *skb,
mtpar.family = tgpar.family = NFPROTO_IPV4;
tgpar.hooknum = hook;
- read_lock_bh(&table->lock);
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
- private = table->private;
- table_base = (void *)private->entries[smp_processor_id()];
+
+ rcu_read_lock();
+ private = rcu_dereference(table->private);
+ table_base = rcu_dereference(private->entries[smp_processor_id()]);
+
e = get_entry(table_base, private->hook_entry[hook]);
/* For return from builtin chain */
@@ -445,7 +447,7 @@ ipt_do_table(struct sk_buff *skb,
}
} while (!hotdrop);
- read_unlock_bh(&table->lock);
+ rcu_read_unlock();
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
@@ -924,13 +926,68 @@ get_counters(const struct xt_table_info *t,
counters,
&i);
}
+
+}
+
+/* We're lazy, and add to the first CPU; overflow works its fey magic
+ * and everything is OK. */
+static int
+add_counter_to_entry(struct ipt_entry *e,
+ const struct xt_counters addme[],
+ unsigned int *i)
+{
+ ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
+
+ (*i)++;
+ return 0;
+}
+
+/* Take values from counters and add them back onto the current cpu */
+static void put_counters(struct xt_table_info *t,
+ const struct xt_counters counters[])
+{
+ unsigned int i, cpu;
+
+ local_bh_disable();
+ cpu = smp_processor_id();
+ i = 0;
+ IPT_ENTRY_ITERATE(t->entries[cpu],
+ t->size,
+ add_counter_to_entry,
+ counters,
+ &i);
+ local_bh_enable();
+}
+
+
+static inline int
+zero_entry_counter(struct ipt_entry *e, void *arg)
+{
+ e->counters.bcnt = 0;
+ e->counters.pcnt = 0;
+ return 0;
+}
+
+static void
+clone_counters(struct xt_table_info *newinfo, const struct xt_table_info *info)
+{
+ unsigned int cpu;
+ const void *loc_cpu_entry = info->entries[raw_smp_processor_id()];
+
+ memcpy(newinfo, info, offsetof(struct xt_table_info, entries));
+ for_each_possible_cpu(cpu) {
+ memcpy(newinfo->entries[cpu], loc_cpu_entry, info->size);
+ IPT_ENTRY_ITERATE(newinfo->entries[cpu], newinfo->size,
+ zero_entry_counter, NULL);
+ }
}
static struct xt_counters * alloc_counters(struct xt_table *table)
{
unsigned int countersize;
struct xt_counters *counters;
- const struct xt_table_info *private = table->private;
+ struct xt_table_info *private = table->private;
+ struct xt_table_info *info;
/* We need atomic snapshot of counters: rest doesn't change
(other than comefrom, which userspace doesn't care
@@ -939,14 +996,30 @@ static struct xt_counters * alloc_counters(struct xt_table *table)
counters = vmalloc_node(countersize, numa_node_id());
if (counters == NULL)
- return ERR_PTR(-ENOMEM);
+ goto nomem;
- /* First, sum counters... */
- write_lock_bh(&table->lock);
- get_counters(private, counters);
- write_unlock_bh(&table->lock);
+ info = xt_alloc_table_info(private->size);
+ if (!info)
+ goto free_counters;
+
+ clone_counters(info, private);
+
+ mutex_lock(&table->lock);
+ xt_table_entry_swap_rcu(private, info);
+ synchronize_net(); /* Wait until smoke has cleared */
+
+ get_counters(info, counters);
+ put_counters(private, counters);
+ mutex_unlock(&table->lock);
+
+ xt_free_table_info(info);
return counters;
+
+ free_counters:
+ vfree(counters);
+ nomem:
+ return ERR_PTR(-ENOMEM);
}
static int
@@ -1312,27 +1385,6 @@ do_replace(struct net *net, void __user *user, unsigned int len)
return ret;
}
-/* We're lazy, and add to the first CPU; overflow works its fey magic
- * and everything is OK. */
-static int
-add_counter_to_entry(struct ipt_entry *e,
- const struct xt_counters addme[],
- unsigned int *i)
-{
-#if 0
- duprintf("add_counter: Entry %u %lu/%lu + %lu/%lu\n",
- *i,
- (long unsigned int)e->counters.pcnt,
- (long unsigned int)e->counters.bcnt,
- (long unsigned int)addme[*i].pcnt,
- (long unsigned int)addme[*i].bcnt);
-#endif
-
- ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
-
- (*i)++;
- return 0;
-}
static int
do_add_counters(struct net *net, void __user *user, unsigned int len, int compat)
@@ -1393,13 +1445,14 @@ do_add_counters(struct net *net, void __user *user, unsigned int len, int compat
goto free;
}
- write_lock_bh(&t->lock);
+ mutex_lock(&t->lock);
private = t->private;
if (private->number != num_counters) {
ret = -EINVAL;
goto unlock_up_free;
}
+ preempt_disable();
i = 0;
/* Choose the copy that is on our node */
loc_cpu_entry = private->entries[raw_smp_processor_id()];
@@ -1408,8 +1461,9 @@ do_add_counters(struct net *net, void __user *user, unsigned int len, int compat
add_counter_to_entry,
paddc,
&i);
+ preempt_enable();
unlock_up_free:
- write_unlock_bh(&t->lock);
+ mutex_unlock(&t->lock);
xt_table_unlock(t);
module_put(t->me);
free:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index d64594b..34af7bb 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -382,10 +382,12 @@ ip6t_do_table(struct sk_buff *skb,
mtpar.family = tgpar.family = NFPROTO_IPV6;
tgpar.hooknum = hook;
- read_lock_bh(&table->lock);
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
- private = table->private;
- table_base = (void *)private->entries[smp_processor_id()];
+
+ rcu_read_lock();
+ private = rcu_dereference(table->private);
+ table_base = rcu_dereference(private->entries[smp_processor_id()]);
+
e = get_entry(table_base, private->hook_entry[hook]);
/* For return from builtin chain */
@@ -483,7 +485,7 @@ ip6t_do_table(struct sk_buff *skb,
#ifdef CONFIG_NETFILTER_DEBUG
((struct ip6t_entry *)table_base)->comefrom = NETFILTER_LINK_POISON;
#endif
- read_unlock_bh(&table->lock);
+ rcu_read_unlock();
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
@@ -964,11 +966,64 @@ get_counters(const struct xt_table_info *t,
}
}
+/* We're lazy, and add to the first CPU; overflow works its fey magic
+ * and everything is OK. */
+static int
+add_counter_to_entry(struct ip6t_entry *e,
+ const struct xt_counters addme[],
+ unsigned int *i)
+{
+ ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
+
+ (*i)++;
+ return 0;
+}
+
+/* Take values from counters and add them back onto the current cpu */
+static void put_counters(struct xt_table_info *t,
+ const struct xt_counters counters[])
+{
+ unsigned int i, cpu;
+
+ local_bh_disable();
+ cpu = smp_processor_id();
+ i = 0;
+ IP6T_ENTRY_ITERATE(t->entries[cpu],
+ t->size,
+ add_counter_to_entry,
+ counters,
+ &i);
+ local_bh_enable();
+}
+
+static inline int
+zero_entry_counter(struct ip6t_entry *e, void *arg)
+{
+ e->counters.bcnt = 0;
+ e->counters.pcnt = 0;
+ return 0;
+}
+
+static void
+clone_counters(struct xt_table_info *newinfo, const struct xt_table_info *info)
+{
+ unsigned int cpu;
+ const void *loc_cpu_entry = info->entries[raw_smp_processor_id()];
+
+ memcpy(newinfo, info, offsetof(struct xt_table_info, entries));
+ for_each_possible_cpu(cpu) {
+ memcpy(newinfo->entries[cpu], loc_cpu_entry, info->size);
+ IP6T_ENTRY_ITERATE(newinfo->entries[cpu], newinfo->size,
+ zero_entry_counter, NULL);
+ }
+}
+
static struct xt_counters *alloc_counters(struct xt_table *table)
{
unsigned int countersize;
struct xt_counters *counters;
- const struct xt_table_info *private = table->private;
+ struct xt_table_info *private = table->private;
+ struct xt_table_info *info;
/* We need atomic snapshot of counters: rest doesn't change
(other than comefrom, which userspace doesn't care
@@ -977,14 +1032,28 @@ static struct xt_counters *alloc_counters(struct xt_table *table)
counters = vmalloc_node(countersize, numa_node_id());
if (counters == NULL)
- return ERR_PTR(-ENOMEM);
+ goto nomem;
+
+ info = xt_alloc_table_info(private->size);
+ if (!info)
+ goto free_counters;
+
+ clone_counters(info, private);
+
+ mutex_lock(&table->lock);
+ xt_table_entry_swap_rcu(private, info);
+ synchronize_net(); /* Wait until smoke has cleared */
+
+ get_counters(info, counters);
+ put_counters(private, counters);
+ mutex_unlock(&table->lock);
- /* First, sum counters... */
- write_lock_bh(&table->lock);
- get_counters(private, counters);
- write_unlock_bh(&table->lock);
+ xt_free_table_info(info);
- return counters;
+ free_counters:
+ vfree(counters);
+ nomem:
+ return ERR_PTR(-ENOMEM);
}
static int
@@ -1351,28 +1420,6 @@ do_replace(struct net *net, void __user *user, unsigned int len)
return ret;
}
-/* We're lazy, and add to the first CPU; overflow works its fey magic
- * and everything is OK. */
-static inline int
-add_counter_to_entry(struct ip6t_entry *e,
- const struct xt_counters addme[],
- unsigned int *i)
-{
-#if 0
- duprintf("add_counter: Entry %u %lu/%lu + %lu/%lu\n",
- *i,
- (long unsigned int)e->counters.pcnt,
- (long unsigned int)e->counters.bcnt,
- (long unsigned int)addme[*i].pcnt,
- (long unsigned int)addme[*i].bcnt);
-#endif
-
- ADD_COUNTER(e->counters, addme[*i].bcnt, addme[*i].pcnt);
-
- (*i)++;
- return 0;
-}
-
static int
do_add_counters(struct net *net, void __user *user, unsigned int len,
int compat)
@@ -1433,13 +1480,14 @@ do_add_counters(struct net *net, void __user *user, unsigned int len,
goto free;
}
- write_lock_bh(&t->lock);
+ mutex_lock(&t->lock);
private = t->private;
if (private->number != num_counters) {
ret = -EINVAL;
goto unlock_up_free;
}
+ preempt_disable();
i = 0;
/* Choose the copy that is on our node */
loc_cpu_entry = private->entries[raw_smp_processor_id()];
@@ -1448,8 +1496,9 @@ do_add_counters(struct net *net, void __user *user, unsigned int len,
add_counter_to_entry,
paddc,
&i);
+ preempt_enable();
unlock_up_free:
- write_unlock_bh(&t->lock);
+ mutex_unlock(&t->lock);
xt_table_unlock(t);
module_put(t->me);
free:
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index bfbf521..bfcac92 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -625,6 +625,20 @@ void xt_free_table_info(struct xt_table_info *info)
}
EXPORT_SYMBOL(xt_free_table_info);
+void xt_table_entry_swap_rcu(struct xt_table_info *oldinfo,
+ struct xt_table_info *newinfo)
+{
+ unsigned int cpu;
+
+ for_each_possible_cpu(cpu) {
+ void *p = oldinfo->entries[cpu];
+ rcu_assign_pointer(oldinfo->entries[cpu], newinfo->entries[cpu]);
+ newinfo->entries[cpu] = p;
+ }
+
+}
+EXPORT_SYMBOL_GPL(xt_table_entry_swap_rcu);
+
/* Find table by name, grabs mutex & ref. Returns ERR_PTR() on error. */
struct xt_table *xt_find_table_lock(struct net *net, u_int8_t af,
const char *name)
@@ -671,21 +685,22 @@ xt_replace_table(struct xt_table *table,
struct xt_table_info *oldinfo, *private;
/* Do the substitution. */
- write_lock_bh(&table->lock);
+ mutex_lock(&table->lock);
private = table->private;
/* Check inside lock: is the old number correct? */
if (num_counters != private->number) {
duprintf("num_counters != table->private->number (%u/%u)\n",
num_counters, private->number);
- write_unlock_bh(&table->lock);
+ mutex_unlock(&table->lock);
*error = -EAGAIN;
return NULL;
}
oldinfo = private;
- table->private = newinfo;
+ rcu_assign_pointer(table->private, newinfo);
newinfo->initial_entries = oldinfo->initial_entries;
- write_unlock_bh(&table->lock);
+ mutex_unlock(&table->lock);
+ synchronize_net();
return oldinfo;
}
EXPORT_SYMBOL_GPL(xt_replace_table);
@@ -719,7 +734,8 @@ struct xt_table *xt_register_table(struct net *net, struct xt_table *table,
/* Simplifies replace_table code. */
table->private = bootstrap;
- rwlock_init(&table->lock);
+ mutex_init(&table->lock);
+
if (!xt_replace_table(table, 0, newinfo, &ret))
goto unlock;
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 15/41: nf_conntrack: table max size should hold at least table size
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (13 preceding siblings ...)
2009-03-24 14:03 ` netfilter 14/41: iptables: lock free counters Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 16/41: fix hardcoded size assumptions Patrick McHardy
` (26 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit e478075c6f07a383c378fb400edc1a7407a941b0
Author: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Fri Feb 20 10:47:09 2009 +0100
netfilter: nf_conntrack: table max size should hold at least table size
Table size is defined as unsigned, wheres the table maximum size is
defined as a signed integer. The calculation of max is 8 or 4,
multiplied the table size. Therefore the max value is aligned to
unsigned.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 2e0c536..4dfb793 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -287,7 +287,7 @@ static inline int nf_ct_is_untracked(const struct sk_buff *skb)
extern int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp);
extern unsigned int nf_conntrack_htable_size;
-extern int nf_conntrack_max;
+extern unsigned int nf_conntrack_max;
#define NF_CT_STAT_INC(net, count) \
(per_cpu_ptr((net)->ct.stat, raw_smp_processor_id())->count++)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 90ce9dd..f3aa4e6 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -54,7 +54,7 @@ EXPORT_SYMBOL_GPL(nf_conntrack_lock);
unsigned int nf_conntrack_htable_size __read_mostly;
EXPORT_SYMBOL_GPL(nf_conntrack_htable_size);
-int nf_conntrack_max __read_mostly;
+unsigned int nf_conntrack_max __read_mostly;
EXPORT_SYMBOL_GPL(nf_conntrack_max);
struct nf_conn nf_conntrack_untracked __read_mostly;
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 16/41: fix hardcoded size assumptions
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (14 preceding siblings ...)
2009-03-24 14:03 ` netfilter 15/41: nf_conntrack: table max size should hold at least table size Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 17/41: x_tables: add LED trigger target Patrick McHardy
` (25 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit af07d241dc76f0a52c7ff04df3a3970020fe6157
Author: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Fri Feb 20 10:48:06 2009 +0100
netfilter: fix hardcoded size assumptions
get_random_bytes() is sometimes called with a hard coded size assumption
of an integer. This could not be true for next centuries. This patch
replace it with a compile time statement.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index f3aa4e6..2235432 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -472,7 +472,8 @@ struct nf_conn *nf_conntrack_alloc(struct net *net,
struct nf_conn *ct;
if (unlikely(!nf_conntrack_hash_rnd_initted)) {
- get_random_bytes(&nf_conntrack_hash_rnd, 4);
+ get_random_bytes(&nf_conntrack_hash_rnd,
+ sizeof(nf_conntrack_hash_rnd));
nf_conntrack_hash_rnd_initted = 1;
}
@@ -1103,7 +1104,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
/* We have to rehahs for the new table anyway, so we also can
* use a newrandom seed */
- get_random_bytes(&rnd, 4);
+ get_random_bytes(&rnd, sizeof(rnd));
/* Lookups in the old hash might happen in parallel, which means we
* might get false negatives during connection lookup. New connections
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 3a8a34a..357ba39 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -72,7 +72,8 @@ static unsigned int nf_ct_expect_dst_hash(const struct nf_conntrack_tuple *tuple
unsigned int hash;
if (unlikely(!nf_ct_expect_hash_rnd_initted)) {
- get_random_bytes(&nf_ct_expect_hash_rnd, 4);
+ get_random_bytes(&nf_ct_expect_hash_rnd,
+ sizeof(nf_ct_expect_hash_rnd));
nf_ct_expect_hash_rnd_initted = 1;
}
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index f97fded..2482055 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -149,7 +149,7 @@ dsthash_alloc_init(struct xt_hashlimit_htable *ht,
/* initialize hash with random val at the time we allocate
* the first hashtable entry */
if (!ht->rnd_initialized) {
- get_random_bytes(&ht->rnd, 4);
+ get_random_bytes(&ht->rnd, sizeof(ht->rnd));
ht->rnd_initialized = 1;
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 17/41: x_tables: add LED trigger target
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (15 preceding siblings ...)
2009-03-24 14:03 ` netfilter 16/41: fix hardcoded size assumptions Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 18/41: ip_tables: unfold two critical loops in ip_packet_match() Patrick McHardy
` (24 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 268cb38e1802db560c73167e643f14a3dcb4b07c
Author: Adam Nielsen <a.nielsen@shikadi.net>
Date: Fri Feb 20 10:55:14 2009 +0100
netfilter: x_tables: add LED trigger target
Kernel module providing implementation of LED netfilter target. Each
instance of the target appears as a led-trigger device, which can be
associated with one or more LEDs in /sys/class/leds/
Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig
index 7427136..556aeca 100644
--- a/drivers/leds/Kconfig
+++ b/drivers/leds/Kconfig
@@ -223,4 +223,7 @@ config LEDS_TRIGGER_DEFAULT_ON
This allows LEDs to be initialised in the ON state.
If unsure, say Y.
+comment "iptables trigger is under Netfilter config (LED target)"
+ depends on LEDS_TRIGGERS
+
endif # NEW_LEDS
diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index 5a8af87..deeaee5 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -7,6 +7,7 @@ header-y += xt_CLASSIFY.h
header-y += xt_CONNMARK.h
header-y += xt_CONNSECMARK.h
header-y += xt_DSCP.h
+header-y += xt_LED.h
header-y += xt_MARK.h
header-y += xt_NFLOG.h
header-y += xt_NFQUEUE.h
diff --git a/include/linux/netfilter/xt_LED.h b/include/linux/netfilter/xt_LED.h
new file mode 100644
index 0000000..4c91a0d
--- /dev/null
+++ b/include/linux/netfilter/xt_LED.h
@@ -0,0 +1,13 @@
+#ifndef _XT_LED_H
+#define _XT_LED_H
+
+struct xt_led_info {
+ char id[27]; /* Unique ID for this trigger in the LED class */
+ __u8 always_blink; /* Blink even if the LED is already on */
+ __u32 delay; /* Delay until LED is switched off after trigger */
+
+ /* Kernel data used in the module */
+ void *internal_data __attribute__((aligned(8)));
+};
+
+#endif /* _XT_LED_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 0eb98b4..cdbaaff 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -372,6 +372,30 @@ config NETFILTER_XT_TARGET_HL
since you can easily create immortal packets that loop
forever on the network.
+config NETFILTER_XT_TARGET_LED
+ tristate '"LED" target support'
+ depends on LEDS_CLASS
+ depends on NETFILTER_ADVANCED
+ help
+ This option adds a `LED' target, which allows you to blink LEDs in
+ response to particular packets passing through your machine.
+
+ This can be used to turn a spare LED into a network activity LED,
+ which only flashes in response to FTP transfers, for example. Or
+ you could have an LED which lights up for a minute or two every time
+ somebody connects to your machine via SSH.
+
+ You will need support for the "led" class to make this work.
+
+ To create an LED trigger for incoming SSH traffic:
+ iptables -A INPUT -p tcp --dport 22 -j LED --led-trigger-id ssh --led-delay 1000
+
+ Then attach the new trigger to an LED on your system:
+ echo netfilter-ssh > /sys/class/leds/<ledname>/trigger
+
+ For more information on the LEDs available on your system, see
+ Documentation/leds-class.txt
+
config NETFILTER_XT_TARGET_MARK
tristate '"MARK" target support'
default m if NETFILTER_ADVANCED=n
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index da73ed2..7a9b839 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_CONNMARK) += xt_CONNMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
obj-$(CONFIG_NETFILTER_XT_TARGET_HL) += xt_HL.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_LED) += xt_LED.o
obj-$(CONFIG_NETFILTER_XT_TARGET_MARK) += xt_MARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_NFLOG) += xt_NFLOG.o
obj-$(CONFIG_NETFILTER_XT_TARGET_NFQUEUE) += xt_NFQUEUE.o
diff --git a/net/netfilter/xt_LED.c b/net/netfilter/xt_LED.c
new file mode 100644
index 0000000..8ff7843
--- /dev/null
+++ b/net/netfilter/xt_LED.c
@@ -0,0 +1,161 @@
+/*
+ * xt_LED.c - netfilter target to make LEDs blink upon packet matches
+ *
+ * Copyright (C) 2008 Adam Nielsen <a.nielsen@shikadi.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301 USA.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/leds.h>
+#include <linux/mutex.h>
+
+#include <linux/netfilter/xt_LED.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Nielsen <a.nielsen@shikadi.net>");
+MODULE_DESCRIPTION("Xtables: trigger LED devices on packet match");
+
+/*
+ * This is declared in here (the kernel module) only, to avoid having these
+ * dependencies in userspace code. This is what xt_led_info.internal_data
+ * points to.
+ */
+struct xt_led_info_internal {
+ struct led_trigger netfilter_led_trigger;
+ struct timer_list timer;
+};
+
+static unsigned int
+led_tg(struct sk_buff *skb, const struct xt_target_param *par)
+{
+ const struct xt_led_info *ledinfo = par->targinfo;
+ struct xt_led_info_internal *ledinternal = ledinfo->internal_data;
+
+ /*
+ * If "always blink" is enabled, and there's still some time until the
+ * LED will switch off, briefly switch it off now.
+ */
+ if ((ledinfo->delay > 0) && ledinfo->always_blink &&
+ timer_pending(&ledinternal->timer))
+ led_trigger_event(&ledinternal->netfilter_led_trigger,LED_OFF);
+
+ led_trigger_event(&ledinternal->netfilter_led_trigger, LED_FULL);
+
+ /* If there's a positive delay, start/update the timer */
+ if (ledinfo->delay > 0) {
+ mod_timer(&ledinternal->timer,
+ jiffies + msecs_to_jiffies(ledinfo->delay));
+
+ /* Otherwise if there was no delay given, blink as fast as possible */
+ } else if (ledinfo->delay == 0) {
+ led_trigger_event(&ledinternal->netfilter_led_trigger, LED_OFF);
+ }
+
+ /* else the delay is negative, which means switch on and stay on */
+
+ return XT_CONTINUE;
+}
+
+static void led_timeout_callback(unsigned long data)
+{
+ struct xt_led_info *ledinfo = (struct xt_led_info *)data;
+ struct xt_led_info_internal *ledinternal = ledinfo->internal_data;
+
+ led_trigger_event(&ledinternal->netfilter_led_trigger, LED_OFF);
+}
+
+static bool led_tg_check(const struct xt_tgchk_param *par)
+{
+ struct xt_led_info *ledinfo = par->targinfo;
+ struct xt_led_info_internal *ledinternal;
+ int err;
+
+ if (ledinfo->id[0] == '\0') {
+ printk(KERN_ERR KBUILD_MODNAME ": No 'id' parameter given.\n");
+ return false;
+ }
+
+ ledinternal = kzalloc(sizeof(struct xt_led_info_internal), GFP_KERNEL);
+ if (!ledinternal) {
+ printk(KERN_CRIT KBUILD_MODNAME ": out of memory\n");
+ return false;
+ }
+
+ ledinternal->netfilter_led_trigger.name = ledinfo->id;
+
+ err = led_trigger_register(&ledinternal->netfilter_led_trigger);
+ if (err) {
+ printk(KERN_CRIT KBUILD_MODNAME
+ ": led_trigger_register() failed\n");
+ if (err == -EEXIST)
+ printk(KERN_ERR KBUILD_MODNAME
+ ": Trigger name is already in use.\n");
+ goto exit_alloc;
+ }
+
+ /* See if we need to set up a timer */
+ if (ledinfo->delay > 0)
+ setup_timer(&ledinternal->timer, led_timeout_callback,
+ (unsigned long)ledinfo);
+
+ ledinfo->internal_data = ledinternal;
+
+ return true;
+
+exit_alloc:
+ kfree(ledinternal);
+
+ return false;
+}
+
+static void led_tg_destroy(const struct xt_tgdtor_param *par)
+{
+ const struct xt_led_info *ledinfo = par->targinfo;
+ struct xt_led_info_internal *ledinternal = ledinfo->internal_data;
+
+ if (ledinfo->delay > 0)
+ del_timer_sync(&ledinternal->timer);
+
+ led_trigger_unregister(&ledinternal->netfilter_led_trigger);
+ kfree(ledinternal);
+}
+
+static struct xt_target led_tg_reg __read_mostly = {
+ .name = "LED",
+ .revision = 0,
+ .family = NFPROTO_UNSPEC,
+ .target = led_tg,
+ .targetsize = XT_ALIGN(sizeof(struct xt_led_info)),
+ .checkentry = led_tg_check,
+ .destroy = led_tg_destroy,
+ .me = THIS_MODULE,
+};
+
+static int __init led_tg_init(void)
+{
+ return xt_register_target(&led_tg_reg);
+}
+
+static void __exit led_tg_exit(void)
+{
+ xt_unregister_target(&led_tg_reg);
+}
+
+module_init(led_tg_init);
+module_exit(led_tg_exit);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 18/41: ip_tables: unfold two critical loops in ip_packet_match()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (16 preceding siblings ...)
2009-03-24 14:03 ` netfilter 17/41: x_tables: add LED trigger target Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 19/41: nf_conntrack: account packets drop by tcp_packet() Patrick McHardy
` (23 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 08361aa807ae5e5007cd226ca9e34287512de737
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri Feb 20 11:03:33 2009 +0100
netfilter: ip_tables: unfold two critical loops in ip_packet_match()
While doing oprofile tests I noticed two loops are not properly unrolled by gcc
Using a hand coded unrolled loop provides nice speedup : ipt_do_table
credited of 2.52 % of cpu instead of 3.29 % in tbench.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 08cde5b..e5294ae 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -74,6 +74,25 @@ do { \
Hence the start of any table is given by get_table() below. */
+static unsigned long ifname_compare(const char *_a, const char *_b,
+ const unsigned char *_mask)
+{
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ return ret;
+}
+
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -83,7 +102,6 @@ ip_packet_match(const struct iphdr *ip,
const struct ipt_ip *ipinfo,
int isfrag)
{
- size_t i;
unsigned long ret;
#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg)))
@@ -103,12 +121,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- /* Look for ifname matches; this should unroll nicely. */
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)indev)[i]
- ^ ((const unsigned long *)ipinfo->iniface)[i])
- & ((const unsigned long *)ipinfo->iniface_mask)[i];
- }
+ ret = ifname_compare(indev, ipinfo->iniface, ipinfo->iniface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -117,11 +130,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- for (i = 0, ret = 0; i < IFNAMSIZ/sizeof(unsigned long); i++) {
- ret |= (((const unsigned long *)outdev)[i]
- ^ ((const unsigned long *)ipinfo->outiface)[i])
- & ((const unsigned long *)ipinfo->outiface_mask)[i];
- }
+ ret = ifname_compare(outdev, ipinfo->outiface, ipinfo->outiface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 19/41: nf_conntrack: account packets drop by tcp_packet()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (17 preceding siblings ...)
2009-03-24 14:03 ` netfilter 18/41: ip_tables: unfold two critical loops in ip_packet_match() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 20/41: install missing headers Patrick McHardy
` (22 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 7d1e04598e5e92527840b6889fb75b4b30fdd33b
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue Feb 24 14:48:01 2009 +0100
netfilter: nf_conntrack: account packets drop by tcp_packet()
Since tcp_packet() may return -NF_DROP in two situations, the
packet-drop stats must be increased.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 2235432..ebc2756 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -734,6 +734,8 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
nf_conntrack_put(skb->nfct);
skb->nfct = NULL;
NF_CT_STAT_INC_ATOMIC(net, invalid);
+ if (ret == -NF_DROP)
+ NF_CT_STAT_INC_ATOMIC(net, drop);
return -ret;
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 20/41: install missing headers
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (18 preceding siblings ...)
2009-03-24 14:03 ` netfilter 19/41: nf_conntrack: account packets drop by tcp_packet() Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 21/41: xt_hashlimit fix Patrick McHardy
` (21 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit d060ffc1840e37100628f520e66600c5ae483b44
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Tue Feb 24 15:23:58 2009 +0100
netfilter: install missing headers
iptables imports headers from (the unifdefed headers of a)
kernel tree, but some headers happened to not be installed.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index deeaee5..947b47d 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -14,8 +14,11 @@ header-y += xt_NFQUEUE.h
header-y += xt_RATEEST.h
header-y += xt_SECMARK.h
header-y += xt_TCPMSS.h
+header-y += xt_TCPOPTSTRIP.h
+header-y += xt_TPROXY.h
header-y += xt_comment.h
header-y += xt_connbytes.h
+header-y += xt_connlimit.h
header-y += xt_connmark.h
header-y += xt_conntrack.h
header-y += xt_dccp.h
@@ -31,6 +34,7 @@ header-y += xt_mark.h
header-y += xt_multiport.h
header-y += xt_owner.h
header-y += xt_pkttype.h
+header-y += xt_quota.h
header-y += xt_rateest.h
header-y += xt_realm.h
header-y += xt_recent.h
@@ -40,6 +44,8 @@ header-y += xt_statistic.h
header-y += xt_string.h
header-y += xt_tcpmss.h
header-y += xt_tcpudp.h
+header-y += xt_time.h
+header-y += xt_u32.h
unifdef-y += nf_conntrack_common.h
unifdef-y += nf_conntrack_ftp.h
diff --git a/include/linux/netfilter_ipv6/Kbuild b/include/linux/netfilter_ipv6/Kbuild
index 8887a5f..aca4bd1 100644
--- a/include/linux/netfilter_ipv6/Kbuild
+++ b/include/linux/netfilter_ipv6/Kbuild
@@ -11,6 +11,7 @@ header-y += ip6t_length.h
header-y += ip6t_limit.h
header-y += ip6t_mac.h
header-y += ip6t_mark.h
+header-y += ip6t_mh.h
header-y += ip6t_multiport.h
header-y += ip6t_opts.h
header-y += ip6t_owner.h
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 21/41: xt_hashlimit fix
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (19 preceding siblings ...)
2009-03-24 14:03 ` netfilter 20/41: install missing headers Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 22/41: use a linked list of loggers Patrick McHardy
` (20 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 28337ff5438a640afa713d874d076e3a8a9150da
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue Feb 24 15:30:29 2009 +0100
netfilter: xt_hashlimit fix
Commit 784544739a25c30637397ace5489eeb6e15d7d49
(netfilter: iptables: lock free counters) broke xt_hashlimit netfilter module :
This module was storing a pointer inside its xt_hashlimit_info, and this pointer
is not relocated when we temporarly switch tables (iptables -L).
This hack is not not needed at all (probably a leftover from
ancient time), as each cpu should and can access to its own copy.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 2482055..a5b5369 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -565,8 +565,7 @@ hashlimit_init_dst(const struct xt_hashlimit_htable *hinfo,
static bool
hashlimit_mt_v0(const struct sk_buff *skb, const struct xt_match_param *par)
{
- const struct xt_hashlimit_info *r =
- ((const struct xt_hashlimit_info *)par->matchinfo)->u.master;
+ const struct xt_hashlimit_info *r = par->matchinfo;
struct xt_hashlimit_htable *hinfo = r->hinfo;
unsigned long now = jiffies;
struct dsthash_ent *dh;
@@ -702,8 +701,6 @@ static bool hashlimit_mt_check_v0(const struct xt_mtchk_param *par)
}
mutex_unlock(&hlimit_mutex);
- /* Ugly hack: For SMP, we only want to use one set */
- r->u.master = r;
return true;
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 22/41: use a linked list of loggers
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (20 preceding siblings ...)
2009-03-24 14:03 ` netfilter 21/41: xt_hashlimit fix Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 23/41: print the list of register loggers Patrick McHardy
` (19 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit ca735b3aaa945626ba65a3e51145bfe4ecd9e222
Author: Eric Leblond <eric@inl.fr>
Date: Mon Mar 16 14:54:21 2009 +0100
netfilter: use a linked list of loggers
This patch modifies nf_log to use a linked list of loggers for each
protocol. This list of loggers is read and write protected with a
mutex.
This patch separates registration and binding. To be used as
logging module, a module has to register calling nf_log_register()
and to bind to a protocol it has to call nf_log_bind_pf().
This patch also converts the logging modules to the new API. For nfnetlink_log,
it simply switchs call to register functions to call to bind function and
adds a call to nf_log_register() during init. For other modules, it just
remove a const flag from the logger structure and replace it with a
__read_mostly.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_log.h b/include/net/netfilter/nf_log.h
index 7182c06..920997f 100644
--- a/include/net/netfilter/nf_log.h
+++ b/include/net/netfilter/nf_log.h
@@ -1,6 +1,8 @@
#ifndef _NF_LOG_H
#define _NF_LOG_H
+#include <linux/netfilter.h>
+
/* those NF_LOG_* defines and struct nf_loginfo are legacy definitios that will
* disappear once iptables is replaced with pkttables. Please DO NOT use them
* for any new code! */
@@ -40,12 +42,15 @@ struct nf_logger {
struct module *me;
nf_logfn *logfn;
char *name;
+ struct list_head list[NFPROTO_NUMPROTO];
};
/* Function to register/unregister log function. */
-int nf_log_register(u_int8_t pf, const struct nf_logger *logger);
-void nf_log_unregister(const struct nf_logger *logger);
-void nf_log_unregister_pf(u_int8_t pf);
+int nf_log_register(u_int8_t pf, struct nf_logger *logger);
+void nf_log_unregister(struct nf_logger *logger);
+
+int nf_log_bind_pf(u_int8_t pf, const struct nf_logger *logger);
+void nf_log_unbind_pf(u_int8_t pf);
/* Calls the registered backend logging function */
void nf_log_packet(u_int8_t pf,
diff --git a/net/ipv4/netfilter/ipt_LOG.c b/net/ipv4/netfilter/ipt_LOG.c
index 27a78fb..acc44c6 100644
--- a/net/ipv4/netfilter/ipt_LOG.c
+++ b/net/ipv4/netfilter/ipt_LOG.c
@@ -464,7 +464,7 @@ static struct xt_target log_tg_reg __read_mostly = {
.me = THIS_MODULE,
};
-static const struct nf_logger ipt_log_logger ={
+static struct nf_logger ipt_log_logger __read_mostly = {
.name = "ipt_LOG",
.logfn = &ipt_log_packet,
.me = THIS_MODULE,
diff --git a/net/ipv4/netfilter/ipt_ULOG.c b/net/ipv4/netfilter/ipt_ULOG.c
index 18a2826..d32cc4b 100644
--- a/net/ipv4/netfilter/ipt_ULOG.c
+++ b/net/ipv4/netfilter/ipt_ULOG.c
@@ -379,7 +379,7 @@ static struct xt_target ulog_tg_reg __read_mostly = {
.me = THIS_MODULE,
};
-static struct nf_logger ipt_ulog_logger = {
+static struct nf_logger ipt_ulog_logger __read_mostly = {
.name = "ipt_ULOG",
.logfn = ipt_logfn,
.me = THIS_MODULE,
diff --git a/net/ipv6/netfilter/ip6t_LOG.c b/net/ipv6/netfilter/ip6t_LOG.c
index 37adf5a..7018cac 100644
--- a/net/ipv6/netfilter/ip6t_LOG.c
+++ b/net/ipv6/netfilter/ip6t_LOG.c
@@ -477,7 +477,7 @@ static struct xt_target log_tg6_reg __read_mostly = {
.me = THIS_MODULE,
};
-static const struct nf_logger ip6t_logger = {
+static struct nf_logger ip6t_logger __read_mostly = {
.name = "ip6t_LOG",
.logfn = &ip6t_log_packet,
.me = THIS_MODULE,
diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index fa8ae5d..a228b5f 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -16,56 +16,60 @@
#define NF_LOG_PREFIXLEN 128
static const struct nf_logger *nf_loggers[NFPROTO_NUMPROTO] __read_mostly;
+static struct list_head nf_loggers_l[NFPROTO_NUMPROTO] __read_mostly;
static DEFINE_MUTEX(nf_log_mutex);
-/* return EBUSY if somebody else is registered, EEXIST if the same logger
- * is registred, 0 on success. */
-int nf_log_register(u_int8_t pf, const struct nf_logger *logger)
+static struct nf_logger *__find_logger(int pf, const char *str_logger)
{
- int ret;
+ struct nf_logger *t;
- if (pf >= ARRAY_SIZE(nf_loggers))
- return -EINVAL;
-
- /* Any setup of logging members must be done before
- * substituting pointer. */
- ret = mutex_lock_interruptible(&nf_log_mutex);
- if (ret < 0)
- return ret;
-
- if (!nf_loggers[pf])
- rcu_assign_pointer(nf_loggers[pf], logger);
- else if (nf_loggers[pf] == logger)
- ret = -EEXIST;
- else
- ret = -EBUSY;
+ list_for_each_entry(t, &nf_loggers_l[pf], list[pf]) {
+ if (!strnicmp(str_logger, t->name, strlen(t->name)))
+ return t;
+ }
- mutex_unlock(&nf_log_mutex);
- return ret;
+ return NULL;
}
-EXPORT_SYMBOL(nf_log_register);
-void nf_log_unregister_pf(u_int8_t pf)
+/* return EEXIST if the same logger is registred, 0 on success. */
+int nf_log_register(u_int8_t pf, struct nf_logger *logger)
{
+ const struct nf_logger *llog;
+
if (pf >= ARRAY_SIZE(nf_loggers))
- return;
+ return -EINVAL;
+
mutex_lock(&nf_log_mutex);
- rcu_assign_pointer(nf_loggers[pf], NULL);
+
+ if (pf == NFPROTO_UNSPEC) {
+ int i;
+ for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++)
+ list_add_tail(&(logger->list[i]), &(nf_loggers_l[i]));
+ } else {
+ /* register at end of list to honor first register win */
+ list_add_tail(&logger->list[pf], &nf_loggers_l[pf]);
+ llog = rcu_dereference(nf_loggers[pf]);
+ if (llog == NULL)
+ rcu_assign_pointer(nf_loggers[pf], logger);
+ }
+
mutex_unlock(&nf_log_mutex);
- /* Give time to concurrent readers. */
- synchronize_rcu();
+ return 0;
}
-EXPORT_SYMBOL(nf_log_unregister_pf);
+EXPORT_SYMBOL(nf_log_register);
-void nf_log_unregister(const struct nf_logger *logger)
+void nf_log_unregister(struct nf_logger *logger)
{
+ const struct nf_logger *c_logger;
int i;
mutex_lock(&nf_log_mutex);
for (i = 0; i < ARRAY_SIZE(nf_loggers); i++) {
- if (nf_loggers[i] == logger)
+ c_logger = rcu_dereference(nf_loggers[i]);
+ if (c_logger == logger)
rcu_assign_pointer(nf_loggers[i], NULL);
+ list_del(&logger->list[i]);
}
mutex_unlock(&nf_log_mutex);
@@ -73,6 +77,27 @@ void nf_log_unregister(const struct nf_logger *logger)
}
EXPORT_SYMBOL(nf_log_unregister);
+int nf_log_bind_pf(u_int8_t pf, const struct nf_logger *logger)
+{
+ mutex_lock(&nf_log_mutex);
+ if (__find_logger(pf, logger->name) == NULL) {
+ mutex_unlock(&nf_log_mutex);
+ return -ENOENT;
+ }
+ rcu_assign_pointer(nf_loggers[pf], logger);
+ mutex_unlock(&nf_log_mutex);
+ return 0;
+}
+EXPORT_SYMBOL(nf_log_bind_pf);
+
+void nf_log_unbind_pf(u_int8_t pf)
+{
+ mutex_lock(&nf_log_mutex);
+ rcu_assign_pointer(nf_loggers[pf], NULL);
+ mutex_unlock(&nf_log_mutex);
+}
+EXPORT_SYMBOL(nf_log_unbind_pf);
+
void nf_log_packet(u_int8_t pf,
unsigned int hooknum,
const struct sk_buff *skb,
@@ -163,10 +188,15 @@ static const struct file_operations nflog_file_ops = {
int __init netfilter_log_init(void)
{
+ int i;
#ifdef CONFIG_PROC_FS
if (!proc_create("nf_log", S_IRUGO,
proc_net_netfilter, &nflog_file_ops))
return -1;
#endif
+
+ for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++)
+ INIT_LIST_HEAD(&(nf_loggers_l[i]));
+
return 0;
}
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index fa49dc7..3eae3fc 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -691,7 +691,7 @@ nfulnl_recv_unsupp(struct sock *ctnl, struct sk_buff *skb,
return -ENOTSUPP;
}
-static const struct nf_logger nfulnl_logger = {
+static struct nf_logger nfulnl_logger __read_mostly = {
.name = "nfnetlink_log",
.logfn = &nfulnl_log_packet,
.me = THIS_MODULE,
@@ -723,9 +723,9 @@ nfulnl_recv_config(struct sock *ctnl, struct sk_buff *skb,
/* Commands without queue context */
switch (cmd->command) {
case NFULNL_CFG_CMD_PF_BIND:
- return nf_log_register(pf, &nfulnl_logger);
+ return nf_log_bind_pf(pf, &nfulnl_logger);
case NFULNL_CFG_CMD_PF_UNBIND:
- nf_log_unregister_pf(pf);
+ nf_log_unbind_pf(pf);
return 0;
}
}
@@ -950,17 +950,25 @@ static int __init nfnetlink_log_init(void)
goto cleanup_netlink_notifier;
}
+ status = nf_log_register(NFPROTO_UNSPEC, &nfulnl_logger);
+ if (status < 0) {
+ printk(KERN_ERR "log: failed to register logger\n");
+ goto cleanup_subsys;
+ }
+
#ifdef CONFIG_PROC_FS
if (!proc_create("nfnetlink_log", 0440,
proc_net_netfilter, &nful_file_ops))
- goto cleanup_subsys;
+ goto cleanup_logger;
#endif
return status;
#ifdef CONFIG_PROC_FS
+cleanup_logger:
+ nf_log_unregister(&nfulnl_logger);
+#endif
cleanup_subsys:
nfnetlink_subsys_unregister(&nfulnl_subsys);
-#endif
cleanup_netlink_notifier:
netlink_unregister_notifier(&nfulnl_rtnl_notifier);
return status;
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 23/41: print the list of register loggers
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (21 preceding siblings ...)
2009-03-24 14:03 ` netfilter 22/41: use a linked list of loggers Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 24/41: remove IPvX specific parts from nf_conntrack_l4proto.h Patrick McHardy
` (18 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit c7a913cd5535554d6f5d5e1f5ef46c4307cf2afc
Author: Eric Leblond <eric@inl.fr>
Date: Mon Mar 16 14:55:27 2009 +0100
netfilter: print the list of register loggers
This patch modifies the proc output to add display of registered
loggers. The content of /proc/net/netfilter/nf_log is modified. Instead
of displaying a protocol per line with format:
proto:logger
it now displays:
proto:logger (comma_separated_list_of_loggers)
NONE is used as keyword if no logger is used.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index a228b5f..4fcbcc7 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -154,13 +154,37 @@ static int seq_show(struct seq_file *s, void *v)
{
loff_t *pos = v;
const struct nf_logger *logger;
+ struct nf_logger *t;
+ int ret;
logger = rcu_dereference(nf_loggers[*pos]);
if (!logger)
- return seq_printf(s, "%2lld NONE\n", *pos);
+ ret = seq_printf(s, "%2lld NONE (", *pos);
+ else
+ ret = seq_printf(s, "%2lld %s (", *pos, logger->name);
+
+ if (ret < 0)
+ return ret;
+
+ mutex_lock(&nf_log_mutex);
+ list_for_each_entry(t, &nf_loggers_l[*pos], list[*pos]) {
+ ret = seq_printf(s, "%s", t->name);
+ if (ret < 0) {
+ mutex_unlock(&nf_log_mutex);
+ return ret;
+ }
+ if (&t->list[*pos] != nf_loggers_l[*pos].prev) {
+ ret = seq_printf(s, ",");
+ if (ret < 0) {
+ mutex_unlock(&nf_log_mutex);
+ return ret;
+ }
+ }
+ }
+ mutex_unlock(&nf_log_mutex);
- return seq_printf(s, "%2lld %s\n", *pos, logger->name);
+ return seq_printf(s, ")\n");
}
static const struct seq_operations nflog_seq_ops = {
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 24/41: remove IPvX specific parts from nf_conntrack_l4proto.h
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (22 preceding siblings ...)
2009-03-24 14:03 ` netfilter 23/41: print the list of register loggers Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 25/41: Kconfig spelling fixes (trivial) Patrick McHardy
` (17 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 9d2493f88f846b391a15a736efc7f4b97d6c4046
Author: Christoph Paasch <christoph.paasch@gmail.com>
Date: Mon Mar 16 15:15:35 2009 +0100
netfilter: remove IPvX specific parts from nf_conntrack_l4proto.h
Moving the structure definitions to the corresponding IPvX specific header files.
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index debdaf7..16ab604 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -90,10 +90,7 @@ struct nf_conntrack_l4proto
struct module *me;
};
-/* Existing built-in protocols */
-extern struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp6;
-extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4;
-extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6;
+/* Existing built-in generic protocol */
extern struct nf_conntrack_l4proto nf_conntrack_l4proto_generic;
#define MAX_NF_CT_PROTO 256
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index 727b953..e6852f6 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -26,6 +26,7 @@
#include <net/netfilter/nf_conntrack_l4proto.h>
#include <net/netfilter/nf_conntrack_l3proto.h>
#include <net/netfilter/nf_conntrack_core.h>
+#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
static bool ipv6_pkt_to_tuple(const struct sk_buff *skb, unsigned int nhoff,
struct nf_conntrack_tuple *tuple)
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index a1edb9c..7d3944f 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -25,6 +25,8 @@
#include <net/netfilter/nf_conntrack_l4proto.h>
#include <net/netfilter/nf_conntrack_ecache.h>
#include <net/netfilter/nf_log.h>
+#include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
+#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
/* Protects ct->proto.tcp */
static DEFINE_RWLOCK(tcp_lock);
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index 2b8b1f5..d402117 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -22,6 +22,8 @@
#include <net/netfilter/nf_conntrack_l4proto.h>
#include <net/netfilter/nf_conntrack_ecache.h>
#include <net/netfilter/nf_log.h>
+#include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
+#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
static unsigned int nf_ct_udp_timeout __read_mostly = 30*HZ;
static unsigned int nf_ct_udp_timeout_stream __read_mostly = 180*HZ;
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 25/41: Kconfig spelling fixes (trivial)
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (23 preceding siblings ...)
2009-03-24 14:03 ` netfilter 24/41: remove IPvX specific parts from nf_conntrack_l4proto.h Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:20 ` Jan Engelhardt
2009-03-24 14:03 ` netfilter 26/41: conntrack: increase drop stats if sequence adjustment fails Patrick McHardy
` (16 subsequent siblings)
41 siblings, 1 reply; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 67c0d57930ff9a24c6c34abee1b01f7716a9b0e2
Author: Stephen Hemminger <sheminger@vyatta.com>
Date: Mon Mar 16 15:17:23 2009 +0100
netfilter: Kconfig spelling fixes (trivial)
Signed-off-by: Stephen Hemminger <sheminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index f8d6180..1833bdb 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -31,7 +31,7 @@ config NF_CONNTRACK_PROC_COMPAT
default y
help
This option enables /proc and sysctl compatibility with the old
- layer 3 dependant connection tracking. This is needed to keep
+ layer 3 dependent connection tracking. This is needed to keep
old programs that have not been adapted to the new names working.
If unsure, say Y.
@@ -99,7 +99,7 @@ config IP_NF_MATCH_TTL
---help---
This is a backwards-compat option for the user's convenience
(e.g. when running oldconfig). It selects
- COFNIG_NETFILTER_XT_MATCH_HL.
+ CONFIG_NETFILTER_XT_MATCH_HL.
# `filter', generic and specific targets
config IP_NF_FILTER
@@ -329,7 +329,7 @@ config IP_NF_TARGET_TTL
---help---
This is a backwards-compat option for the user's convenience
(e.g. when running oldconfig). It selects
- COFNIG_NETFILTER_XT_TARGET_HL.
+ CONFIG_NETFILTER_XT_TARGET_HL.
# raw + specific targets
config IP_NF_RAW
^ permalink raw reply related [flat|nested] 58+ messages in thread* Re: netfilter 25/41: Kconfig spelling fixes (trivial)
2009-03-24 14:03 ` netfilter 25/41: Kconfig spelling fixes (trivial) Patrick McHardy
@ 2009-03-24 14:20 ` Jan Engelhardt
2009-03-24 20:35 ` David Miller
0 siblings, 1 reply; 58+ messages in thread
From: Jan Engelhardt @ 2009-03-24 14:20 UTC (permalink / raw)
To: Patrick McHardy; +Cc: davem, netdev, Netfilter Developer Mailing List
On Tuesday 2009-03-24 15:03, Patrick McHardy wrote:
>commit 67c0d57930ff9a24c6c34abee1b01f7716a9b0e2
>Author: Stephen Hemminger <sheminger@vyatta.com>
>Date: Mon Mar 16 15:17:23 2009 +0100
>
> netfilter: Kconfig spelling fixes (trivial)
>
> Signed-off-by: Stephen Hemminger <sheminger@vyatta.com>
> Signed-off-by: Patrick McHardy <kaber@trash.net>
Please add this patch as a followup commit.
parent 1d45209d89e647e9f27e4afa1f47338df73bc112 (v2.6.29-rc5-881-g1d45209)
commit 85982fcfd17fbb1000302e253f82edd88efa180c
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Tue Mar 24 15:18:17 2009 +0100
netfilter: trivial Kconfig spelling fixes
Supplements commit 67c0d57930ff9a24c6c34abee1b01f7716a9b0e2.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
net/ipv6/netfilter/Kconfig | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 625353a..29d643b 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -101,7 +101,7 @@ config IP6_NF_MATCH_HL
---help---
This is a backwards-compat option for the user's convenience
(e.g. when running oldconfig). It selects
- COFNIG_NETFILTER_XT_MATCH_HL.
+ CONFIG_NETFILTER_XT_MATCH_HL.
config IP6_NF_MATCH_IPV6HEADER
tristate '"ipv6header" IPv6 Extension Headers Match'
@@ -137,7 +137,7 @@ config IP6_NF_TARGET_HL
---help---
This is a backwards-compat option for the user's convenience
(e.g. when running oldconfig). It selects
- COFNIG_NETFILTER_XT_TARGET_HL.
+ CONFIG_NETFILTER_XT_TARGET_HL.
config IP6_NF_TARGET_LOG
tristate "LOG target support"
--
# Created with git-export-patch
^ permalink raw reply related [flat|nested] 58+ messages in thread
* netfilter 26/41: conntrack: increase drop stats if sequence adjustment fails
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (24 preceding siblings ...)
2009-03-24 14:03 ` netfilter 25/41: Kconfig spelling fixes (trivial) Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 27/41: ctnetlink: cleanup master conntrack assignation Patrick McHardy
` (15 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 1db7a748dfd50d7615913730763c024444900030
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 16 15:18:50 2009 +0100
netfilter: conntrack: increase drop stats if sequence adjustment fails
This patch increases the statistics of packets drop if the sequence
adjustment fails in ipv4_confirm().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 4beb04f..8b681f2 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -120,8 +120,10 @@ static unsigned int ipv4_confirm(unsigned int hooknum,
typeof(nf_nat_seq_adjust_hook) seq_adjust;
seq_adjust = rcu_dereference(nf_nat_seq_adjust_hook);
- if (!seq_adjust || !seq_adjust(skb, ct, ctinfo))
+ if (!seq_adjust || !seq_adjust(skb, ct, ctinfo)) {
+ NF_CT_STAT_INC_ATOMIC(nf_ct_net(ct), drop);
return NF_DROP;
+ }
}
out:
/* We've seen it coming out the other side: confirm it */
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 27/41: ctnetlink: cleanup master conntrack assignation
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (25 preceding siblings ...)
2009-03-24 14:03 ` netfilter 26/41: conntrack: increase drop stats if sequence adjustment fails Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 28/41: ctnetlink: cleanup conntrack update preliminary checkings Patrick McHardy
` (14 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 7ec4749675bf33ea639bbcca8a5365ccc5091a6a
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 16 15:25:46 2009 +0100
netfilter: ctnetlink: cleanup master conntrack assignation
This patch moves the assignation of the master conntrack to
ctnetlink_create_conntrack(), which is where it really belongs.
This patch is a cleanup.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index cb78aa0..cca22d5 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1128,9 +1128,9 @@ static int
ctnetlink_create_conntrack(struct nlattr *cda[],
struct nf_conntrack_tuple *otuple,
struct nf_conntrack_tuple *rtuple,
- struct nf_conn *master_ct,
u32 pid,
- int report)
+ int report,
+ u8 u3)
{
struct nf_conn *ct;
int err = -EINVAL;
@@ -1241,7 +1241,22 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
#endif
/* setup master conntrack: this is a confirmed expectation */
- if (master_ct) {
+ if (cda[CTA_TUPLE_MASTER]) {
+ struct nf_conntrack_tuple master;
+ struct nf_conntrack_tuple_hash *master_h;
+ struct nf_conn *master_ct;
+
+ err = ctnetlink_parse_tuple(cda, &master, CTA_TUPLE_MASTER, u3);
+ if (err < 0)
+ goto err;
+
+ master_h = __nf_conntrack_find(&init_net, &master);
+ if (master_h == NULL) {
+ err = -ENOENT;
+ goto err;
+ }
+ master_ct = nf_ct_tuplehash_to_ctrack(master_h);
+ nf_conntrack_get(&master_ct->ct_general);
__set_bit(IPS_EXPECTED_BIT, &ct->status);
ct->master = master_ct;
}
@@ -1289,39 +1304,15 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb,
h = __nf_conntrack_find(&init_net, &rtuple);
if (h == NULL) {
- struct nf_conntrack_tuple master;
- struct nf_conntrack_tuple_hash *master_h = NULL;
- struct nf_conn *master_ct = NULL;
-
- if (cda[CTA_TUPLE_MASTER]) {
- err = ctnetlink_parse_tuple(cda,
- &master,
- CTA_TUPLE_MASTER,
- u3);
- if (err < 0)
- goto out_unlock;
-
- master_h = __nf_conntrack_find(&init_net, &master);
- if (master_h == NULL) {
- err = -ENOENT;
- goto out_unlock;
- }
- master_ct = nf_ct_tuplehash_to_ctrack(master_h);
- nf_conntrack_get(&master_ct->ct_general);
- }
-
err = -ENOENT;
if (nlh->nlmsg_flags & NLM_F_CREATE)
err = ctnetlink_create_conntrack(cda,
&otuple,
&rtuple,
- master_ct,
NETLINK_CB(skb).pid,
- nlmsg_report(nlh));
+ nlmsg_report(nlh),
+ u3);
spin_unlock_bh(&nf_conntrack_lock);
- if (err < 0 && master_ct)
- nf_ct_put(master_ct);
-
return err;
}
/* implicit 'else' */
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 28/41: ctnetlink: cleanup conntrack update preliminary checkings
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (26 preceding siblings ...)
2009-03-24 14:03 ` netfilter 27/41: ctnetlink: cleanup master conntrack assignation Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 29/41: ctnetlink: move event reporting for new entries outside the lock Patrick McHardy
` (13 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit e098360f159b3358f085543eb6dc2eb500d6667c
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 16 15:27:22 2009 +0100
netfilter: ctnetlink: cleanup conntrack update preliminary checkings
This patch moves the preliminary checkings that must be fulfilled
to update a conntrack, which are the following:
* NAT manglings cannot be updated
* Changing the master conntrack is not allowed.
This patch is a cleanup.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index cca22d5..b67db69 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1062,6 +1062,10 @@ ctnetlink_change_conntrack(struct nf_conn *ct, struct nlattr *cda[])
{
int err;
+ /* only allow NAT changes and master assignation for new conntracks */
+ if (cda[CTA_NAT_SRC] || cda[CTA_NAT_DST] || cda[CTA_TUPLE_MASTER])
+ return -EOPNOTSUPP;
+
if (cda[CTA_HELP]) {
err = ctnetlink_change_helper(ct, cda);
if (err < 0)
@@ -1323,17 +1327,6 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb,
if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
- /* we only allow nat config for new conntracks */
- if (cda[CTA_NAT_SRC] || cda[CTA_NAT_DST]) {
- err = -EOPNOTSUPP;
- goto out_unlock;
- }
- /* can't link an existing conntrack to a master */
- if (cda[CTA_TUPLE_MASTER]) {
- err = -EOPNOTSUPP;
- goto out_unlock;
- }
-
err = ctnetlink_change_conntrack(ct, cda);
if (err == 0) {
nf_conntrack_get(&ct->ct_general);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 29/41: ctnetlink: move event reporting for new entries outside the lock
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (27 preceding siblings ...)
2009-03-24 14:03 ` netfilter 28/41: ctnetlink: cleanup conntrack update preliminary checkings Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 30/41: auto-load ip6_queue module when socket opened Patrick McHardy
` (12 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit f0a3c0869f3b0ef93d9df044e9a41e40086d4c97
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 16 15:28:09 2009 +0100
netfilter: ctnetlink: move event reporting for new entries outside the lock
This patch moves the event reporting outside the lock section. With
this patch, the creation and update of entries is homogeneous from
the event reporting perspective. Moreover, as the event reporting is
done outside the lock section, the netlink broadcast delivery can
benefit of the yield() call under congestion.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index b67db69..9fb7cf7 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1128,12 +1128,10 @@ ctnetlink_event_report(struct nf_conn *ct, u32 pid, int report)
report);
}
-static int
+static struct nf_conn *
ctnetlink_create_conntrack(struct nlattr *cda[],
struct nf_conntrack_tuple *otuple,
struct nf_conntrack_tuple *rtuple,
- u32 pid,
- int report,
u8 u3)
{
struct nf_conn *ct;
@@ -1142,7 +1140,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_ATOMIC);
if (IS_ERR(ct))
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
if (!cda[CTA_TIMEOUT])
goto err;
@@ -1265,18 +1263,14 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
ct->master = master_ct;
}
- nf_conntrack_get(&ct->ct_general);
add_timer(&ct->timeout);
nf_conntrack_hash_insert(ct);
rcu_read_unlock();
- ctnetlink_event_report(ct, pid, report);
- nf_ct_put(ct);
-
- return 0;
+ return ct;
err:
nf_conntrack_free(ct);
- return err;
+ return ERR_PTR(err);
}
static int
@@ -1309,14 +1303,25 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb,
if (h == NULL) {
err = -ENOENT;
- if (nlh->nlmsg_flags & NLM_F_CREATE)
- err = ctnetlink_create_conntrack(cda,
- &otuple,
- &rtuple,
- NETLINK_CB(skb).pid,
- nlmsg_report(nlh),
- u3);
- spin_unlock_bh(&nf_conntrack_lock);
+ if (nlh->nlmsg_flags & NLM_F_CREATE) {
+ struct nf_conn *ct;
+
+ ct = ctnetlink_create_conntrack(cda, &otuple,
+ &rtuple, u3);
+ if (IS_ERR(ct)) {
+ err = PTR_ERR(ct);
+ goto out_unlock;
+ }
+ err = 0;
+ nf_conntrack_get(&ct->ct_general);
+ spin_unlock_bh(&nf_conntrack_lock);
+ ctnetlink_event_report(ct,
+ NETLINK_CB(skb).pid,
+ nlmsg_report(nlh));
+ nf_ct_put(ct);
+ } else
+ spin_unlock_bh(&nf_conntrack_lock);
+
return err;
}
/* implicit 'else' */
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 30/41: auto-load ip6_queue module when socket opened
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (28 preceding siblings ...)
2009-03-24 14:03 ` netfilter 29/41: ctnetlink: move event reporting for new entries outside the lock Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 31/41: auto-load ip_queue " Patrick McHardy
` (11 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 26c3b6780618f09abb5f7e03b09b13dbb8e8aa24
Author: Scott James Remnant <scott@canonical.com>
Date: Mon Mar 16 15:30:14 2009 +0100
netfilter: auto-load ip6_queue module when socket opened
The ip6_queue module is missing the net-pf-16-proto-13 alias that would
cause it to be auto-loaded when a socket of that type is opened. This
patch adds the alias.
Signed-off-by: Scott James Remnant <scott@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c
index 5859c04..b693f84 100644
--- a/net/ipv6/netfilter/ip6_queue.c
+++ b/net/ipv6/netfilter/ip6_queue.c
@@ -643,6 +643,7 @@ static void __exit ip6_queue_fini(void)
MODULE_DESCRIPTION("IPv6 packet queue handler");
MODULE_LICENSE("GPL");
+MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_IP6_FW);
module_init(ip6_queue_init);
module_exit(ip6_queue_fini);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 31/41: auto-load ip_queue module when socket opened
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (29 preceding siblings ...)
2009-03-24 14:03 ` netfilter 30/41: auto-load ip6_queue module when socket opened Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 32/41: xtables: avoid pointer to self Patrick McHardy
` (10 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 95ba434f898c3cb5c7457dce265bf0ab72ba8ce9
Author: Scott James Remnant <scott@canonical.com>
Date: Mon Mar 16 15:31:10 2009 +0100
netfilter: auto-load ip_queue module when socket opened
The ip_queue module is missing the net-pf-16-proto-3 alias that would
causae it to be auto-loaded when a socket of that type is opened. This
patch adds the alias.
Signed-off-by: Scott James Remnant <scott@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c
index 432ce9d..5f22c91 100644
--- a/net/ipv4/netfilter/ip_queue.c
+++ b/net/ipv4/netfilter/ip_queue.c
@@ -24,6 +24,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/security.h>
+#include <linux/net.h>
#include <linux/mutex.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -640,6 +641,7 @@ static void __exit ip_queue_fini(void)
MODULE_DESCRIPTION("IPv4 packet queue handler");
MODULE_AUTHOR("James Morris <jmorris@intercode.com.au>");
MODULE_LICENSE("GPL");
+MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_FIREWALL);
module_init(ip_queue_init);
module_exit(ip_queue_fini);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 32/41: xtables: avoid pointer to self
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (30 preceding siblings ...)
2009-03-24 14:03 ` netfilter 31/41: auto-load ip_queue " Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` net 33/41: sysctl_net - use net_eq to compare nets Patrick McHardy
` (9 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit acc738fec03bdaa5b77340c32a82fbfedaaabef0
Author: Jan Engelhardt <jengelh@medozas.de>
Date: Mon Mar 16 15:35:29 2009 +0100
netfilter: xtables: avoid pointer to self
Commit 784544739a25c30637397ace5489eeb6e15d7d49 (netfilter: iptables:
lock free counters) broke a number of modules whose rule data referenced
itself. A reallocation would not reestablish the correct references, so
it is best to use a separate struct that does not fall under RCU.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/xt_limit.h b/include/linux/netfilter/xt_limit.h
index b3ce653..fda222c 100644
--- a/include/linux/netfilter/xt_limit.h
+++ b/include/linux/netfilter/xt_limit.h
@@ -4,6 +4,8 @@
/* timings are in milliseconds. */
#define XT_LIMIT_SCALE 10000
+struct xt_limit_priv;
+
/* 1/10,000 sec period => max of 10,000/sec. Min rate is then 429490
seconds, or one every 59 hours. */
struct xt_rateinfo {
@@ -11,11 +13,10 @@ struct xt_rateinfo {
u_int32_t burst; /* Period multiplier for upper limit. */
/* Used internally by the kernel */
- unsigned long prev;
- u_int32_t credit;
+ unsigned long prev; /* moved to xt_limit_priv */
+ u_int32_t credit; /* moved to xt_limit_priv */
u_int32_t credit_cap, cost;
- /* Ugly, ugly fucker. */
- struct xt_rateinfo *master;
+ struct xt_limit_priv *master;
};
#endif /*_XT_RATE_H*/
diff --git a/include/linux/netfilter/xt_quota.h b/include/linux/netfilter/xt_quota.h
index 4c8368d..8dc89df 100644
--- a/include/linux/netfilter/xt_quota.h
+++ b/include/linux/netfilter/xt_quota.h
@@ -6,13 +6,15 @@ enum xt_quota_flags {
};
#define XT_QUOTA_MASK 0x1
+struct xt_quota_priv;
+
struct xt_quota_info {
u_int32_t flags;
u_int32_t pad;
/* Used internally by the kernel */
aligned_u64 quota;
- struct xt_quota_info *master;
+ struct xt_quota_priv *master;
};
#endif /* _XT_QUOTA_H */
diff --git a/include/linux/netfilter/xt_statistic.h b/include/linux/netfilter/xt_statistic.h
index 3d38bc9..8f521ab 100644
--- a/include/linux/netfilter/xt_statistic.h
+++ b/include/linux/netfilter/xt_statistic.h
@@ -13,6 +13,8 @@ enum xt_statistic_flags {
};
#define XT_STATISTIC_MASK 0x1
+struct xt_statistic_priv;
+
struct xt_statistic_info {
u_int16_t mode;
u_int16_t flags;
@@ -23,11 +25,10 @@ struct xt_statistic_info {
struct {
u_int32_t every;
u_int32_t packet;
- /* Used internally by the kernel */
- u_int32_t count;
+ u_int32_t count; /* unused */
} nth;
} u;
- struct xt_statistic_info *master __attribute__((aligned(8)));
+ struct xt_statistic_priv *master __attribute__((aligned(8)));
};
#endif /* _XT_STATISTIC_H */
diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
index c908d69..2e8089e 100644
--- a/net/netfilter/xt_limit.c
+++ b/net/netfilter/xt_limit.c
@@ -14,6 +14,11 @@
#include <linux/netfilter/x_tables.h>
#include <linux/netfilter/xt_limit.h>
+struct xt_limit_priv {
+ unsigned long prev;
+ uint32_t credit;
+};
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Herve Eychenne <rv@wallfire.org>");
MODULE_DESCRIPTION("Xtables: rate-limit match");
@@ -60,18 +65,18 @@ static DEFINE_SPINLOCK(limit_lock);
static bool
limit_mt(const struct sk_buff *skb, const struct xt_match_param *par)
{
- struct xt_rateinfo *r =
- ((const struct xt_rateinfo *)par->matchinfo)->master;
+ const struct xt_rateinfo *r = par->matchinfo;
+ struct xt_limit_priv *priv = r->master;
unsigned long now = jiffies;
spin_lock_bh(&limit_lock);
- r->credit += (now - xchg(&r->prev, now)) * CREDITS_PER_JIFFY;
- if (r->credit > r->credit_cap)
- r->credit = r->credit_cap;
+ priv->credit += (now - xchg(&priv->prev, now)) * CREDITS_PER_JIFFY;
+ if (priv->credit > r->credit_cap)
+ priv->credit = r->credit_cap;
- if (r->credit >= r->cost) {
+ if (priv->credit >= r->cost) {
/* We're not limited. */
- r->credit -= r->cost;
+ priv->credit -= r->cost;
spin_unlock_bh(&limit_lock);
return true;
}
@@ -95,6 +100,7 @@ user2credits(u_int32_t user)
static bool limit_mt_check(const struct xt_mtchk_param *par)
{
struct xt_rateinfo *r = par->matchinfo;
+ struct xt_limit_priv *priv;
/* Check for overflow. */
if (r->burst == 0
@@ -104,19 +110,30 @@ static bool limit_mt_check(const struct xt_mtchk_param *par)
return false;
}
- /* For SMP, we only want to use one set of counters. */
- r->master = r;
+ priv = kmalloc(sizeof(*priv), GFP_KERNEL);
+ if (priv == NULL)
+ return -ENOMEM;
+
+ /* For SMP, we only want to use one set of state. */
+ r->master = priv;
if (r->cost == 0) {
/* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies *
128. */
- r->prev = jiffies;
- r->credit = user2credits(r->avg * r->burst); /* Credits full. */
+ priv->prev = jiffies;
+ priv->credit = user2credits(r->avg * r->burst); /* Credits full. */
r->credit_cap = user2credits(r->avg * r->burst); /* Credits full. */
r->cost = user2credits(r->avg);
}
return true;
}
+static void limit_mt_destroy(const struct xt_mtdtor_param *par)
+{
+ const struct xt_rateinfo *info = par->matchinfo;
+
+ kfree(info->master);
+}
+
#ifdef CONFIG_COMPAT
struct compat_xt_rateinfo {
u_int32_t avg;
@@ -167,6 +184,7 @@ static struct xt_match limit_mt_reg __read_mostly = {
.family = NFPROTO_UNSPEC,
.match = limit_mt,
.checkentry = limit_mt_check,
+ .destroy = limit_mt_destroy,
.matchsize = sizeof(struct xt_rateinfo),
#ifdef CONFIG_COMPAT
.compatsize = sizeof(struct compat_xt_rateinfo),
diff --git a/net/netfilter/xt_quota.c b/net/netfilter/xt_quota.c
index c84fce5..01dd07b 100644
--- a/net/netfilter/xt_quota.c
+++ b/net/netfilter/xt_quota.c
@@ -9,6 +9,10 @@
#include <linux/netfilter/x_tables.h>
#include <linux/netfilter/xt_quota.h>
+struct xt_quota_priv {
+ uint64_t quota;
+};
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Sam Johnston <samj@samj.net>");
MODULE_DESCRIPTION("Xtables: countdown quota match");
@@ -20,18 +24,20 @@ static DEFINE_SPINLOCK(quota_lock);
static bool
quota_mt(const struct sk_buff *skb, const struct xt_match_param *par)
{
- struct xt_quota_info *q =
- ((const struct xt_quota_info *)par->matchinfo)->master;
+ struct xt_quota_info *q = (void *)par->matchinfo;
+ struct xt_quota_priv *priv = q->master;
bool ret = q->flags & XT_QUOTA_INVERT;
spin_lock_bh("a_lock);
- if (q->quota >= skb->len) {
- q->quota -= skb->len;
+ if (priv->quota >= skb->len) {
+ priv->quota -= skb->len;
ret = !ret;
} else {
/* we do not allow even small packets from now on */
- q->quota = 0;
+ priv->quota = 0;
}
+ /* Copy quota back to matchinfo so that iptables can display it */
+ q->quota = priv->quota;
spin_unlock_bh("a_lock);
return ret;
@@ -43,17 +49,28 @@ static bool quota_mt_check(const struct xt_mtchk_param *par)
if (q->flags & ~XT_QUOTA_MASK)
return false;
- /* For SMP, we only want to use one set of counters. */
- q->master = q;
+
+ q->master = kmalloc(sizeof(*q->master), GFP_KERNEL);
+ if (q->master == NULL)
+ return -ENOMEM;
+
return true;
}
+static void quota_mt_destroy(const struct xt_mtdtor_param *par)
+{
+ const struct xt_quota_info *q = par->matchinfo;
+
+ kfree(q->master);
+}
+
static struct xt_match quota_mt_reg __read_mostly = {
.name = "quota",
.revision = 0,
.family = NFPROTO_UNSPEC,
.match = quota_mt,
.checkentry = quota_mt_check,
+ .destroy = quota_mt_destroy,
.matchsize = sizeof(struct xt_quota_info),
.me = THIS_MODULE,
};
diff --git a/net/netfilter/xt_statistic.c b/net/netfilter/xt_statistic.c
index 0d75141..d8c0f8f 100644
--- a/net/netfilter/xt_statistic.c
+++ b/net/netfilter/xt_statistic.c
@@ -16,6 +16,10 @@
#include <linux/netfilter/xt_statistic.h>
#include <linux/netfilter/x_tables.h>
+struct xt_statistic_priv {
+ uint32_t count;
+};
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
MODULE_DESCRIPTION("Xtables: statistics-based matching (\"Nth\", random)");
@@ -27,7 +31,7 @@ static DEFINE_SPINLOCK(nth_lock);
static bool
statistic_mt(const struct sk_buff *skb, const struct xt_match_param *par)
{
- struct xt_statistic_info *info = (void *)par->matchinfo;
+ const struct xt_statistic_info *info = par->matchinfo;
bool ret = info->flags & XT_STATISTIC_INVERT;
switch (info->mode) {
@@ -36,10 +40,9 @@ statistic_mt(const struct sk_buff *skb, const struct xt_match_param *par)
ret = !ret;
break;
case XT_STATISTIC_MODE_NTH:
- info = info->master;
spin_lock_bh(&nth_lock);
- if (info->u.nth.count++ == info->u.nth.every) {
- info->u.nth.count = 0;
+ if (info->master->count++ == info->u.nth.every) {
+ info->master->count = 0;
ret = !ret;
}
spin_unlock_bh(&nth_lock);
@@ -56,16 +59,31 @@ static bool statistic_mt_check(const struct xt_mtchk_param *par)
if (info->mode > XT_STATISTIC_MODE_MAX ||
info->flags & ~XT_STATISTIC_MASK)
return false;
- info->master = info;
+
+ info->master = kzalloc(sizeof(*info->master), GFP_KERNEL);
+ if (info->master == NULL) {
+ printk(KERN_ERR KBUILD_MODNAME ": Out of memory\n");
+ return false;
+ }
+ info->master->count = info->u.nth.count;
+
return true;
}
+static void statistic_mt_destroy(const struct xt_mtdtor_param *par)
+{
+ const struct xt_statistic_info *info = par->matchinfo;
+
+ kfree(info->master);
+}
+
static struct xt_match xt_statistic_mt_reg __read_mostly = {
.name = "statistic",
.revision = 0,
.family = NFPROTO_UNSPEC,
.match = statistic_mt,
.checkentry = statistic_mt_check,
+ .destroy = statistic_mt_destroy,
.matchsize = sizeof(struct xt_statistic_info),
.me = THIS_MODULE,
};
^ permalink raw reply related [flat|nested] 58+ messages in thread* net 33/41: sysctl_net - use net_eq to compare nets
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (31 preceding siblings ...)
2009-03-24 14:03 ` netfilter 32/41: xtables: avoid pointer to self Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` net 34/41: netfilter conntrack - add per-net functionality for DCCP protocol Patrick McHardy
` (8 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 81a1d3c31e3517f9939b3e04d21cf4a6b0997419
Author: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Mon Mar 16 16:23:30 2009 +0100
net: sysctl_net - use net_eq to compare nets
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index 972201c..0b15d72 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -61,7 +61,7 @@ static struct ctl_table_root net_sysctl_root = {
static int net_ctl_ro_header_perms(struct ctl_table_root *root,
struct nsproxy *namespaces, struct ctl_table *table)
{
- if (namespaces->net_ns == &init_net)
+ if (net_eq(namespaces->net_ns, &init_net))
return table->mode;
else
return table->mode & ~0222;
^ permalink raw reply related [flat|nested] 58+ messages in thread* net 34/41: netfilter conntrack - add per-net functionality for DCCP protocol
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (32 preceding siblings ...)
2009-03-24 14:03 ` net 33/41: sysctl_net - use net_eq to compare nets Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 35/41: xtables: add cluster match Patrick McHardy
` (7 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 1546000fe8db0d3f47b0ef1dd487ec23fbd95313
Author: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Mon Mar 16 16:30:49 2009 +0100
net: netfilter conntrack - add per-net functionality for DCCP protocol
Module specific data moved into per-net site and being allocated/freed
during net namespace creation/deletion.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
index 8fcf176..d3d5a7f 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -16,6 +16,9 @@
#include <linux/skbuff.h>
#include <linux/dccp.h>
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
+
#include <linux/netfilter/nfnetlink_conntrack.h>
#include <net/netfilter/nf_conntrack.h>
#include <net/netfilter/nf_conntrack_l4proto.h>
@@ -23,8 +26,6 @@
static DEFINE_RWLOCK(dccp_lock);
-static int nf_ct_dccp_loose __read_mostly = 1;
-
/* Timeouts are based on values from RFC4340:
*
* - REQUEST:
@@ -72,16 +73,6 @@ static int nf_ct_dccp_loose __read_mostly = 1;
#define DCCP_MSL (2 * 60 * HZ)
-static unsigned int dccp_timeout[CT_DCCP_MAX + 1] __read_mostly = {
- [CT_DCCP_REQUEST] = 2 * DCCP_MSL,
- [CT_DCCP_RESPOND] = 4 * DCCP_MSL,
- [CT_DCCP_PARTOPEN] = 4 * DCCP_MSL,
- [CT_DCCP_OPEN] = 12 * 3600 * HZ,
- [CT_DCCP_CLOSEREQ] = 64 * HZ,
- [CT_DCCP_CLOSING] = 64 * HZ,
- [CT_DCCP_TIMEWAIT] = 2 * DCCP_MSL,
-};
-
static const char * const dccp_state_names[] = {
[CT_DCCP_NONE] = "NONE",
[CT_DCCP_REQUEST] = "REQUEST",
@@ -393,6 +384,22 @@ dccp_state_table[CT_DCCP_ROLE_MAX + 1][DCCP_PKT_SYNCACK + 1][CT_DCCP_MAX + 1] =
},
};
+/* this module per-net specifics */
+static int dccp_net_id;
+struct dccp_net {
+ int dccp_loose;
+ unsigned int dccp_timeout[CT_DCCP_MAX + 1];
+#ifdef CONFIG_SYSCTL
+ struct ctl_table_header *sysctl_header;
+ struct ctl_table *sysctl_table;
+#endif
+};
+
+static inline struct dccp_net *dccp_pernet(struct net *net)
+{
+ return net_generic(net, dccp_net_id);
+}
+
static bool dccp_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff,
struct nf_conntrack_tuple *tuple)
{
@@ -419,6 +426,7 @@ static bool dccp_new(struct nf_conn *ct, const struct sk_buff *skb,
unsigned int dataoff)
{
struct net *net = nf_ct_net(ct);
+ struct dccp_net *dn;
struct dccp_hdr _dh, *dh;
const char *msg;
u_int8_t state;
@@ -429,7 +437,8 @@ static bool dccp_new(struct nf_conn *ct, const struct sk_buff *skb,
state = dccp_state_table[CT_DCCP_ROLE_CLIENT][dh->dccph_type][CT_DCCP_NONE];
switch (state) {
default:
- if (nf_ct_dccp_loose == 0) {
+ dn = dccp_pernet(net);
+ if (dn->dccp_loose == 0) {
msg = "nf_ct_dccp: not picking up existing connection ";
goto out_invalid;
}
@@ -465,6 +474,7 @@ static int dccp_packet(struct nf_conn *ct, const struct sk_buff *skb,
u_int8_t pf, unsigned int hooknum)
{
struct net *net = nf_ct_net(ct);
+ struct dccp_net *dn;
enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
struct dccp_hdr _dh, *dh;
u_int8_t type, old_state, new_state;
@@ -542,7 +552,9 @@ static int dccp_packet(struct nf_conn *ct, const struct sk_buff *skb,
ct->proto.dccp.last_pkt = type;
ct->proto.dccp.state = new_state;
write_unlock_bh(&dccp_lock);
- nf_ct_refresh_acct(ct, ctinfo, skb, dccp_timeout[new_state]);
+
+ dn = dccp_pernet(net);
+ nf_ct_refresh_acct(ct, ctinfo, skb, dn->dccp_timeout[new_state]);
return NF_ACCEPT;
}
@@ -660,13 +672,11 @@ static int nlattr_to_dccp(struct nlattr *cda[], struct nf_conn *ct)
#endif
#ifdef CONFIG_SYSCTL
-static unsigned int dccp_sysctl_table_users;
-static struct ctl_table_header *dccp_sysctl_header;
-static ctl_table dccp_sysctl_table[] = {
+/* template, data assigned later */
+static struct ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_request",
- .data = &dccp_timeout[CT_DCCP_REQUEST],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -674,7 +684,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_respond",
- .data = &dccp_timeout[CT_DCCP_RESPOND],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -682,7 +691,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_partopen",
- .data = &dccp_timeout[CT_DCCP_PARTOPEN],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -690,7 +698,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_open",
- .data = &dccp_timeout[CT_DCCP_OPEN],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -698,7 +705,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_closereq",
- .data = &dccp_timeout[CT_DCCP_CLOSEREQ],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -706,7 +712,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_closing",
- .data = &dccp_timeout[CT_DCCP_CLOSING],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -714,7 +719,6 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_timeout_timewait",
- .data = &dccp_timeout[CT_DCCP_TIMEWAIT],
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -722,8 +726,7 @@ static ctl_table dccp_sysctl_table[] = {
{
.ctl_name = CTL_UNNUMBERED,
.procname = "nf_conntrack_dccp_loose",
- .data = &nf_ct_dccp_loose,
- .maxlen = sizeof(nf_ct_dccp_loose),
+ .maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
},
@@ -751,11 +754,6 @@ static struct nf_conntrack_l4proto dccp_proto4 __read_mostly = {
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
-#ifdef CONFIG_SYSCTL
- .ctl_table_users = &dccp_sysctl_table_users,
- .ctl_table_header = &dccp_sysctl_header,
- .ctl_table = dccp_sysctl_table,
-#endif
};
static struct nf_conntrack_l4proto dccp_proto6 __read_mostly = {
@@ -776,34 +774,107 @@ static struct nf_conntrack_l4proto dccp_proto6 __read_mostly = {
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
+};
+
+static __net_init int dccp_net_init(struct net *net)
+{
+ struct dccp_net *dn;
+ int err;
+
+ dn = kmalloc(sizeof(*dn), GFP_KERNEL);
+ if (!dn)
+ return -ENOMEM;
+
+ /* default values */
+ dn->dccp_loose = 1;
+ dn->dccp_timeout[CT_DCCP_REQUEST] = 2 * DCCP_MSL;
+ dn->dccp_timeout[CT_DCCP_RESPOND] = 4 * DCCP_MSL;
+ dn->dccp_timeout[CT_DCCP_PARTOPEN] = 4 * DCCP_MSL;
+ dn->dccp_timeout[CT_DCCP_OPEN] = 12 * 3600 * HZ;
+ dn->dccp_timeout[CT_DCCP_CLOSEREQ] = 64 * HZ;
+ dn->dccp_timeout[CT_DCCP_CLOSING] = 64 * HZ;
+ dn->dccp_timeout[CT_DCCP_TIMEWAIT] = 2 * DCCP_MSL;
+
+ err = net_assign_generic(net, dccp_net_id, dn);
+ if (err)
+ goto out;
+
#ifdef CONFIG_SYSCTL
- .ctl_table_users = &dccp_sysctl_table_users,
- .ctl_table_header = &dccp_sysctl_header,
- .ctl_table = dccp_sysctl_table,
+ err = -ENOMEM;
+ dn->sysctl_table = kmemdup(dccp_sysctl_table,
+ sizeof(dccp_sysctl_table), GFP_KERNEL);
+ if (!dn->sysctl_table)
+ goto out;
+
+ dn->sysctl_table[0].data = &dn->dccp_timeout[CT_DCCP_REQUEST];
+ dn->sysctl_table[1].data = &dn->dccp_timeout[CT_DCCP_RESPOND];
+ dn->sysctl_table[2].data = &dn->dccp_timeout[CT_DCCP_PARTOPEN];
+ dn->sysctl_table[3].data = &dn->dccp_timeout[CT_DCCP_OPEN];
+ dn->sysctl_table[4].data = &dn->dccp_timeout[CT_DCCP_CLOSEREQ];
+ dn->sysctl_table[5].data = &dn->dccp_timeout[CT_DCCP_CLOSING];
+ dn->sysctl_table[6].data = &dn->dccp_timeout[CT_DCCP_TIMEWAIT];
+ dn->sysctl_table[7].data = &dn->dccp_loose;
+
+ dn->sysctl_header = register_net_sysctl_table(net,
+ nf_net_netfilter_sysctl_path, dn->sysctl_table);
+ if (!dn->sysctl_header) {
+ kfree(dn->sysctl_table);
+ goto out;
+ }
#endif
+
+ return 0;
+
+out:
+ kfree(dn);
+ return err;
+}
+
+static __net_exit void dccp_net_exit(struct net *net)
+{
+ struct dccp_net *dn = dccp_pernet(net);
+#ifdef CONFIG_SYSCTL
+ unregister_net_sysctl_table(dn->sysctl_header);
+ kfree(dn->sysctl_table);
+#endif
+ kfree(dn);
+
+ net_assign_generic(net, dccp_net_id, NULL);
+}
+
+static struct pernet_operations dccp_net_ops = {
+ .init = dccp_net_init,
+ .exit = dccp_net_exit,
};
static int __init nf_conntrack_proto_dccp_init(void)
{
int err;
- err = nf_conntrack_l4proto_register(&dccp_proto4);
+ err = register_pernet_gen_subsys(&dccp_net_id, &dccp_net_ops);
if (err < 0)
goto err1;
- err = nf_conntrack_l4proto_register(&dccp_proto6);
+ err = nf_conntrack_l4proto_register(&dccp_proto4);
if (err < 0)
goto err2;
+
+ err = nf_conntrack_l4proto_register(&dccp_proto6);
+ if (err < 0)
+ goto err3;
return 0;
-err2:
+err3:
nf_conntrack_l4proto_unregister(&dccp_proto4);
+err2:
+ unregister_pernet_gen_subsys(dccp_net_id, &dccp_net_ops);
err1:
return err;
}
static void __exit nf_conntrack_proto_dccp_fini(void)
{
+ unregister_pernet_gen_subsys(dccp_net_id, &dccp_net_ops);
nf_conntrack_l4proto_unregister(&dccp_proto6);
nf_conntrack_l4proto_unregister(&dccp_proto4);
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 35/41: xtables: add cluster match
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (33 preceding siblings ...)
2009-03-24 14:03 ` net 34/41: netfilter conntrack - add per-net functionality for DCCP protocol Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 36/41: ctnetlink: remove remaining module refcounting Patrick McHardy
` (6 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 0269ea4937343536ec7e85649932bc8c9686ea78
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 16 17:10:36 2009 +0100
netfilter: xtables: add cluster match
This patch adds the iptables cluster match. This match can be used
to deploy gateway and back-end load-sharing clusters. The cluster
can be composed of 32 nodes maximum (although I have only tested
this with two nodes, so I cannot tell what is the real scalability
limit of this solution in terms of cluster nodes).
Assuming that all the nodes see all packets (see below for an
example on how to do that if your switch does not allow this), the
cluster match decides if this node has to handle a packet given:
(jhash(source IP) % total_nodes) & node_mask
For related connections, the master conntrack is used. The following
is an example of its use to deploy a gateway cluster composed of two
nodes (where this is the node 1):
iptables -I PREROUTING -t mangle -i eth1 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth1 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth1 \
-m mark ! --mark 0xffff -j DROP
iptables -A PREROUTING -t mangle -i eth2 -m cluster \
--cluster-total-nodes 2 --cluster-local-node 1 \
--cluster-proc-name eth2 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth2 \
-m mark ! --mark 0xffff -j DROP
And the following commands to make all nodes see the same packets:
ip maddr add 01:00:5e:00:01:01 dev eth1
ip maddr add 01:00:5e:00:01:02 dev eth2
arptables -I OUTPUT -o eth1 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:01
arptables -I INPUT -i eth1 --h-length 6 \
--destination-mac 01:00:5e:00:01:01 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
arptables -I OUTPUT -o eth2 --h-length 6 \
-j mangle --mangle-mac-s 01:00:5e:00:01:02
arptables -I INPUT -i eth2 --h-length 6 \
--destination-mac 01:00:5e:00:01:02 \
-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
In the case of TCP connections, pickup facility has to be disabled
to avoid marking TCP ACK packets coming in the reply direction as
valid.
echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
BTW, some final notes:
* This match mangles the skbuff pkt_type in case that it detects
PACKET_MULTICAST for a non-multicast address. This may be done in
a PKTTYPE target for this sole purpose.
* This match supersedes the CLUSTERIP target.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index 947b47d..af9d2fb 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -21,6 +21,7 @@ header-y += xt_connbytes.h
header-y += xt_connlimit.h
header-y += xt_connmark.h
header-y += xt_conntrack.h
+header-y += xt_cluster.h
header-y += xt_dccp.h
header-y += xt_dscp.h
header-y += xt_esp.h
diff --git a/include/linux/netfilter/xt_cluster.h b/include/linux/netfilter/xt_cluster.h
new file mode 100644
index 0000000..5e0a0d0
--- /dev/null
+++ b/include/linux/netfilter/xt_cluster.h
@@ -0,0 +1,15 @@
+#ifndef _XT_CLUSTER_MATCH_H
+#define _XT_CLUSTER_MATCH_H
+
+enum xt_cluster_flags {
+ XT_CLUSTER_F_INV = (1 << 0)
+};
+
+struct xt_cluster_match_info {
+ u_int32_t total_nodes;
+ u_int32_t node_mask;
+ u_int32_t hash_seed;
+ u_int32_t flags;
+};
+
+#endif /* _XT_CLUSTER_MATCH_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index cdbaaff..2562d05 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -527,6 +527,22 @@ config NETFILTER_XT_TARGET_TCPOPTSTRIP
This option adds a "TCPOPTSTRIP" target, which allows you to strip
TCP options from TCP packets.
+config NETFILTER_XT_MATCH_CLUSTER
+ tristate '"cluster" match support'
+ depends on NF_CONNTRACK
+ depends on NETFILTER_ADVANCED
+ ---help---
+ This option allows you to build work-load-sharing clusters of
+ network servers/stateful firewalls without having a dedicated
+ load-balancing router/server/switch. Basically, this match returns
+ true when the packet must be handled by this cluster node. Thus,
+ all nodes see all packets and this match decides which node handles
+ what packets. The work-load sharing algorithm is based on source
+ address hashing.
+
+ If you say Y or M here, try `iptables -m cluster --help` for
+ more information.
+
config NETFILTER_XT_MATCH_COMMENT
tristate '"comment" match support'
depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 7a9b839..6282060 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP) += xt_TCPOPTSTRIP.o
obj-$(CONFIG_NETFILTER_XT_TARGET_TRACE) += xt_TRACE.o
# matches
+obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
obj-$(CONFIG_NETFILTER_XT_MATCH_CONNLIMIT) += xt_connlimit.o
diff --git a/net/netfilter/xt_cluster.c b/net/netfilter/xt_cluster.c
new file mode 100644
index 0000000..ad5bd89
--- /dev/null
+++ b/net/netfilter/xt_cluster.c
@@ -0,0 +1,164 @@
+/*
+ * (C) 2008-2009 Pablo Neira Ayuso <pablo@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/jhash.h>
+#include <linux/ip.h>
+#include <net/ipv6.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <net/netfilter/nf_conntrack.h>
+#include <linux/netfilter/xt_cluster.h>
+
+static inline u_int32_t nf_ct_orig_ipv4_src(const struct nf_conn *ct)
+{
+ return ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip;
+}
+
+static inline const void *nf_ct_orig_ipv6_src(const struct nf_conn *ct)
+{
+ return ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip6;
+}
+
+static inline u_int32_t
+xt_cluster_hash_ipv4(u_int32_t ip, const struct xt_cluster_match_info *info)
+{
+ return jhash_1word(ip, info->hash_seed);
+}
+
+static inline u_int32_t
+xt_cluster_hash_ipv6(const void *ip, const struct xt_cluster_match_info *info)
+{
+ return jhash2(ip, NF_CT_TUPLE_L3SIZE / sizeof(__u32), info->hash_seed);
+}
+
+static inline u_int32_t
+xt_cluster_hash(const struct nf_conn *ct,
+ const struct xt_cluster_match_info *info)
+{
+ u_int32_t hash = 0;
+
+ switch(nf_ct_l3num(ct)) {
+ case AF_INET:
+ hash = xt_cluster_hash_ipv4(nf_ct_orig_ipv4_src(ct), info);
+ break;
+ case AF_INET6:
+ hash = xt_cluster_hash_ipv6(nf_ct_orig_ipv6_src(ct), info);
+ break;
+ default:
+ WARN_ON(1);
+ break;
+ }
+ return (((u64)hash * info->total_nodes) >> 32);
+}
+
+static inline bool
+xt_cluster_is_multicast_addr(const struct sk_buff *skb, u_int8_t family)
+{
+ bool is_multicast = false;
+
+ switch(family) {
+ case NFPROTO_IPV4:
+ is_multicast = ipv4_is_multicast(ip_hdr(skb)->daddr);
+ break;
+ case NFPROTO_IPV6:
+ is_multicast = ipv6_addr_type(&ipv6_hdr(skb)->daddr) &
+ IPV6_ADDR_MULTICAST;
+ break;
+ default:
+ WARN_ON(1);
+ break;
+ }
+ return is_multicast;
+}
+
+static bool
+xt_cluster_mt(const struct sk_buff *skb, const struct xt_match_param *par)
+{
+ struct sk_buff *pskb = (struct sk_buff *)skb;
+ const struct xt_cluster_match_info *info = par->matchinfo;
+ const struct nf_conn *ct;
+ enum ip_conntrack_info ctinfo;
+ unsigned long hash;
+
+ /* This match assumes that all nodes see the same packets. This can be
+ * achieved if the switch that connects the cluster nodes support some
+ * sort of 'port mirroring'. However, if your switch does not support
+ * this, your cluster nodes can reply ARP request using a multicast MAC
+ * address. Thus, your switch will flood the same packets to the
+ * cluster nodes with the same multicast MAC address. Using a multicast
+ * link address is a RFC 1812 (section 3.3.2) violation, but this works
+ * fine in practise.
+ *
+ * Unfortunately, if you use the multicast MAC address, the link layer
+ * sets skbuff's pkt_type to PACKET_MULTICAST, which is not accepted
+ * by TCP and others for packets coming to this node. For that reason,
+ * this match mangles skbuff's pkt_type if it detects a packet
+ * addressed to a unicast address but using PACKET_MULTICAST. Yes, I
+ * know, matches should not alter packets, but we are doing this here
+ * because we would need to add a PKTTYPE target for this sole purpose.
+ */
+ if (!xt_cluster_is_multicast_addr(skb, par->family) &&
+ skb->pkt_type == PACKET_MULTICAST) {
+ pskb->pkt_type = PACKET_HOST;
+ }
+
+ ct = nf_ct_get(skb, &ctinfo);
+ if (ct == NULL)
+ return false;
+
+ if (ct == &nf_conntrack_untracked)
+ return false;
+
+ if (ct->master)
+ hash = xt_cluster_hash(ct->master, info);
+ else
+ hash = xt_cluster_hash(ct, info);
+
+ return !!((1 << hash) & info->node_mask) ^
+ !!(info->flags & XT_CLUSTER_F_INV);
+}
+
+static bool xt_cluster_mt_checkentry(const struct xt_mtchk_param *par)
+{
+ struct xt_cluster_match_info *info = par->matchinfo;
+
+ if (info->node_mask >= (1 << info->total_nodes)) {
+ printk(KERN_ERR "xt_cluster: this node mask cannot be "
+ "higher than the total number of nodes\n");
+ return false;
+ }
+ return true;
+}
+
+static struct xt_match xt_cluster_match __read_mostly = {
+ .name = "cluster",
+ .family = NFPROTO_UNSPEC,
+ .match = xt_cluster_mt,
+ .checkentry = xt_cluster_mt_checkentry,
+ .matchsize = sizeof(struct xt_cluster_match_info),
+ .me = THIS_MODULE,
+};
+
+static int __init xt_cluster_mt_init(void)
+{
+ return xt_register_match(&xt_cluster_match);
+}
+
+static void __exit xt_cluster_mt_fini(void)
+{
+ xt_unregister_match(&xt_cluster_match);
+}
+
+MODULE_AUTHOR("Pablo Neira Ayuso <pablo@netfilter.org>");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Xtables: hash-based cluster match");
+MODULE_ALIAS("ipt_cluster");
+MODULE_ALIAS("ip6t_cluster");
+module_init(xt_cluster_mt_init);
+module_exit(xt_cluster_mt_fini);
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 36/41: ctnetlink: remove remaining module refcounting
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (34 preceding siblings ...)
2009-03-24 14:03 ` netfilter 35/41: xtables: add cluster match Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 37/41: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put Patrick McHardy
` (5 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit cd91566e4bdbcb8841385e4b2eacc8d0c29c9208
Author: Florian Westphal <fw@strlen.de>
Date: Wed Mar 18 17:28:37 2009 +0100
netfilter: ctnetlink: remove remaining module refcounting
Convert the remaining refcount users.
As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 9fb7cf7..735ea9c 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -599,7 +599,8 @@ ctnetlink_parse_tuple_ip(struct nlattr *attr, struct nf_conntrack_tuple *tuple)
nla_parse_nested(tb, CTA_IP_MAX, attr, NULL);
- l3proto = nf_ct_l3proto_find_get(tuple->src.l3num);
+ rcu_read_lock();
+ l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
if (likely(l3proto->nlattr_to_tuple)) {
ret = nla_validate_nested(attr, CTA_IP_MAX,
@@ -608,7 +609,7 @@ ctnetlink_parse_tuple_ip(struct nlattr *attr, struct nf_conntrack_tuple *tuple)
ret = l3proto->nlattr_to_tuple(tb, tuple);
}
- nf_ct_l3proto_put(l3proto);
+ rcu_read_unlock();
return ret;
}
@@ -633,7 +634,8 @@ ctnetlink_parse_tuple_proto(struct nlattr *attr,
return -EINVAL;
tuple->dst.protonum = nla_get_u8(tb[CTA_PROTO_NUM]);
- l4proto = nf_ct_l4proto_find_get(tuple->src.l3num, tuple->dst.protonum);
+ rcu_read_lock();
+ l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
if (likely(l4proto->nlattr_to_tuple)) {
ret = nla_validate_nested(attr, CTA_PROTO_MAX,
@@ -642,7 +644,7 @@ ctnetlink_parse_tuple_proto(struct nlattr *attr,
ret = l4proto->nlattr_to_tuple(tb, tuple);
}
- nf_ct_l4proto_put(l4proto);
+ rcu_read_unlock();
return ret;
}
@@ -989,10 +991,11 @@ ctnetlink_change_protoinfo(struct nf_conn *ct, struct nlattr *cda[])
nla_parse_nested(tb, CTA_PROTOINFO_MAX, attr, NULL);
- l4proto = nf_ct_l4proto_find_get(nf_ct_l3num(ct), nf_ct_protonum(ct));
+ rcu_read_lock();
+ l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
if (l4proto->from_nlattr)
err = l4proto->from_nlattr(tb, ct);
- nf_ct_l4proto_put(l4proto);
+ rcu_read_unlock();
return err;
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 37/41: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (35 preceding siblings ...)
2009-03-24 14:03 ` netfilter 36/41: ctnetlink: remove remaining module refcounting Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 38/41: ctnetlink: fix rcu context imbalance Patrick McHardy
` (4 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 711d60a9e7f88e394ccca10f5fc83f95f0cea5b1
Author: Florian Westphal <fw@strlen.de>
Date: Wed Mar 18 17:30:50 2009 +0100
netfilter: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put
users have been moved to __nf_ct_l4proto_find.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index 16ab604..b01070b 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -98,11 +98,6 @@ extern struct nf_conntrack_l4proto nf_conntrack_l4proto_generic;
extern struct nf_conntrack_l4proto *
__nf_ct_l4proto_find(u_int16_t l3proto, u_int8_t l4proto);
-extern struct nf_conntrack_l4proto *
-nf_ct_l4proto_find_get(u_int16_t l3proto, u_int8_t protocol);
-
-extern void nf_ct_l4proto_put(struct nf_conntrack_l4proto *p);
-
/* Protocol registration. */
extern int nf_conntrack_l4proto_register(struct nf_conntrack_l4proto *proto);
extern void nf_conntrack_l4proto_unregister(struct nf_conntrack_l4proto *proto);
diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 592d733..9a62b4e 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -74,27 +74,6 @@ EXPORT_SYMBOL_GPL(__nf_ct_l4proto_find);
/* this is guaranteed to always return a valid protocol helper, since
* it falls back to generic_protocol */
-struct nf_conntrack_l4proto *
-nf_ct_l4proto_find_get(u_int16_t l3proto, u_int8_t l4proto)
-{
- struct nf_conntrack_l4proto *p;
-
- rcu_read_lock();
- p = __nf_ct_l4proto_find(l3proto, l4proto);
- if (!try_module_get(p->me))
- p = &nf_conntrack_l4proto_generic;
- rcu_read_unlock();
-
- return p;
-}
-EXPORT_SYMBOL_GPL(nf_ct_l4proto_find_get);
-
-void nf_ct_l4proto_put(struct nf_conntrack_l4proto *p)
-{
- module_put(p->me);
-}
-EXPORT_SYMBOL_GPL(nf_ct_l4proto_put);
-
struct nf_conntrack_l3proto *
nf_ct_l3proto_find_get(u_int16_t l3proto)
{
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 38/41: ctnetlink: fix rcu context imbalance
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (36 preceding siblings ...)
2009-03-24 14:03 ` netfilter 37/41: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:03 ` netfilter 39/41: sysctl support of logger choice Patrick McHardy
` (3 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 0f5b3e85a3716efebb0150ebb7c6d022e2bf17d7
Author: Patrick McHardy <kaber@trash.net>
Date: Wed Mar 18 17:36:40 2009 +0100
netfilter: ctnetlink: fix rcu context imbalance
Introduced by 7ec47496 (netfilter: ctnetlink: cleanup master conntrack assignation):
net/netfilter/nf_conntrack_netlink.c:1275:2: warning: context imbalance in 'ctnetlink_create_conntrack' - different lock contexts for basic block
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 735ea9c..d1fe9d1 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1146,7 +1146,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
return ERR_PTR(-ENOMEM);
if (!cda[CTA_TIMEOUT])
- goto err;
+ goto err1;
ct->timeout.expires = ntohl(nla_get_be32(cda[CTA_TIMEOUT]));
ct->timeout.expires = jiffies + ct->timeout.expires * HZ;
@@ -1157,10 +1157,8 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
char *helpname;
err = ctnetlink_parse_help(cda[CTA_HELP], &helpname);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
helper = __nf_conntrack_helper_find_byname(helpname);
if (helper == NULL) {
@@ -1168,28 +1166,26 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
#ifdef CONFIG_MODULES
if (request_module("nfct-helper-%s", helpname) < 0) {
err = -EOPNOTSUPP;
- goto err;
+ goto err1;
}
rcu_read_lock();
helper = __nf_conntrack_helper_find_byname(helpname);
if (helper) {
- rcu_read_unlock();
err = -EAGAIN;
- goto err;
+ goto err2;
}
rcu_read_unlock();
#endif
err = -EOPNOTSUPP;
- goto err;
+ goto err1;
} else {
struct nf_conn_help *help;
help = nf_ct_helper_ext_add(ct, GFP_ATOMIC);
if (help == NULL) {
- rcu_read_unlock();
err = -ENOMEM;
- goto err;
+ goto err2;
}
/* not in hash table yet so not strictly necessary */
@@ -1198,44 +1194,34 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
} else {
/* try an implicit helper assignation */
err = __nf_ct_try_assign_helper(ct, GFP_ATOMIC);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
}
if (cda[CTA_STATUS]) {
err = ctnetlink_change_status(ct, cda);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
}
if (cda[CTA_NAT_SRC] || cda[CTA_NAT_DST]) {
err = ctnetlink_change_nat(ct, cda);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
}
#ifdef CONFIG_NF_NAT_NEEDED
if (cda[CTA_NAT_SEQ_ADJ_ORIG] || cda[CTA_NAT_SEQ_ADJ_REPLY]) {
err = ctnetlink_change_nat_seq_adj(ct, cda);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
}
#endif
if (cda[CTA_PROTOINFO]) {
err = ctnetlink_change_protoinfo(ct, cda);
- if (err < 0) {
- rcu_read_unlock();
- goto err;
- }
+ if (err < 0)
+ goto err2;
}
nf_ct_acct_ext_add(ct, GFP_ATOMIC);
@@ -1253,12 +1239,12 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
err = ctnetlink_parse_tuple(cda, &master, CTA_TUPLE_MASTER, u3);
if (err < 0)
- goto err;
+ goto err2;
master_h = __nf_conntrack_find(&init_net, &master);
if (master_h == NULL) {
err = -ENOENT;
- goto err;
+ goto err2;
}
master_ct = nf_ct_tuplehash_to_ctrack(master_h);
nf_conntrack_get(&master_ct->ct_general);
@@ -1271,7 +1257,10 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
rcu_read_unlock();
return ct;
-err:
+
+err2:
+ rcu_read_unlock();
+err1:
nf_conntrack_free(ct);
return ERR_PTR(err);
}
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 39/41: sysctl support of logger choice
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (37 preceding siblings ...)
2009-03-24 14:03 ` netfilter 38/41: ctnetlink: fix rcu context imbalance Patrick McHardy
@ 2009-03-24 14:03 ` Patrick McHardy
2009-03-24 14:04 ` nefilter 40/41: nfnetlink: add nfnetlink_set_err and use it in ctnetlink Patrick McHardy
` (2 subsequent siblings)
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:03 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 176252746ebbc8db97e304345af1f2563c7dc139
Author: Eric Leblond <eric@inl.fr>
Date: Mon Mar 23 13:16:53 2009 +0100
netfilter: sysctl support of logger choice
This patchs adds support of modification of the used logger via sysctl.
It can be used to change the logger to module that can not use the bind
operation (ipt_LOG and ipt_ULOG). For this purpose, it creates a
directory /proc/sys/net/netfilter/nf_log which contains a file
per-protocol. The content of the file is the name current logger (NONE if
not set) and a logger can be setup by simply echoing its name to the file.
By echoing "NONE" to a /proc/sys/net/netfilter/nf_log/PROTO file, the
logger corresponding to this PROTO is set to NULL.
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c
index 4fcbcc7..8bb998f 100644
--- a/net/netfilter/nf_log.c
+++ b/net/netfilter/nf_log.c
@@ -14,6 +14,7 @@
LOG target modules */
#define NF_LOG_PREFIXLEN 128
+#define NFLOGGER_NAME_LEN 64
static const struct nf_logger *nf_loggers[NFPROTO_NUMPROTO] __read_mostly;
static struct list_head nf_loggers_l[NFPROTO_NUMPROTO] __read_mostly;
@@ -207,18 +208,100 @@ static const struct file_operations nflog_file_ops = {
.release = seq_release,
};
+
#endif /* PROC_FS */
+#ifdef CONFIG_SYSCTL
+struct ctl_path nf_log_sysctl_path[] = {
+ { .procname = "net", .ctl_name = CTL_NET, },
+ { .procname = "netfilter", .ctl_name = NET_NETFILTER, },
+ { .procname = "nf_log", .ctl_name = CTL_UNNUMBERED, },
+ { }
+};
+
+static char nf_log_sysctl_fnames[NFPROTO_NUMPROTO-NFPROTO_UNSPEC][3];
+static struct ctl_table nf_log_sysctl_table[NFPROTO_NUMPROTO+1];
+static struct ctl_table_header *nf_log_dir_header;
-int __init netfilter_log_init(void)
+static int nf_log_proc_dostring(ctl_table *table, int write, struct file *filp,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ const struct nf_logger *logger;
+ int r = 0;
+ int tindex = (unsigned long)table->extra1;
+
+ if (write) {
+ if (!strcmp(buffer, "NONE")) {
+ nf_log_unbind_pf(tindex);
+ return 0;
+ }
+ mutex_lock(&nf_log_mutex);
+ logger = __find_logger(tindex, buffer);
+ if (logger == NULL) {
+ mutex_unlock(&nf_log_mutex);
+ return -ENOENT;
+ }
+ rcu_assign_pointer(nf_loggers[tindex], logger);
+ mutex_unlock(&nf_log_mutex);
+ } else {
+ rcu_read_lock();
+ logger = rcu_dereference(nf_loggers[tindex]);
+ if (!logger)
+ table->data = "NONE";
+ else
+ table->data = logger->name;
+ r = proc_dostring(table, write, filp, buffer, lenp, ppos);
+ rcu_read_unlock();
+ }
+
+ return r;
+}
+
+static __init int netfilter_log_sysctl_init(void)
{
int i;
+
+ for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++) {
+ snprintf(nf_log_sysctl_fnames[i-NFPROTO_UNSPEC], 3, "%d", i);
+ nf_log_sysctl_table[i].ctl_name = CTL_UNNUMBERED;
+ nf_log_sysctl_table[i].procname =
+ nf_log_sysctl_fnames[i-NFPROTO_UNSPEC];
+ nf_log_sysctl_table[i].data = NULL;
+ nf_log_sysctl_table[i].maxlen =
+ NFLOGGER_NAME_LEN * sizeof(char);
+ nf_log_sysctl_table[i].mode = 0644;
+ nf_log_sysctl_table[i].proc_handler = nf_log_proc_dostring;
+ nf_log_sysctl_table[i].extra1 = (void *)(unsigned long) i;
+ }
+
+ nf_log_dir_header = register_sysctl_paths(nf_log_sysctl_path,
+ nf_log_sysctl_table);
+ if (!nf_log_dir_header)
+ return -ENOMEM;
+
+ return 0;
+}
+#else
+static __init int netfilter_log_sysctl_init(void)
+{
+ return 0;
+}
+#endif /* CONFIG_SYSCTL */
+
+int __init netfilter_log_init(void)
+{
+ int i, r;
#ifdef CONFIG_PROC_FS
if (!proc_create("nf_log", S_IRUGO,
proc_net_netfilter, &nflog_file_ops))
return -1;
#endif
+ /* Errors will trigger panic, unroll on error is unnecessary. */
+ r = netfilter_log_sysctl_init();
+ if (r < 0)
+ return r;
+
for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++)
INIT_LIST_HEAD(&(nf_loggers_l[i]));
^ permalink raw reply related [flat|nested] 58+ messages in thread* nefilter 40/41: nfnetlink: add nfnetlink_set_err and use it in ctnetlink
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (38 preceding siblings ...)
2009-03-24 14:03 ` netfilter 39/41: sysctl support of logger choice Patrick McHardy
@ 2009-03-24 14:04 ` Patrick McHardy
2009-03-24 14:04 ` netfilter 41/41: nf_conntrack: Reduce conntrack count in nf_conntrack_free() Patrick McHardy
2009-03-24 20:26 ` netfilter 00/41: Netfilter update for 2.6.30 David Miller
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:04 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit dd5b6ce6fd465eab90357711c8e8124dc3a31ff0
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Mar 23 13:21:06 2009 +0100
nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink
This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h
index 7d8e045..135e5cf 100644
--- a/include/linux/netfilter/nfnetlink.h
+++ b/include/linux/netfilter/nfnetlink.h
@@ -76,6 +76,7 @@ extern int nfnetlink_subsys_unregister(const struct nfnetlink_subsystem *n);
extern int nfnetlink_has_listeners(unsigned int group);
extern int nfnetlink_send(struct sk_buff *skb, u32 pid, unsigned group,
int echo);
+extern void nfnetlink_set_err(u32 pid, u32 group, int error);
extern int nfnetlink_unicast(struct sk_buff *skb, u_int32_t pid, int flags);
extern void nfnl_lock(void);
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index d1fe9d1..1b75c9e 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -518,6 +518,7 @@ static int ctnetlink_conntrack_event(struct notifier_block *this,
nla_put_failure:
rcu_read_unlock();
nlmsg_failure:
+ nfnetlink_set_err(0, group, -ENOBUFS);
kfree_skb(skb);
return NOTIFY_DONE;
}
@@ -1514,6 +1515,7 @@ static int ctnetlink_expect_event(struct notifier_block *this,
nla_put_failure:
rcu_read_unlock();
nlmsg_failure:
+ nfnetlink_set_err(0, 0, -ENOBUFS);
kfree_skb(skb);
return NOTIFY_DONE;
}
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 9c0ba17..2785d66 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -113,6 +113,12 @@ int nfnetlink_send(struct sk_buff *skb, u32 pid, unsigned group, int echo)
}
EXPORT_SYMBOL_GPL(nfnetlink_send);
+void nfnetlink_set_err(u32 pid, u32 group, int error)
+{
+ netlink_set_err(nfnl, pid, group, error);
+}
+EXPORT_SYMBOL_GPL(nfnetlink_set_err);
+
int nfnetlink_unicast(struct sk_buff *skb, u_int32_t pid, int flags)
{
return netlink_unicast(nfnl, skb, pid, flags);
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 6ee69c2..5b33879 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1106,6 +1106,7 @@ void netlink_set_err(struct sock *ssk, u32 pid, u32 group, int code)
read_unlock(&nl_table_lock);
}
+EXPORT_SYMBOL(netlink_set_err);
/* must be called with netlink table grabbed */
static void netlink_update_socket_mc(struct netlink_sock *nlk,
^ permalink raw reply related [flat|nested] 58+ messages in thread* netfilter 41/41: nf_conntrack: Reduce conntrack count in nf_conntrack_free()
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (39 preceding siblings ...)
2009-03-24 14:04 ` nefilter 40/41: nfnetlink: add nfnetlink_set_err and use it in ctnetlink Patrick McHardy
@ 2009-03-24 14:04 ` Patrick McHardy
2009-03-24 20:26 ` netfilter 00/41: Netfilter update for 2.6.30 David Miller
41 siblings, 0 replies; 58+ messages in thread
From: Patrick McHardy @ 2009-03-24 14:04 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 1d45209d89e647e9f27e4afa1f47338df73bc112
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue Mar 24 14:26:50 2009 +0100
netfilter: nf_conntrack: Reduce conntrack count in nf_conntrack_free()
We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might
accumulate about 10.000 elements per CPU in its internal queues. To get accurate
conntrack counts (at the expense of slightly more RAM used), we might consider
conntrack counter not taking into account "about to be freed elements, waiting
in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU
callback.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index ebc2756..55befe5 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -517,16 +517,17 @@ EXPORT_SYMBOL_GPL(nf_conntrack_alloc);
static void nf_conntrack_free_rcu(struct rcu_head *head)
{
struct nf_conn *ct = container_of(head, struct nf_conn, rcu);
- struct net *net = nf_ct_net(ct);
nf_ct_ext_free(ct);
kmem_cache_free(nf_conntrack_cachep, ct);
- atomic_dec(&net->ct.count);
}
void nf_conntrack_free(struct nf_conn *ct)
{
+ struct net *net = nf_ct_net(ct);
+
nf_ct_ext_destroy(ct);
+ atomic_dec(&net->ct.count);
call_rcu(&ct->rcu, nf_conntrack_free_rcu);
}
EXPORT_SYMBOL_GPL(nf_conntrack_free);
^ permalink raw reply related [flat|nested] 58+ messages in thread* Re: netfilter 00/41: Netfilter update for 2.6.30
2009-03-24 14:03 netfilter 00/41: Netfilter update for 2.6.30 Patrick McHardy
` (40 preceding siblings ...)
2009-03-24 14:04 ` netfilter 41/41: nf_conntrack: Reduce conntrack count in nf_conntrack_free() Patrick McHardy
@ 2009-03-24 20:26 ` David Miller
2009-03-25 16:29 ` Patrick McHardy
41 siblings, 1 reply; 58+ messages in thread
From: David Miller @ 2009-03-24 20:26 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel
From: Patrick McHardy <kaber@trash.net>
Date: Tue, 24 Mar 2009 15:03:06 +0100 (MET)
> Please apply or pull from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6.git
I'm going to pull this all in, thanks Patrick.
I'll reply to a few patches for which I'd like to see some
followup discussion and change.
Thanks.
^ permalink raw reply [flat|nested] 58+ messages in thread