* netfilter 00/12: Netfilter fixes/2.6.30 update part II
@ 2009-03-26 19:02 Patrick McHardy
2009-03-26 19:02 ` netfilter 01/12: fix xt_LED build failure Patrick McHardy
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
Hi Dave,
following are a few late netfilter patches and fixes for 2.6.30, containing:
- Eric's patch to use SLAB_DESTROY_BY_RCU in conntrack, which reduces
the conntrack size and avoids temporarily exceeding the configured
maximum amount of entries before the RCU threshold kicks in.
- another patch from Eric to factorize the optimized ifname comparisons
- a fix from Eric to use hlist_add_head_rcu in nf_conntrack_set_hashsize()
to avoid a race condition
- a number of patches from Holger Eitzenberger to perform approximately
correct allocation (might overshoot by a bit) for ctnetlink event
messages to avoid reallocation in netlink_trim(). According to some
benchmarks by Pablo. this increases throughput by about 10% in an
connection intensive workload.
- a patch fixing a build-failure in the new LED target
- a patch from Francis Dupont to fix an old regression in the *tables
loop detection. Slightly modified and ported to ip6_tables and
arp_tables by myself.
Please apply or pull from:
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6.git
Thanks!
include/linux/netfilter/x_tables.h | 23 ++++
include/net/netfilter/nf_conntrack.h | 14 ++-
include/net/netfilter/nf_conntrack_helper.h | 2 +
include/net/netfilter/nf_conntrack_l3proto.h | 7 +
include/net/netfilter/nf_conntrack_l4proto.h | 7 +
include/net/netfilter/nf_conntrack_tuple.h | 6 +-
include/net/netlink.h | 1 +
include/net/netns/conntrack.h | 5 +-
net/ipv4/netfilter/arp_tables.c | 18 +--
net/ipv4/netfilter/ip_tables.c | 27 +----
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 6 +
.../netfilter/nf_conntrack_l3proto_ipv4_compat.c | 63 ++++++----
net/ipv4/netfilter/nf_conntrack_proto_icmp.c | 6 +
net/ipv4/netfilter/nf_nat_core.c | 2 +-
net/ipv6/netfilter/ip6_tables.c | 27 +----
net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 6 +
net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 6 +
net/netfilter/Kconfig | 2 +-
net/netfilter/nf_conntrack_core.c | 129 ++++++++++++--------
net/netfilter/nf_conntrack_expect.c | 2 +-
net/netfilter/nf_conntrack_helper.c | 8 +-
net/netfilter/nf_conntrack_netlink.c | 94 +++++++++++++--
net/netfilter/nf_conntrack_proto.c | 16 +++
net/netfilter/nf_conntrack_proto_dccp.c | 9 ++
net/netfilter/nf_conntrack_proto_gre.c | 1 +
net/netfilter/nf_conntrack_proto_sctp.c | 10 ++
net/netfilter/nf_conntrack_proto_tcp.c | 15 +++
net/netfilter/nf_conntrack_proto_udp.c | 2 +
net/netfilter/nf_conntrack_proto_udplite.c | 1 +
net/netfilter/nf_conntrack_standalone.c | 57 +++++----
net/netfilter/xt_connlimit.c | 6 +-
net/netfilter/xt_physdev.c | 21 +---
net/netlink/attr.c | 27 ++++
33 files changed, 416 insertions(+), 210 deletions(-)
Eric Dumazet (3):
netfilter: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize()
netfilter: factorize ifname_compare()
netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()
Holger Eitzenberger (7):
netfilter: ctnetlink: add callbacks to the per-proto nlattrs
netlink: add nla_policy_len()
netfilter: limit the length of the helper name
netfilter: ctnetlink: allocate right-sized ctnetlink skb
netfilter: nf_conntrack: add generic function to get len of generic policy
netfilter: nf_conntrack: calculate per-protocol nlattr size
ctnetlink: compute generic part of event more acurately
Patrick McHardy (2):
netfilter: fix xt_LED build failure
netfilter: {ip,ip6,arp}_tables: fix incorrect loop detection
^ permalink raw reply [flat|nested] 14+ messages in thread
* netfilter 01/12: fix xt_LED build failure
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 02/12: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Patrick McHardy
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit a9a9adfe2f99ddadfb574a098392a007970a1577
Author: Patrick McHardy <kaber@trash.net>
Date: Wed Mar 25 17:21:34 2009 +0100
netfilter: fix xt_LED build failure
net/netfilter/xt_LED.c:40: error: field netfilter_led_trigger has incomplete type
net/netfilter/xt_LED.c: In function led_timeout_callback:
net/netfilter/xt_LED.c:78: warning: unused variable ledinternal
net/netfilter/xt_LED.c: In function led_tg_check:
net/netfilter/xt_LED.c:102: error: implicit declaration of function led_trigger_register
net/netfilter/xt_LED.c: In function led_tg_destroy:
net/netfilter/xt_LED.c:135: error: implicit declaration of function led_trigger_unregister
Fix by adding a dependency on LED_TRIGGERS.
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Tested-by: Subrata Modak <tosubrata@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 2562d05..2c967e4 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -374,7 +374,7 @@ config NETFILTER_XT_TARGET_HL
config NETFILTER_XT_TARGET_LED
tristate '"LED" target support'
- depends on LEDS_CLASS
+ depends on LEDS_CLASS && LED_TRIGGERS
depends on NETFILTER_ADVANCED
help
This option adds a `LED' target, which allows you to blink LEDs in
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 02/12: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize()
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
2009-03-26 19:02 ` netfilter 01/12: fix xt_LED build failure Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 03/12: factorize ifname_compare() Patrick McHardy
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 78f3648601fdc7a8166748bbd6d0555a88efa24a
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed Mar 25 17:24:34 2009 +0100
netfilter: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize()
Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous.
Without any barrier, one CPU could see a loop while doing its lookup.
Its true new table cannot be seen by another cpu, but previous table is still
readable.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 55befe5..54e983f 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1121,7 +1121,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
struct nf_conntrack_tuple_hash, hnode);
hlist_del_rcu(&h->hnode);
bucket = __hash_conntrack(&h->tuple, hashsize, rnd);
- hlist_add_head(&h->hnode, &hash[bucket]);
+ hlist_add_head_rcu(&h->hnode, &hash[bucket]);
}
}
old_size = nf_conntrack_htable_size;
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 03/12: factorize ifname_compare()
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
2009-03-26 19:02 ` netfilter 01/12: fix xt_LED build failure Patrick McHardy
2009-03-26 19:02 ` netfilter 02/12: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 04/12: ctnetlink: add callbacks to the per-proto nlattrs Patrick McHardy
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit b8dfe498775de912116f275680ddb57c8799d9ef
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed Mar 25 17:31:52 2009 +0100
netfilter: factorize ifname_compare()
We use same not trivial helper function in four places. We can factorize it.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index e8e08d0..72918b7 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -435,6 +435,29 @@ extern void xt_free_table_info(struct xt_table_info *info);
extern void xt_table_entry_swap_rcu(struct xt_table_info *old,
struct xt_table_info *new);
+/*
+ * This helper is performance critical and must be inlined
+ */
+static inline unsigned long ifname_compare_aligned(const char *_a,
+ const char *_b,
+ const char *_mask)
+{
+ const unsigned long *a = (const unsigned long *)_a;
+ const unsigned long *b = (const unsigned long *)_b;
+ const unsigned long *mask = (const unsigned long *)_mask;
+ unsigned long ret;
+
+ ret = (a[0] ^ b[0]) & mask[0];
+ if (IFNAMSIZ > sizeof(unsigned long))
+ ret |= (a[1] ^ b[1]) & mask[1];
+ if (IFNAMSIZ > 2 * sizeof(unsigned long))
+ ret |= (a[2] ^ b[2]) & mask[2];
+ if (IFNAMSIZ > 3 * sizeof(unsigned long))
+ ret |= (a[3] ^ b[3]) & mask[3];
+ BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ return ret;
+}
+
#ifdef CONFIG_COMPAT
#include <net/compat.h>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 64a7c6c..4b35dba 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -80,19 +80,7 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
{
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+ unsigned long ret = ifname_compare_aligned(_a, _b, _mask);
#else
unsigned long ret = 0;
int i;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index e5294ae..41c59e3 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -74,25 +74,6 @@ do { \
Hence the start of any table is given by get_table() below. */
-static unsigned long ifname_compare(const char *_a, const char *_b,
- const unsigned char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
-
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -121,7 +102,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- ret = ifname_compare(indev, ipinfo->iniface, ipinfo->iniface_mask);
+ ret = ifname_compare_aligned(indev, ipinfo->iniface, ipinfo->iniface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -130,7 +111,7 @@ ip_packet_match(const struct iphdr *ip,
return false;
}
- ret = ifname_compare(outdev, ipinfo->outiface, ipinfo->outiface_mask);
+ ret = ifname_compare_aligned(outdev, ipinfo->outiface, ipinfo->outiface_mask);
if (FWINV(ret != 0, IPT_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 34af7bb..e59662b 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -89,25 +89,6 @@ ip6t_ext_hdr(u8 nexthdr)
(nexthdr == IPPROTO_DSTOPTS) );
}
-static unsigned long ifname_compare(const char *_a, const char *_b,
- const unsigned char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
-
/* Returns whether matches rule or not. */
/* Performance critical - called for every packet */
static inline bool
@@ -138,7 +119,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- ret = ifname_compare(indev, ip6info->iniface, ip6info->iniface_mask);
+ ret = ifname_compare_aligned(indev, ip6info->iniface, ip6info->iniface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_IN)) {
dprintf("VIA in mismatch (%s vs %s).%s\n",
@@ -147,7 +128,7 @@ ip6_packet_match(const struct sk_buff *skb,
return false;
}
- ret = ifname_compare(outdev, ip6info->outiface, ip6info->outiface_mask);
+ ret = ifname_compare_aligned(outdev, ip6info->outiface, ip6info->outiface_mask);
if (FWINV(ret != 0, IP6T_INV_VIA_OUT)) {
dprintf("VIA out mismatch (%s vs %s).%s\n",
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index 44a234e..8d28ca5 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -20,23 +20,6 @@ MODULE_DESCRIPTION("Xtables: Bridge physical device match");
MODULE_ALIAS("ipt_physdev");
MODULE_ALIAS("ip6t_physdev");
-static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
-{
- const unsigned long *a = (const unsigned long *)_a;
- const unsigned long *b = (const unsigned long *)_b;
- const unsigned long *mask = (const unsigned long *)_mask;
- unsigned long ret;
-
- ret = (a[0] ^ b[0]) & mask[0];
- if (IFNAMSIZ > sizeof(unsigned long))
- ret |= (a[1] ^ b[1]) & mask[1];
- if (IFNAMSIZ > 2 * sizeof(unsigned long))
- ret |= (a[2] ^ b[2]) & mask[2];
- if (IFNAMSIZ > 3 * sizeof(unsigned long))
- ret |= (a[3] ^ b[3]) & mask[3];
- BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
- return ret;
-}
static bool
physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
@@ -85,7 +68,7 @@ physdev_mt(const struct sk_buff *skb, const struct xt_match_param *par)
if (!(info->bitmask & XT_PHYSDEV_OP_IN))
goto match_outdev;
indev = nf_bridge->physindev ? nf_bridge->physindev->name : nulldevname;
- ret = ifname_compare(indev, info->physindev, info->in_mask);
+ ret = ifname_compare_aligned(indev, info->physindev, info->in_mask);
if (!ret ^ !(info->invert & XT_PHYSDEV_OP_IN))
return false;
@@ -95,7 +78,7 @@ match_outdev:
return true;
outdev = nf_bridge->physoutdev ?
nf_bridge->physoutdev->name : nulldevname;
- ret = ifname_compare(outdev, info->physoutdev, info->out_mask);
+ ret = ifname_compare_aligned(outdev, info->physoutdev, info->out_mask);
return (!!ret ^ !(info->invert & XT_PHYSDEV_OP_OUT));
}
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 04/12: ctnetlink: add callbacks to the per-proto nlattrs
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (2 preceding siblings ...)
2009-03-26 19:02 ` netfilter 03/12: factorize ifname_compare() Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netlink 05/12: add nla_policy_len() Patrick McHardy
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit d0dba7255b541f1651a88e75ebdb20dd45509c2f
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 18:24:48 2009 +0100
netfilter: ctnetlink: add callbacks to the per-proto nlattrs
There is added a single callback for the l3 proto helper. The two
callbacks for the l4 protos are necessary because of the general
structure of a ctnetlink event, which is in short:
CTA_TUPLE_ORIG
<l3/l4-proto-attributes>
CTA_TUPLE_REPLY
<l3/l4-proto-attributes>
CTA_ID
...
CTA_PROTOINFO
<l4-proto-attributes>
CTA_TUPLE_MASTER
<l3/l4-proto-attributes>
Therefore the formular is
size := sizeof(generic-nlas) + 3 * sizeof(tuple_nlas) + sizeof(protoinfo_nlas)
Some of the NLAs are optional, e. g. CTA_TUPLE_MASTER, which is only
set if it's an expected connection. But the number of optional NLAs is
small enough to prevent netlink_trim() from reallocating if calculated
properly.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h
index 0378676..9f99d36 100644
--- a/include/net/netfilter/nf_conntrack_l3proto.h
+++ b/include/net/netfilter/nf_conntrack_l3proto.h
@@ -53,10 +53,17 @@ struct nf_conntrack_l3proto
int (*tuple_to_nlattr)(struct sk_buff *skb,
const struct nf_conntrack_tuple *t);
+ /*
+ * Calculate size of tuple nlattr
+ */
+ int (*nlattr_tuple_size)(void);
+
int (*nlattr_to_tuple)(struct nlattr *tb[],
struct nf_conntrack_tuple *t);
const struct nla_policy *nla_policy;
+ size_t nla_size;
+
#ifdef CONFIG_SYSCTL
struct ctl_table_header *ctl_table_header;
struct ctl_path *ctl_table_path;
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index b01070b..a120990 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -64,16 +64,22 @@ struct nf_conntrack_l4proto
/* convert protoinfo to nfnetink attributes */
int (*to_nlattr)(struct sk_buff *skb, struct nlattr *nla,
const struct nf_conn *ct);
+ /* Calculate protoinfo nlattr size */
+ int (*nlattr_size)(void);
/* convert nfnetlink attributes to protoinfo */
int (*from_nlattr)(struct nlattr *tb[], struct nf_conn *ct);
int (*tuple_to_nlattr)(struct sk_buff *skb,
const struct nf_conntrack_tuple *t);
+ /* Calculate tuple nlattr size */
+ int (*nlattr_tuple_size)(void);
int (*nlattr_to_tuple)(struct nlattr *tb[],
struct nf_conntrack_tuple *t);
const struct nla_policy *nla_policy;
+ size_t nla_size;
+
#ifdef CONFIG_SYSCTL
struct ctl_table_header **ctl_table_header;
struct ctl_table *ctl_table;
diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 9a62b4e..1a4568b 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -167,6 +167,9 @@ int nf_conntrack_l3proto_register(struct nf_conntrack_l3proto *proto)
if (proto->l3proto >= AF_MAX)
return -EBUSY;
+ if (proto->tuple_to_nlattr && !proto->nlattr_tuple_size)
+ return -EINVAL;
+
mutex_lock(&nf_ct_proto_mutex);
if (nf_ct_l3protos[proto->l3proto] != &nf_conntrack_l3proto_generic) {
ret = -EBUSY;
@@ -177,6 +180,9 @@ int nf_conntrack_l3proto_register(struct nf_conntrack_l3proto *proto)
if (ret < 0)
goto out_unlock;
+ if (proto->nlattr_tuple_size)
+ proto->nla_size = 3 * proto->nlattr_tuple_size();
+
rcu_assign_pointer(nf_ct_l3protos[proto->l3proto], proto);
out_unlock:
@@ -263,6 +269,10 @@ int nf_conntrack_l4proto_register(struct nf_conntrack_l4proto *l4proto)
if (l4proto->l3proto >= PF_MAX)
return -EBUSY;
+ if ((l4proto->to_nlattr && !l4proto->nlattr_size)
+ || (l4proto->tuple_to_nlattr && !l4proto->nlattr_tuple_size))
+ return -EINVAL;
+
mutex_lock(&nf_ct_proto_mutex);
if (!nf_ct_protos[l4proto->l3proto]) {
/* l3proto may be loaded latter. */
@@ -290,6 +300,12 @@ int nf_conntrack_l4proto_register(struct nf_conntrack_l4proto *l4proto)
if (ret < 0)
goto out_unlock;
+ l4proto->nla_size = 0;
+ if (l4proto->nlattr_size)
+ l4proto->nla_size += l4proto->nlattr_size();
+ if (l4proto->nlattr_tuple_size)
+ l4proto->nla_size += 3 * l4proto->nlattr_tuple_size();
+
rcu_assign_pointer(nf_ct_protos[l4proto->l3proto][l4proto->l4proto],
l4proto);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netlink 05/12: add nla_policy_len()
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (3 preceding siblings ...)
2009-03-26 19:02 ` netfilter 04/12: ctnetlink: add callbacks to the per-proto nlattrs Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 06/12: limit the length of the helper name Patrick McHardy
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit e487eb99cf9381a4f8254fa01747a85818da612b
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 18:26:30 2009 +0100
netlink: add nla_policy_len()
It calculates the max. length of a Netlink policy, which is usefull
for allocating Netlink buffers roughly the size of the actual
message.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netlink.h b/include/net/netlink.h
index 8a6150a..eddb502 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -230,6 +230,7 @@ extern int nla_validate(struct nlattr *head, int len, int maxtype,
extern int nla_parse(struct nlattr *tb[], int maxtype,
struct nlattr *head, int len,
const struct nla_policy *policy);
+extern int nla_policy_len(const struct nla_policy *, int);
extern struct nlattr * nla_find(struct nlattr *head, int len, int attrtype);
extern size_t nla_strlcpy(char *dst, const struct nlattr *nla,
size_t dstsize);
diff --git a/net/netlink/attr.c b/net/netlink/attr.c
index 56c3ce7..ae32c57 100644
--- a/net/netlink/attr.c
+++ b/net/netlink/attr.c
@@ -133,6 +133,32 @@ errout:
}
/**
+ * nla_policy_len - Determin the max. length of a policy
+ * @policy: policy to use
+ * @n: number of policies
+ *
+ * Determines the max. length of the policy. It is currently used
+ * to allocated Netlink buffers roughly the size of the actual
+ * message.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int
+nla_policy_len(const struct nla_policy *p, int n)
+{
+ int i, len = 0;
+
+ for (i = 0; i < n; i++) {
+ if (p->len)
+ len += nla_total_size(p->len);
+ else if (nla_attr_minlen[p->type])
+ len += nla_total_size(nla_attr_minlen[p->type]);
+ }
+
+ return len;
+}
+
+/**
* nla_parse - Parse a stream of attributes into a tb buffer
* @tb: destination array with maxtype+1 elements
* @maxtype: maximum attribute type to be expected
@@ -456,6 +482,7 @@ int nla_append(struct sk_buff *skb, int attrlen, const void *data)
}
EXPORT_SYMBOL(nla_validate);
+EXPORT_SYMBOL(nla_policy_len);
EXPORT_SYMBOL(nla_parse);
EXPORT_SYMBOL(nla_find);
EXPORT_SYMBOL(nla_strlcpy);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 06/12: limit the length of the helper name
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (4 preceding siblings ...)
2009-03-26 19:02 ` netlink 05/12: add nla_policy_len() Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 07/12: {ip,ip6,arp}_tables: fix incorrect loop detection Patrick McHardy
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit af9d32ad6718b9a80fa89f557cc1fbb63a93ec15
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 18:44:01 2009 +0100
netfilter: limit the length of the helper name
This is necessary in order to have an upper bound for Netlink
message calculation, which is not a problem at all, as there
are no helpers with a longer name.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 66d65a7..ee2a4b3 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -14,6 +14,8 @@
struct module;
+#define NF_CT_HELPER_NAME_LEN 16
+
struct nf_conntrack_helper
{
struct hlist_node hnode; /* Internal use. */
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index a51bdac..805cfdd 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -142,6 +142,7 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
BUG_ON(me->expect_policy == NULL);
BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
+ BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
mutex_lock(&nf_ct_helper_mutex);
hlist_add_head_rcu(&me->hnode, &nf_ct_helper_hash[h]);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 07/12: {ip,ip6,arp}_tables: fix incorrect loop detection
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (5 preceding siblings ...)
2009-03-26 19:02 ` netfilter 06/12: limit the length of the helper name Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 08/12: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu() Patrick McHardy
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 1f9352ae2253a97b07b34dcf16ffa3b4ca12c558
Author: Patrick McHardy <kaber@trash.net>
Date: Wed Mar 25 19:26:35 2009 +0100
netfilter: {ip,ip6,arp}_tables: fix incorrect loop detection
Commit e1b4b9f ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case
search for loops) introduced a regression in the loop detection algorithm,
causing sporadic incorrectly detected loops.
When a chain has already been visited during the check, it is treated as
having a standard target containing a RETURN verdict directly at the
beginning in order to not check it again. The real target of the first
rule is then incorrectly treated as STANDARD target and checked not to
contain invalid verdicts.
Fix by making sure the rule does actually contain a standard target.
Based on patch by Francis Dupont <Francis_Dupont@isc.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 4b35dba..4f454ce 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -388,7 +388,9 @@ static int mark_source_chains(struct xt_table_info *newinfo,
&& unconditional(&e->arp)) || visited) {
unsigned int oldpos, size;
- if (t->verdict < -NF_MAX_VERDICT - 1) {
+ if ((strcmp(t->target.u.user.name,
+ ARPT_STANDARD_TARGET) == 0) &&
+ t->verdict < -NF_MAX_VERDICT - 1) {
duprintf("mark_source_chains: bad "
"negative verdict (%i)\n",
t->verdict);
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 41c59e3..82ee7c9 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -488,7 +488,9 @@ mark_source_chains(struct xt_table_info *newinfo,
&& unconditional(&e->ip)) || visited) {
unsigned int oldpos, size;
- if (t->verdict < -NF_MAX_VERDICT - 1) {
+ if ((strcmp(t->target.u.user.name,
+ IPT_STANDARD_TARGET) == 0) &&
+ t->verdict < -NF_MAX_VERDICT - 1) {
duprintf("mark_source_chains: bad "
"negative verdict (%i)\n",
t->verdict);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index e59662b..e89cfa3 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -517,7 +517,9 @@ mark_source_chains(struct xt_table_info *newinfo,
&& unconditional(&e->ipv6)) || visited) {
unsigned int oldpos, size;
- if (t->verdict < -NF_MAX_VERDICT - 1) {
+ if ((strcmp(t->target.u.user.name,
+ IP6T_STANDARD_TARGET) == 0) &&
+ t->verdict < -NF_MAX_VERDICT - 1) {
duprintf("mark_source_chains: bad "
"negative verdict (%i)\n",
t->verdict);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 08/12: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (6 preceding siblings ...)
2009-03-26 19:02 ` netfilter 07/12: {ip,ip6,arp}_tables: fix incorrect loop detection Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 09/12: ctnetlink: allocate right-sized ctnetlink skb Patrick McHardy
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit ea781f197d6a835cbb93a0bf88ee1696296ed8aa
Author: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed Mar 25 21:05:46 2009 +0100
netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()
Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP.
This permits an easy conversion from call_rcu() based hash lists to a
SLAB_DESTROY_BY_RCU one.
Avoiding call_rcu() delay at nf_conn freeing time has numerous gains.
First, it doesnt fill RCU queues (up to 10000 elements per cpu).
This reduces OOM possibility, if queued elements are not taken into account
This reduces latency problems when RCU queue size hits hilimit and triggers
emergency mode.
- It allows fast reuse of just freed elements, permitting better use of
CPU cache.
- We delete rcu_head from "struct nf_conn", shrinking size of this structure
by 8 or 16 bytes.
This patch only takes care of "struct nf_conn".
call_rcu() is still used for less critical conntrack parts, that may
be converted later if necessary.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 4dfb793..6c3f964 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -91,8 +91,7 @@ struct nf_conn_help {
#include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
-struct nf_conn
-{
+struct nf_conn {
/* Usage count in here is 1 for hash table/destruct timer, 1 per skb,
plus 1 for any connection(s) we are `master' for */
struct nf_conntrack ct_general;
@@ -126,7 +125,6 @@ struct nf_conn
#ifdef CONFIG_NET_NS
struct net *ct_net;
#endif
- struct rcu_head rcu;
};
static inline struct nf_conn *
@@ -190,9 +188,13 @@ static inline void nf_ct_put(struct nf_conn *ct)
extern int nf_ct_l3proto_try_module_get(unsigned short l3proto);
extern void nf_ct_l3proto_module_put(unsigned short l3proto);
-extern struct hlist_head *nf_ct_alloc_hashtable(unsigned int *sizep, int *vmalloced);
-extern void nf_ct_free_hashtable(struct hlist_head *hash, int vmalloced,
- unsigned int size);
+/*
+ * Allocate a hashtable of hlist_head (if nulls == 0),
+ * or hlist_nulls_head (if nulls == 1)
+ */
+extern void *nf_ct_alloc_hashtable(unsigned int *sizep, int *vmalloced, int nulls);
+
+extern void nf_ct_free_hashtable(void *hash, int vmalloced, unsigned int size);
extern struct nf_conntrack_tuple_hash *
__nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple);
diff --git a/include/net/netfilter/nf_conntrack_tuple.h b/include/net/netfilter/nf_conntrack_tuple.h
index f2f6aa7..2628c15 100644
--- a/include/net/netfilter/nf_conntrack_tuple.h
+++ b/include/net/netfilter/nf_conntrack_tuple.h
@@ -12,6 +12,7 @@
#include <linux/netfilter/x_tables.h>
#include <linux/netfilter/nf_conntrack_tuple_common.h>
+#include <linux/list_nulls.h>
/* A `tuple' is a structure containing the information to uniquely
identify a connection. ie. if two packets have the same tuple, they
@@ -146,9 +147,8 @@ static inline void nf_ct_dump_tuple(const struct nf_conntrack_tuple *t)
((enum ip_conntrack_dir)(h)->tuple.dst.dir)
/* Connections have two entries in the hash table: one for each way */
-struct nf_conntrack_tuple_hash
-{
- struct hlist_node hnode;
+struct nf_conntrack_tuple_hash {
+ struct hlist_nulls_node hnnode;
struct nf_conntrack_tuple tuple;
};
diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
index f4498a6..9dc5840 100644
--- a/include/net/netns/conntrack.h
+++ b/include/net/netns/conntrack.h
@@ -2,6 +2,7 @@
#define __NETNS_CONNTRACK_H
#include <linux/list.h>
+#include <linux/list_nulls.h>
#include <asm/atomic.h>
struct ctl_table_header;
@@ -10,9 +11,9 @@ struct nf_conntrack_ecache;
struct netns_ct {
atomic_t count;
unsigned int expect_count;
- struct hlist_head *hash;
+ struct hlist_nulls_head *hash;
struct hlist_head *expect_hash;
- struct hlist_head unconfirmed;
+ struct hlist_nulls_head unconfirmed;
struct ip_conntrack_stat *stat;
#ifdef CONFIG_NF_CONNTRACK_EVENTS
struct nf_conntrack_ecache *ecache;
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.c
index 6ba5c55..8668a3d 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.c
@@ -25,40 +25,42 @@ struct ct_iter_state {
unsigned int bucket;
};
-static struct hlist_node *ct_get_first(struct seq_file *seq)
+static struct hlist_nulls_node *ct_get_first(struct seq_file *seq)
{
struct net *net = seq_file_net(seq);
struct ct_iter_state *st = seq->private;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
for (st->bucket = 0;
st->bucket < nf_conntrack_htable_size;
st->bucket++) {
n = rcu_dereference(net->ct.hash[st->bucket].first);
- if (n)
+ if (!is_a_nulls(n))
return n;
}
return NULL;
}
-static struct hlist_node *ct_get_next(struct seq_file *seq,
- struct hlist_node *head)
+static struct hlist_nulls_node *ct_get_next(struct seq_file *seq,
+ struct hlist_nulls_node *head)
{
struct net *net = seq_file_net(seq);
struct ct_iter_state *st = seq->private;
head = rcu_dereference(head->next);
- while (head == NULL) {
- if (++st->bucket >= nf_conntrack_htable_size)
- return NULL;
+ while (is_a_nulls(head)) {
+ if (likely(get_nulls_value(head) == st->bucket)) {
+ if (++st->bucket >= nf_conntrack_htable_size)
+ return NULL;
+ }
head = rcu_dereference(net->ct.hash[st->bucket].first);
}
return head;
}
-static struct hlist_node *ct_get_idx(struct seq_file *seq, loff_t pos)
+static struct hlist_nulls_node *ct_get_idx(struct seq_file *seq, loff_t pos)
{
- struct hlist_node *head = ct_get_first(seq);
+ struct hlist_nulls_node *head = ct_get_first(seq);
if (head)
while (pos && (head = ct_get_next(seq, head)))
@@ -87,69 +89,76 @@ static void ct_seq_stop(struct seq_file *s, void *v)
static int ct_seq_show(struct seq_file *s, void *v)
{
- const struct nf_conntrack_tuple_hash *hash = v;
- const struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(hash);
+ struct nf_conntrack_tuple_hash *hash = v;
+ struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(hash);
const struct nf_conntrack_l3proto *l3proto;
const struct nf_conntrack_l4proto *l4proto;
+ int ret = 0;
NF_CT_ASSERT(ct);
+ if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
+ return 0;
+
/* we only want to print DIR_ORIGINAL */
if (NF_CT_DIRECTION(hash))
- return 0;
+ goto release;
if (nf_ct_l3num(ct) != AF_INET)
- return 0;
+ goto release;
l3proto = __nf_ct_l3proto_find(nf_ct_l3num(ct));
NF_CT_ASSERT(l3proto);
l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
NF_CT_ASSERT(l4proto);
+ ret = -ENOSPC;
if (seq_printf(s, "%-8s %u %ld ",
l4proto->name, nf_ct_protonum(ct),
timer_pending(&ct->timeout)
? (long)(ct->timeout.expires - jiffies)/HZ : 0) != 0)
- return -ENOSPC;
+ goto release;
if (l4proto->print_conntrack && l4proto->print_conntrack(s, ct))
- return -ENOSPC;
+ goto release;
if (print_tuple(s, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
l3proto, l4proto))
- return -ENOSPC;
+ goto release;
if (seq_print_acct(s, ct, IP_CT_DIR_ORIGINAL))
- return -ENOSPC;
+ goto release;
if (!(test_bit(IPS_SEEN_REPLY_BIT, &ct->status)))
if (seq_printf(s, "[UNREPLIED] "))
- return -ENOSPC;
+ goto release;
if (print_tuple(s, &ct->tuplehash[IP_CT_DIR_REPLY].tuple,
l3proto, l4proto))
- return -ENOSPC;
+ goto release;
if (seq_print_acct(s, ct, IP_CT_DIR_REPLY))
- return -ENOSPC;
+ goto release;
if (test_bit(IPS_ASSURED_BIT, &ct->status))
if (seq_printf(s, "[ASSURED] "))
- return -ENOSPC;
+ goto release;
#ifdef CONFIG_NF_CONNTRACK_MARK
if (seq_printf(s, "mark=%u ", ct->mark))
- return -ENOSPC;
+ goto release;
#endif
#ifdef CONFIG_NF_CONNTRACK_SECMARK
if (seq_printf(s, "secmark=%u ", ct->secmark))
- return -ENOSPC;
+ goto release;
#endif
if (seq_printf(s, "use=%u\n", atomic_read(&ct->ct_general.use)))
- return -ENOSPC;
-
- return 0;
+ goto release;
+ ret = 0;
+release:
+ nf_ct_put(ct);
+ return ret;
}
static const struct seq_operations ct_seq_ops = {
diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c
index a65cf69..fe65187 100644
--- a/net/ipv4/netfilter/nf_nat_core.c
+++ b/net/ipv4/netfilter/nf_nat_core.c
@@ -679,7 +679,7 @@ nfnetlink_parse_nat_setup(struct nf_conn *ct,
static int __net_init nf_nat_net_init(struct net *net)
{
net->ipv4.nat_bysource = nf_ct_alloc_hashtable(&nf_nat_htable_size,
- &net->ipv4.nat_vmalloced);
+ &net->ipv4.nat_vmalloced, 0);
if (!net->ipv4.nat_bysource)
return -ENOMEM;
return 0;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 54e983f..c55bbdc 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -29,6 +29,7 @@
#include <linux/netdevice.h>
#include <linux/socket.h>
#include <linux/mm.h>
+#include <linux/rculist_nulls.h>
#include <net/netfilter/nf_conntrack.h>
#include <net/netfilter/nf_conntrack_l3proto.h>
@@ -163,8 +164,8 @@ static void
clean_from_lists(struct nf_conn *ct)
{
pr_debug("clean_from_lists(%p)\n", ct);
- hlist_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode);
- hlist_del_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnode);
+ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
+ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode);
/* Destroy all pending expectations */
nf_ct_remove_expectations(ct);
@@ -204,8 +205,8 @@ destroy_conntrack(struct nf_conntrack *nfct)
/* We overload first tuple to link into unconfirmed list. */
if (!nf_ct_is_confirmed(ct)) {
- BUG_ON(hlist_unhashed(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode));
- hlist_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode);
+ BUG_ON(hlist_nulls_unhashed(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode));
+ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
}
NF_CT_STAT_INC(net, delete);
@@ -242,18 +243,26 @@ static void death_by_timeout(unsigned long ul_conntrack)
nf_ct_put(ct);
}
+/*
+ * Warning :
+ * - Caller must take a reference on returned object
+ * and recheck nf_ct_tuple_equal(tuple, &h->tuple)
+ * OR
+ * - Caller must lock nf_conntrack_lock before calling this function
+ */
struct nf_conntrack_tuple_hash *
__nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple)
{
struct nf_conntrack_tuple_hash *h;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
unsigned int hash = hash_conntrack(tuple);
/* Disable BHs the entire time since we normally need to disable them
* at least once for the stats anyway.
*/
local_bh_disable();
- hlist_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnode) {
+begin:
+ hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) {
if (nf_ct_tuple_equal(tuple, &h->tuple)) {
NF_CT_STAT_INC(net, found);
local_bh_enable();
@@ -261,6 +270,13 @@ __nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple)
}
NF_CT_STAT_INC(net, searched);
}
+ /*
+ * if the nulls value we got at the end of this lookup is
+ * not the expected one, we must restart lookup.
+ * We probably met an item that was moved to another chain.
+ */
+ if (get_nulls_value(n) != hash)
+ goto begin;
local_bh_enable();
return NULL;
@@ -275,11 +291,18 @@ nf_conntrack_find_get(struct net *net, const struct nf_conntrack_tuple *tuple)
struct nf_conn *ct;
rcu_read_lock();
+begin:
h = __nf_conntrack_find(net, tuple);
if (h) {
ct = nf_ct_tuplehash_to_ctrack(h);
if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
h = NULL;
+ else {
+ if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple))) {
+ nf_ct_put(ct);
+ goto begin;
+ }
+ }
}
rcu_read_unlock();
@@ -293,9 +316,9 @@ static void __nf_conntrack_hash_insert(struct nf_conn *ct,
{
struct net *net = nf_ct_net(ct);
- hlist_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode,
+ hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
&net->ct.hash[hash]);
- hlist_add_head_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnode,
+ hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode,
&net->ct.hash[repl_hash]);
}
@@ -318,7 +341,7 @@ __nf_conntrack_confirm(struct sk_buff *skb)
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct;
struct nf_conn_help *help;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
enum ip_conntrack_info ctinfo;
struct net *net;
@@ -350,17 +373,17 @@ __nf_conntrack_confirm(struct sk_buff *skb)
/* See if there's one in the list already, including reverse:
NAT could have grabbed it without realizing, since we're
not in the hash. If there is, we lost race. */
- hlist_for_each_entry(h, n, &net->ct.hash[hash], hnode)
+ hlist_nulls_for_each_entry(h, n, &net->ct.hash[hash], hnnode)
if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
&h->tuple))
goto out;
- hlist_for_each_entry(h, n, &net->ct.hash[repl_hash], hnode)
+ hlist_nulls_for_each_entry(h, n, &net->ct.hash[repl_hash], hnnode)
if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple,
&h->tuple))
goto out;
/* Remove from unconfirmed list */
- hlist_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode);
+ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
__nf_conntrack_hash_insert(ct, hash, repl_hash);
/* Timer relative to confirmation time, not original
@@ -399,14 +422,14 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple,
{
struct net *net = nf_ct_net(ignored_conntrack);
struct nf_conntrack_tuple_hash *h;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
unsigned int hash = hash_conntrack(tuple);
/* Disable BHs the entire time since we need to disable them at
* least once for the stats anyway.
*/
rcu_read_lock_bh();
- hlist_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnode) {
+ hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) {
if (nf_ct_tuplehash_to_ctrack(h) != ignored_conntrack &&
nf_ct_tuple_equal(tuple, &h->tuple)) {
NF_CT_STAT_INC(net, found);
@@ -430,14 +453,14 @@ static noinline int early_drop(struct net *net, unsigned int hash)
/* Use oldest entry, which is roughly LRU */
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct = NULL, *tmp;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
unsigned int i, cnt = 0;
int dropped = 0;
rcu_read_lock();
for (i = 0; i < nf_conntrack_htable_size; i++) {
- hlist_for_each_entry_rcu(h, n, &net->ct.hash[hash],
- hnode) {
+ hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash],
+ hnnode) {
tmp = nf_ct_tuplehash_to_ctrack(h);
if (!test_bit(IPS_ASSURED_BIT, &tmp->status))
ct = tmp;
@@ -508,27 +531,19 @@ struct nf_conn *nf_conntrack_alloc(struct net *net,
#ifdef CONFIG_NET_NS
ct->ct_net = net;
#endif
- INIT_RCU_HEAD(&ct->rcu);
return ct;
}
EXPORT_SYMBOL_GPL(nf_conntrack_alloc);
-static void nf_conntrack_free_rcu(struct rcu_head *head)
-{
- struct nf_conn *ct = container_of(head, struct nf_conn, rcu);
-
- nf_ct_ext_free(ct);
- kmem_cache_free(nf_conntrack_cachep, ct);
-}
-
void nf_conntrack_free(struct nf_conn *ct)
{
struct net *net = nf_ct_net(ct);
nf_ct_ext_destroy(ct);
atomic_dec(&net->ct.count);
- call_rcu(&ct->rcu, nf_conntrack_free_rcu);
+ nf_ct_ext_free(ct);
+ kmem_cache_free(nf_conntrack_cachep, ct);
}
EXPORT_SYMBOL_GPL(nf_conntrack_free);
@@ -594,7 +609,7 @@ init_conntrack(struct net *net,
}
/* Overload tuple linked list to put us in unconfirmed list. */
- hlist_add_head(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode,
+ hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
&net->ct.unconfirmed);
spin_unlock_bh(&nf_conntrack_lock);
@@ -934,17 +949,17 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
{
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
spin_lock_bh(&nf_conntrack_lock);
for (; *bucket < nf_conntrack_htable_size; (*bucket)++) {
- hlist_for_each_entry(h, n, &net->ct.hash[*bucket], hnode) {
+ hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket], hnnode) {
ct = nf_ct_tuplehash_to_ctrack(h);
if (iter(ct, data))
goto found;
}
}
- hlist_for_each_entry(h, n, &net->ct.unconfirmed, hnode) {
+ hlist_nulls_for_each_entry(h, n, &net->ct.unconfirmed, hnnode) {
ct = nf_ct_tuplehash_to_ctrack(h);
if (iter(ct, data))
set_bit(IPS_DYING_BIT, &ct->status);
@@ -992,7 +1007,7 @@ static int kill_all(struct nf_conn *i, void *data)
return 1;
}
-void nf_ct_free_hashtable(struct hlist_head *hash, int vmalloced, unsigned int size)
+void nf_ct_free_hashtable(void *hash, int vmalloced, unsigned int size)
{
if (vmalloced)
vfree(hash);
@@ -1060,26 +1075,28 @@ void nf_conntrack_cleanup(struct net *net)
}
}
-struct hlist_head *nf_ct_alloc_hashtable(unsigned int *sizep, int *vmalloced)
+void *nf_ct_alloc_hashtable(unsigned int *sizep, int *vmalloced, int nulls)
{
- struct hlist_head *hash;
- unsigned int size, i;
+ struct hlist_nulls_head *hash;
+ unsigned int nr_slots, i;
+ size_t sz;
*vmalloced = 0;
- size = *sizep = roundup(*sizep, PAGE_SIZE / sizeof(struct hlist_head));
- hash = (void*)__get_free_pages(GFP_KERNEL|__GFP_NOWARN,
- get_order(sizeof(struct hlist_head)
- * size));
+ BUILD_BUG_ON(sizeof(struct hlist_nulls_head) != sizeof(struct hlist_head));
+ nr_slots = *sizep = roundup(*sizep, PAGE_SIZE / sizeof(struct hlist_nulls_head));
+ sz = nr_slots * sizeof(struct hlist_nulls_head);
+ hash = (void *)__get_free_pages(GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO,
+ get_order(sz));
if (!hash) {
*vmalloced = 1;
printk(KERN_WARNING "nf_conntrack: falling back to vmalloc.\n");
- hash = vmalloc(sizeof(struct hlist_head) * size);
+ hash = __vmalloc(sz, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
}
- if (hash)
- for (i = 0; i < size; i++)
- INIT_HLIST_HEAD(&hash[i]);
+ if (hash && nulls)
+ for (i = 0; i < nr_slots; i++)
+ INIT_HLIST_NULLS_HEAD(&hash[i], i);
return hash;
}
@@ -1090,7 +1107,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
int i, bucket, vmalloced, old_vmalloced;
unsigned int hashsize, old_size;
int rnd;
- struct hlist_head *hash, *old_hash;
+ struct hlist_nulls_head *hash, *old_hash;
struct nf_conntrack_tuple_hash *h;
/* On boot, we can set this without any fancy locking. */
@@ -1101,7 +1118,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
if (!hashsize)
return -EINVAL;
- hash = nf_ct_alloc_hashtable(&hashsize, &vmalloced);
+ hash = nf_ct_alloc_hashtable(&hashsize, &vmalloced, 1);
if (!hash)
return -ENOMEM;
@@ -1116,12 +1133,12 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
*/
spin_lock_bh(&nf_conntrack_lock);
for (i = 0; i < nf_conntrack_htable_size; i++) {
- while (!hlist_empty(&init_net.ct.hash[i])) {
- h = hlist_entry(init_net.ct.hash[i].first,
- struct nf_conntrack_tuple_hash, hnode);
- hlist_del_rcu(&h->hnode);
+ while (!hlist_nulls_empty(&init_net.ct.hash[i])) {
+ h = hlist_nulls_entry(init_net.ct.hash[i].first,
+ struct nf_conntrack_tuple_hash, hnnode);
+ hlist_nulls_del_rcu(&h->hnnode);
bucket = __hash_conntrack(&h->tuple, hashsize, rnd);
- hlist_add_head_rcu(&h->hnode, &hash[bucket]);
+ hlist_nulls_add_head_rcu(&h->hnnode, &hash[bucket]);
}
}
old_size = nf_conntrack_htable_size;
@@ -1172,7 +1189,7 @@ static int nf_conntrack_init_init_net(void)
nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
sizeof(struct nf_conn),
- 0, 0, NULL);
+ 0, SLAB_DESTROY_BY_RCU, NULL);
if (!nf_conntrack_cachep) {
printk(KERN_ERR "Unable to create nf_conn slab cache\n");
ret = -ENOMEM;
@@ -1202,7 +1219,7 @@ static int nf_conntrack_init_net(struct net *net)
int ret;
atomic_set(&net->ct.count, 0);
- INIT_HLIST_HEAD(&net->ct.unconfirmed);
+ INIT_HLIST_NULLS_HEAD(&net->ct.unconfirmed, 0);
net->ct.stat = alloc_percpu(struct ip_conntrack_stat);
if (!net->ct.stat) {
ret = -ENOMEM;
@@ -1212,7 +1229,7 @@ static int nf_conntrack_init_net(struct net *net)
if (ret < 0)
goto err_ecache;
net->ct.hash = nf_ct_alloc_hashtable(&nf_conntrack_htable_size,
- &net->ct.hash_vmalloc);
+ &net->ct.hash_vmalloc, 1);
if (!net->ct.hash) {
ret = -ENOMEM;
printk(KERN_ERR "Unable to create nf_conntrack_hash\n");
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 357ba39..3940f99 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -604,7 +604,7 @@ int nf_conntrack_expect_init(struct net *net)
net->ct.expect_count = 0;
net->ct.expect_hash = nf_ct_alloc_hashtable(&nf_ct_expect_hsize,
- &net->ct.expect_vmalloc);
+ &net->ct.expect_vmalloc, 0);
if (net->ct.expect_hash == NULL)
goto err1;
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 805cfdd..30b8e90 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -159,6 +159,7 @@ static void __nf_conntrack_helper_unregister(struct nf_conntrack_helper *me,
struct nf_conntrack_tuple_hash *h;
struct nf_conntrack_expect *exp;
const struct hlist_node *n, *next;
+ const struct hlist_nulls_node *nn;
unsigned int i;
/* Get rid of expectations */
@@ -175,10 +176,10 @@ static void __nf_conntrack_helper_unregister(struct nf_conntrack_helper *me,
}
/* Get rid of expecteds, set helpers to NULL. */
- hlist_for_each_entry(h, n, &net->ct.unconfirmed, hnode)
+ hlist_for_each_entry(h, nn, &net->ct.unconfirmed, hnnode)
unhelp(h, me);
for (i = 0; i < nf_conntrack_htable_size; i++) {
- hlist_for_each_entry(h, n, &net->ct.hash[i], hnode)
+ hlist_nulls_for_each_entry(h, nn, &net->ct.hash[i], hnnode)
unhelp(h, me);
}
}
@@ -218,7 +219,7 @@ int nf_conntrack_helper_init(void)
nf_ct_helper_hsize = 1; /* gets rounded up to use one page */
nf_ct_helper_hash = nf_ct_alloc_hashtable(&nf_ct_helper_hsize,
- &nf_ct_helper_vmalloc);
+ &nf_ct_helper_vmalloc, 0);
if (!nf_ct_helper_hash)
return -ENOMEM;
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 1b75c9e..349bbef 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/rculist.h>
+#include <linux/rculist_nulls.h>
#include <linux/types.h>
#include <linux/timer.h>
#include <linux/skbuff.h>
@@ -536,7 +537,7 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
{
struct nf_conn *ct, *last;
struct nf_conntrack_tuple_hash *h;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
struct nfgenmsg *nfmsg = NLMSG_DATA(cb->nlh);
u_int8_t l3proto = nfmsg->nfgen_family;
@@ -544,27 +545,27 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb)
last = (struct nf_conn *)cb->args[1];
for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++) {
restart:
- hlist_for_each_entry_rcu(h, n, &init_net.ct.hash[cb->args[0]],
- hnode) {
+ hlist_nulls_for_each_entry_rcu(h, n, &init_net.ct.hash[cb->args[0]],
+ hnnode) {
if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
continue;
ct = nf_ct_tuplehash_to_ctrack(h);
+ if (!atomic_inc_not_zero(&ct->ct_general.use))
+ continue;
/* Dump entries of a given L3 protocol number.
* If it is not specified, ie. l3proto == 0,
* then dump everything. */
if (l3proto && nf_ct_l3num(ct) != l3proto)
- continue;
+ goto releasect;
if (cb->args[1]) {
if (ct != last)
- continue;
+ goto releasect;
cb->args[1] = 0;
}
if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid,
cb->nlh->nlmsg_seq,
IPCTNL_MSG_CT_NEW,
1, ct) < 0) {
- if (!atomic_inc_not_zero(&ct->ct_general.use))
- continue;
cb->args[1] = (unsigned long)ct;
goto out;
}
@@ -577,6 +578,8 @@ restart:
if (acct)
memset(acct, 0, sizeof(struct nf_conn_counter[IP_CT_DIR_MAX]));
}
+releasect:
+ nf_ct_put(ct);
}
if (cb->args[1]) {
cb->args[1] = 0;
@@ -1242,13 +1245,12 @@ ctnetlink_create_conntrack(struct nlattr *cda[],
if (err < 0)
goto err2;
- master_h = __nf_conntrack_find(&init_net, &master);
+ master_h = nf_conntrack_find_get(&init_net, &master);
if (master_h == NULL) {
err = -ENOENT;
goto err2;
}
master_ct = nf_ct_tuplehash_to_ctrack(master_h);
- nf_conntrack_get(&master_ct->ct_general);
__set_bit(IPS_EXPECTED_BIT, &ct->status);
ct->master = master_ct;
}
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 4da54b0..1935153 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -44,40 +44,42 @@ struct ct_iter_state {
unsigned int bucket;
};
-static struct hlist_node *ct_get_first(struct seq_file *seq)
+static struct hlist_nulls_node *ct_get_first(struct seq_file *seq)
{
struct net *net = seq_file_net(seq);
struct ct_iter_state *st = seq->private;
- struct hlist_node *n;
+ struct hlist_nulls_node *n;
for (st->bucket = 0;
st->bucket < nf_conntrack_htable_size;
st->bucket++) {
n = rcu_dereference(net->ct.hash[st->bucket].first);
- if (n)
+ if (!is_a_nulls(n))
return n;
}
return NULL;
}
-static struct hlist_node *ct_get_next(struct seq_file *seq,
- struct hlist_node *head)
+static struct hlist_nulls_node *ct_get_next(struct seq_file *seq,
+ struct hlist_nulls_node *head)
{
struct net *net = seq_file_net(seq);
struct ct_iter_state *st = seq->private;
head = rcu_dereference(head->next);
- while (head == NULL) {
- if (++st->bucket >= nf_conntrack_htable_size)
- return NULL;
+ while (is_a_nulls(head)) {
+ if (likely(get_nulls_value(head) == st->bucket)) {
+ if (++st->bucket >= nf_conntrack_htable_size)
+ return NULL;
+ }
head = rcu_dereference(net->ct.hash[st->bucket].first);
}
return head;
}
-static struct hlist_node *ct_get_idx(struct seq_file *seq, loff_t pos)
+static struct hlist_nulls_node *ct_get_idx(struct seq_file *seq, loff_t pos)
{
- struct hlist_node *head = ct_get_first(seq);
+ struct hlist_nulls_node *head = ct_get_first(seq);
if (head)
while (pos && (head = ct_get_next(seq, head)))
@@ -107,67 +109,74 @@ static void ct_seq_stop(struct seq_file *s, void *v)
/* return 0 on success, 1 in case of error */
static int ct_seq_show(struct seq_file *s, void *v)
{
- const struct nf_conntrack_tuple_hash *hash = v;
- const struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(hash);
+ struct nf_conntrack_tuple_hash *hash = v;
+ struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(hash);
const struct nf_conntrack_l3proto *l3proto;
const struct nf_conntrack_l4proto *l4proto;
+ int ret = 0;
NF_CT_ASSERT(ct);
+ if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
+ return 0;
/* we only want to print DIR_ORIGINAL */
if (NF_CT_DIRECTION(hash))
- return 0;
+ goto release;
l3proto = __nf_ct_l3proto_find(nf_ct_l3num(ct));
NF_CT_ASSERT(l3proto);
l4proto = __nf_ct_l4proto_find(nf_ct_l3num(ct), nf_ct_protonum(ct));
NF_CT_ASSERT(l4proto);
+ ret = -ENOSPC;
if (seq_printf(s, "%-8s %u %-8s %u %ld ",
l3proto->name, nf_ct_l3num(ct),
l4proto->name, nf_ct_protonum(ct),
timer_pending(&ct->timeout)
? (long)(ct->timeout.expires - jiffies)/HZ : 0) != 0)
- return -ENOSPC;
+ goto release;
if (l4proto->print_conntrack && l4proto->print_conntrack(s, ct))
- return -ENOSPC;
+ goto release;
if (print_tuple(s, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
l3proto, l4proto))
- return -ENOSPC;
+ goto release;
if (seq_print_acct(s, ct, IP_CT_DIR_ORIGINAL))
- return -ENOSPC;
+ goto release;
if (!(test_bit(IPS_SEEN_REPLY_BIT, &ct->status)))
if (seq_printf(s, "[UNREPLIED] "))
- return -ENOSPC;
+ goto release;
if (print_tuple(s, &ct->tuplehash[IP_CT_DIR_REPLY].tuple,
l3proto, l4proto))
- return -ENOSPC;
+ goto release;
if (seq_print_acct(s, ct, IP_CT_DIR_REPLY))
- return -ENOSPC;
+ goto release;
if (test_bit(IPS_ASSURED_BIT, &ct->status))
if (seq_printf(s, "[ASSURED] "))
- return -ENOSPC;
+ goto release;
#if defined(CONFIG_NF_CONNTRACK_MARK)
if (seq_printf(s, "mark=%u ", ct->mark))
- return -ENOSPC;
+ goto release;
#endif
#ifdef CONFIG_NF_CONNTRACK_SECMARK
if (seq_printf(s, "secmark=%u ", ct->secmark))
- return -ENOSPC;
+ goto release;
#endif
if (seq_printf(s, "use=%u\n", atomic_read(&ct->ct_general.use)))
- return -ENOSPC;
+ goto release;
+ ret = 0;
+release:
+ nf_ct_put(ct);
return 0;
}
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index 7f404cc..6809809 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -108,7 +108,7 @@ static int count_them(struct xt_connlimit_data *data,
const struct nf_conntrack_tuple_hash *found;
struct xt_connlimit_conn *conn;
struct xt_connlimit_conn *tmp;
- const struct nf_conn *found_ct;
+ struct nf_conn *found_ct;
struct list_head *hash;
bool addit = true;
int matches = 0;
@@ -123,7 +123,7 @@ static int count_them(struct xt_connlimit_data *data,
/* check the saved connections */
list_for_each_entry_safe(conn, tmp, hash, list) {
- found = __nf_conntrack_find(&init_net, &conn->tuple);
+ found = nf_conntrack_find_get(&init_net, &conn->tuple);
found_ct = NULL;
if (found != NULL)
@@ -151,6 +151,7 @@ static int count_them(struct xt_connlimit_data *data,
* we do not care about connections which are
* closed already -> ditch it
*/
+ nf_ct_put(found_ct);
list_del(&conn->list);
kfree(conn);
continue;
@@ -160,6 +161,7 @@ static int count_them(struct xt_connlimit_data *data,
match->family))
/* same source network -> be counted! */
++matches;
+ nf_ct_put(found_ct);
}
rcu_read_unlock();
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 09/12: ctnetlink: allocate right-sized ctnetlink skb
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (7 preceding siblings ...)
2009-03-26 19:02 ` netfilter 08/12: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu() Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 10/12: nf_conntrack: add generic function to get len of generic policy Patrick McHardy
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 2732c4e45bb67006fdc9ae6669be866762711ab5
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 21:50:59 2009 +0100
netfilter: ctnetlink: allocate right-sized ctnetlink skb
Try to allocate a Netlink skb roughly the size of the actual
message, with the help from the l3 and l4 protocol helpers.
This is all to prevent a reallocation in netlink_trim() later.
The overhead of allocating the right-sized skb is rather small, with
ctnetlink_alloc_skb() actually being inlined away on my x86_64 box.
The size of the per-proto space is determined at registration time of
the protocol helper.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 349bbef..03547c6 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -405,6 +405,69 @@ nla_put_failure:
}
#ifdef CONFIG_NF_CONNTRACK_EVENTS
+/*
+ * The general structure of a ctnetlink event is
+ *
+ * CTA_TUPLE_ORIG
+ * <l3/l4-proto-attributes>
+ * CTA_TUPLE_REPLY
+ * <l3/l4-proto-attributes>
+ * CTA_ID
+ * ...
+ * CTA_PROTOINFO
+ * <l4-proto-attributes>
+ * CTA_TUPLE_MASTER
+ * <l3/l4-proto-attributes>
+ *
+ * Therefore the formular is
+ *
+ * size = sizeof(headers) + sizeof(generic_nlas) + 3 * sizeof(tuple_nlas)
+ * + sizeof(protoinfo_nlas)
+ */
+static struct sk_buff *
+ctnetlink_alloc_skb(const struct nf_conntrack_tuple *tuple, gfp_t gfp)
+{
+ struct nf_conntrack_l3proto *l3proto;
+ struct nf_conntrack_l4proto *l4proto;
+ int len;
+
+#define NLA_TYPE_SIZE(type) nla_total_size(sizeof(type))
+
+ /* proto independant part */
+ len = NLMSG_SPACE(sizeof(struct nfgenmsg))
+ + 3 * nla_total_size(0) /* CTA_TUPLE_ORIG|REPL|MASTER */
+ + 3 * nla_total_size(0) /* CTA_TUPLE_IP */
+ + 3 * nla_total_size(0) /* CTA_TUPLE_PROTO */
+ + 3 * NLA_TYPE_SIZE(u_int8_t) /* CTA_PROTO_NUM */
+ + NLA_TYPE_SIZE(u_int32_t) /* CTA_ID */
+ + NLA_TYPE_SIZE(u_int32_t) /* CTA_STATUS */
+ + 2 * nla_total_size(0) /* CTA_COUNTERS_ORIG|REPL */
+ + 2 * NLA_TYPE_SIZE(uint64_t) /* CTA_COUNTERS_PACKETS */
+ + 2 * NLA_TYPE_SIZE(uint64_t) /* CTA_COUNTERS_BYTES */
+ + NLA_TYPE_SIZE(u_int32_t) /* CTA_TIMEOUT */
+ + nla_total_size(0) /* CTA_PROTOINFO */
+ + nla_total_size(0) /* CTA_HELP */
+ + nla_total_size(NF_CT_HELPER_NAME_LEN) /* CTA_HELP_NAME */
+ + NLA_TYPE_SIZE(u_int32_t) /* CTA_SECMARK */
+ + 2 * nla_total_size(0) /* CTA_NAT_SEQ_ADJ_ORIG|REPL */
+ + 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_POS */
+ + 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_BEFORE */
+ + 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_AFTER */
+ + NLA_TYPE_SIZE(u_int32_t); /* CTA_MARK */
+
+#undef NLA_TYPE_SIZE
+
+ rcu_read_lock();
+ l3proto = __nf_ct_l3proto_find(tuple->src.l3num);
+ len += l3proto->nla_size;
+
+ l4proto = __nf_ct_l4proto_find(tuple->src.l3num, tuple->dst.protonum);
+ len += l4proto->nla_size;
+ rcu_read_unlock();
+
+ return alloc_skb(len, gfp);
+}
+
static int ctnetlink_conntrack_event(struct notifier_block *this,
unsigned long events, void *ptr)
{
@@ -438,7 +501,7 @@ static int ctnetlink_conntrack_event(struct notifier_block *this,
if (!item->report && !nfnetlink_has_listeners(group))
return NOTIFY_DONE;
- skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
+ skb = ctnetlink_alloc_skb(tuple(ct, IP_CT_DIR_ORIGINAL), GFP_ATOMIC);
if (!skb)
return NOTIFY_DONE;
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 10/12: nf_conntrack: add generic function to get len of generic policy
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (8 preceding siblings ...)
2009-03-26 19:02 ` netfilter 09/12: ctnetlink: allocate right-sized ctnetlink skb Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` netfilter 11/12: nf_conntrack: calculate per-protocol nlattr size Patrick McHardy
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit 5c0de29d06318ec8f6e3ba0d17d62529dbbdc1e8
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 21:52:17 2009 +0100
netfilter: nf_conntrack: add generic function to get len of generic policy
Usefull for all protocols which do not add additional data, such
as GRE or UDPlite.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index a120990..ba32ed7 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -113,6 +113,7 @@ extern int nf_ct_port_tuple_to_nlattr(struct sk_buff *skb,
const struct nf_conntrack_tuple *tuple);
extern int nf_ct_port_nlattr_to_tuple(struct nlattr *tb[],
struct nf_conntrack_tuple *t);
+extern int nf_ct_port_nlattr_tuple_size(void);
extern const struct nla_policy nf_ct_port_nla_policy[];
#ifdef CONFIG_SYSCTL
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index c55bbdc..b182b30 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -921,6 +921,12 @@ int nf_ct_port_nlattr_to_tuple(struct nlattr *tb[],
return 0;
}
EXPORT_SYMBOL_GPL(nf_ct_port_nlattr_to_tuple);
+
+int nf_ct_port_nlattr_tuple_size(void)
+{
+ return nla_policy_len(nf_ct_port_nla_policy, CTA_PROTO_MAX + 1);
+}
+EXPORT_SYMBOL_GPL(nf_ct_port_nlattr_tuple_size);
#endif
/* Used by ipt_REJECT and ip6t_REJECT. */
^ permalink raw reply related [flat|nested] 14+ messages in thread
* netfilter 11/12: nf_conntrack: calculate per-protocol nlattr size
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (9 preceding siblings ...)
2009-03-26 19:02 ` netfilter 10/12: nf_conntrack: add generic function to get len of generic policy Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-26 19:02 ` ctnetlink 12/12: compute generic part of event more acurately Patrick McHardy
2009-03-27 5:46 ` netfilter 00/12: Netfilter fixes/2.6.30 update part II David Miller
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit a400c30edb1958ceb53c4b8ce78989189b36df47
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Wed Mar 25 21:53:39 2009 +0100
netfilter: nf_conntrack: calculate per-protocol nlattr size
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 8b681f2..7d2ead7 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -328,6 +328,11 @@ static int ipv4_nlattr_to_tuple(struct nlattr *tb[],
return 0;
}
+
+static int ipv4_nlattr_tuple_size(void)
+{
+ return nla_policy_len(ipv4_nla_policy, CTA_IP_MAX + 1);
+}
#endif
static struct nf_sockopt_ops so_getorigdst = {
@@ -347,6 +352,7 @@ struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv4 __read_mostly = {
.get_l4proto = ipv4_get_l4proto,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = ipv4_tuple_to_nlattr,
+ .nlattr_tuple_size = ipv4_nlattr_tuple_size,
.nlattr_to_tuple = ipv4_nlattr_to_tuple,
.nla_policy = ipv4_nla_policy,
#endif
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index 2a8bee2..23b2c2e 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -262,6 +262,11 @@ static int icmp_nlattr_to_tuple(struct nlattr *tb[],
return 0;
}
+
+static int icmp_nlattr_tuple_size(void)
+{
+ return nla_policy_len(icmp_nla_policy, CTA_PROTO_MAX + 1);
+}
#endif
#ifdef CONFIG_SYSCTL
@@ -309,6 +314,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_icmp __read_mostly =
.me = NULL,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = icmp_tuple_to_nlattr,
+ .nlattr_tuple_size = icmp_nlattr_tuple_size,
.nlattr_to_tuple = icmp_nlattr_to_tuple,
.nla_policy = icmp_nla_policy,
#endif
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index e6852f6..2a15c2d 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -342,6 +342,11 @@ static int ipv6_nlattr_to_tuple(struct nlattr *tb[],
return 0;
}
+
+static int ipv6_nlattr_tuple_size(void)
+{
+ return nla_policy_len(ipv6_nla_policy, CTA_IP_MAX + 1);
+}
#endif
struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv6 __read_mostly = {
@@ -353,6 +358,7 @@ struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv6 __read_mostly = {
.get_l4proto = ipv6_get_l4proto,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = ipv6_tuple_to_nlattr,
+ .nlattr_tuple_size = ipv6_nlattr_tuple_size,
.nlattr_to_tuple = ipv6_nlattr_to_tuple,
.nla_policy = ipv6_nla_policy,
#endif
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index 165b256..032fdf4 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -268,6 +268,11 @@ static int icmpv6_nlattr_to_tuple(struct nlattr *tb[],
return 0;
}
+
+static int icmpv6_nlattr_tuple_size(void)
+{
+ return nla_policy_len(icmpv6_nla_policy, CTA_PROTO_MAX + 1);
+}
#endif
#ifdef CONFIG_SYSCTL
@@ -299,6 +304,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6 __read_mostly =
.error = icmpv6_error,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = icmpv6_tuple_to_nlattr,
+ .nlattr_tuple_size = icmpv6_nlattr_tuple_size,
.nlattr_to_tuple = icmpv6_nlattr_to_tuple,
.nla_policy = icmpv6_nla_policy,
#endif
diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
index d3d5a7f..50dac8d 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -669,6 +669,12 @@ static int nlattr_to_dccp(struct nlattr *cda[], struct nf_conn *ct)
write_unlock_bh(&dccp_lock);
return 0;
}
+
+static int dccp_nlattr_size(void)
+{
+ return nla_total_size(0) /* CTA_PROTOINFO_DCCP */
+ + nla_policy_len(dccp_nla_policy, CTA_PROTOINFO_DCCP_MAX + 1);
+}
#endif
#ifdef CONFIG_SYSCTL
@@ -749,8 +755,10 @@ static struct nf_conntrack_l4proto dccp_proto4 __read_mostly = {
.print_conntrack = dccp_print_conntrack,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.to_nlattr = dccp_to_nlattr,
+ .nlattr_size = dccp_nlattr_size,
.from_nlattr = nlattr_to_dccp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
@@ -771,6 +779,7 @@ static struct nf_conntrack_l4proto dccp_proto6 __read_mostly = {
.to_nlattr = dccp_to_nlattr,
.from_nlattr = nlattr_to_dccp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
diff --git a/net/netfilter/nf_conntrack_proto_gre.c b/net/netfilter/nf_conntrack_proto_gre.c
index 1b279f9..117b801 100644
--- a/net/netfilter/nf_conntrack_proto_gre.c
+++ b/net/netfilter/nf_conntrack_proto_gre.c
@@ -293,6 +293,7 @@ static struct nf_conntrack_l4proto nf_conntrack_l4proto_gre4 __read_mostly = {
.me = THIS_MODULE,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
index 74e0379..101b4ad 100644
--- a/net/netfilter/nf_conntrack_proto_sctp.c
+++ b/net/netfilter/nf_conntrack_proto_sctp.c
@@ -537,6 +537,12 @@ static int nlattr_to_sctp(struct nlattr *cda[], struct nf_conn *ct)
return 0;
}
+
+static int sctp_nlattr_size(void)
+{
+ return nla_total_size(0) /* CTA_PROTOINFO_SCTP */
+ + nla_policy_len(sctp_nla_policy, CTA_PROTOINFO_SCTP_MAX + 1);
+}
#endif
#ifdef CONFIG_SYSCTL
@@ -668,8 +674,10 @@ static struct nf_conntrack_l4proto nf_conntrack_l4proto_sctp4 __read_mostly = {
.me = THIS_MODULE,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.to_nlattr = sctp_to_nlattr,
+ .nlattr_size = sctp_nlattr_size,
.from_nlattr = nlattr_to_sctp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
@@ -696,8 +704,10 @@ static struct nf_conntrack_l4proto nf_conntrack_l4proto_sctp6 __read_mostly = {
.me = THIS_MODULE,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.to_nlattr = sctp_to_nlattr,
+ .nlattr_size = sctp_nlattr_size,
.from_nlattr = nlattr_to_sctp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 7d3944f..9b9e671 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1183,6 +1183,17 @@ static int nlattr_to_tcp(struct nlattr *cda[], struct nf_conn *ct)
return 0;
}
+
+static int tcp_nlattr_size(void)
+{
+ return nla_total_size(0) /* CTA_PROTOINFO_TCP */
+ + nla_policy_len(tcp_nla_policy, CTA_PROTOINFO_TCP_MAX + 1);
+}
+
+static int tcp_nlattr_tuple_size(void)
+{
+ return nla_policy_len(nf_ct_port_nla_policy, CTA_PROTO_MAX + 1);
+}
#endif
#ifdef CONFIG_SYSCTL
@@ -1398,9 +1409,11 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp4 __read_mostly =
.error = tcp_error,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.to_nlattr = tcp_to_nlattr,
+ .nlattr_size = tcp_nlattr_size,
.from_nlattr = nlattr_to_tcp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
+ .nlattr_tuple_size = tcp_nlattr_tuple_size,
.nla_policy = nf_ct_port_nla_policy,
#endif
#ifdef CONFIG_SYSCTL
@@ -1428,9 +1441,11 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_tcp6 __read_mostly =
.error = tcp_error,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.to_nlattr = tcp_to_nlattr,
+ .nlattr_size = tcp_nlattr_size,
.from_nlattr = nlattr_to_tcp,
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
+ .nlattr_tuple_size = tcp_nlattr_tuple_size,
.nla_policy = nf_ct_port_nla_policy,
#endif
#ifdef CONFIG_SYSCTL
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index d402117..70809d1 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -195,6 +195,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_udp4 __read_mostly =
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nla_policy = nf_ct_port_nla_policy,
#endif
#ifdef CONFIG_SYSCTL
@@ -222,6 +223,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6 __read_mostly =
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nla_policy = nf_ct_port_nla_policy,
#endif
#ifdef CONFIG_SYSCTL
diff --git a/net/netfilter/nf_conntrack_proto_udplite.c b/net/netfilter/nf_conntrack_proto_udplite.c
index 4579d8d..4614696 100644
--- a/net/netfilter/nf_conntrack_proto_udplite.c
+++ b/net/netfilter/nf_conntrack_proto_udplite.c
@@ -180,6 +180,7 @@ static struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite4 __read_mostly =
.error = udplite_error,
#if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
.tuple_to_nlattr = nf_ct_port_tuple_to_nlattr,
+ .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size,
.nlattr_to_tuple = nf_ct_port_nlattr_to_tuple,
.nla_policy = nf_ct_port_nla_policy,
#endif
^ permalink raw reply related [flat|nested] 14+ messages in thread
* ctnetlink 12/12: compute generic part of event more acurately
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (10 preceding siblings ...)
2009-03-26 19:02 ` netfilter 11/12: nf_conntrack: calculate per-protocol nlattr size Patrick McHardy
@ 2009-03-26 19:02 ` Patrick McHardy
2009-03-27 5:46 ` netfilter 00/12: Netfilter fixes/2.6.30 update part II David Miller
12 siblings, 0 replies; 14+ messages in thread
From: Patrick McHardy @ 2009-03-26 19:02 UTC (permalink / raw)
To: davem; +Cc: netdev, Patrick McHardy, netfilter-devel
commit d271e8bd8c60ce059ee36d836ba063cfc61c3e21
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date: Thu Mar 26 13:37:14 2009 +0100
ctnetlink: compute generic part of event more acurately
On a box with most of the optional Netfilter switches turned off some
of the NLAs are never send, e. g. secmark, mark or the conntrack
byte/packet counters. As a worst case scenario this may possibly
still lead to ctnetlink skbs being reallocated in netlink_trim()
later, loosing all the nice effects from the previous patches.
I try to solve that (at least partly) by correctly #ifdef'ing the
NLAs in the computation.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 03547c6..2fb833b 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -441,19 +441,28 @@ ctnetlink_alloc_skb(const struct nf_conntrack_tuple *tuple, gfp_t gfp)
+ 3 * NLA_TYPE_SIZE(u_int8_t) /* CTA_PROTO_NUM */
+ NLA_TYPE_SIZE(u_int32_t) /* CTA_ID */
+ NLA_TYPE_SIZE(u_int32_t) /* CTA_STATUS */
+#ifdef CONFIG_NF_CT_ACCT
+ 2 * nla_total_size(0) /* CTA_COUNTERS_ORIG|REPL */
+ 2 * NLA_TYPE_SIZE(uint64_t) /* CTA_COUNTERS_PACKETS */
+ 2 * NLA_TYPE_SIZE(uint64_t) /* CTA_COUNTERS_BYTES */
+#endif
+ NLA_TYPE_SIZE(u_int32_t) /* CTA_TIMEOUT */
+ nla_total_size(0) /* CTA_PROTOINFO */
+ nla_total_size(0) /* CTA_HELP */
+ nla_total_size(NF_CT_HELPER_NAME_LEN) /* CTA_HELP_NAME */
+#ifdef CONFIG_NF_CONNTRACK_SECMARK
+ NLA_TYPE_SIZE(u_int32_t) /* CTA_SECMARK */
+#endif
+#ifdef CONFIG_NF_NAT_NEEDED
+ 2 * nla_total_size(0) /* CTA_NAT_SEQ_ADJ_ORIG|REPL */
+ 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_POS */
+ 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_BEFORE */
+ 2 * NLA_TYPE_SIZE(u_int32_t) /* CTA_NAT_SEQ_CORRECTION_AFTER */
- + NLA_TYPE_SIZE(u_int32_t); /* CTA_MARK */
+#endif
+#ifdef CONFIG_NF_CONNTRACK_MARK
+ + NLA_TYPE_SIZE(u_int32_t) /* CTA_MARK */
+#endif
+ ;
#undef NLA_TYPE_SIZE
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: netfilter 00/12: Netfilter fixes/2.6.30 update part II
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
` (11 preceding siblings ...)
2009-03-26 19:02 ` ctnetlink 12/12: compute generic part of event more acurately Patrick McHardy
@ 2009-03-27 5:46 ` David Miller
12 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2009-03-27 5:46 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel
From: Patrick McHardy <kaber@trash.net>
Date: Thu, 26 Mar 2009 20:02:30 +0100 (MET)
> Please apply or pull from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6.git
Pulled, thanks.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-03-27 5:46 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-26 19:02 netfilter 00/12: Netfilter fixes/2.6.30 update part II Patrick McHardy
2009-03-26 19:02 ` netfilter 01/12: fix xt_LED build failure Patrick McHardy
2009-03-26 19:02 ` netfilter 02/12: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Patrick McHardy
2009-03-26 19:02 ` netfilter 03/12: factorize ifname_compare() Patrick McHardy
2009-03-26 19:02 ` netfilter 04/12: ctnetlink: add callbacks to the per-proto nlattrs Patrick McHardy
2009-03-26 19:02 ` netlink 05/12: add nla_policy_len() Patrick McHardy
2009-03-26 19:02 ` netfilter 06/12: limit the length of the helper name Patrick McHardy
2009-03-26 19:02 ` netfilter 07/12: {ip,ip6,arp}_tables: fix incorrect loop detection Patrick McHardy
2009-03-26 19:02 ` netfilter 08/12: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu() Patrick McHardy
2009-03-26 19:02 ` netfilter 09/12: ctnetlink: allocate right-sized ctnetlink skb Patrick McHardy
2009-03-26 19:02 ` netfilter 10/12: nf_conntrack: add generic function to get len of generic policy Patrick McHardy
2009-03-26 19:02 ` netfilter 11/12: nf_conntrack: calculate per-protocol nlattr size Patrick McHardy
2009-03-26 19:02 ` ctnetlink 12/12: compute generic part of event more acurately Patrick McHardy
2009-03-27 5:46 ` netfilter 00/12: Netfilter fixes/2.6.30 update part II David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).