* [PATCH 00/20] Netfilter/IPVS updates for net-next
@ 2013-11-04 21:50 Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition Pablo Neira Ayuso
` (20 more replies)
0 siblings, 21 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
Hi David,
This is another batch containing Netfilter/IPVS updates for your net-next
tree, they are:
* Six patches to make the ipt_CLUSTERIP target support netnamespace,
from Gao feng.
* Two cleanups for the nf_conntrack_acct infrastructure, introducing
a new structure to encapsulate conntrack counters, from Holger
Eitzenberger.
* Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann.
* Skip checksum recalculation in SCTP support for IPVS, also from
Daniel Borkmann.
* Fix behavioural change in xt_socket after IP early demux, from
Florian Westphal.
* Fix bogus large memory allocation in the bitmap port set type in ipset,
from Jozsef Kadlecsik.
* Fix possible compilation issues in the hash netnet set type in ipset,
also from Jozsef Kadlecsik.
* Define constants to identify netlink callback data in ipset dumps,
again from Jozsef Kadlecsik.
* Use sock_gen_put() in xt_socket to replace xt_socket_put_sk,
from Eric Dumazet.
* Improvements for the SH scheduler in IPVS, from Alexander Frolkin.
* Remove extra delay due to unneeded rcu barrier in IPVS net namespace
cleanup path, from Julian Anastasov.
* Save some cycles in ip6t_REJECT by skipping checksum validation in
packets leaving from our stack, from Stanislav Fomichev.
* Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from
Julian Anastasov.
You can pull these changes from:
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master
Thanks!
----------------------------------------------------------------
The following changes since commit 58308451e91974267e1f4a618346055342019e02:
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next (2013-10-10 15:29:44 -0400)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master
for you to fetch changes up to 4542fa4727f5f83faf9e1f28f35be0b9a2317aec:
netfilter: ctnetlink: account both directions in one step (2013-11-03 21:49:32 +0100)
----------------------------------------------------------------
Alexander Frolkin (1):
ipvs: improved SH fallback strategy
Daniel Borkmann (2):
net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule
net: ipvs: sctp: do not recalc sctp csum when ports didn't change
Eric Dumazet (1):
netfilter: xt_socket: use sock_gen_put()
Florian Westphal (1):
bridge: netfilter: orphan skb before invoking ip netfilter hooks
Gao feng (6):
netfilter: ipt_CLUSTERIP: make proc directory per net namespace
netfilter: ipt_CLUSTERIP: make clusterip_list per net namespace
netfilter: ipt_CLUSTERIP: make clusterip_lock per net namespace
netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get
netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory
netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP
Holger Eitzenberger (2):
netfilter: introduce nf_conn_acct structure
netfilter: ctnetlink: account both directions in one step
Jozsef Kadlecsik (3):
netfilter: ipset: Use netlink callback dump args only
netfilter: ipset: The unnamed union initialization may lead to compilation error
netfilter:ipset: Fix memory allocation for bitmap:port
Julian Anastasov (2):
ipvs: fix the IPVS_CMD_ATTR_MAX definition
ipvs: avoid rcu_barrier during netns cleanup
Michael Opdenacker (1):
netfilter: ipset: remove duplicate define
Stanislav Fomichev (1):
netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets
include/linux/netfilter/ipset/ip_set.h | 10 +++
include/net/ip_vs.h | 6 ++
include/net/netfilter/nf_conntrack_acct.h | 10 ++-
include/net/netfilter/nf_conntrack_extend.h | 2 +-
include/uapi/linux/ip_vs.h | 2 +-
net/bridge/br_netfilter.c | 2 +
net/ipv4/netfilter/ipt_CLUSTERIP.c | 110 ++++++++++++++++++--------
net/ipv6/netfilter/ip6t_REJECT.c | 7 +-
net/netfilter/ipset/ip_set_bitmap_gen.h | 11 +--
net/netfilter/ipset/ip_set_bitmap_port.c | 2 +-
net/netfilter/ipset/ip_set_core.c | 70 ++++++++--------
net/netfilter/ipset/ip_set_hash_gen.h | 21 ++---
net/netfilter/ipset/ip_set_hash_netnet.c | 22 +++---
net/netfilter/ipset/ip_set_hash_netportnet.c | 22 +++---
net/netfilter/ipset/ip_set_list_set.c | 11 +--
net/netfilter/ipvs/ip_vs_ctl.c | 6 +-
net/netfilter/ipvs/ip_vs_lblc.c | 2 +-
net/netfilter/ipvs/ip_vs_lblcr.c | 2 +-
net/netfilter/ipvs/ip_vs_proto_sctp.c | 48 +++++++++--
net/netfilter/ipvs/ip_vs_sh.c | 39 ++++++---
net/netfilter/nf_conntrack_acct.c | 12 +--
net/netfilter/nf_conntrack_core.c | 16 ++--
net/netfilter/nf_conntrack_netlink.c | 51 ++++++------
net/netfilter/xt_connbytes.c | 6 +-
net/netfilter/xt_socket.c | 13 +--
25 files changed, 305 insertions(+), 198 deletions(-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 02/20] ipvs: avoid rcu_barrier during netns cleanup Pablo Neira Ayuso
` (19 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Julian Anastasov <ja@ssi.bg>
It was wrong (bigger) but problem is harmless.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/uapi/linux/ip_vs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
index 2945822..fbcffe8 100644
--- a/include/uapi/linux/ip_vs.h
+++ b/include/uapi/linux/ip_vs.h
@@ -334,7 +334,7 @@ enum {
__IPVS_CMD_ATTR_MAX,
};
-#define IPVS_CMD_ATTR_MAX (__IPVS_SVC_ATTR_MAX - 1)
+#define IPVS_CMD_ATTR_MAX (__IPVS_CMD_ATTR_MAX - 1)
/*
* Attributes used to describe a service
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 02/20] ipvs: avoid rcu_barrier during netns cleanup
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 03/20] ipvs: improved SH fallback strategy Pablo Neira Ayuso
` (18 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Julian Anastasov <ja@ssi.bg>
commit 578bc3ef1e473a ("ipvs: reorganize dest trash") added
rcu_barrier() on cleanup to wait dest users and schedulers
like LBLC and LBLCR to put their last dest reference.
Using rcu_barrier with many namespaces is problematic.
Trying to fix it by freeing dest with kfree_rcu is not
a solution, RCU callbacks can run in parallel and execution
order is random.
Fix it by creating new function ip_vs_dest_put_and_free()
which is heavier than ip_vs_dest_put(). We will use it just
for schedulers like LBLC, LBLCR that can delay their dest
release.
By default, dests reference is above 0 if they are present in
service and it is 0 when deleted but still in trash list.
Change the dest trash code to use ip_vs_dest_put_and_free(),
so that refcnt -1 can be used for freeing. As result,
such checks remain in slow path and the rcu_barrier() from
netns cleanup can be removed.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 6 ++++++
net/netfilter/ipvs/ip_vs_ctl.c | 6 +-----
net/netfilter/ipvs/ip_vs_lblc.c | 2 +-
net/netfilter/ipvs/ip_vs_lblcr.c | 2 +-
4 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 1c2e1b9..cd7275f 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1442,6 +1442,12 @@ static inline void ip_vs_dest_put(struct ip_vs_dest *dest)
atomic_dec(&dest->refcnt);
}
+static inline void ip_vs_dest_put_and_free(struct ip_vs_dest *dest)
+{
+ if (atomic_dec_return(&dest->refcnt) < 0)
+ kfree(dest);
+}
+
/*
* IPVS sync daemon data and function prototypes
* (from ip_vs_sync.c)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index a3df9bd..62786a4 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -704,7 +704,7 @@ static void ip_vs_dest_free(struct ip_vs_dest *dest)
__ip_vs_dst_cache_reset(dest);
__ip_vs_svc_put(svc, false);
free_percpu(dest->stats.cpustats);
- kfree(dest);
+ ip_vs_dest_put_and_free(dest);
}
/*
@@ -3820,10 +3820,6 @@ void __net_exit ip_vs_control_net_cleanup(struct net *net)
{
struct netns_ipvs *ipvs = net_ipvs(net);
- /* Some dest can be in grace period even before cleanup, we have to
- * defer ip_vs_trash_cleanup until ip_vs_dest_wait_readers is called.
- */
- rcu_barrier();
ip_vs_trash_cleanup(net);
ip_vs_stop_estimator(net, &ipvs->tot_stats);
ip_vs_control_net_cleanup_sysctl(net);
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index eff13c9..ca056a3 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -136,7 +136,7 @@ static void ip_vs_lblc_rcu_free(struct rcu_head *head)
struct ip_vs_lblc_entry,
rcu_head);
- ip_vs_dest_put(en->dest);
+ ip_vs_dest_put_and_free(en->dest);
kfree(en);
}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 0b85500..3f21a2f 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -130,7 +130,7 @@ static void ip_vs_lblcr_elem_rcu_free(struct rcu_head *head)
struct ip_vs_dest_set_elem *e;
e = container_of(head, struct ip_vs_dest_set_elem, rcu_head);
- ip_vs_dest_put(e->dest);
+ ip_vs_dest_put_and_free(e->dest);
kfree(e);
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 03/20] ipvs: improved SH fallback strategy
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 02/20] ipvs: avoid rcu_barrier during netns cleanup Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 04/20] netfilter: xt_socket: use sock_gen_put() Pablo Neira Ayuso
` (17 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Alexander Frolkin <avf@eldamar.org.uk>
Improve the SH fallback realserver selection strategy.
With sh and sh-fallback, if a realserver is down, this attempts to
distribute the traffic that would have gone to that server evenly
among the remaining servers.
Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_sh.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c
index 3588fae..cc65b2f 100644
--- a/net/netfilter/ipvs/ip_vs_sh.c
+++ b/net/netfilter/ipvs/ip_vs_sh.c
@@ -115,27 +115,46 @@ ip_vs_sh_get(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
}
-/* As ip_vs_sh_get, but with fallback if selected server is unavailable */
+/* As ip_vs_sh_get, but with fallback if selected server is unavailable
+ *
+ * The fallback strategy loops around the table starting from a "random"
+ * point (in fact, it is chosen to be the original hash value to make the
+ * algorithm deterministic) to find a new server.
+ */
static inline struct ip_vs_dest *
ip_vs_sh_get_fallback(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
const union nf_inet_addr *addr, __be16 port)
{
- unsigned int offset;
- unsigned int hash;
+ unsigned int offset, roffset;
+ unsigned int hash, ihash;
struct ip_vs_dest *dest;
+ /* first try the dest it's supposed to go to */
+ ihash = ip_vs_sh_hashkey(svc->af, addr, port, 0);
+ dest = rcu_dereference(s->buckets[ihash].dest);
+ if (!dest)
+ return NULL;
+ if (!is_unavailable(dest))
+ return dest;
+
+ IP_VS_DBG_BUF(6, "SH: selected unavailable server %s:%d, reselecting",
+ IP_VS_DBG_ADDR(svc->af, &dest->addr), ntohs(dest->port));
+
+ /* if the original dest is unavailable, loop around the table
+ * starting from ihash to find a new dest
+ */
for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) {
- hash = ip_vs_sh_hashkey(svc->af, addr, port, offset);
+ roffset = (offset + ihash) % IP_VS_SH_TAB_SIZE;
+ hash = ip_vs_sh_hashkey(svc->af, addr, port, roffset);
dest = rcu_dereference(s->buckets[hash].dest);
if (!dest)
break;
- if (is_unavailable(dest))
- IP_VS_DBG_BUF(6, "SH: selected unavailable server "
- "%s:%d (offset %d)",
- IP_VS_DBG_ADDR(svc->af, &dest->addr),
- ntohs(dest->port), offset);
- else
+ if (!is_unavailable(dest))
return dest;
+ IP_VS_DBG_BUF(6, "SH: selected unavailable "
+ "server %s:%d (offset %d), reselecting",
+ IP_VS_DBG_ADDR(svc->af, &dest->addr),
+ ntohs(dest->port), roffset);
}
return NULL;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 04/20] netfilter: xt_socket: use sock_gen_put()
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (2 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 03/20] ipvs: improved SH fallback strategy Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 05/20] netfilter: ipt_CLUSTERIP: make proc directory per net namespace Pablo Neira Ayuso
` (16 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Eric Dumazet <edumazet@google.com>
TCP listener refactoring, part 7 :
Use sock_gen_put() instead of xt_socket_put_sk() for future
SYN_RECV support.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/xt_socket.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
index 3dd0e37..1ba6793 100644
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -35,15 +35,6 @@
#include <net/netfilter/nf_conntrack.h>
#endif
-static void
-xt_socket_put_sk(struct sock *sk)
-{
- if (sk->sk_state == TCP_TIME_WAIT)
- inet_twsk_put(inet_twsk(sk));
- else
- sock_put(sk);
-}
-
static int
extract_icmp4_fields(const struct sk_buff *skb,
u8 *protocol,
@@ -216,7 +207,7 @@ socket_match(const struct sk_buff *skb, struct xt_action_param *par,
inet_twsk(sk)->tw_transparent));
if (sk != skb->sk)
- xt_socket_put_sk(sk);
+ sock_gen_put(sk);
if (wildcard || !transparent)
sk = NULL;
@@ -381,7 +372,7 @@ socket_mt6_v1_v2(const struct sk_buff *skb, struct xt_action_param *par)
inet_twsk(sk)->tw_transparent));
if (sk != skb->sk)
- xt_socket_put_sk(sk);
+ sock_gen_put(sk);
if (wildcard || !transparent)
sk = NULL;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 05/20] netfilter: ipt_CLUSTERIP: make proc directory per net namespace
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (3 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 04/20] netfilter: xt_socket: use sock_gen_put() Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 06/20] netfilter: ipt_CLUSTERIP: make clusterip_list " Pablo Neira Ayuso
` (15 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
Create /proc/net/ipt_CLUSTERIP directory for per net namespace.
Right now,only allow to create entries under the ipt_CLUSTERIP
in init net namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 70 ++++++++++++++++++++++++++----------
1 file changed, 51 insertions(+), 19 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 0b732ef..e66b91b 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -28,6 +28,7 @@
#include <linux/netfilter_ipv4/ipt_CLUSTERIP.h>
#include <net/netfilter/nf_conntrack.h>
#include <net/net_namespace.h>
+#include <net/netns/generic.h>
#include <net/checksum.h>
#include <net/ip.h>
@@ -64,9 +65,16 @@ static DEFINE_SPINLOCK(clusterip_lock);
#ifdef CONFIG_PROC_FS
static const struct file_operations clusterip_proc_fops;
-static struct proc_dir_entry *clusterip_procdir;
#endif
+static int clusterip_net_id __read_mostly;
+
+struct clusterip_net {
+#ifdef CONFIG_PROC_FS
+ struct proc_dir_entry *procdir;
+#endif
+};
+
static inline void
clusterip_config_get(struct clusterip_config *c)
{
@@ -158,6 +166,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info *i, __be32 ip,
struct net_device *dev)
{
struct clusterip_config *c;
+ struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
c = kzalloc(sizeof(*c), GFP_ATOMIC);
if (!c)
@@ -180,7 +189,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info *i, __be32 ip,
/* create proc dir entry */
sprintf(buffer, "%pI4", &ip);
c->pde = proc_create_data(buffer, S_IWUSR|S_IRUSR,
- clusterip_procdir,
+ cn->procdir,
&clusterip_proc_fops, c);
if (!c->pde) {
kfree(c);
@@ -698,48 +707,71 @@ static const struct file_operations clusterip_proc_fops = {
#endif /* CONFIG_PROC_FS */
+static int clusterip_net_init(struct net *net)
+{
+#ifdef CONFIG_PROC_FS
+ struct clusterip_net *cn = net_generic(net, clusterip_net_id);
+
+ cn->procdir = proc_mkdir("ipt_CLUSTERIP", net->proc_net);
+ if (!cn->procdir) {
+ pr_err("Unable to proc dir entry\n");
+ return -ENOMEM;
+ }
+#endif /* CONFIG_PROC_FS */
+
+ return 0;
+}
+
+static void clusterip_net_exit(struct net *net)
+{
+#ifdef CONFIG_PROC_FS
+ struct clusterip_net *cn = net_generic(net, clusterip_net_id);
+ proc_remove(cn->procdir);
+#endif
+}
+
+static struct pernet_operations clusterip_net_ops = {
+ .init = clusterip_net_init,
+ .exit = clusterip_net_exit,
+ .id = &clusterip_net_id,
+ .size = sizeof(struct clusterip_net),
+};
+
static int __init clusterip_tg_init(void)
{
int ret;
- ret = xt_register_target(&clusterip_tg_reg);
+ ret = register_pernet_subsys(&clusterip_net_ops);
if (ret < 0)
return ret;
+ ret = xt_register_target(&clusterip_tg_reg);
+ if (ret < 0)
+ goto cleanup_subsys;
+
ret = nf_register_hook(&cip_arp_ops);
if (ret < 0)
goto cleanup_target;
-#ifdef CONFIG_PROC_FS
- clusterip_procdir = proc_mkdir("ipt_CLUSTERIP", init_net.proc_net);
- if (!clusterip_procdir) {
- pr_err("Unable to proc dir entry\n");
- ret = -ENOMEM;
- goto cleanup_hook;
- }
-#endif /* CONFIG_PROC_FS */
-
pr_info("ClusterIP Version %s loaded successfully\n",
CLUSTERIP_VERSION);
+
return 0;
-#ifdef CONFIG_PROC_FS
-cleanup_hook:
- nf_unregister_hook(&cip_arp_ops);
-#endif /* CONFIG_PROC_FS */
cleanup_target:
xt_unregister_target(&clusterip_tg_reg);
+cleanup_subsys:
+ unregister_pernet_subsys(&clusterip_net_ops);
return ret;
}
static void __exit clusterip_tg_exit(void)
{
pr_info("ClusterIP Version %s unloading\n", CLUSTERIP_VERSION);
-#ifdef CONFIG_PROC_FS
- proc_remove(clusterip_procdir);
-#endif
+
nf_unregister_hook(&cip_arp_ops);
xt_unregister_target(&clusterip_tg_reg);
+ unregister_pernet_subsys(&clusterip_net_ops);
/* Wait for completion of call_rcu_bh()'s (clusterip_config_rcu_free) */
rcu_barrier_bh();
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 06/20] netfilter: ipt_CLUSTERIP: make clusterip_list per net namespace
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (4 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 05/20] netfilter: ipt_CLUSTERIP: make proc directory per net namespace Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 07/20] netfilter: ipt_CLUSTERIP: make clusterip_lock " Pablo Neira Ayuso
` (14 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
clusterip_configs should be per net namespace, so operate
cluster in one net namespace won't affect other net
namespace. right now, only allow to operate the clusterip_configs
of init net namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index e66b91b..8ef3e6f 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -58,8 +58,6 @@ struct clusterip_config {
struct rcu_head rcu;
};
-static LIST_HEAD(clusterip_configs);
-
/* clusterip_lock protects the clusterip_configs list */
static DEFINE_SPINLOCK(clusterip_lock);
@@ -70,6 +68,7 @@ static const struct file_operations clusterip_proc_fops;
static int clusterip_net_id __read_mostly;
struct clusterip_net {
+ struct list_head configs;
#ifdef CONFIG_PROC_FS
struct proc_dir_entry *procdir;
#endif
@@ -124,8 +123,9 @@ static struct clusterip_config *
__clusterip_config_find(__be32 clusterip)
{
struct clusterip_config *c;
+ struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
- list_for_each_entry_rcu(c, &clusterip_configs, list) {
+ list_for_each_entry_rcu(c, &cn->configs, list) {
if (c->clusterip == clusterip)
return c;
}
@@ -199,7 +199,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info *i, __be32 ip,
#endif
spin_lock_bh(&clusterip_lock);
- list_add_rcu(&c->list, &clusterip_configs);
+ list_add_rcu(&c->list, &cn->configs);
spin_unlock_bh(&clusterip_lock);
return c;
@@ -709,9 +709,11 @@ static const struct file_operations clusterip_proc_fops = {
static int clusterip_net_init(struct net *net)
{
-#ifdef CONFIG_PROC_FS
struct clusterip_net *cn = net_generic(net, clusterip_net_id);
+ INIT_LIST_HEAD(&cn->configs);
+
+#ifdef CONFIG_PROC_FS
cn->procdir = proc_mkdir("ipt_CLUSTERIP", net->proc_net);
if (!cn->procdir) {
pr_err("Unable to proc dir entry\n");
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 07/20] netfilter: ipt_CLUSTERIP: make clusterip_lock per net namespace
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (5 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 06/20] netfilter: ipt_CLUSTERIP: make clusterip_list " Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 08/20] netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get Pablo Neira Ayuso
` (13 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
this lock is used for protecting clusterip_configs of per
net namespace, it should be per net namespace too.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 8ef3e6f..1bf5aa30 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -58,9 +58,6 @@ struct clusterip_config {
struct rcu_head rcu;
};
-/* clusterip_lock protects the clusterip_configs list */
-static DEFINE_SPINLOCK(clusterip_lock);
-
#ifdef CONFIG_PROC_FS
static const struct file_operations clusterip_proc_fops;
#endif
@@ -69,6 +66,9 @@ static int clusterip_net_id __read_mostly;
struct clusterip_net {
struct list_head configs;
+ /* lock protects the configs list */
+ spinlock_t lock;
+
#ifdef CONFIG_PROC_FS
struct proc_dir_entry *procdir;
#endif
@@ -99,10 +99,12 @@ clusterip_config_put(struct clusterip_config *c)
static inline void
clusterip_config_entry_put(struct clusterip_config *c)
{
+ struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
+
local_bh_disable();
- if (atomic_dec_and_lock(&c->entries, &clusterip_lock)) {
+ if (atomic_dec_and_lock(&c->entries, &cn->lock)) {
list_del_rcu(&c->list);
- spin_unlock(&clusterip_lock);
+ spin_unlock(&cn->lock);
local_bh_enable();
dev_mc_del(c->dev, c->clustermac);
@@ -198,9 +200,9 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info *i, __be32 ip,
}
#endif
- spin_lock_bh(&clusterip_lock);
+ spin_lock_bh(&cn->lock);
list_add_rcu(&c->list, &cn->configs);
- spin_unlock_bh(&clusterip_lock);
+ spin_unlock_bh(&cn->lock);
return c;
}
@@ -713,6 +715,8 @@ static int clusterip_net_init(struct net *net)
INIT_LIST_HEAD(&cn->configs);
+ spin_lock_init(&cn->lock);
+
#ifdef CONFIG_PROC_FS
cn->procdir = proc_mkdir("ipt_CLUSTERIP", net->proc_net);
if (!cn->procdir) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 08/20] netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (6 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 07/20] netfilter: ipt_CLUSTERIP: make clusterip_lock " Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 09/20] netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory Pablo Neira Ayuso
` (12 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
Inorder to find clusterip_config in net namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 1bf5aa30..b7fc9d5 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -122,10 +122,10 @@ clusterip_config_entry_put(struct clusterip_config *c)
}
static struct clusterip_config *
-__clusterip_config_find(__be32 clusterip)
+__clusterip_config_find(struct net *net, __be32 clusterip)
{
struct clusterip_config *c;
- struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
+ struct clusterip_net *cn = net_generic(net, clusterip_net_id);
list_for_each_entry_rcu(c, &cn->configs, list) {
if (c->clusterip == clusterip)
@@ -136,12 +136,12 @@ __clusterip_config_find(__be32 clusterip)
}
static inline struct clusterip_config *
-clusterip_config_find_get(__be32 clusterip, int entry)
+clusterip_config_find_get(struct net *net, __be32 clusterip, int entry)
{
struct clusterip_config *c;
rcu_read_lock_bh();
- c = __clusterip_config_find(clusterip);
+ c = __clusterip_config_find(net, clusterip);
if (c) {
if (unlikely(!atomic_inc_not_zero(&c->refcount)))
c = NULL;
@@ -381,7 +381,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param *par)
/* FIXME: further sanity checks */
- config = clusterip_config_find_get(e->ip.dst.s_addr, 1);
+ config = clusterip_config_find_get(&init_net, e->ip.dst.s_addr, 1);
if (!config) {
if (!(cipinfo->flags & CLUSTERIP_FLAG_NEW)) {
pr_info("no config found for %pI4, need 'new'\n",
@@ -519,7 +519,7 @@ arp_mangle(unsigned int hook,
/* if there is no clusterip configuration for the arp reply's
* source ip, we don't want to mangle it */
- c = clusterip_config_find_get(payload->src_ip, 0);
+ c = clusterip_config_find_get(&init_net, payload->src_ip, 0);
if (!c)
return NF_ACCEPT;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 09/20] netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (7 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 08/20] netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 10/20] netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP Pablo Neira Ayuso
` (11 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
Create proc entries under the ipt_CLUSTERIP directory of proper
net namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index b7fc9d5..c93dfd2 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -168,7 +168,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info *i, __be32 ip,
struct net_device *dev)
{
struct clusterip_config *c;
- struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
+ struct clusterip_net *cn = net_generic(dev_net(dev), clusterip_net_id);
c = kzalloc(sizeof(*c), GFP_ATOMIC);
if (!c)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 10/20] netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (8 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 09/20] netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 11/20] netfilter: ipset: Use netlink callback dump args only Pablo Neira Ayuso
` (10 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Gao feng <gaofeng@cn.fujitsu.com>
we can allow users in uninit net namespace to operate ipt_CLUSTERIP
now.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv4/netfilter/ipt_CLUSTERIP.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index c93dfd2..ecd808a 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -99,7 +99,8 @@ clusterip_config_put(struct clusterip_config *c)
static inline void
clusterip_config_entry_put(struct clusterip_config *c)
{
- struct clusterip_net *cn = net_generic(&init_net, clusterip_net_id);
+ struct net *net = dev_net(c->dev);
+ struct clusterip_net *cn = net_generic(net, clusterip_net_id);
local_bh_disable();
if (atomic_dec_and_lock(&c->entries, &cn->lock)) {
@@ -381,7 +382,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param *par)
/* FIXME: further sanity checks */
- config = clusterip_config_find_get(&init_net, e->ip.dst.s_addr, 1);
+ config = clusterip_config_find_get(par->net, e->ip.dst.s_addr, 1);
if (!config) {
if (!(cipinfo->flags & CLUSTERIP_FLAG_NEW)) {
pr_info("no config found for %pI4, need 'new'\n",
@@ -395,7 +396,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param *par)
return -EINVAL;
}
- dev = dev_get_by_name(&init_net, e->ip.iniface);
+ dev = dev_get_by_name(par->net, e->ip.iniface);
if (!dev) {
pr_info("no such interface %s\n",
e->ip.iniface);
@@ -503,6 +504,7 @@ arp_mangle(unsigned int hook,
struct arphdr *arp = arp_hdr(skb);
struct arp_payload *payload;
struct clusterip_config *c;
+ struct net *net = dev_net(in ? in : out);
/* we don't care about non-ethernet and non-ipv4 ARP */
if (arp->ar_hrd != htons(ARPHRD_ETHER) ||
@@ -519,7 +521,7 @@ arp_mangle(unsigned int hook,
/* if there is no clusterip configuration for the arp reply's
* source ip, we don't want to mangle it */
- c = clusterip_config_find_get(&init_net, payload->src_ip, 0);
+ c = clusterip_config_find_get(net, payload->src_ip, 0);
if (!c)
return NF_ACCEPT;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 11/20] netfilter: ipset: Use netlink callback dump args only
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (9 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 10/20] netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 12/20] netfilter: ipset: The unnamed union initialization may lead to compilation error Pablo Neira Ayuso
` (9 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Instead of cb->data, use callback dump args only and introduce symbolic
names instead of plain numbers at accessing the argument members.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter/ipset/ip_set.h | 10 +++++
net/netfilter/ipset/ip_set_bitmap_gen.h | 11 ++---
net/netfilter/ipset/ip_set_core.c | 70 +++++++++++++++----------------
net/netfilter/ipset/ip_set_hash_gen.h | 20 +++++----
net/netfilter/ipset/ip_set_list_set.c | 11 ++---
5 files changed, 68 insertions(+), 54 deletions(-)
diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h
index 7967516..c7174b8 100644
--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -316,6 +316,16 @@ ip_set_init_counter(struct ip_set_counter *counter,
atomic64_set(&(counter)->packets, (long long)(ext->packets));
}
+/* Netlink CB args */
+enum {
+ IPSET_CB_NET = 0,
+ IPSET_CB_DUMP,
+ IPSET_CB_INDEX,
+ IPSET_CB_ARG0,
+ IPSET_CB_ARG1,
+ IPSET_CB_ARG2,
+};
+
/* register and unregister set references */
extern ip_set_id_t ip_set_get_byname(struct net *net,
const char *name, struct ip_set **set);
diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h
index a13e15b..f2c7d83 100644
--- a/net/netfilter/ipset/ip_set_bitmap_gen.h
+++ b/net/netfilter/ipset/ip_set_bitmap_gen.h
@@ -198,13 +198,14 @@ mtype_list(const struct ip_set *set,
struct mtype *map = set->data;
struct nlattr *adt, *nested;
void *x;
- u32 id, first = cb->args[2];
+ u32 id, first = cb->args[IPSET_CB_ARG0];
adt = ipset_nest_start(skb, IPSET_ATTR_ADT);
if (!adt)
return -EMSGSIZE;
- for (; cb->args[2] < map->elements; cb->args[2]++) {
- id = cb->args[2];
+ for (; cb->args[IPSET_CB_ARG0] < map->elements;
+ cb->args[IPSET_CB_ARG0]++) {
+ id = cb->args[IPSET_CB_ARG0];
x = get_ext(set, map, id);
if (!test_bit(id, map->members) ||
(SET_WITH_TIMEOUT(set) &&
@@ -231,14 +232,14 @@ mtype_list(const struct ip_set *set,
ipset_nest_end(skb, adt);
/* Set listing finished */
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return 0;
nla_put_failure:
nla_nest_cancel(skb, nested);
if (unlikely(id == first)) {
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return -EMSGSIZE;
}
ipset_nest_end(skb, adt);
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index dc9284b..bac7e01 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1182,10 +1182,12 @@ ip_set_swap(struct sock *ctnl, struct sk_buff *skb,
static int
ip_set_dump_done(struct netlink_callback *cb)
{
- struct ip_set_net *inst = (struct ip_set_net *)cb->data;
- if (cb->args[2]) {
- pr_debug("release set %s\n", nfnl_set(inst, cb->args[1])->name);
- __ip_set_put_byindex(inst, (ip_set_id_t) cb->args[1]);
+ struct ip_set_net *inst = (struct ip_set_net *)cb->args[IPSET_CB_NET];
+ if (cb->args[IPSET_CB_ARG0]) {
+ pr_debug("release set %s\n",
+ nfnl_set(inst, cb->args[IPSET_CB_INDEX])->name);
+ __ip_set_put_byindex(inst,
+ (ip_set_id_t) cb->args[IPSET_CB_INDEX]);
}
return 0;
}
@@ -1203,7 +1205,7 @@ dump_attrs(struct nlmsghdr *nlh)
}
static int
-dump_init(struct netlink_callback *cb)
+dump_init(struct netlink_callback *cb, struct ip_set_net *inst)
{
struct nlmsghdr *nlh = nlmsg_hdr(cb->skb);
int min_len = nlmsg_total_size(sizeof(struct nfgenmsg));
@@ -1211,15 +1213,15 @@ dump_init(struct netlink_callback *cb)
struct nlattr *attr = (void *)nlh + min_len;
u32 dump_type;
ip_set_id_t index;
- struct ip_set_net *inst = (struct ip_set_net *)cb->data;
/* Second pass, so parser can't fail */
nla_parse(cda, IPSET_ATTR_CMD_MAX,
attr, nlh->nlmsg_len - min_len, ip_set_setname_policy);
- /* cb->args[0] : dump single set/all sets
- * [1] : set index
- * [..]: type specific
+ /* cb->args[IPSET_CB_NET]: net namespace
+ * [IPSET_CB_DUMP]: dump single set/all sets
+ * [IPSET_CB_INDEX]: set index
+ * [IPSET_CB_ARG0]: type specific
*/
if (cda[IPSET_ATTR_SETNAME]) {
@@ -1231,7 +1233,7 @@ dump_init(struct netlink_callback *cb)
return -ENOENT;
dump_type = DUMP_ONE;
- cb->args[1] = index;
+ cb->args[IPSET_CB_INDEX] = index;
} else
dump_type = DUMP_ALL;
@@ -1239,7 +1241,8 @@ dump_init(struct netlink_callback *cb)
u32 f = ip_set_get_h32(cda[IPSET_ATTR_FLAGS]);
dump_type |= (f << 16);
}
- cb->args[0] = dump_type;
+ cb->args[IPSET_CB_NET] = (unsigned long)inst;
+ cb->args[IPSET_CB_DUMP] = dump_type;
return 0;
}
@@ -1251,12 +1254,12 @@ ip_set_dump_start(struct sk_buff *skb, struct netlink_callback *cb)
struct ip_set *set = NULL;
struct nlmsghdr *nlh = NULL;
unsigned int flags = NETLINK_CB(cb->skb).portid ? NLM_F_MULTI : 0;
+ struct ip_set_net *inst = ip_set_pernet(sock_net(skb->sk));
u32 dump_type, dump_flags;
int ret = 0;
- struct ip_set_net *inst = (struct ip_set_net *)cb->data;
- if (!cb->args[0]) {
- ret = dump_init(cb);
+ if (!cb->args[IPSET_CB_DUMP]) {
+ ret = dump_init(cb, inst);
if (ret < 0) {
nlh = nlmsg_hdr(cb->skb);
/* We have to create and send the error message
@@ -1267,17 +1270,18 @@ ip_set_dump_start(struct sk_buff *skb, struct netlink_callback *cb)
}
}
- if (cb->args[1] >= inst->ip_set_max)
+ if (cb->args[IPSET_CB_INDEX] >= inst->ip_set_max)
goto out;
- dump_type = DUMP_TYPE(cb->args[0]);
- dump_flags = DUMP_FLAGS(cb->args[0]);
- max = dump_type == DUMP_ONE ? cb->args[1] + 1 : inst->ip_set_max;
+ dump_type = DUMP_TYPE(cb->args[IPSET_CB_DUMP]);
+ dump_flags = DUMP_FLAGS(cb->args[IPSET_CB_DUMP]);
+ max = dump_type == DUMP_ONE ? cb->args[IPSET_CB_INDEX] + 1
+ : inst->ip_set_max;
dump_last:
- pr_debug("args[0]: %u %u args[1]: %ld\n",
- dump_type, dump_flags, cb->args[1]);
- for (; cb->args[1] < max; cb->args[1]++) {
- index = (ip_set_id_t) cb->args[1];
+ pr_debug("dump type, flag: %u %u index: %ld\n",
+ dump_type, dump_flags, cb->args[IPSET_CB_INDEX]);
+ for (; cb->args[IPSET_CB_INDEX] < max; cb->args[IPSET_CB_INDEX]++) {
+ index = (ip_set_id_t) cb->args[IPSET_CB_INDEX];
set = nfnl_set(inst, index);
if (set == NULL) {
if (dump_type == DUMP_ONE) {
@@ -1294,7 +1298,7 @@ dump_last:
!!(set->type->features & IPSET_DUMP_LAST)))
continue;
pr_debug("List set: %s\n", set->name);
- if (!cb->args[2]) {
+ if (!cb->args[IPSET_CB_ARG0]) {
/* Start listing: make sure set won't be destroyed */
pr_debug("reference set\n");
__ip_set_get(set);
@@ -1311,7 +1315,7 @@ dump_last:
goto nla_put_failure;
if (dump_flags & IPSET_FLAG_LIST_SETNAME)
goto next_set;
- switch (cb->args[2]) {
+ switch (cb->args[IPSET_CB_ARG0]) {
case 0:
/* Core header data */
if (nla_put_string(skb, IPSET_ATTR_TYPENAME,
@@ -1331,7 +1335,7 @@ dump_last:
read_lock_bh(&set->lock);
ret = set->variant->list(set, skb, cb);
read_unlock_bh(&set->lock);
- if (!cb->args[2])
+ if (!cb->args[IPSET_CB_ARG0])
/* Set is done, proceed with next one */
goto next_set;
goto release_refcount;
@@ -1340,8 +1344,8 @@ dump_last:
/* If we dump all sets, continue with dumping last ones */
if (dump_type == DUMP_ALL) {
dump_type = DUMP_LAST;
- cb->args[0] = dump_type | (dump_flags << 16);
- cb->args[1] = 0;
+ cb->args[IPSET_CB_DUMP] = dump_type | (dump_flags << 16);
+ cb->args[IPSET_CB_INDEX] = 0;
goto dump_last;
}
goto out;
@@ -1350,15 +1354,15 @@ nla_put_failure:
ret = -EFAULT;
next_set:
if (dump_type == DUMP_ONE)
- cb->args[1] = IPSET_INVALID_ID;
+ cb->args[IPSET_CB_INDEX] = IPSET_INVALID_ID;
else
- cb->args[1]++;
+ cb->args[IPSET_CB_INDEX]++;
release_refcount:
/* If there was an error or set is done, release set */
- if (ret || !cb->args[2]) {
+ if (ret || !cb->args[IPSET_CB_ARG0]) {
pr_debug("release set %s\n", nfnl_set(inst, index)->name);
__ip_set_put_byindex(inst, index);
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
}
out:
if (nlh) {
@@ -1375,8 +1379,6 @@ ip_set_dump(struct sock *ctnl, struct sk_buff *skb,
const struct nlmsghdr *nlh,
const struct nlattr * const attr[])
{
- struct ip_set_net *inst = ip_set_pernet(sock_net(ctnl));
-
if (unlikely(protocol_failed(attr)))
return -IPSET_ERR_PROTOCOL;
@@ -1384,7 +1386,6 @@ ip_set_dump(struct sock *ctnl, struct sk_buff *skb,
struct netlink_dump_control c = {
.dump = ip_set_dump_start,
.done = ip_set_dump_done,
- .data = (void *)inst
};
return netlink_dump_start(ctnl, skb, nlh, &c);
}
@@ -1961,7 +1962,6 @@ static int __net_init
ip_set_net_init(struct net *net)
{
struct ip_set_net *inst = ip_set_pernet(net);
-
struct ip_set **list;
inst->ip_set_max = max_sets ? max_sets : CONFIG_IP_SET_MAX;
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 6a80dbd..2f80c74 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -931,7 +931,7 @@ mtype_list(const struct ip_set *set,
struct nlattr *atd, *nested;
const struct hbucket *n;
const struct mtype_elem *e;
- u32 first = cb->args[2];
+ u32 first = cb->args[IPSET_CB_ARG0];
/* We assume that one hash bucket fills into one page */
void *incomplete;
int i;
@@ -940,20 +940,22 @@ mtype_list(const struct ip_set *set,
if (!atd)
return -EMSGSIZE;
pr_debug("list hash set %s\n", set->name);
- for (; cb->args[2] < jhash_size(t->htable_bits); cb->args[2]++) {
+ for (; cb->args[IPSET_CB_ARG0] < jhash_size(t->htable_bits);
+ cb->args[IPSET_CB_ARG0]++) {
incomplete = skb_tail_pointer(skb);
- n = hbucket(t, cb->args[2]);
- pr_debug("cb->args[2]: %lu, t %p n %p\n", cb->args[2], t, n);
+ n = hbucket(t, cb->args[IPSET_CB_ARG0]);
+ pr_debug("cb->arg bucket: %lu, t %p n %p\n",
+ cb->args[IPSET_CB_ARG0], t, n);
for (i = 0; i < n->pos; i++) {
e = ahash_data(n, i, set->dsize);
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
pr_debug("list hash %lu hbucket %p i %u, data %p\n",
- cb->args[2], n, i, e);
+ cb->args[IPSET_CB_ARG0], n, i, e);
nested = ipset_nest_start(skb, IPSET_ATTR_DATA);
if (!nested) {
- if (cb->args[2] == first) {
+ if (cb->args[IPSET_CB_ARG0] == first) {
nla_nest_cancel(skb, atd);
return -EMSGSIZE;
} else
@@ -968,16 +970,16 @@ mtype_list(const struct ip_set *set,
}
ipset_nest_end(skb, atd);
/* Set listing finished */
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return 0;
nla_put_failure:
nlmsg_trim(skb, incomplete);
- if (unlikely(first == cb->args[2])) {
+ if (unlikely(first == cb->args[IPSET_CB_ARG0])) {
pr_warning("Can't list set %s: one bucket does not fit into "
"a message. Please report it!\n", set->name);
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return -EMSGSIZE;
}
ipset_nest_end(skb, atd);
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c
index ec6f6d1..3e2317f 100644
--- a/net/netfilter/ipset/ip_set_list_set.c
+++ b/net/netfilter/ipset/ip_set_list_set.c
@@ -490,14 +490,15 @@ list_set_list(const struct ip_set *set,
{
const struct list_set *map = set->data;
struct nlattr *atd, *nested;
- u32 i, first = cb->args[2];
+ u32 i, first = cb->args[IPSET_CB_ARG0];
const struct set_elem *e;
atd = ipset_nest_start(skb, IPSET_ATTR_ADT);
if (!atd)
return -EMSGSIZE;
- for (; cb->args[2] < map->size; cb->args[2]++) {
- i = cb->args[2];
+ for (; cb->args[IPSET_CB_ARG0] < map->size;
+ cb->args[IPSET_CB_ARG0]++) {
+ i = cb->args[IPSET_CB_ARG0];
e = list_set_elem(set, map, i);
if (e->id == IPSET_INVALID_ID)
goto finish;
@@ -522,13 +523,13 @@ list_set_list(const struct ip_set *set,
finish:
ipset_nest_end(skb, atd);
/* Set listing finished */
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return 0;
nla_put_failure:
nla_nest_cancel(skb, nested);
if (unlikely(i == first)) {
- cb->args[2] = 0;
+ cb->args[IPSET_CB_ARG0] = 0;
return -EMSGSIZE;
}
ipset_nest_end(skb, atd);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 12/20] netfilter: ipset: The unnamed union initialization may lead to compilation error
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (10 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 11/20] netfilter: ipset: Use netlink callback dump args only Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 13/20] netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets Pablo Neira Ayuso
` (8 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
The unnamed union should be possible to be initialized directly, but
unfortunately it's not so:
/usr/src/ipset/kernel/net/netfilter/ipset/ip_set_hash_netnet.c: In
function ?hash_netnet4_kadt?:
/usr/src/ipset/kernel/net/netfilter/ipset/ip_set_hash_netnet.c:141:
error: unknown field ?cidr? specified in initializer
Reported-by: Husnu Demir <hdemir@metu.edu.tr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/ipset/ip_set_hash_netnet.c | 22 ++++++++++------------
net/netfilter/ipset/ip_set_hash_netportnet.c | 22 ++++++++++------------
2 files changed, 20 insertions(+), 24 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c b/net/netfilter/ipset/ip_set_hash_netnet.c
index 4260327..2bc2dec 100644
--- a/net/netfilter/ipset/ip_set_hash_netnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netnet.c
@@ -137,12 +137,11 @@ hash_netnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
{
const struct hash_netnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netnet4_elem e = {
- .cidr[0] = h->nets[0].cidr[0] ? h->nets[0].cidr[0] : HOST_MASK,
- .cidr[1] = h->nets[0].cidr[1] ? h->nets[0].cidr[1] : HOST_MASK,
- };
+ struct hash_netnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set);
+ e.cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK);
+ e.cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK);
if (adt == IPSET_TEST)
e.ccmp = (HOST_MASK << (sizeof(e.cidr[0]) * 8)) | HOST_MASK;
@@ -160,14 +159,14 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
{
const struct hash_netnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netnet4_elem e = { .cidr[0] = HOST_MASK,
- .cidr[1] = HOST_MASK };
+ struct hash_netnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, last;
u32 ip2 = 0, ip2_from = 0, ip2_to = 0, last2;
u8 cidr, cidr2;
int ret;
+ e.cidr[0] = e.cidr[1] = HOST_MASK;
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_TIMEOUT) ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS) ||
@@ -364,12 +363,11 @@ hash_netnet6_kadt(struct ip_set *set, const struct sk_buff *skb,
{
const struct hash_netnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netnet6_elem e = {
- .cidr[0] = h->nets[0].cidr[0] ? h->nets[0].cidr[0] : HOST_MASK,
- .cidr[1] = h->nets[0].cidr[1] ? h->nets[0].cidr[1] : HOST_MASK
- };
+ struct hash_netnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set);
+ e.cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK);
+ e.cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK);
if (adt == IPSET_TEST)
e.ccmp = (HOST_MASK << (sizeof(u8)*8)) | HOST_MASK;
@@ -386,11 +384,11 @@ hash_netnet6_uadt(struct ip_set *set, struct nlattr *tb[],
enum ipset_adt adt, u32 *lineno, u32 flags, bool retried)
{
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netnet6_elem e = { .cidr[0] = HOST_MASK,
- .cidr[1] = HOST_MASK };
+ struct hash_netnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
int ret;
+ e.cidr[0] = e.cidr[1] = HOST_MASK;
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_TIMEOUT) ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS) ||
diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c
index 363fab9..703d119 100644
--- a/net/netfilter/ipset/ip_set_hash_netportnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netportnet.c
@@ -147,12 +147,11 @@ hash_netportnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
{
const struct hash_netportnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netportnet4_elem e = {
- .cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK),
- .cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK),
- };
+ struct hash_netportnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set);
+ e.cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK);
+ e.cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK);
if (adt == IPSET_TEST)
e.ccmp = (HOST_MASK << (sizeof(e.cidr[0]) * 8)) | HOST_MASK;
@@ -174,8 +173,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
{
const struct hash_netportnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netportnet4_elem e = { .cidr[0] = HOST_MASK,
- .cidr[1] = HOST_MASK };
+ struct hash_netportnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, ip_last, p = 0, port, port_to;
u32 ip2_from = 0, ip2_to = 0, ip2_last, ip2;
@@ -183,6 +181,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[],
u8 cidr, cidr2;
int ret;
+ e.cidr[0] = e.cidr[1] = HOST_MASK;
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
!ip_set_attr_netorder(tb, IPSET_ATTR_PORT) ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) ||
@@ -419,12 +418,11 @@ hash_netportnet6_kadt(struct ip_set *set, const struct sk_buff *skb,
{
const struct hash_netportnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netportnet6_elem e = {
- .cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK),
- .cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK),
- };
+ struct hash_netportnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set);
+ e.cidr[0] = IP_SET_INIT_CIDR(h->nets[0].cidr[0], HOST_MASK);
+ e.cidr[1] = IP_SET_INIT_CIDR(h->nets[0].cidr[1], HOST_MASK);
if (adt == IPSET_TEST)
e.ccmp = (HOST_MASK << (sizeof(u8) * 8)) | HOST_MASK;
@@ -446,13 +444,13 @@ hash_netportnet6_uadt(struct ip_set *set, struct nlattr *tb[],
{
const struct hash_netportnet *h = set->data;
ipset_adtfn adtfn = set->variant->adt[adt];
- struct hash_netportnet6_elem e = { .cidr[0] = HOST_MASK,
- .cidr[1] = HOST_MASK };
+ struct hash_netportnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 port, port_to;
bool with_ports = false;
int ret;
+ e.cidr[0] = e.cidr[1] = HOST_MASK;
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
!ip_set_attr_netorder(tb, IPSET_ATTR_PORT) ||
!ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) ||
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 13/20] netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (11 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 12/20] netfilter: ipset: The unnamed union initialization may lead to compilation error Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 14/20] netfilter:ipset: Fix memory allocation for bitmap:port Pablo Neira Ayuso
` (7 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Stanislav Fomichev <stfomichev@yandex-team.ru>
Don't verify checksum for outgoing packets because checksum calculation
may be done by the device.
Without this patch:
$ ip6tables -I OUTPUT -p tcp --dport 80 -j REJECT --reject-with tcp-reset
$ time telnet ipv6.google.com 80
Trying 2a00:1450:4010:c03::67...
telnet: Unable to connect to remote host: Connection timed out
real 0m7.201s
user 0m0.000s
sys 0m0.000s
With the patch applied:
$ ip6tables -I OUTPUT -p tcp --dport 80 -j REJECT --reject-with tcp-reset
$ time telnet ipv6.google.com 80
Trying 2a00:1450:4010:c03::67...
telnet: Unable to connect to remote host: Connection refused
real 0m0.085s
user 0m0.000s
sys 0m0.000s
Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/ipv6/netfilter/ip6t_REJECT.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c
index 56eef30..da00a2e 100644
--- a/net/ipv6/netfilter/ip6t_REJECT.c
+++ b/net/ipv6/netfilter/ip6t_REJECT.c
@@ -39,7 +39,7 @@ MODULE_DESCRIPTION("Xtables: packet \"rejection\" target for IPv6");
MODULE_LICENSE("GPL");
/* Send RST reply */
-static void send_reset(struct net *net, struct sk_buff *oldskb)
+static void send_reset(struct net *net, struct sk_buff *oldskb, int hook)
{
struct sk_buff *nskb;
struct tcphdr otcph, *tcph;
@@ -88,8 +88,7 @@ static void send_reset(struct net *net, struct sk_buff *oldskb)
}
/* Check checksum. */
- if (csum_ipv6_magic(&oip6h->saddr, &oip6h->daddr, otcplen, IPPROTO_TCP,
- skb_checksum(oldskb, tcphoff, otcplen, 0))) {
+ if (nf_ip6_checksum(oldskb, hook, tcphoff, IPPROTO_TCP)) {
pr_debug("TCP checksum is invalid\n");
return;
}
@@ -227,7 +226,7 @@ reject_tg6(struct sk_buff *skb, const struct xt_action_param *par)
/* Do nothing */
break;
case IP6T_TCP_RESET:
- send_reset(net, skb);
+ send_reset(net, skb, par->hooknum);
break;
default:
net_info_ratelimited("case %u not handled yet\n", reject->with);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 14/20] netfilter:ipset: Fix memory allocation for bitmap:port
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (12 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 13/20] netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 15/20] netfilter: ipset: remove duplicate define Pablo Neira Ayuso
` (6 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
At the restructuring of the bitmap types creation in ipset, for the
bitmap:port type wrong (too large) memory allocation was copied
(netfilter bugzilla id #859).
Reported-by: Quentin Armitage <quentin@armitage.org.uk>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
net/netfilter/ipset/ip_set_bitmap_port.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_bitmap_port.c b/net/netfilter/ipset/ip_set_bitmap_port.c
index e7603c5..cf99676 100644
--- a/net/netfilter/ipset/ip_set_bitmap_port.c
+++ b/net/netfilter/ipset/ip_set_bitmap_port.c
@@ -254,7 +254,7 @@ bitmap_port_create(struct net *net, struct ip_set *set, struct nlattr *tb[],
return -ENOMEM;
map->elements = last_port - first_port + 1;
- map->memsize = map->elements * sizeof(unsigned long);
+ map->memsize = bitmap_bytes(0, map->elements);
set->variant = &bitmap_port;
set->dsize = ip_set_elem_len(set, tb, 0);
if (!init_map_port(set, map, first_port, last_port)) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 15/20] netfilter: ipset: remove duplicate define
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (13 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 14/20] netfilter:ipset: Fix memory allocation for bitmap:port Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 16/20] bridge: netfilter: orphan skb before invoking ip netfilter hooks Pablo Neira Ayuso
` (5 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Michael Opdenacker <michael.opdenacker@free-electrons.com>
This patch removes a duplicate define from
net/netfilter/ipset/ip_set_hash_gen.h
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
---
net/netfilter/ipset/ip_set_hash_gen.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 2f80c74..be6932a 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -234,7 +234,6 @@ hbucket_elem_add(struct hbucket *n, u8 ahash_max, size_t dsize)
#define mtype_uadt IPSET_TOKEN(MTYPE, _uadt)
#define mtype MTYPE
-#define mtype_elem IPSET_TOKEN(MTYPE, _elem)
#define mtype_add IPSET_TOKEN(MTYPE, _add)
#define mtype_del IPSET_TOKEN(MTYPE, _del)
#define mtype_test_cidrs IPSET_TOKEN(MTYPE, _test_cidrs)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 16/20] bridge: netfilter: orphan skb before invoking ip netfilter hooks
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (14 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 15/20] netfilter: ipset: remove duplicate define Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 17/20] net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule Pablo Neira Ayuso
` (4 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Florian Westphal <fw@strlen.de>
Pekka Pietikäinen reports xt_socket behavioural change after commit
00028aa37098o (netfilter: xt_socket: use IP early demux).
Reason is xt_socket now no longer does an unconditional sk lookup -
it re-uses existing skb->sk if possible, assuming ->sk was set by
ip early demux.
However, when netfilter is invoked via bridge, this can cause 'bogus'
sockets to be examined by the match, e.g. a 'tun' device socket.
bridge netfilter should orphan the skb just like the routing path
before invoking ipv4/ipv6 netfilter hooks to avoid this.
Reported-and-tested-by: Pekka Pietikäinen <pp@ee.oulu.fi>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/bridge/br_netfilter.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index f877362..3d55312 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -559,6 +559,8 @@ static struct net_device *setup_pre_routing(struct sk_buff *skb)
else if (skb->protocol == htons(ETH_P_PPP_SES))
nf_bridge->mask |= BRNF_PPPoE;
+ /* Must drop socket now because of tproxy. */
+ skb_orphan(skb);
return skb->dev;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 17/20] net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (15 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 16/20] bridge: netfilter: orphan skb before invoking ip netfilter hooks Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 18/20] net: ipvs: sctp: do not recalc sctp csum when ports didn't change Pablo Neira Ayuso
` (3 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Daniel Borkmann <dborkman@redhat.com>
If skb_header_pointer() fails, we need to assign a verdict, that is
NF_DROP in this case, otherwise, we would leave the verdict from
conn_schedule() uninitialized when returning.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_proto_sctp.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 23e596e..9ca7aa0 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -20,13 +20,18 @@ sctp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_proto_data *pd,
sctp_sctphdr_t *sh, _sctph;
sh = skb_header_pointer(skb, iph->len, sizeof(_sctph), &_sctph);
- if (sh == NULL)
+ if (sh == NULL) {
+ *verdict = NF_DROP;
return 0;
+ }
sch = skb_header_pointer(skb, iph->len + sizeof(sctp_sctphdr_t),
sizeof(_schunkh), &_schunkh);
- if (sch == NULL)
+ if (sch == NULL) {
+ *verdict = NF_DROP;
return 0;
+ }
+
net = skb_net(skb);
ipvs = net_ipvs(net);
rcu_read_lock();
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 18/20] net: ipvs: sctp: do not recalc sctp csum when ports didn't change
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (16 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 17/20] net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 19/20] netfilter: introduce nf_conn_acct structure Pablo Neira Ayuso
` (2 subsequent siblings)
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Daniel Borkmann <dborkman@redhat.com>
Unlike UDP or TCP, we do not take the pseudo-header into
account in SCTP checksums. So in case port mapping is the
very same, we do not need to recalculate the whole SCTP
checksum in software, which is very expensive.
Also, similarly as in TCP, take into account when a private
helper mangled the packet. In that case, we also need to
recalculate the checksum even if ports might be same.
Thanks for feedback regarding skb->ip_summed checks from
Julian Anastasov; here's a discussion on these checks for
snat and dnat:
* For snat_handler(), we can see CHECKSUM_PARTIAL from
virtual devices, and from LOCAL_OUT, otherwise it
should be CHECKSUM_UNNECESSARY. In general, in snat it
is more complex. skb contains the original route and
ip_vs_route_me_harder() can change the route after
snat_handler. So, for locally generated replies from
local server we can not preserve the CHECKSUM_PARTIAL
mode. It is an chicken or egg dilemma: snat_handler
needs the device after rerouting (to check for
NETIF_F_SCTP_CSUM), while ip_route_me_harder() wants
the snat_handler() to put the new saddr for proper
rerouting.
* For dnat_handler(), we should not see CHECKSUM_COMPLETE
for SCTP, in fact the small set of drivers that support
SCTP offloading return CHECKSUM_UNNECESSARY on correctly
received SCTP csum. We can see CHECKSUM_PARTIAL from
local stack or received from virtual drivers. The idea is
that SCTP decides to avoid csum calculation if hardware
supports offloading. IPVS can change the device after
rerouting to real server but we can preserve the
CHECKSUM_PARTIAL mode if the new device supports
offloading too. This works because skb dst is changed
before dnat_handler and we see the new device. So, checks
in the 'if' part will decide whether it is ok to keep
CHECKSUM_PARTIAL for the output. If the packet was with
CHECKSUM_NONE, hence we deal with unknown checksum. As we
recalculate the sum for IP header in all cases, it should
be safe to use CHECKSUM_UNNECESSARY. We can forward wrong
checksum in this case (without cp->app). In case of
CHECKSUM_UNNECESSARY, the csum was valid on receive.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_proto_sctp.c | 39 ++++++++++++++++++++++++++++-----
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 9ca7aa0..2f7ea75 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -81,6 +81,7 @@ sctp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
{
sctp_sctphdr_t *sctph;
unsigned int sctphoff = iph->len;
+ bool payload_csum = false;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6 && iph->fragoffs)
@@ -92,19 +93,31 @@ sctp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
return 0;
if (unlikely(cp->app != NULL)) {
+ int ret;
+
/* Some checks before mangling */
if (pp->csum_check && !pp->csum_check(cp->af, skb, pp))
return 0;
/* Call application helper if needed */
- if (!ip_vs_app_pkt_out(cp, skb))
+ ret = ip_vs_app_pkt_out(cp, skb);
+ if (ret == 0)
return 0;
+ /* ret=2: csum update is needed after payload mangling */
+ if (ret == 2)
+ payload_csum = true;
}
sctph = (void *) skb_network_header(skb) + sctphoff;
- sctph->source = cp->vport;
- sctp_nat_csum(skb, sctph, sctphoff);
+ /* Only update csum if we really have to */
+ if (sctph->source != cp->vport || payload_csum ||
+ skb->ip_summed == CHECKSUM_PARTIAL) {
+ sctph->source = cp->vport;
+ sctp_nat_csum(skb, sctph, sctphoff);
+ } else {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ }
return 1;
}
@@ -115,6 +128,7 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
{
sctp_sctphdr_t *sctph;
unsigned int sctphoff = iph->len;
+ bool payload_csum = false;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6 && iph->fragoffs)
@@ -126,19 +140,32 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
return 0;
if (unlikely(cp->app != NULL)) {
+ int ret;
+
/* Some checks before mangling */
if (pp->csum_check && !pp->csum_check(cp->af, skb, pp))
return 0;
/* Call application helper if needed */
- if (!ip_vs_app_pkt_in(cp, skb))
+ ret = ip_vs_app_pkt_in(cp, skb);
+ if (ret == 0)
return 0;
+ /* ret=2: csum update is needed after payload mangling */
+ if (ret == 2)
+ payload_csum = true;
}
sctph = (void *) skb_network_header(skb) + sctphoff;
- sctph->dest = cp->dport;
- sctp_nat_csum(skb, sctph, sctphoff);
+ /* Only update csum if we really have to */
+ if (sctph->dest != cp->dport || payload_csum ||
+ (skb->ip_summed == CHECKSUM_PARTIAL &&
+ !(skb_dst(skb)->dev->features & NETIF_F_SCTP_CSUM))) {
+ sctph->dest = cp->dport;
+ sctp_nat_csum(skb, sctph, sctphoff);
+ } else if (skb->ip_summed != CHECKSUM_PARTIAL) {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ }
return 1;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 19/20] netfilter: introduce nf_conn_acct structure
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (17 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 18/20] net: ipvs: sctp: do not recalc sctp csum when ports didn't change Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 20/20] netfilter: ctnetlink: account both directions in one step Pablo Neira Ayuso
2013-11-05 0:47 ` [PATCH 00/20] Netfilter/IPVS updates for net-next David Miller
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Holger Eitzenberger <holger@eitzenberger.org>
Encapsulate counters for both directions into nf_conn_acct. During
that process also consistently name pointers to the extend 'acct',
not 'counters'. This patch is a cleanup.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_conntrack_acct.h | 10 +++++++---
include/net/netfilter/nf_conntrack_extend.h | 2 +-
net/netfilter/nf_conntrack_acct.c | 12 +++++++-----
net/netfilter/nf_conntrack_core.c | 16 ++++++++++------
net/netfilter/nf_conntrack_netlink.c | 16 +++++++++-------
net/netfilter/xt_connbytes.c | 6 ++++--
6 files changed, 38 insertions(+), 24 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_acct.h b/include/net/netfilter/nf_conntrack_acct.h
index fef44ed..79d8d16 100644
--- a/include/net/netfilter/nf_conntrack_acct.h
+++ b/include/net/netfilter/nf_conntrack_acct.h
@@ -19,17 +19,21 @@ struct nf_conn_counter {
atomic64_t bytes;
};
+struct nf_conn_acct {
+ struct nf_conn_counter counter[IP_CT_DIR_MAX];
+};
+
static inline
-struct nf_conn_counter *nf_conn_acct_find(const struct nf_conn *ct)
+struct nf_conn_acct *nf_conn_acct_find(const struct nf_conn *ct)
{
return nf_ct_ext_find(ct, NF_CT_EXT_ACCT);
}
static inline
-struct nf_conn_counter *nf_ct_acct_ext_add(struct nf_conn *ct, gfp_t gfp)
+struct nf_conn_acct *nf_ct_acct_ext_add(struct nf_conn *ct, gfp_t gfp)
{
struct net *net = nf_ct_net(ct);
- struct nf_conn_counter *acct;
+ struct nf_conn_acct *acct;
if (!net->ct.sysctl_acct)
return NULL;
diff --git a/include/net/netfilter/nf_conntrack_extend.h b/include/net/netfilter/nf_conntrack_extend.h
index 86372ae..956b175 100644
--- a/include/net/netfilter/nf_conntrack_extend.h
+++ b/include/net/netfilter/nf_conntrack_extend.h
@@ -36,7 +36,7 @@ enum nf_ct_ext_id {
#define NF_CT_EXT_HELPER_TYPE struct nf_conn_help
#define NF_CT_EXT_NAT_TYPE struct nf_conn_nat
#define NF_CT_EXT_SEQADJ_TYPE struct nf_conn_seqadj
-#define NF_CT_EXT_ACCT_TYPE struct nf_conn_counter
+#define NF_CT_EXT_ACCT_TYPE struct nf_conn_acct
#define NF_CT_EXT_ECACHE_TYPE struct nf_conntrack_ecache
#define NF_CT_EXT_ZONE_TYPE struct nf_conntrack_zone
#define NF_CT_EXT_TSTAMP_TYPE struct nf_conn_tstamp
diff --git a/net/netfilter/nf_conntrack_acct.c b/net/netfilter/nf_conntrack_acct.c
index 2d3030a..a4b5e2a 100644
--- a/net/netfilter/nf_conntrack_acct.c
+++ b/net/netfilter/nf_conntrack_acct.c
@@ -39,21 +39,23 @@ static struct ctl_table acct_sysctl_table[] = {
unsigned int
seq_print_acct(struct seq_file *s, const struct nf_conn *ct, int dir)
{
- struct nf_conn_counter *acct;
+ struct nf_conn_acct *acct;
+ struct nf_conn_counter *counter;
acct = nf_conn_acct_find(ct);
if (!acct)
return 0;
+ counter = acct->counter;
return seq_printf(s, "packets=%llu bytes=%llu ",
- (unsigned long long)atomic64_read(&acct[dir].packets),
- (unsigned long long)atomic64_read(&acct[dir].bytes));
+ (unsigned long long)atomic64_read(&counter[dir].packets),
+ (unsigned long long)atomic64_read(&counter[dir].bytes));
};
EXPORT_SYMBOL_GPL(seq_print_acct);
static struct nf_ct_ext_type acct_extend __read_mostly = {
- .len = sizeof(struct nf_conn_counter[IP_CT_DIR_MAX]),
- .align = __alignof__(struct nf_conn_counter[IP_CT_DIR_MAX]),
+ .len = sizeof(struct nf_conn_acct),
+ .align = __alignof__(struct nf_conn_acct),
.id = NF_CT_EXT_ACCT,
};
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5d892fe..e22d950 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1109,12 +1109,14 @@ void __nf_ct_refresh_acct(struct nf_conn *ct,
acct:
if (do_acct) {
- struct nf_conn_counter *acct;
+ struct nf_conn_acct *acct;
acct = nf_conn_acct_find(ct);
if (acct) {
- atomic64_inc(&acct[CTINFO2DIR(ctinfo)].packets);
- atomic64_add(skb->len, &acct[CTINFO2DIR(ctinfo)].bytes);
+ struct nf_conn_counter *counter = acct->counter;
+
+ atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets);
+ atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes);
}
}
}
@@ -1126,13 +1128,15 @@ bool __nf_ct_kill_acct(struct nf_conn *ct,
int do_acct)
{
if (do_acct) {
- struct nf_conn_counter *acct;
+ struct nf_conn_acct *acct;
acct = nf_conn_acct_find(ct);
if (acct) {
- atomic64_inc(&acct[CTINFO2DIR(ctinfo)].packets);
+ struct nf_conn_counter *counter = acct->counter;
+
+ atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets);
atomic64_add(skb->len - skb_network_offset(skb),
- &acct[CTINFO2DIR(ctinfo)].bytes);
+ &counter[CTINFO2DIR(ctinfo)].bytes);
}
}
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index eea936b..ddc3777 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -237,19 +237,21 @@ static int
ctnetlink_dump_counters(struct sk_buff *skb, const struct nf_conn *ct,
enum ip_conntrack_dir dir, int type)
{
- struct nf_conn_counter *acct;
+ struct nf_conn_acct *acct;
+ struct nf_conn_counter *counter;
u64 pkts, bytes;
acct = nf_conn_acct_find(ct);
if (!acct)
return 0;
+ counter = acct->counter;
if (type == IPCTNL_MSG_CT_GET_CTRZERO) {
- pkts = atomic64_xchg(&acct[dir].packets, 0);
- bytes = atomic64_xchg(&acct[dir].bytes, 0);
+ pkts = atomic64_xchg(&counter[dir].packets, 0);
+ bytes = atomic64_xchg(&counter[dir].bytes, 0);
} else {
- pkts = atomic64_read(&acct[dir].packets);
- bytes = atomic64_read(&acct[dir].bytes);
+ pkts = atomic64_read(&counter[dir].packets);
+ bytes = atomic64_read(&counter[dir].bytes);
}
return dump_counters(skb, pkts, bytes, dir);
}
@@ -530,7 +532,7 @@ ctnetlink_proto_size(const struct nf_conn *ct)
}
static inline size_t
-ctnetlink_counters_size(const struct nf_conn *ct)
+ctnetlink_acct_size(const struct nf_conn *ct)
{
if (!nf_ct_ext_exist(ct, NF_CT_EXT_ACCT))
return 0;
@@ -579,7 +581,7 @@ ctnetlink_nlmsg_size(const struct nf_conn *ct)
+ 3 * nla_total_size(sizeof(u_int8_t)) /* CTA_PROTO_NUM */
+ nla_total_size(sizeof(u_int32_t)) /* CTA_ID */
+ nla_total_size(sizeof(u_int32_t)) /* CTA_STATUS */
- + ctnetlink_counters_size(ct)
+ + ctnetlink_acct_size(ct)
+ ctnetlink_timestamp_size(ct)
+ nla_total_size(sizeof(u_int32_t)) /* CTA_TIMEOUT */
+ nla_total_size(0) /* CTA_PROTOINFO */
diff --git a/net/netfilter/xt_connbytes.c b/net/netfilter/xt_connbytes.c
index e595e07..1e63461 100644
--- a/net/netfilter/xt_connbytes.c
+++ b/net/netfilter/xt_connbytes.c
@@ -26,16 +26,18 @@ connbytes_mt(const struct sk_buff *skb, struct xt_action_param *par)
u_int64_t what = 0; /* initialize to make gcc happy */
u_int64_t bytes = 0;
u_int64_t pkts = 0;
+ const struct nf_conn_acct *acct;
const struct nf_conn_counter *counters;
ct = nf_ct_get(skb, &ctinfo);
if (!ct)
return false;
- counters = nf_conn_acct_find(ct);
- if (!counters)
+ acct = nf_conn_acct_find(ct);
+ if (!acct)
return false;
+ counters = acct->counter;
switch (sinfo->what) {
case XT_CONNBYTES_PKTS:
switch (sinfo->direction) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 20/20] netfilter: ctnetlink: account both directions in one step
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (18 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 19/20] netfilter: introduce nf_conn_acct structure Pablo Neira Ayuso
@ 2013-11-04 21:50 ` Pablo Neira Ayuso
2013-11-05 0:47 ` [PATCH 00/20] Netfilter/IPVS updates for net-next David Miller
20 siblings, 0 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2013-11-04 21:50 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
From: Holger Eitzenberger <holger@eitzenberger.org>
With the intent to dump other accounting data later.
This patch is a cleanup.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_netlink.c | 49 +++++++++++++++++-----------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index ddc3777..08870b8 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -211,13 +211,23 @@ nla_put_failure:
}
static int
-dump_counters(struct sk_buff *skb, u64 pkts, u64 bytes,
- enum ip_conntrack_dir dir)
+dump_counters(struct sk_buff *skb, struct nf_conn_acct *acct,
+ enum ip_conntrack_dir dir, int type)
{
- enum ctattr_type type = dir ? CTA_COUNTERS_REPLY: CTA_COUNTERS_ORIG;
+ enum ctattr_type attr = dir ? CTA_COUNTERS_REPLY: CTA_COUNTERS_ORIG;
+ struct nf_conn_counter *counter = acct->counter;
struct nlattr *nest_count;
+ u64 pkts, bytes;
- nest_count = nla_nest_start(skb, type | NLA_F_NESTED);
+ if (type == IPCTNL_MSG_CT_GET_CTRZERO) {
+ pkts = atomic64_xchg(&counter[dir].packets, 0);
+ bytes = atomic64_xchg(&counter[dir].bytes, 0);
+ } else {
+ pkts = atomic64_read(&counter[dir].packets);
+ bytes = atomic64_read(&counter[dir].bytes);
+ }
+
+ nest_count = nla_nest_start(skb, attr | NLA_F_NESTED);
if (!nest_count)
goto nla_put_failure;
@@ -234,26 +244,19 @@ nla_put_failure:
}
static int
-ctnetlink_dump_counters(struct sk_buff *skb, const struct nf_conn *ct,
- enum ip_conntrack_dir dir, int type)
+ctnetlink_dump_acct(struct sk_buff *skb, const struct nf_conn *ct, int type)
{
- struct nf_conn_acct *acct;
- struct nf_conn_counter *counter;
- u64 pkts, bytes;
+ struct nf_conn_acct *acct = nf_conn_acct_find(ct);
- acct = nf_conn_acct_find(ct);
if (!acct)
return 0;
- counter = acct->counter;
- if (type == IPCTNL_MSG_CT_GET_CTRZERO) {
- pkts = atomic64_xchg(&counter[dir].packets, 0);
- bytes = atomic64_xchg(&counter[dir].bytes, 0);
- } else {
- pkts = atomic64_read(&counter[dir].packets);
- bytes = atomic64_read(&counter[dir].bytes);
- }
- return dump_counters(skb, pkts, bytes, dir);
+ if (dump_counters(skb, acct, IP_CT_DIR_ORIGINAL, type) < 0)
+ return -1;
+ if (dump_counters(skb, acct, IP_CT_DIR_REPLY, type) < 0)
+ return -1;
+
+ return 0;
}
static int
@@ -490,8 +493,7 @@ ctnetlink_fill_info(struct sk_buff *skb, u32 portid, u32 seq, u32 type,
if (ctnetlink_dump_status(skb, ct) < 0 ||
ctnetlink_dump_timeout(skb, ct) < 0 ||
- ctnetlink_dump_counters(skb, ct, IP_CT_DIR_ORIGINAL, type) < 0 ||
- ctnetlink_dump_counters(skb, ct, IP_CT_DIR_REPLY, type) < 0 ||
+ ctnetlink_dump_acct(skb, ct, type) < 0 ||
ctnetlink_dump_timestamp(skb, ct) < 0 ||
ctnetlink_dump_protoinfo(skb, ct) < 0 ||
ctnetlink_dump_helpinfo(skb, ct) < 0 ||
@@ -675,10 +677,7 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item)
goto nla_put_failure;
if (events & (1 << IPCT_DESTROY)) {
- if (ctnetlink_dump_counters(skb, ct,
- IP_CT_DIR_ORIGINAL, type) < 0 ||
- ctnetlink_dump_counters(skb, ct,
- IP_CT_DIR_REPLY, type) < 0 ||
+ if (ctnetlink_dump_acct(skb, ct, type) < 0 ||
ctnetlink_dump_timestamp(skb, ct) < 0)
goto nla_put_failure;
} else {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 00/20] Netfilter/IPVS updates for net-next
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
` (19 preceding siblings ...)
2013-11-04 21:50 ` [PATCH 20/20] netfilter: ctnetlink: account both directions in one step Pablo Neira Ayuso
@ 2013-11-05 0:47 ` David Miller
20 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2013-11-05 0:47 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 4 Nov 2013 22:50:22 +0100
> This is another batch containing Netfilter/IPVS updates for your net-next
> tree, they are:
...
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master
Looks good, pulled, thanks Pablo.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2013-11-05 0:47 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-04 21:50 [PATCH 00/20] Netfilter/IPVS updates for net-next Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 01/20] ipvs: fix the IPVS_CMD_ATTR_MAX definition Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 02/20] ipvs: avoid rcu_barrier during netns cleanup Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 03/20] ipvs: improved SH fallback strategy Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 04/20] netfilter: xt_socket: use sock_gen_put() Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 05/20] netfilter: ipt_CLUSTERIP: make proc directory per net namespace Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 06/20] netfilter: ipt_CLUSTERIP: make clusterip_list " Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 07/20] netfilter: ipt_CLUSTERIP: make clusterip_lock " Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 08/20] netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 09/20] netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 10/20] netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 11/20] netfilter: ipset: Use netlink callback dump args only Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 12/20] netfilter: ipset: The unnamed union initialization may lead to compilation error Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 13/20] netfilter: ip6t_REJECT: skip checksum verification for outgoing ipv6 packets Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 14/20] netfilter:ipset: Fix memory allocation for bitmap:port Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 15/20] netfilter: ipset: remove duplicate define Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 16/20] bridge: netfilter: orphan skb before invoking ip netfilter hooks Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 17/20] net: ipvs: sctp: add missing verdict assignments in sctp_conn_schedule Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 18/20] net: ipvs: sctp: do not recalc sctp csum when ports didn't change Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 19/20] netfilter: introduce nf_conn_acct structure Pablo Neira Ayuso
2013-11-04 21:50 ` [PATCH 20/20] netfilter: ctnetlink: account both directions in one step Pablo Neira Ayuso
2013-11-05 0:47 ` [PATCH 00/20] Netfilter/IPVS updates for net-next David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).