From: kaber@trash.net
To: netfilter-devel@vger.kernel.org
Cc: netdev@vger.kernel.org
Subject: [PATCH 05/19] netfilter: nf_conntrack_ipv6: improve fragmentation handling
Date: Thu, 9 Aug 2012 22:08:49 +0200 [thread overview]
Message-ID: <1344542943-11588-6-git-send-email-kaber@trash.net> (raw)
In-Reply-To: <1344542943-11588-1-git-send-email-kaber@trash.net>
From: Patrick McHardy <kaber@trash.net>
The IPv6 conntrack fragmentation currently has a couple of shortcomings.
Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
defragmented packet is then passed to conntrack, the resulting conntrack
information is attached to each original fragment and the fragments then
continue their way through the stack.
Helper invocation occurs in the POSTROUTING hook, at which point only
the original fragments are available. The result of this is that
fragmented packets are never passed to helpers.
This patch improves the situation in the following way:
- If a reassembled packet belongs to a connection that has a helper
assigned, the reassembled packet is passed through the stack instead
of the original fragments.
- During defragmentation, the largest received fragment size is stored.
On output, the packet is refragmented if required. If the largest
received fragment size exceeds the outgoing MTU, a "packet too big"
message is generated, thus behaving as if the original fragments
were passed through the stack from an outside point of view.
- The ipv6_helper() hook function can't receive fragments anymore for
connections using a helper, so it is switched to use ipv6_skip_exthdr()
instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
reassembled packets are passed to connection tracking helpers.
The result of this is that we can properly track fragmented packets, but
still generate ICMPv6 Packet too big messages if we would have before.
This patch is also required as a precondition for IPv6 NAT, where NAT
helpers might enlarge packets up to a point that they require
fragmentation. In that case we can't generate Packet too big messages
since the proper MTU can't be calculated in all cases (f.i. when
changing textual representation of a variable amount of addresses),
so the packet is transparently fragmented iff the original packet or
fragments would have fit the outgoing MTU.
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
include/linux/ipv6.h | 1 +
net/ipv6/ip6_output.c | 7 +++-
net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 37 ++++++++++++++++++------
net/ipv6/netfilter/nf_conntrack_reasm.c | 19 ++++++++++--
4 files changed, 50 insertions(+), 14 deletions(-)
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 879db26..0b94e91 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -256,6 +256,7 @@ struct inet6_skb_parm {
#if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)
__u16 dsthao;
#endif
+ __u16 frag_max_size;
#define IP6SKB_XFRM_TRANSFORMED 1
#define IP6SKB_FORWARDED 2
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5b2d63e..a4f6263 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -493,7 +493,8 @@ int ip6_forward(struct sk_buff *skb)
if (mtu < IPV6_MIN_MTU)
mtu = IPV6_MIN_MTU;
- if (skb->len > mtu && !skb_is_gso(skb)) {
+ if ((!skb->local_df && skb->len > mtu && !skb_is_gso(skb)) ||
+ (IP6CB(skb)->frag_max_size && IP6CB(skb)->frag_max_size > mtu)) {
/* Again, force OUTPUT device used as source address */
skb->dev = dst->dev;
icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
@@ -636,7 +637,9 @@ int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
/* We must not fragment if the socket is set to force MTU discovery
* or if the skb it not generated by a local socket.
*/
- if (unlikely(!skb->local_df && skb->len > mtu)) {
+ if (unlikely(!skb->local_df && skb->len > mtu) ||
+ (IP6CB(skb)->frag_max_size &&
+ IP6CB(skb)->frag_max_size > mtu)) {
if (skb->sk && dst_allfrag(skb_dst(skb)))
sk_nocaps_add(skb->sk, NETIF_F_GSO_MASK);
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index 4794f96..560d823 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -153,10 +153,10 @@ static unsigned int ipv6_helper(unsigned int hooknum,
const struct nf_conn_help *help;
const struct nf_conntrack_helper *helper;
enum ip_conntrack_info ctinfo;
- unsigned int ret, protoff;
- unsigned int extoff = (u8 *)(ipv6_hdr(skb) + 1) - skb->data;
- unsigned char pnum = ipv6_hdr(skb)->nexthdr;
-
+ unsigned int ret;
+ __be16 frag_off;
+ int protoff;
+ u8 nexthdr;
/* This is where we call the helper: as the packet goes out. */
ct = nf_ct_get(skb, &ctinfo);
@@ -171,9 +171,10 @@ static unsigned int ipv6_helper(unsigned int hooknum,
if (!helper)
return NF_ACCEPT;
- protoff = nf_ct_ipv6_skip_exthdr(skb, extoff, &pnum,
- skb->len - extoff);
- if (protoff > skb->len || pnum == NEXTHDR_FRAGMENT) {
+ nexthdr = ipv6_hdr(skb)->nexthdr;
+ protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &nexthdr,
+ &frag_off);
+ if (protoff < 0 || (frag_off & ntohs(~0x7)) != 0) {
pr_debug("proto header not found\n");
return NF_ACCEPT;
}
@@ -199,9 +200,13 @@ static unsigned int ipv6_confirm(unsigned int hooknum,
static unsigned int __ipv6_conntrack_in(struct net *net,
unsigned int hooknum,
struct sk_buff *skb,
+ const struct net_device *in,
+ const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
struct sk_buff *reasm = skb->nfct_reasm;
+ struct nf_conn *ct;
+ enum ip_conntrack_info ctinfo;
/* This packet is fragmented and has reassembled packet. */
if (reasm) {
@@ -213,6 +218,20 @@ static unsigned int __ipv6_conntrack_in(struct net *net,
if (ret != NF_ACCEPT)
return ret;
}
+
+ /* Conntrack helpers need the entire reassembled packet in the
+ * POST_ROUTING hook.
+ */
+ ct = nf_ct_get(reasm, &ctinfo);
+ if (ct != NULL && test_bit(IPS_HELPER_BIT, &ct->status)) {
+ nf_conntrack_get_reasm(skb);
+ NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, reasm,
+ (struct net_device *)in,
+ (struct net_device *)out,
+ okfn, NF_IP6_PRI_CONNTRACK + 1);
+ return NF_DROP_ERR(-ECANCELED);
+ }
+
nf_conntrack_get(reasm->nfct);
skb->nfct = reasm->nfct;
skb->nfctinfo = reasm->nfctinfo;
@@ -228,7 +247,7 @@ static unsigned int ipv6_conntrack_in(unsigned int hooknum,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
- return __ipv6_conntrack_in(dev_net(in), hooknum, skb, okfn);
+ return __ipv6_conntrack_in(dev_net(in), hooknum, skb, in, out, okfn);
}
static unsigned int ipv6_conntrack_local(unsigned int hooknum,
@@ -242,7 +261,7 @@ static unsigned int ipv6_conntrack_local(unsigned int hooknum,
net_notice_ratelimited("ipv6_conntrack_local: packet too short\n");
return NF_ACCEPT;
}
- return __ipv6_conntrack_in(dev_net(out), hooknum, skb, okfn);
+ return __ipv6_conntrack_in(dev_net(out), hooknum, skb, in, out, okfn);
}
static struct nf_hook_ops ipv6_conntrack_ops[] __read_mostly = {
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index c9c78c2..f94fb3a 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -190,6 +190,7 @@ static int nf_ct_frag6_queue(struct nf_ct_frag6_queue *fq, struct sk_buff *skb,
const struct frag_hdr *fhdr, int nhoff)
{
struct sk_buff *prev, *next;
+ unsigned int payload_len;
int offset, end;
if (fq->q.last_in & INET_FRAG_COMPLETE) {
@@ -197,8 +198,10 @@ static int nf_ct_frag6_queue(struct nf_ct_frag6_queue *fq, struct sk_buff *skb,
goto err;
}
+ payload_len = ntohs(ipv6_hdr(skb)->payload_len);
+
offset = ntohs(fhdr->frag_off) & ~0x7;
- end = offset + (ntohs(ipv6_hdr(skb)->payload_len) -
+ end = offset + (payload_len -
((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
if ((unsigned int)end > IPV6_MAXPLEN) {
@@ -307,6 +310,8 @@ found:
skb->dev = NULL;
fq->q.stamp = skb->tstamp;
fq->q.meat += skb->len;
+ if (payload_len > fq->q.max_size)
+ fq->q.max_size = payload_len;
atomic_add(skb->truesize, &nf_init_frags.mem);
/* The first fragment.
@@ -412,10 +417,12 @@ nf_ct_frag6_reasm(struct nf_ct_frag6_queue *fq, struct net_device *dev)
}
atomic_sub(head->truesize, &nf_init_frags.mem);
+ head->local_df = 1;
head->next = NULL;
head->dev = dev;
head->tstamp = fq->q.stamp;
ipv6_hdr(head)->payload_len = htons(payload_len);
+ IP6CB(head)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
/* Yes, and fold redundant checksum back. 8) */
if (head->ip_summed == CHECKSUM_COMPLETE)
@@ -592,6 +599,7 @@ void nf_ct_frag6_output(unsigned int hooknum, struct sk_buff *skb,
int (*okfn)(struct sk_buff *))
{
struct sk_buff *s, *s2;
+ unsigned int ret = 0;
for (s = NFCT_FRAG6_CB(skb)->orig; s;) {
nf_conntrack_put_reasm(s->nfct_reasm);
@@ -601,8 +609,13 @@ void nf_ct_frag6_output(unsigned int hooknum, struct sk_buff *skb,
s2 = s->next;
s->next = NULL;
- NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, s, in, out, okfn,
- NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
+ if (ret != -ECANCELED)
+ ret = NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, s,
+ in, out, okfn,
+ NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
+ else
+ kfree_skb(s);
+
s = s2;
}
nf_conntrack_put_reasm(skb);
--
1.7.1
next prev parent reply other threads:[~2012-08-09 20:08 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-09 20:08 [PATCH 00/19] netfilter: IPv6 NAT kaber
2012-08-09 20:08 ` [PATCH 01/19] netfilter: nf_ct_sip: fix helper name kaber
2012-08-14 0:00 ` Pablo Neira Ayuso
2012-08-09 20:08 ` [PATCH 02/19] netfilter: nf_ct_sip: fix IPv6 address parsing kaber
2012-08-14 0:19 ` Pablo Neira Ayuso
2012-08-09 20:08 ` [PATCH 03/19] netfilter: nf_nat_sip: fix via header translation with multiple parameters kaber
2012-08-14 0:28 ` Pablo Neira Ayuso
2012-08-14 12:23 ` Patrick McHardy
2012-08-09 20:08 ` [PATCH 04/19] ipv4: fix path MTU discovery with connection tracking kaber
2012-08-09 20:08 ` kaber [this message]
2012-08-17 8:06 ` [PATCH 05/19] netfilter: nf_conntrack_ipv6: improve fragmentation handling Jesper Dangaard Brouer
2012-08-18 12:26 ` Patrick McHardy
2012-08-19 19:37 ` Jesper Dangaard Brouer
2012-08-19 19:44 ` Patrick McHardy
2012-08-20 13:13 ` Jesper Dangaard Brouer
2012-08-22 22:21 ` Patrick McHardy
2012-08-21 22:21 ` Jesper Dangaard Brouer
2012-08-26 21:20 ` Patrick McHardy
2012-08-27 10:13 ` Jesper Dangaard Brouer
2012-08-27 10:41 ` Patrick McHardy
2012-08-27 14:40 ` [PATCH 0/2] net: ipvs and netfilter IPv6 defrag MTU handling Jesper Dangaard Brouer
2012-08-27 14:40 ` [PATCH 1/2] ipvs: IPv6 MTU checking cleanup and bugfix Jesper Dangaard Brouer
2012-08-27 14:42 ` [PATCH 2/2] ipvs: Extend MTU check to account for IPv6 NAT defrag changes Jesper Dangaard Brouer
2012-08-27 15:20 ` Julian Anastasov
2012-08-28 8:22 ` Patrick McHardy
2012-08-28 8:28 ` Simon Horman
2012-08-28 14:21 ` [PATCH V2 0/2] net: ipvs and netfilter IPv6 defrag MTU handling Jesper Dangaard Brouer
2012-08-28 14:22 ` [PATCH V2 1/2] ipvs: IPv6 MTU checking cleanup and bugfix Jesper Dangaard Brouer
2012-08-28 20:08 ` Patrick McHardy
2012-08-28 14:23 ` [PATCH V2 2/2] ipvs: Extend MTU check to account for IPv6 NAT defrag changes Jesper Dangaard Brouer
2012-08-28 14:49 ` Eric Dumazet
2012-08-29 7:02 ` Jesper Dangaard Brouer
2012-08-29 8:43 ` Eric Dumazet
2012-08-29 9:04 ` Jesper Dangaard Brouer
2012-08-28 20:10 ` Patrick McHardy
2012-08-28 9:03 ` [PATCH " Jesper Dangaard Brouer
2012-08-28 9:47 ` Julian Anastasov
2012-08-17 13:36 ` [PATCH 05/19] netfilter: nf_conntrack_ipv6: improve fragmentation handling Pablo Neira Ayuso
2012-08-18 12:43 ` Patrick McHardy
2012-08-09 20:08 ` [PATCH 06/19] netfilter: nf_conntrack_ipv6: fix tracking of ICMPv6 error messages containing fragments kaber
2012-08-09 20:08 ` [PATCH 07/19] netfilter: nf_conntrack: restrict NAT helper invocation to IPv4 kaber
2012-08-09 20:08 ` [PATCH 08/19] netfilter: nf_nat: add protoff argument to packet mangling functions kaber
2012-08-09 20:08 ` [PATCH 09/19] netfilter: add protocol independant NAT core kaber
2012-08-09 20:08 ` [PATCH 10/19] netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change kaber
2012-08-09 20:08 ` [PATCH 11/19] net: core: add function for incremental IPv6 pseudo header checksum updates kaber
2012-08-09 20:08 ` [PATCH 12/19] netfilter: ipv6: add IPv6 NAT support kaber
2012-08-09 20:08 ` [PATCH 13/19] netfilter: ip6tables: add MASQUERADE target kaber
2012-08-17 13:11 ` Pablo Neira Ayuso
2012-08-18 12:31 ` Patrick McHardy
2012-08-09 20:08 ` [PATCH 14/19] netfilter: ip6tables: add REDIRECT target kaber
2012-08-09 20:08 ` [PATCH 15/19] netfilter: ip6tables: add NETMAP target kaber
2012-08-09 20:09 ` [PATCH 16/19] netfilter: nf_nat: support IPv6 in FTP NAT helper kaber
2012-08-09 20:09 ` [PATCH 17/19] netfilter: nf_nat: support IPv6 in amanda " kaber
2012-08-09 20:09 ` [PATCH 18/19] netfilter: nf_nat: support IPv6 in SIP " kaber
2012-08-09 20:09 ` [PATCH 19/19] netfilter: ip6tables: add stateless IPv6-to-IPv6 Network Prefix Translation target kaber
2012-08-09 21:55 ` Jan Engelhardt
2012-08-09 22:25 ` Patrick McHardy
2012-08-09 20:56 ` [PATCH 00/19] netfilter: IPv6 NAT Eric W. Biederman
2012-08-09 21:52 ` Patrick McHardy
2012-08-09 22:00 ` Pablo Neira Ayuso
2012-08-09 22:30 ` Patrick McHardy
2012-08-17 13:42 ` Pablo Neira Ayuso
2012-08-18 12:46 ` Patrick McHardy
2012-08-25 0:58 ` Andre Tomt
2012-08-25 1:16 ` Andre Tomt
2012-08-26 18:06 ` Patrick McHardy
2012-08-27 7:33 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1344542943-11588-6-git-send-email-kaber@trash.net \
--to=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).