From: Patrick McHardy <kaber@trash.net>
To: pablo@netfilter.org
Cc: netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 01/19] ipv4: fix path MTU discovery with connection tracking
Date: Tue, 28 Aug 2012 23:48:41 +0200 [thread overview]
Message-ID: <1346190539-9963-2-git-send-email-kaber@trash.net> (raw)
In-Reply-To: <1346190539-9963-1-git-send-email-kaber@trash.net>
IPv4 conntrack defragments incoming packet at the PRE_ROUTING hook and
(in case of forwarded packets) refragments them at POST_ROUTING
independent of the IP_DF flag. Refragmentation uses the dst_mtu() of
the local route without caring about the original fragment sizes,
thereby breaking PMTUD.
This patch fixes this by keeping track of the largest received fragment
with IP_DF set and generates an ICMP fragmentation required error during
refragmentation if that size exceeds the MTU.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
---
include/net/inet_frag.h | 2 ++
include/net/ip.h | 2 ++
net/ipv4/ip_fragment.c | 8 +++++++-
net/ipv4/ip_output.c | 4 +++-
4 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 2431cf8..5098ee7 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -29,6 +29,8 @@ struct inet_frag_queue {
#define INET_FRAG_COMPLETE 4
#define INET_FRAG_FIRST_IN 2
#define INET_FRAG_LAST_IN 1
+
+ u16 max_size;
};
#define INETFRAGS_HASHSZ 64
diff --git a/include/net/ip.h b/include/net/ip.h
index 5a5d84d..0707fb9 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -42,6 +42,8 @@ struct inet_skb_parm {
#define IPSKB_XFRM_TRANSFORMED 4
#define IPSKB_FRAG_COMPLETE 8
#define IPSKB_REROUTED 16
+
+ u16 frag_max_size;
};
static inline unsigned int ip_hdrlen(const struct sk_buff *skb)
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 8d07c97..fa6a12c 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -523,6 +523,10 @@ found:
if (offset == 0)
qp->q.last_in |= INET_FRAG_FIRST_IN;
+ if (ip_hdr(skb)->frag_off & htons(IP_DF) &&
+ skb->len + ihl > qp->q.max_size)
+ qp->q.max_size = skb->len + ihl;
+
if (qp->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
qp->q.meat == qp->q.len)
return ip_frag_reasm(qp, prev, dev);
@@ -646,9 +650,11 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
head->next = NULL;
head->dev = dev;
head->tstamp = qp->q.stamp;
+ IPCB(head)->frag_max_size = qp->q.max_size;
iph = ip_hdr(head);
- iph->frag_off = 0;
+ /* max_size != 0 implies at least one fragment had IP_DF set */
+ iph->frag_off = qp->q.max_size ? htons(IP_DF) : 0;
iph->tot_len = htons(len);
iph->tos |= ecn;
IP_INC_STATS_BH(net, IPSTATS_MIB_REASMOKS);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index c196d74..a5beab1 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -467,7 +467,9 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
iph = ip_hdr(skb);
- if (unlikely((iph->frag_off & htons(IP_DF)) && !skb->local_df)) {
+ if (unlikely(((iph->frag_off & htons(IP_DF)) && !skb->local_df) ||
+ (IPCB(skb)->frag_max_size &&
+ IPCB(skb)->frag_max_size > dst_mtu(&rt->dst)))) {
IP_INC_STATS(dev_net(dev), IPSTATS_MIB_FRAGFAILS);
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
htonl(ip_skb_dst_mtu(skb)));
--
1.7.1
next prev parent reply other threads:[~2012-08-28 21:48 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-28 21:48 [PATCH 00/19] netfilter: IPv6 NAT Patrick McHardy
2012-08-28 21:48 ` Patrick McHardy [this message]
2012-08-28 21:48 ` [PATCH 02/19] Cleaning up the IPv6 MTU checking in the IPVS xmit code, by using a common helper function __mtu_check_toobig_v6() Patrick McHardy
2012-08-29 8:02 ` Jesper Dangaard Brouer
2012-08-29 12:24 ` Patrick McHardy
2012-08-30 1:01 ` Pablo Neira Ayuso
2012-08-28 21:48 ` [PATCH 03/19] netfilter: nf_conntrack_ipv6: improve fragmentation handling Patrick McHardy
2012-08-29 8:21 ` Jesper Dangaard Brouer
2012-08-29 12:27 ` Patrick McHardy
2012-08-30 3:06 ` Pablo Neira Ayuso
2012-08-30 6:54 ` Patrick McHardy
2012-08-28 21:48 ` [PATCH 04/19] netfilter: nf_conntrack_ipv6: fix tracking of ICMPv6 error messages containing fragments Patrick McHardy
2012-08-28 21:48 ` [PATCH 05/19] netfilter: nf_conntrack: restrict NAT helper invocation to IPv4 Patrick McHardy
2012-08-28 21:48 ` [PATCH 06/19] netfilter: nf_nat: add protoff argument to packet mangling functions Patrick McHardy
2012-08-28 21:48 ` [PATCH 07/19] netfilter: add protocol independent NAT core Patrick McHardy
2012-08-28 21:48 ` [PATCH 08/19] netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change Patrick McHardy
2012-08-28 21:48 ` [PATCH 09/19] net: core: add function for incremental IPv6 pseudo header checksum updates Patrick McHardy
2012-08-28 21:48 ` [PATCH 10/19] netfilter: ipv6: add IPv6 NAT support Patrick McHardy
2012-08-28 21:48 ` [PATCH 11/19] netfilter: ip6tables: add MASQUERADE target Patrick McHardy
2012-08-28 21:48 ` [PATCH 12/19] netfilter: ip6tables: add REDIRECT target Patrick McHardy
2012-08-28 21:48 ` [PATCH 13/19] netfilter: ip6tables: add NETMAP target Patrick McHardy
2012-08-28 21:48 ` [PATCH 14/19] netfilter: nf_nat: support IPv6 in FTP NAT helper Patrick McHardy
2012-08-28 21:48 ` [PATCH 15/19] netfilter: nf_nat: support IPv6 in amanda " Patrick McHardy
2012-08-28 21:48 ` [PATCH 16/19] netfilter: nf_nat: support IPv6 in SIP " Patrick McHardy
2012-08-28 21:48 ` [PATCH 17/19] netfilter: nf_nat: support IPv6 in IRC " Patrick McHardy
2012-08-28 21:48 ` [PATCH 18/19] netfilter: nf_nat: support IPv6 in TFTP " Patrick McHardy
2012-08-28 21:48 ` [PATCH 19/19] netfilter: ip6tables: add stateless IPv6-to-IPv6 Network Prefix Translation target Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1346190539-9963-2-git-send-email-kaber@trash.net \
--to=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).