* GRO/GSO hiding PMTU?
@ 2011-02-10 22:55 David Miller
2011-02-10 23:07 ` David Miller
2011-02-10 23:50 ` Herbert Xu
0 siblings, 2 replies; 10+ messages in thread
From: David Miller @ 2011-02-10 22:55 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
I was trying to setup something simple to trigger PMTU to test my
PMTU patches, and the simplest (I thought) would be to simply down
the mtu of the internet facing side of my simple NAT box.
All I did was "ip link set eth0 mtu 1400", and try to send large
TCP sequences from inside.
To my surprise I saw no ICMP messages, on input to the NAT machine the
TCP packets had length 1448 and on output they had length 1348.
This NAT box has TG3 on both input and output, so supports GRO and
TSO. The kernel is 2.6.34-rc7 vintage :-)
I suspect that the packet arrives on eth1, accumulates into GRO, and
thus marked as GSO as well, then GSO/TSO on output to eth0 is
re-segmenting things transparently, and we're not getting the ICMP
frag-needed message and the packet drop because of the skb_is_gso()
check in ip_forward().
if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
(ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
htonl(dst_mtu(&rt->dst)));
goto drop;
}
So if that's what is happening, that's cute, but I think we need to
fix this :-)
Perhaps the check in ip_forward() should instead validate the gso_size
in the skb_is_gso() case?
That'd be a little tricky since gso_size is an MSS value whereas what
we're checking against (skb->len) is the full packet size, headers and
all.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-10 22:55 GRO/GSO hiding PMTU? David Miller
@ 2011-02-10 23:07 ` David Miller
2011-02-11 0:07 ` Herbert Xu
2011-02-10 23:50 ` Herbert Xu
1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2011-02-10 23:07 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
From: David Miller <davem@davemloft.net>
Date: Thu, 10 Feb 2011 14:55:55 -0800 (PST)
> I suspect that the packet arrives on eth1, accumulates into GRO, and
> thus marked as GSO as well, then GSO/TSO on output to eth0 is
> re-segmenting things transparently, and we're not getting the ICMP
> frag-needed message and the packet drop because of the skb_is_gso()
> check in ip_forward().
>
> if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
> (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
> IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> htonl(dst_mtu(&rt->dst)));
> goto drop;
> }
>
> So if that's what is happening, that's cute, but I think we need to
> fix this :-)
>
> Perhaps the check in ip_forward() should instead validate the gso_size
> in the skb_is_gso() case?
>
> That'd be a little tricky since gso_size is an MSS value whereas what
> we're checking against (skb->len) is the full packet size, headers and
> all.
Nevermind, I turned off gso/tso on eth0 (outgoing interface) and it still
happens.
I guess netfilter or something else is causing this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-10 22:55 GRO/GSO hiding PMTU? David Miller
2011-02-10 23:07 ` David Miller
@ 2011-02-10 23:50 ` Herbert Xu
2011-02-11 6:22 ` David Miller
1 sibling, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2011-02-10 23:50 UTC (permalink / raw)
To: David Miller; +Cc: netdev, netfilter-devel
On Thu, Feb 10, 2011 at 02:55:55PM -0800, David Miller wrote:
>
> I suspect that the packet arrives on eth1, accumulates into GRO, and
> thus marked as GSO as well, then GSO/TSO on output to eth0 is
> re-segmenting things transparently, and we're not getting the ICMP
> frag-needed message and the packet drop because of the skb_is_gso()
> check in ip_forward().
>
> if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
> (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
> IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> htonl(dst_mtu(&rt->dst)));
> goto drop;
> }
>
> So if that's what is happening, that's cute, but I think we need to
> fix this :-)
Yes this is a known problem and we do need to fix this, even if
it doesn't appear to be the cause of your immediate issue :)
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-10 23:07 ` David Miller
@ 2011-02-11 0:07 ` Herbert Xu
0 siblings, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2011-02-11 0:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev, netfilter-devel
On Thu, Feb 10, 2011 at 03:07:59PM -0800, David Miller wrote:
>
> Nevermind, I turned off gso/tso on eth0 (outgoing interface) and it still
> happens.
Turning off GSO/TSO has no effect if it's GRO generating the packet
since it still has to be segmented regardless.
So I think this is the problem that you found and we need to fix it
by checking the gso_size parameter.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-10 23:50 ` Herbert Xu
@ 2011-02-11 6:22 ` David Miller
2011-02-11 6:35 ` David Miller
2011-02-11 6:37 ` Herbert Xu
0 siblings, 2 replies; 10+ messages in thread
From: David Miller @ 2011-02-11 6:22 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 11 Feb 2011 10:50:22 +1100
> On Thu, Feb 10, 2011 at 02:55:55PM -0800, David Miller wrote:
>>
>> I suspect that the packet arrives on eth1, accumulates into GRO, and
>> thus marked as GSO as well, then GSO/TSO on output to eth0 is
>> re-segmenting things transparently, and we're not getting the ICMP
>> frag-needed message and the packet drop because of the skb_is_gso()
>> check in ip_forward().
>>
>> if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
>> (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
>> IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
>> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
>> htonl(dst_mtu(&rt->dst)));
>> goto drop;
>> }
>>
>> So if that's what is happening, that's cute, but I think we need to
>> fix this :-)
>
> Yes this is a known problem and we do need to fix this, even if
> it doesn't appear to be the cause of your immediate issue :)
I gave it a shot but it isn't easy. We can figure out the length of
the IP headers just fine, but the rest of the value we need to add
to the MSS (the TCP header length) is transport specific which kind
of implies a transport dependent gso proto op of some sort.
Or we just hack it, admit that only TCP creates GSO packets, and
directly check for TCP protcol and then inspect the TCP header
length :-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-11 6:22 ` David Miller
@ 2011-02-11 6:35 ` David Miller
2011-02-11 6:41 ` Herbert Xu
2011-02-11 6:37 ` Herbert Xu
1 sibling, 1 reply; 10+ messages in thread
From: David Miller @ 2011-02-11 6:35 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
From: David Miller <davem@davemloft.net>
Date: Thu, 10 Feb 2011 22:22:16 -0800 (PST)
> I gave it a shot but it isn't easy. We can figure out the length of
> the IP headers just fine, but the rest of the value we need to add
> to the MSS (the TCP header length) is transport specific which kind
> of implies a transport dependent gso proto op of some sort.
>
> Or we just hack it, admit that only TCP creates GSO packets, and
> directly check for TCP protcol and then inspect the TCP header
> length :-)
Herbert how does this look for now?
Of course, we need to do something similar in all kinds of other spots.
Even places like bridging :-/
--------------------
ipv4: Check MSS properly in ip_forward() GSO check.
When we forward packets we decide whether we should send
a frag-needed ICMP back based upon the skb length.
But if this is a GSO packet, we wholesale elide the length
check entirely.
This is wrong, we do have to check things. Except that the
length validation in this case is not straighforward.
We have to take the gso_size (which is the MSS) and add in
the IP and TCP header to arrive at the length we should use
to compare against the MTU.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
index 99461f0..7449890 100644
--- a/net/ipv4/ip_forward.c
+++ b/net/ipv4/ip_forward.c
@@ -51,6 +51,36 @@ static int ip_forward_finish(struct sk_buff *skb)
return dst_output(skb);
}
+static bool send_frag_needed(struct sk_buff *skb, struct rtable *rt)
+{
+ unsigned int len_to_check = skb->len;
+
+ if (skb_is_gso(skb)) {
+ unsigned int gso_size = skb_shinfo(skb)->gso_size;
+ unsigned int ihl = ip_hdr(skb)->ihl * 4;
+ struct tcphdr th_stack, *th;
+
+ if (WARN_ON_ONCE(ip_hdr(skb)->protocol != IPPROTO_TCP))
+ return false;
+
+ th = skb_header_pointer(skb, ihl, sizeof(th_stack),
+ &th_stack);
+ if (!th)
+ return false;
+
+ len_to_check = gso_size + ihl + (th->doff * 4);
+ }
+
+ if (len_to_check <= dst_mtu(&rt->dst))
+ return false;
+ if (!(ip_hdr(skb)->frag_off & htons(IP_DF)))
+ return false;
+ if (skb->local_df)
+ return false;
+
+ return true;
+}
+
int ip_forward(struct sk_buff *skb)
{
struct iphdr *iph; /* Our header */
@@ -87,8 +117,7 @@ int ip_forward(struct sk_buff *skb)
if (opt->is_strictroute && rt->rt_dst != rt->rt_gateway)
goto sr_failed;
- if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
- (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
+ if (unlikely(send_frag_needed(skb, rt))) {
IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
htonl(dst_mtu(&rt->dst)));
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-11 6:22 ` David Miller
2011-02-11 6:35 ` David Miller
@ 2011-02-11 6:37 ` Herbert Xu
2011-02-11 7:07 ` David Miller
1 sibling, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2011-02-11 6:37 UTC (permalink / raw)
To: David Miller; +Cc: netdev, netfilter-devel
On Thu, Feb 10, 2011 at 10:22:16PM -0800, David Miller wrote:
>
> I gave it a shot but it isn't easy. We can figure out the length of
> the IP headers just fine, but the rest of the value we need to add
> to the MSS (the TCP header length) is transport specific which kind
> of implies a transport dependent gso proto op of some sort.
That's pretty much where I gave up :)
> Or we just hack it, admit that only TCP creates GSO packets, and
> directly check for TCP protcol and then inspect the TCP header
> length :-)
Sure we can do that for now.
What I wanted to do if I ever get enough time to work on this is
to record the transport header length in a gso_hlen field so we
can fix this properly.
We currently have a useless gso_segs field that only has one or
two users that don't even need it. We could easily get rid of it
and use that space for gso_hlen instead.
The gso_hlen field only needs to be filled in at the few spots
that generate GSO packets, i.e.,
1) TCP
2) Virt backends like tun.c
3) GRO
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-11 6:35 ` David Miller
@ 2011-02-11 6:41 ` Herbert Xu
2011-02-11 7:06 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Herbert Xu @ 2011-02-11 6:41 UTC (permalink / raw)
To: David Miller; +Cc: netdev, netfilter-devel
On Thu, Feb 10, 2011 at 10:35:44PM -0800, David Miller wrote:
>
> Herbert how does this look for now?
This should work.
> Of course, we need to do something similar in all kinds of other spots.
>
> Even places like bridging :-/
Yeah every place that does skb->len and skb_is_gso checks will need
this.
> +static bool send_frag_needed(struct sk_buff *skb, struct rtable *rt)
> +{
> + unsigned int len_to_check = skb->len;
> +
> + if (skb_is_gso(skb)) {
> + unsigned int gso_size = skb_shinfo(skb)->gso_size;
> + unsigned int ihl = ip_hdr(skb)->ihl * 4;
> + struct tcphdr th_stack, *th;
> +
> + if (WARN_ON_ONCE(ip_hdr(skb)->protocol != IPPROTO_TCP))
> + return false;
> +
> + th = skb_header_pointer(skb, ihl, sizeof(th_stack),
> + &th_stack);
> + if (!th)
> + return false;
> +
> + len_to_check = gso_size + ihl + (th->doff * 4);
I think we need to do some length verifications here because for
a malicious guest-generated packet the TCP header may not be present.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-11 6:41 ` Herbert Xu
@ 2011-02-11 7:06 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2011-02-11 7:06 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 11 Feb 2011 17:41:38 +1100
> I think we need to do some length verifications here because for
> a malicious guest-generated packet the TCP header may not be present.
Indeed, good catch.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: GRO/GSO hiding PMTU?
2011-02-11 6:37 ` Herbert Xu
@ 2011-02-11 7:07 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2011-02-11 7:07 UTC (permalink / raw)
To: herbert; +Cc: netdev, netfilter-devel
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 11 Feb 2011 17:37:53 +1100
> What I wanted to do if I ever get enough time to work on this is
> to record the transport header length in a gso_hlen field so we
> can fix this properly.
>
> We currently have a useless gso_segs field that only has one or
> two users that don't even need it. We could easily get rid of it
> and use that space for gso_hlen instead.
>
> The gso_hlen field only needs to be filled in at the few spots
> that generate GSO packets, i.e.,
>
> 1) TCP
> 2) Virt backends like tun.c
> 3) GRO
Yep, that's good idea.
And even if we needed to add one more u32 to skb_shared_info()
that's still sort-of "free" because of SLAB slack space.
I'll look into doing this.
Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-02-11 7:06 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-10 22:55 GRO/GSO hiding PMTU? David Miller
2011-02-10 23:07 ` David Miller
2011-02-11 0:07 ` Herbert Xu
2011-02-10 23:50 ` Herbert Xu
2011-02-11 6:22 ` David Miller
2011-02-11 6:35 ` David Miller
2011-02-11 6:41 ` Herbert Xu
2011-02-11 7:06 ` David Miller
2011-02-11 6:37 ` Herbert Xu
2011-02-11 7:07 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).