* [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets
@ 2015-10-20 14:38 Hannes Frederic Sowa
2015-10-20 14:55 ` Vlad Yasevich
2015-10-20 21:39 ` Vlad Yasevich
0 siblings, 2 replies; 5+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-20 14:38 UTC (permalink / raw)
To: netdev; +Cc: sd, bcodding, vyasevich, Hannes Frederic Sowa
MSG_MORE might cause the packet to get fragmented in the end when
passed down to the flush function and the transhdrlen check alone is
not sufficient to protect against fragmentation. Instead check if the
socket user intends to add more data to the socket on the first packet.
This broke checksum calculation for UDPv6 for NFS protocols.
Fixes: 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets")
Cc: Vlad Yasevich <vyasevich@gmail.com>
Tested-by: Sabrina Dubroca <sd@quesysnail.net>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/ip6_output.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 61d403e..95c5780 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1317,6 +1317,7 @@ emsgsize:
* sums only work when transhdrlen is set.
*/
if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
+ !(flags & MSG_MORE) &&
length + fragheaderlen < mtu &&
rt->dst.dev->features & NETIF_F_V6_CSUM &&
!exthdrlen)
--
2.4.3
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets
2015-10-20 14:38 [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets Hannes Frederic Sowa
@ 2015-10-20 14:55 ` Vlad Yasevich
2015-10-20 21:39 ` Vlad Yasevich
1 sibling, 0 replies; 5+ messages in thread
From: Vlad Yasevich @ 2015-10-20 14:55 UTC (permalink / raw)
To: Hannes Frederic Sowa, netdev; +Cc: sd, bcodding
On 10/20/2015 10:38 AM, Hannes Frederic Sowa wrote:
> MSG_MORE might cause the packet to get fragmented in the end when
> passed down to the flush function and the transhdrlen check alone is
> not sufficient to protect against fragmentation. Instead check if the
> socket user intends to add more data to the socket on the first packet.
>
> This broke checksum calculation for UDPv6 for NFS protocols.
>
> Fixes: 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets")
> Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> Tested-by: Sabrina Dubroca <sd@quesysnail.net>
> Tested-by: Benjamin Coddington <bcodding@redhat.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> net/ipv6/ip6_output.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 61d403e..95c5780 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1317,6 +1317,7 @@ emsgsize:
> * sums only work when transhdrlen is set.
> */
> if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
> + !(flags & MSG_MORE) &&
> length + fragheaderlen < mtu &&
> rt->dst.dev->features & NETIF_F_V6_CSUM &&
> !exthdrlen)
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets
2015-10-20 14:38 [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets Hannes Frederic Sowa
2015-10-20 14:55 ` Vlad Yasevich
@ 2015-10-20 21:39 ` Vlad Yasevich
2015-10-21 9:51 ` Hannes Frederic Sowa
2015-10-21 12:52 ` Hannes Frederic Sowa
1 sibling, 2 replies; 5+ messages in thread
From: Vlad Yasevich @ 2015-10-20 21:39 UTC (permalink / raw)
To: Hannes Frederic Sowa, netdev; +Cc: sd, bcodding
On 10/20/2015 10:38 AM, Hannes Frederic Sowa wrote:
> MSG_MORE might cause the packet to get fragmented in the end when
> passed down to the flush function and the transhdrlen check alone is
> not sufficient to protect against fragmentation. Instead check if the
> socket user intends to add more data to the socket on the first packet.
>
> This broke checksum calculation for UDPv6 for NFS protocols.
>
> Fixes: 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets")
> Cc: Vlad Yasevich <vyasevich@gmail.com>
> Tested-by: Sabrina Dubroca <sd@quesysnail.net>
> Tested-by: Benjamin Coddington <bcodding@redhat.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> net/ipv6/ip6_output.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 61d403e..95c5780 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1317,6 +1317,7 @@ emsgsize:
> * sums only work when transhdrlen is set.
> */
> if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
> + !(flags & MSG_MORE) &&
> length + fragheaderlen < mtu &&
> rt->dst.dev->features & NETIF_F_V6_CSUM &&
> !exthdrlen)
>
Hmm... so while this solves this problem by simply avoiding the combination of
skb #1 having CHECKSUM_PARTIAL and others having CHECKSUM_NONE, I think the actual
problem is a bit deeper.
The above combination seems to work for me since udp6_hwcsum_outgoing() corrects
the checksum. However, my testing so far has been on nics that have NETIF_F_V6_CSUM,
but without UFO support.
On such systems a simple test of using MSG_MORE an IPv6 udp socket sending 200 bytes
followed by 2000 bytes works correctly.
I am now wondering if this might be UFO related instead and looking for a nic that
has UFO support.
-vlad
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets
2015-10-20 21:39 ` Vlad Yasevich
@ 2015-10-21 9:51 ` Hannes Frederic Sowa
2015-10-21 12:52 ` Hannes Frederic Sowa
1 sibling, 0 replies; 5+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-21 9:51 UTC (permalink / raw)
To: Vlad Yasevich, netdev; +Cc: sd, bcodding
Hi Vlad,
On Tue, Oct 20, 2015, at 23:39, Vlad Yasevich wrote:
> On 10/20/2015 10:38 AM, Hannes Frederic Sowa wrote:
> > MSG_MORE might cause the packet to get fragmented in the end when
> > passed down to the flush function and the transhdrlen check alone is
> > not sufficient to protect against fragmentation. Instead check if the
> > socket user intends to add more data to the socket on the first packet.
> >
> > This broke checksum calculation for UDPv6 for NFS protocols.
> >
> > Fixes: 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets")
> > Cc: Vlad Yasevich <vyasevich@gmail.com>
> > Tested-by: Sabrina Dubroca <sd@quesysnail.net>
> > Tested-by: Benjamin Coddington <bcodding@redhat.com>
> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > ---
> > net/ipv6/ip6_output.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index 61d403e..95c5780 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -1317,6 +1317,7 @@ emsgsize:
> > * sums only work when transhdrlen is set.
> > */
> > if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
> > + !(flags & MSG_MORE) &&
> > length + fragheaderlen < mtu &&
> > rt->dst.dev->features & NETIF_F_V6_CSUM &&
> > !exthdrlen)
> >
>
> Hmm... so while this solves this problem by simply avoiding the
> combination of
> skb #1 having CHECKSUM_PARTIAL and others having CHECKSUM_NONE, I think
> the actual
> problem is a bit deeper.
> The above combination seems to work for me since udp6_hwcsum_outgoing()
> corrects
> the checksum. However, my testing so far has been on nics that have
> NETIF_F_V6_CSUM,
> but without UFO support.
With nfs tests we never branch into setting up or extending an UDP UFO
packet, also because on the test system UFO is disabled on all
interfaces.
I thought about relaxing this check in future when we simply make sure
we don't do fragmentation based based on the length while taking all
fragments into account.
> On such systems a simple test of using MSG_MORE an IPv6 udp socket
> sending 200 bytes
> followed by 2000 bytes works correctly.
Did you make sure it actually fragmented and the checksums are correct?
> I am now wondering if this might be UFO related instead and looking for a
> nic that
> has UFO support.
So far as I can see it has nothing to do with UFO. I will do more
investigation now. Thanks for bringing this up!
Bye,
Hannes
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets
2015-10-20 21:39 ` Vlad Yasevich
2015-10-21 9:51 ` Hannes Frederic Sowa
@ 2015-10-21 12:52 ` Hannes Frederic Sowa
1 sibling, 0 replies; 5+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-21 12:52 UTC (permalink / raw)
To: Vlad Yasevich, netdev; +Cc: sd, bcodding
On Tue, Oct 20, 2015, at 23:39, Vlad Yasevich wrote:
> I am now wondering if this might be UFO related instead and looking for a
> nic that
> has UFO support.
I doubt that.
We overallocate memory first time in ip6_append_data because we are in
MSG_MORE mode. Then, in my case the second write only copies data into
the first skb on the write queue, no skb is appended to frags_list. So
udp6_hwcsum_outgoing doesn't clean up the flags, either.
We can improve the check if we fragment an ipv6 skb in ip6_append_mode
in net-next, I agree. But I still see this fix suitable for 'net' tree.
We could also improve udp6_hwcsum_outgoing to check if our packets get
fragmented and fall back to the clean-up path. But I think this kind of
optimization should go into net-next, too.
Currently the check made sure we don't use PARTIAL on skbs which could
fragment. MSG_MORE somehow circumvented that check, so I think the fix
is good to go. We certainly can try to improve PARTIAL checksums for
fragmented packets.
What do you think?
Bye,
Hannes
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-10-21 12:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-20 14:38 [PATCH net] ipv6: don't use CHECKSUM_PARTIAL on MSG_MORE/UDP_CORK sockets Hannes Frederic Sowa
2015-10-20 14:55 ` Vlad Yasevich
2015-10-20 21:39 ` Vlad Yasevich
2015-10-21 9:51 ` Hannes Frederic Sowa
2015-10-21 12:52 ` Hannes Frederic Sowa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).