netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Dexuan Cui <decui@microsoft.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	'Eric Dumazet' <edumazet@google.com>, 'Willy Tarreau' <w@1wt.eu>,
	Joseph Salisbury <Joseph.Salisbury@microsoft.com>,
	Michael Kelley <mikelley@microsoft.com>
Subject: Re: UDP data corruption in v4.4
Date: Sat, 25 Jul 2020 07:58:40 +0200	[thread overview]
Message-ID: <20200725055840.GD1047853@kroah.com> (raw)
In-Reply-To: <KL1P15301MB028018F5C84C618BF7628045BF740@KL1P15301MB0280.APCP153.PROD.OUTLOOK.COM>

On Sat, Jul 25, 2020 at 02:21:06AM +0000, Dexuan Cui wrote:
> Hi,
> The v4.4 stable kernel (currently it's v4.4.231) lacks this bugfix:
> 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error")
> , as a result, the v4.4 kernel can deliver corrupt data to the application
> when a corrupt UDP packet is closely followed by a valid UDP packet:
> when the same invocation of the recvmsg() syscall delivers the corrupt
> packet's UDP payload to the application's receive buffer, it provides the
> UDP payload length and the "from IP/Port" of the valid packet to the 
> application -- this mismatch makes the issue worse.
> 
> Details:
> 
> For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's
> include/linux/skbuff.h:3951), Linux delays the UDP checksum verification
> until the application invokes the syscall recvmsg().
> 
> In the recvmsg() syscall handler, while Linux is copying the UDP payload
> to the application's memory, it calculates the UDP checksum. If the
> calculated checksum doesn't match the received checksum, Linux drops the
> corrupt UDP packet, and then starts to process the next packet (if any),
> and if the next packet is valid (i.e. the checksum is correct), Linux will
> copy the valid UDP packet's payload to the application's receiver buffer.
> 
> The bug is: before Linux starts to copy the valid UDP packet, the data
> structure used to track how many more bytes should be copied to the
> application memory is not reset to what it was when the application just
> entered the kernel by the syscall! Consequently, only a small portion or
> none of the valid packet's payload is copied to the application's receive
> buffer, and later when the application exits from the kernel, actually
> most of the application's receive buffer contains the payload of the
> corrupt packet while recvmsg() returns the length of the UDP payload of
> the valid packet.
> 
> For the mainline kernel, the bug was fixed by Al Viro in the commit 
> 327868212381, but unluckily the bugfix is only backported to the
> upstream v4.9+ kernels. I hope the bugfix can be backported to the
> v4.4 stable kernel, since it's a "longterm" kernel and is still used by
> some Linux distros.
> 
> It turns out backporting 327868212381 to v4.4 means that some 
> Supporting patches must be backported first, so the overall changes
> are pretty big...
> 
> I made the below one-line workaround patch to force the recvmsg() syscall
> handler to return to the userspace when Linux detects a corrupt UDP packet,
> so the application will invoke the syscall again to receive the following valid
> UDP packet (note: the patch may not work well with blocking sockets, for
> which typically the application doesn't expect an error of -EAGAIN. I
> guess it would be safer to return -EINTR instead?):
> 
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1367,6 +1367,7 @@ csum_copy_err:
>         /* starting over for a new packet, but check if we need to yield */
>         cond_resched();
>         msg->msg_flags &= ~MSG_TRUNC;
> +       return -EAGAIN;
>         goto try_again;
> }
> 
> 
> Eric Dumazet made an alternative that performs the csum validation earlier:
> 
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
> sk_buff *skb)
>                 }
>         }
> 
> -       if (rcu_access_pointer(sk->sk_filter) &&
> -           udp_lib_checksum_complete(skb))
> +       if (udp_lib_checksum_complete(skb))
>                 goto csum_error;
> 
>         if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
> 
> I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
> trying to backport 327868212381.

Does Eric's fix work with your testing?  If so, great, can you turn it
into something I can apply to the 4.4.y stable tree and send it to
stable@vger.kernel.org?

thanks,

greg k-h

  reply	other threads:[~2020-07-25  5:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-25  2:21 UDP data corruption in v4.4 Dexuan Cui
2020-07-25  5:58 ` Greg KH [this message]
2020-07-27 18:38   ` Dexuan Cui
2020-07-27 18:40     ` Eric Dumazet
2020-07-27 18:57       ` Dexuan Cui
2020-07-27 19:37         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200725055840.GD1047853@kroah.com \
    --to=greg@kroah.com \
    --cc=Joseph.Salisbury@microsoft.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=mikelley@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).