All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Brett A C Sheffield <bacs@librecast.net>,
	 willemdebruijn.kernel@gmail.com
Cc: bacs@librecast.net,  davem@davemloft.net,  edumazet@google.com,
	 gregkh@linuxfoundation.org,  horms@kernel.org,  kuba@kernel.org,
	 linux-kernel@vger.kernel.org,  linux-kselftest@vger.kernel.org,
	 netdev@vger.kernel.org,  pabeni@redhat.com,  shuah@kernel.org,
	 willemb@google.com
Subject: Re: [PATCH net-next v5] selftests: net: add test for ipv6 fragmentation
Date: Tue, 02 Sep 2025 10:48:57 -0400	[thread overview]
Message-ID: <willemdebruijn.kernel.e019291a1132@gmail.com> (raw)
In-Reply-To: <20250902142502.27278-1-bacs@librecast.net>

Brett A C Sheffield wrote:
> Add selftest for the IPv6 fragmentation regression which affected
> several stable kernels.
> 
> Commit a18dfa9925b9 ("ipv6: save dontfrag in cork") was backported to
> stable without some prerequisite commits.  This caused a regression when
> sending IPv6 UDP packets by preventing fragmentation and instead
> returning -1 (EMSGSIZE).
> 
> Add selftest to check for this issue by attempting to send a packet
> larger than the interface MTU. The packet will be fragmented on a
> working kernel, with sendmsg(2) correctly returning the expected number
> of bytes sent.  When the regression is present, sendmsg returns -1 and
> sets errno to EMSGSIZE.
> 
> Link: https://lore.kernel.org/stable/aElivdUXqd1OqgMY@karahi.gladserv.com
> Signed-off-by: Brett A C Sheffield <bacs@librecast.net>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
> ---
> Thanks for the reviews Willem and Jakub.
> 
> On 2025-09-01 09:45, Willem de Bruijn wrote:
> 
> > > +int main(void)
> > > +{
> > > +   struct in6_addr addr = {
> > > +           .s6_addr[15] = 0x01, /* ::1 */
> > > +   };
> > > +   struct sockaddr_in6 sa = {
> > > +           .sin6_family = AF_INET6,
> > > +           .sin6_addr = addr,
> > > +           .sin6_port = 9      /* port 9/udp (DISCARD) */
> >
> > htons
> 
> addr is already initialized in network byte order (BE) here.

My point was about port

> 
> Verified with:
> 
>         char ip6[INET6_ADDRSTRLEN];
>         inet_ntop(AF_INET6, &(sa.sin6_addr), ip6, INET6_ADDRSTRLEN);
>         printf("The address is %s\n", ip6);
> 
> which prints "The address is ::1"
> 
> All other suggestions adopted in v5.
> 
> 
> v5 changes:
>  - disable_dad: delete - not needed for lo
>  - main: simplify failure paths
>  - main: char -> static char buf
>  - setup: remove pointless return value
>  - setup: remove unused variable fd
>  - setup: merge with interface_up() to simplify
>  - setup: check all system call return values
>  - remove no longer used headers
> 
> v4 changes:
>  - fix "else should follow close brace" (checkpatch ERROR)
> 
> v3 changes:
>  - add usleep instead of busy polling on sendmsg
>  - simplify error handling by using error() and leaving cleanup to O/S
>  - use loopback interface - don't bother creating TAP
>  - send to localhost (::1)
> 
> v2 changes:
>  - remove superfluous namespace calls - unshare(2) suffices
>  - remove usleep(). Don't wait for the interface to be ready, just send, and
>    handle the (less likely) error case by retrying.
>  - set destination address only once
>  - document our use of the IPv6 link-local source address
>  - send to port 9 (DISCARD) instead of 4242 (DONT PANIC)
>  - ensure sockets are closed on failure paths
>  - use KSFT exit codes for clarity
> 
> v4: https://lore.kernel.org/netdev/20250901123757.13112-1-bacs@librecast.net
> v3: https://lore.kernel.org/netdev/20250901112248.5218-1-bacs@librecast.net
> v2: https://lore.kernel.org/netdev/20250831102908.14655-1-bacs@librecast.net
> v1: https://lore.kernel.org/netdev/20250825092548.4436-3-bacs@librecast.net
> 
>  tools/testing/selftests/net/.gitignore        |   1 +
>  tools/testing/selftests/net/Makefile          |   1 +
>  .../selftests/net/ipv6_fragmentation.c        | 115 ++++++++++++++++++
>  3 files changed, 117 insertions(+)
>  create mode 100644 tools/testing/selftests/net/ipv6_fragmentation.c
> 
> diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
> index 47c293c2962f..3d4b4a53dfda 100644
> --- a/tools/testing/selftests/net/.gitignore
> +++ b/tools/testing/selftests/net/.gitignore
> @@ -16,6 +16,7 @@ ip_local_port_range
>  ipsec
>  ipv6_flowlabel
>  ipv6_flowlabel_mgr
> +ipv6_fragmentation
>  log.txt
>  msg_oob
>  msg_zerocopy
> diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
> index eef0b8f8a7b0..276e0481d996 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -117,6 +117,7 @@ TEST_GEN_FILES += tfo
>  TEST_PROGS += tfo_passive.sh
>  TEST_PROGS += broadcast_pmtu.sh
>  TEST_PROGS += ipv6_force_forwarding.sh
> +TEST_GEN_PROGS += ipv6_fragmentation
>  TEST_PROGS += route_hint.sh
>  
>  # YNL files, must be before "include ..lib.mk"
> diff --git a/tools/testing/selftests/net/ipv6_fragmentation.c b/tools/testing/selftests/net/ipv6_fragmentation.c
> new file mode 100644
> index 000000000000..b76ce7b713fc
> --- /dev/null
> +++ b/tools/testing/selftests/net/ipv6_fragmentation.c
> @@ -0,0 +1,115 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Author: Brett A C Sheffield <bacs@librecast.net>
> + *
> + * Kernel selftest for the IPv6 fragmentation regression which affected stable
> + * kernels:
> + *
> + *   https://lore.kernel.org/stable/aElivdUXqd1OqgMY@karahi.gladserv.com
> + *
> + * Commit: a18dfa9925b9 ("ipv6: save dontfrag in cork") was backported to stable
> + * without some prerequisite commits.
> + *
> + * This caused a regression when sending IPv6 UDP packets by preventing
> + * fragmentation and instead returning -1 (EMSGSIZE).
> + *
> + * This selftest demonstrates the issue by sending an IPv6 UDP packet to
> + * localhost (::1) on the loopback interface from the autoconfigured link-local
> + * address.
> + *
> + * sendmsg(2) returns bytes sent correctly on a working kernel, and returns -1
> + * (EMSGSIZE) when the regression is present.
> + *
> + * The regression was not present in the mainline kernel, but add this test to
> + * catch similar breakage in future.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <error.h>
> +#include <net/if.h>
> +#include <netinet/in.h>
> +#include <sched.h>
> +#include <stdio.h>
> +#include <sys/ioctl.h>
> +#include <sys/socket.h>
> +#include <unistd.h>
> +#include "../kselftest.h"
> +
> +#define MTU 1500
> +#define LARGER_THAN_MTU 8192
> +
> +static void setup(void)
> +{
> +	struct ifreq ifr = {
> +		.ifr_name = "lo"
> +	};
> +	int ctl;
> +
> +	/* we need to set MTU, so do this in a namespace to play nicely */
> +	if (unshare(CLONE_NEWNET) == -1)
> +		error(KSFT_FAIL, errno, "unshare");
> +
> +	ctl = socket(AF_LOCAL, SOCK_STREAM, 0);
> +	if (ctl == -1)
> +		error(KSFT_FAIL, errno, "socket");
> +
> +	/* ensure MTU is smaller than what we plan to send */
> +	ifr.ifr_mtu = MTU;
> +	if (ioctl(ctl, SIOCSIFMTU, &ifr) == -1)
> +		error(KSFT_FAIL, errno, "ioctl: set MTU");
> +
> +	/* bring up interface */
> +	if (ioctl(ctl, SIOCGIFFLAGS, &ifr) == -1)
> +		error(KSFT_FAIL, errno, "ioctl SIOCGIFFLAGS");
> +	ifr.ifr_flags = ifr.ifr_flags | IFF_UP;
> +	if (ioctl(ctl, SIOCSIFFLAGS, &ifr) == -1)
> +		error(KSFT_FAIL, errno, "ioctl: bring interface up");
> +
> +	if (close(ctl) == -1)
> +		error(KSFT_FAIL, errno, "close");
> +}
> +
> +int main(void)
> +{
> +	struct in6_addr addr = {
> +		.s6_addr[15] = 0x01, /* ::1 */
> +	};
> +	struct sockaddr_in6 sa = {
> +		.sin6_family = AF_INET6,
> +		.sin6_addr = addr,
> +		.sin6_port = 9      /* port 9/udp (DISCARD) */
> +	};
> +	static char buf[LARGER_THAN_MTU] = {0};
> +	struct iovec iov = { .iov_base = buf, .iov_len = sizeof(buf) };
> +	struct msghdr msg = {
> +		.msg_iov = &iov,
> +		.msg_iovlen = 1,
> +		.msg_name = (struct sockaddr *)&sa,
> +		.msg_namelen = sizeof(sa),
> +	};
> +	ssize_t rc;
> +	int err = KSFT_FAIL;
> +	int s;
> +
> +	printf("Testing IPv6 fragmentation\n");
> +	setup();
> +	s = socket(AF_INET6, SOCK_DGRAM, 0);
> +send_again:
> +	rc = sendmsg(s, &msg, 0);
> +	if (rc == -1) {
> +		/* if interface wasn't ready, try again */
> +		if (errno == EADDRNOTAVAIL) {
> +			usleep(1000);
> +			goto send_again;
> +		}
> +		error(KSFT_FAIL, errno, "sendmsg");
> +	} else if (rc != LARGER_THAN_MTU) {
> +		error(KSFT_FAIL, errno, "sendmsg returned %zi, expected %i",
> +				rc, LARGER_THAN_MTU);
> +	}
> +	printf("[PASS] sendmsg() returned %zi\n", rc);
> +	err = KSFT_PASS;

err is no longer needed, just return KSFT_PASS below

> +	close(s);

reminder to check return value of all library and system calls
> +	return err;
> +}
> 
> base-commit: cd8a4cfa6bb43a441901e82f5c222dddc75a18a3
> -- 
> 2.49.1
> 



      reply	other threads:[~2025-09-02 14:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-02 14:23 [PATCH net-next v5] selftests: net: add test for ipv6 fragmentation Brett A C Sheffield
2025-09-02 14:48 ` Willem de Bruijn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.e019291a1132@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=bacs@librecast.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.