netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Neal Cardwell <ncardwell@google.com>,
	 Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	 Willem de Bruijn <willemb@google.com>,
	 netdev@vger.kernel.org
Subject: Re: [TEST] tcp_zerocopy_maxfrags.pkt fails
Date: Tue, 25 Nov 2025 15:44:02 -0500	[thread overview]
Message-ID: <willemdebruijn.kernel.2303cd61bcc5e@gmail.com> (raw)
In-Reply-To: <CADVnQykwTjoTVV_jBmUXAMKato-3MwS+j6PdyVFtTxjndcC=bQ@mail.gmail.com>

Neal Cardwell wrote:
> On Tue, Nov 25, 2025 at 2:49 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Neal Cardwell wrote:
> > > On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Jakub Kicinski wrote:
> > > > > Hi Willem!
> > > > >
> > > > > I migrated netdev CI to our own infra now, and the slightly faster,
> > > > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt:
> > > > >
> > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload
> > > > > # script packet:  1.000237 P. 36:37(1) ack 1
> > > > > # actual packet:  1.000235 P. 36:37(1) ack 1 win 1050
> > > > > # not ok 1 ipv4
> > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload
> > > > > # script packet:  1.000209 P. 36:37(1) ack 1
> > > > > # actual packet:  1.000208 P. 36:37(1) ack 1 win 1050
> > > > > # not ok 2 ipv6
> > > > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
> > > > >
> > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout
> > > > >
> > > > > This happens on both debug and non-debug kernel (tho on the former
> > > > > the failure is masked due to MACHINE_SLOW).
> > > >
> > > > That's an odd error.
> > > >
> > > > The test send an msg_iov of 18 1 byte fragments. And verifies that
> > > > only 17 fit in one packet, followed by a single 1 byte packet. The
> > > > test does not explicitly initialize payload, but trusts packetdrill
> > > > to handle that. Relevant snippet below.
> > > >
> > > > Packetdrill complains about payload contents. That error is only
> > > > generated by the below check in run_packet.c. Pretty straightforward.
> > > >
> > > > Packetdrill agrees that the packet is one byte long. The win argument
> > > > is optional on outgoing packets, not relevant to the failure.
> > > >
> > > > So somehow the data in that frag got overwritten in the short window
> > > > between when it was injected into the kernel and when it was observed?
> > > > Seems so unlikely.
> > > >
> > > > Sorry, I'm a bit at a loss at least initially as to the cause.
> > >
> > > I agree this is odd. It looks like either a very concerning kernel
> > > bug, or very concerning packetdrill bug. :-)
> > >
> > > Could someone please run the test with tcpump in the background to
> > > capture the full packet contents, to verify that indeed the packet has
> > > the wrong contents?
> > >
> > > This would help make sure that this is a kernel bug and not a
> > > packetdrill bug. :-)
> >
> > I'm not able to reproduce this on my own machine with the latest nn.
> > But could reproduce it on the netdev machine.
> >
> > I assume all payload is supposed to be zeroed. And indeed the packet
> > seen has a non-zero single byte of payload: 0x60.
> >
> > Is there any chance that this happens on some kernel with
> > unsubmitted patches, but not on netdev-nn/main on this machine either?
> >
> > ----
> >
> > tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect
> > outbound data payload
> > script packet:  1.000169 P. 36:37(1) ack 1
> > actual packet:  1.000167 P. 36:37(1) ack 1 win 1050
> >
> > 14:42:01.330694 tun0  Out IP6 fd3d:a0b:17d6::1.webcache >
> > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 19:36, ack 1, win 1050,
> > length 17: HTTP
> >         0x0000:  6000 842c 0025 0640 fd3d 0a0b 17d6 0000
> >         0x0010:  0000 0000 0000 0001 fd3d fa7b d17d 0000
> >         0x0020:  0000 0000 0000 0001 1f90 c6d5 f7fe 05e9
> >         0x0030:  0000 0001 5018 041a e883 0000 0000 0000
> >         0x0040:  0000 0000 0000 0000 0000 0000 00
> > 14:42:01.330723 tun0  In  IP6 fd3d:fa7b:d17d::1.50901 >
> > fd3d:a0b:17d6::1.webcache: Flags [.], ack 36, win 257, length 0
> >         0x0000:  6000 0000 0014 06ff fd3d fa7b d17d 0000
> >         0x0010:  0000 0000 0000 0001 fd3d 0a0b 17d6 0000
> >         0x0020:  0000 0000 0000 0001 c6d5 1f90 0000 0001
> >         0x0030:  f7fe 05fa 5010 0101 e21b 0000
> > 14:42:01.330727 tun0  Out IP6 fd3d:a0b:17d6::1.webcache >
> > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 36:37, ack 1, win 1050,
> > length 1: HTTP
> >         0x0000:  6000 842c 0015 0640 fd3d 0a0b 17d6 0000
> >         0x0010:  0000 0000 0000 0001 fd3d fa7b d17d 0000
> >         0x0020:  0000 0000 0000 0001 1f90 c6d5 f7fe 05fa
> >         0x0030:  0000 0001 5018 041a e873 0000 60
> 
> Looking at the tests in tools/testing/selftests/net/packetdrill/, I
> don't see anything that sets the --send_omit_free packetdrill flag.
> That flag is needed for TCP zero copy tests, to ensure that
> packetdrill doesn't free the send() buffer after the send() call.
> 
> Because the test didn't use the --send_omit_free flag, packetdrill
> freed the buffer. And the memory probably got reused before the
> transmit. Perhaps for an IPv6 packet, whose first byte is 0x60, and
> thus what was transmitted was the garbage 0x60.
> 
> Does that sound plausible, Willem? If you agree, do you have cycles to
> cook a commit of some kind to fix this?
> 
> One option is to put the  --send_omit_free flag near the top of the
> /tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt
> script.
> 
> Thanks!
> 
> neal

Thanks Neal!

I verified that that fixed the failure. And that our original Google
internal runner passes that flag on the command line, only for these
zerocopy tests.

I can send a fix.

Only, the ipv4 test appears to be failing with a different error.
Equally surprising. It times out just waiting for the SYNACK.

    ./ksft_runner.sh tcp_zerocopy_maxfrags.pkt
    TAP version 13
    1..2
    tcp_zerocopy_maxfrags.pkt:25: error handling packet: Timed out waiting for packet

Which corresponds with the last line in this snippet.

    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 setsockopt(3, SOL_SOCKET, SO_ZEROCOPY, [1], 4) = 0

   // Each pinned zerocopy page is fully accounted to skb->truesize.
   // This test generates a worst case packet with each frag storing
   // one byte, but increasing truesize with a page (64KB on PPC).
   +0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [2000000], 4) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

   +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>

      reply	other threads:[~2025-11-25 20:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-24 15:18 [TEST] tcp_zerocopy_maxfrags.pkt fails Jakub Kicinski
2025-11-24 16:29 ` Willem de Bruijn
2025-11-24 16:38   ` Neal Cardwell
2025-11-25 19:49     ` Willem de Bruijn
2025-11-25 20:31       ` Neal Cardwell
2025-11-25 20:44         ` Willem de Bruijn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.2303cd61bcc5e@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=kuba@kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).