All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Neal Cardwell <ncardwell@google.com>,
	 Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	 Willem de Bruijn <willemb@google.com>,
	 netdev@vger.kernel.org
Subject: Re: [TEST] tcp_zerocopy_maxfrags.pkt fails
Date: Tue, 25 Nov 2025 15:44:02 -0500	[thread overview]
Message-ID: <willemdebruijn.kernel.2303cd61bcc5e@gmail.com> (raw)
In-Reply-To: <CADVnQykwTjoTVV_jBmUXAMKato-3MwS+j6PdyVFtTxjndcC=bQ@mail.gmail.com>

Neal Cardwell wrote:
> On Tue, Nov 25, 2025 at 2:49 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Neal Cardwell wrote:
> > > On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Jakub Kicinski wrote:
> > > > > Hi Willem!
> > > > >
> > > > > I migrated netdev CI to our own infra now, and the slightly faster,
> > > > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt:
> > > > >
> > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload
> > > > > # script packet:  1.000237 P. 36:37(1) ack 1
> > > > > # actual packet:  1.000235 P. 36:37(1) ack 1 win 1050
> > > > > # not ok 1 ipv4
> > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload
> > > > > # script packet:  1.000209 P. 36:37(1) ack 1
> > > > > # actual packet:  1.000208 P. 36:37(1) ack 1 win 1050
> > > > > # not ok 2 ipv6
> > > > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
> > > > >
> > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout
> > > > >
> > > > > This happens on both debug and non-debug kernel (tho on the former
> > > > > the failure is masked due to MACHINE_SLOW).
> > > >
> > > > That's an odd error.
> > > >
> > > > The test send an msg_iov of 18 1 byte fragments. And verifies that
> > > > only 17 fit in one packet, followed by a single 1 byte packet. The
> > > > test does not explicitly initialize payload, but trusts packetdrill
> > > > to handle that. Relevant snippet below.
> > > >
> > > > Packetdrill complains about payload contents. That error is only
> > > > generated by the below check in run_packet.c. Pretty straightforward.
> > > >
> > > > Packetdrill agrees that the packet is one byte long. The win argument
> > > > is optional on outgoing packets, not relevant to the failure.
> > > >
> > > > So somehow the data in that frag got overwritten in the short window
> > > > between when it was injected into the kernel and when it was observed?
> > > > Seems so unlikely.
> > > >
> > > > Sorry, I'm a bit at a loss at least initially as to the cause.
> > >
> > > I agree this is odd. It looks like either a very concerning kernel
> > > bug, or very concerning packetdrill bug. :-)
> > >
> > > Could someone please run the test with tcpump in the background to
> > > capture the full packet contents, to verify that indeed the packet has
> > > the wrong contents?
> > >
> > > This would help make sure that this is a kernel bug and not a
> > > packetdrill bug. :-)
> >
> > I'm not able to reproduce this on my own machine with the latest nn.
> > But could reproduce it on the netdev machine.
> >
> > I assume all payload is supposed to be zeroed. And indeed the packet
> > seen has a non-zero single byte of payload: 0x60.
> >
> > Is there any chance that this happens on some kernel with
> > unsubmitted patches, but not on netdev-nn/main on this machine either?
> >
> > ----
> >
> > tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect
> > outbound data payload
> > script packet:  1.000169 P. 36:37(1) ack 1
> > actual packet:  1.000167 P. 36:37(1) ack 1 win 1050
> >
> > 14:42:01.330694 tun0  Out IP6 fd3d:a0b:17d6::1.webcache >
> > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 19:36, ack 1, win 1050,
> > length 17: HTTP
> >         0x0000:  6000 842c 0025 0640 fd3d 0a0b 17d6 0000
> >         0x0010:  0000 0000 0000 0001 fd3d fa7b d17d 0000
> >         0x0020:  0000 0000 0000 0001 1f90 c6d5 f7fe 05e9
> >         0x0030:  0000 0001 5018 041a e883 0000 0000 0000
> >         0x0040:  0000 0000 0000 0000 0000 0000 00
> > 14:42:01.330723 tun0  In  IP6 fd3d:fa7b:d17d::1.50901 >
> > fd3d:a0b:17d6::1.webcache: Flags [.], ack 36, win 257, length 0
> >         0x0000:  6000 0000 0014 06ff fd3d fa7b d17d 0000
> >         0x0010:  0000 0000 0000 0001 fd3d 0a0b 17d6 0000
> >         0x0020:  0000 0000 0000 0001 c6d5 1f90 0000 0001
> >         0x0030:  f7fe 05fa 5010 0101 e21b 0000
> > 14:42:01.330727 tun0  Out IP6 fd3d:a0b:17d6::1.webcache >
> > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 36:37, ack 1, win 1050,
> > length 1: HTTP
> >         0x0000:  6000 842c 0015 0640 fd3d 0a0b 17d6 0000
> >         0x0010:  0000 0000 0000 0001 fd3d fa7b d17d 0000
> >         0x0020:  0000 0000 0000 0001 1f90 c6d5 f7fe 05fa
> >         0x0030:  0000 0001 5018 041a e873 0000 60
> 
> Looking at the tests in tools/testing/selftests/net/packetdrill/, I
> don't see anything that sets the --send_omit_free packetdrill flag.
> That flag is needed for TCP zero copy tests, to ensure that
> packetdrill doesn't free the send() buffer after the send() call.
> 
> Because the test didn't use the --send_omit_free flag, packetdrill
> freed the buffer. And the memory probably got reused before the
> transmit. Perhaps for an IPv6 packet, whose first byte is 0x60, and
> thus what was transmitted was the garbage 0x60.
> 
> Does that sound plausible, Willem? If you agree, do you have cycles to
> cook a commit of some kind to fix this?
> 
> One option is to put the  --send_omit_free flag near the top of the
> /tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt
> script.
> 
> Thanks!
> 
> neal

Thanks Neal!

I verified that that fixed the failure. And that our original Google
internal runner passes that flag on the command line, only for these
zerocopy tests.

I can send a fix.

Only, the ipv4 test appears to be failing with a different error.
Equally surprising. It times out just waiting for the SYNACK.

    ./ksft_runner.sh tcp_zerocopy_maxfrags.pkt
    TAP version 13
    1..2
    tcp_zerocopy_maxfrags.pkt:25: error handling packet: Timed out waiting for packet

Which corresponds with the last line in this snippet.

    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 setsockopt(3, SOL_SOCKET, SO_ZEROCOPY, [1], 4) = 0

   // Each pinned zerocopy page is fully accounted to skb->truesize.
   // This test generates a worst case packet with each frag storing
   // one byte, but increasing truesize with a page (64KB on PPC).
   +0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [2000000], 4) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

   +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>

      reply	other threads:[~2025-11-25 20:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-24 15:18 [TEST] tcp_zerocopy_maxfrags.pkt fails Jakub Kicinski
2025-11-24 16:29 ` Willem de Bruijn
2025-11-24 16:38   ` Neal Cardwell
2025-11-25 19:49     ` Willem de Bruijn
2025-11-25 20:31       ` Neal Cardwell
2025-11-25 20:44         ` Willem de Bruijn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.2303cd61bcc5e@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=kuba@kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.