* [TEST] tcp_zerocopy_maxfrags.pkt fails @ 2025-11-24 15:18 Jakub Kicinski 2025-11-24 16:29 ` Willem de Bruijn 0 siblings, 1 reply; 6+ messages in thread From: Jakub Kicinski @ 2025-11-24 15:18 UTC (permalink / raw) To: Willem de Bruijn; +Cc: netdev Hi Willem! I migrated netdev CI to our own infra now, and the slightly faster, Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload # script packet: 1.000237 P. 36:37(1) ack 1 # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 # not ok 1 ipv4 # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload # script packet: 1.000209 P. 36:37(1) ack 1 # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 # not ok 2 ipv6 # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout This happens on both debug and non-debug kernel (tho on the former the failure is masked due to MACHINE_SLOW). ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [TEST] tcp_zerocopy_maxfrags.pkt fails 2025-11-24 15:18 [TEST] tcp_zerocopy_maxfrags.pkt fails Jakub Kicinski @ 2025-11-24 16:29 ` Willem de Bruijn 2025-11-24 16:38 ` Neal Cardwell 0 siblings, 1 reply; 6+ messages in thread From: Willem de Bruijn @ 2025-11-24 16:29 UTC (permalink / raw) To: Jakub Kicinski, Willem de Bruijn; +Cc: netdev Jakub Kicinski wrote: > Hi Willem! > > I migrated netdev CI to our own infra now, and the slightly faster, > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > # script packet: 1.000237 P. 36:37(1) ack 1 > # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 > # not ok 1 ipv4 > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > # script packet: 1.000209 P. 36:37(1) ack 1 > # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 > # not ok 2 ipv6 > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout > > This happens on both debug and non-debug kernel (tho on the former > the failure is masked due to MACHINE_SLOW). That's an odd error. The test send an msg_iov of 18 1 byte fragments. And verifies that only 17 fit in one packet, followed by a single 1 byte packet. The test does not explicitly initialize payload, but trusts packetdrill to handle that. Relevant snippet below. Packetdrill complains about payload contents. That error is only generated by the below check in run_packet.c. Pretty straightforward. Packetdrill agrees that the packet is one byte long. The win argument is optional on outgoing packets, not relevant to the failure. So somehow the data in that frag got overwritten in the short window between when it was injected into the kernel and when it was observed? Seems so unlikely. Sorry, I'm a bit at a loss at least initially as to the cause. ---- // send a zerocopy iov of 18 elements: +1 sendmsg(4, {msg_name(...)=..., msg_iov(18)=[{..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}, {..., 1}], msg_flags=0}, MSG_ZEROCOPY) = 18 // verify that it is split in one skb of 17 frags + 1 of 1 frag // verify that both have the PSH bit set +0 > P. 19:36(17) ack 1 +0 < . 1:1(0) ack 36 win 257 +0 > P. 36:37(1) ack 1 +0 < . 1:1(0) ack 37 win 257 ---- /* Verify TCP/UDP payload matches expected value. */ static int verify_outbound_live_payload( struct packet *actual_packet, struct packet *script_packet, char **error) { /* Diff the TCP/UDP data payloads. We've already implicitly * checked their length by checking the IP and TCP/UDP headers. */ assert(packet_payload_len(actual_packet) == packet_payload_len(script_packet)); if (memcmp(packet_payload(script_packet), packet_payload(actual_packet), packet_payload_len(script_packet)) != 0) { asprintf(error, "incorrect outbound data payload"); return STATUS_ERR; } return STATUS_OK; } ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [TEST] tcp_zerocopy_maxfrags.pkt fails 2025-11-24 16:29 ` Willem de Bruijn @ 2025-11-24 16:38 ` Neal Cardwell 2025-11-25 19:49 ` Willem de Bruijn 0 siblings, 1 reply; 6+ messages in thread From: Neal Cardwell @ 2025-11-24 16:38 UTC (permalink / raw) To: Willem de Bruijn; +Cc: Jakub Kicinski, Willem de Bruijn, netdev On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Jakub Kicinski wrote: > > Hi Willem! > > > > I migrated netdev CI to our own infra now, and the slightly faster, > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > # script packet: 1.000237 P. 36:37(1) ack 1 > > # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 > > # not ok 1 ipv4 > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > # script packet: 1.000209 P. 36:37(1) ack 1 > > # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 > > # not ok 2 ipv6 > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout > > > > This happens on both debug and non-debug kernel (tho on the former > > the failure is masked due to MACHINE_SLOW). > > That's an odd error. > > The test send an msg_iov of 18 1 byte fragments. And verifies that > only 17 fit in one packet, followed by a single 1 byte packet. The > test does not explicitly initialize payload, but trusts packetdrill > to handle that. Relevant snippet below. > > Packetdrill complains about payload contents. That error is only > generated by the below check in run_packet.c. Pretty straightforward. > > Packetdrill agrees that the packet is one byte long. The win argument > is optional on outgoing packets, not relevant to the failure. > > So somehow the data in that frag got overwritten in the short window > between when it was injected into the kernel and when it was observed? > Seems so unlikely. > > Sorry, I'm a bit at a loss at least initially as to the cause. I agree this is odd. It looks like either a very concerning kernel bug, or very concerning packetdrill bug. :-) Could someone please run the test with tcpump in the background to capture the full packet contents, to verify that indeed the packet has the wrong contents? This would help make sure that this is a kernel bug and not a packetdrill bug. :-) thanks, neal ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [TEST] tcp_zerocopy_maxfrags.pkt fails 2025-11-24 16:38 ` Neal Cardwell @ 2025-11-25 19:49 ` Willem de Bruijn 2025-11-25 20:31 ` Neal Cardwell 0 siblings, 1 reply; 6+ messages in thread From: Willem de Bruijn @ 2025-11-25 19:49 UTC (permalink / raw) To: Neal Cardwell, Willem de Bruijn; +Cc: Jakub Kicinski, Willem de Bruijn, netdev Neal Cardwell wrote: > On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > Jakub Kicinski wrote: > > > Hi Willem! > > > > > > I migrated netdev CI to our own infra now, and the slightly faster, > > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: > > > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > # script packet: 1.000237 P. 36:37(1) ack 1 > > > # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 > > > # not ok 1 ipv4 > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > # script packet: 1.000209 P. 36:37(1) ack 1 > > > # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 > > > # not ok 2 ipv6 > > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 > > > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout > > > > > > This happens on both debug and non-debug kernel (tho on the former > > > the failure is masked due to MACHINE_SLOW). > > > > That's an odd error. > > > > The test send an msg_iov of 18 1 byte fragments. And verifies that > > only 17 fit in one packet, followed by a single 1 byte packet. The > > test does not explicitly initialize payload, but trusts packetdrill > > to handle that. Relevant snippet below. > > > > Packetdrill complains about payload contents. That error is only > > generated by the below check in run_packet.c. Pretty straightforward. > > > > Packetdrill agrees that the packet is one byte long. The win argument > > is optional on outgoing packets, not relevant to the failure. > > > > So somehow the data in that frag got overwritten in the short window > > between when it was injected into the kernel and when it was observed? > > Seems so unlikely. > > > > Sorry, I'm a bit at a loss at least initially as to the cause. > > I agree this is odd. It looks like either a very concerning kernel > bug, or very concerning packetdrill bug. :-) > > Could someone please run the test with tcpump in the background to > capture the full packet contents, to verify that indeed the packet has > the wrong contents? > > This would help make sure that this is a kernel bug and not a > packetdrill bug. :-) I'm not able to reproduce this on my own machine with the latest nn. But could reproduce it on the netdev machine. I assume all payload is supposed to be zeroed. And indeed the packet seen has a non-zero single byte of payload: 0x60. Is there any chance that this happens on some kernel with unsubmitted patches, but not on netdev-nn/main on this machine either? ---- tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload script packet: 1.000169 P. 36:37(1) ack 1 actual packet: 1.000167 P. 36:37(1) ack 1 win 1050 14:42:01.330694 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 19:36, ack 1, win 1050, length 17: HTTP 0x0000: 6000 842c 0025 0640 fd3d 0a0b 17d6 0000 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05e9 0x0030: 0000 0001 5018 041a e883 0000 0000 0000 0x0040: 0000 0000 0000 0000 0000 0000 00 14:42:01.330723 tun0 In IP6 fd3d:fa7b:d17d::1.50901 > fd3d:a0b:17d6::1.webcache: Flags [.], ack 36, win 257, length 0 0x0000: 6000 0000 0014 06ff fd3d fa7b d17d 0000 0x0010: 0000 0000 0000 0001 fd3d 0a0b 17d6 0000 0x0020: 0000 0000 0000 0001 c6d5 1f90 0000 0001 0x0030: f7fe 05fa 5010 0101 e21b 0000 14:42:01.330727 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 36:37, ack 1, win 1050, length 1: HTTP 0x0000: 6000 842c 0015 0640 fd3d 0a0b 17d6 0000 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05fa 0x0030: 0000 0001 5018 041a e873 0000 60 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [TEST] tcp_zerocopy_maxfrags.pkt fails 2025-11-25 19:49 ` Willem de Bruijn @ 2025-11-25 20:31 ` Neal Cardwell 2025-11-25 20:44 ` Willem de Bruijn 0 siblings, 1 reply; 6+ messages in thread From: Neal Cardwell @ 2025-11-25 20:31 UTC (permalink / raw) To: Willem de Bruijn; +Cc: Jakub Kicinski, Willem de Bruijn, netdev On Tue, Nov 25, 2025 at 2:49 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Neal Cardwell wrote: > > On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > Jakub Kicinski wrote: > > > > Hi Willem! > > > > > > > > I migrated netdev CI to our own infra now, and the slightly faster, > > > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: > > > > > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > > # script packet: 1.000237 P. 36:37(1) ack 1 > > > > # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 > > > > # not ok 1 ipv4 > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > > # script packet: 1.000209 P. 36:37(1) ack 1 > > > > # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 > > > > # not ok 2 ipv6 > > > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 > > > > > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout > > > > > > > > This happens on both debug and non-debug kernel (tho on the former > > > > the failure is masked due to MACHINE_SLOW). > > > > > > That's an odd error. > > > > > > The test send an msg_iov of 18 1 byte fragments. And verifies that > > > only 17 fit in one packet, followed by a single 1 byte packet. The > > > test does not explicitly initialize payload, but trusts packetdrill > > > to handle that. Relevant snippet below. > > > > > > Packetdrill complains about payload contents. That error is only > > > generated by the below check in run_packet.c. Pretty straightforward. > > > > > > Packetdrill agrees that the packet is one byte long. The win argument > > > is optional on outgoing packets, not relevant to the failure. > > > > > > So somehow the data in that frag got overwritten in the short window > > > between when it was injected into the kernel and when it was observed? > > > Seems so unlikely. > > > > > > Sorry, I'm a bit at a loss at least initially as to the cause. > > > > I agree this is odd. It looks like either a very concerning kernel > > bug, or very concerning packetdrill bug. :-) > > > > Could someone please run the test with tcpump in the background to > > capture the full packet contents, to verify that indeed the packet has > > the wrong contents? > > > > This would help make sure that this is a kernel bug and not a > > packetdrill bug. :-) > > I'm not able to reproduce this on my own machine with the latest nn. > But could reproduce it on the netdev machine. > > I assume all payload is supposed to be zeroed. And indeed the packet > seen has a non-zero single byte of payload: 0x60. > > Is there any chance that this happens on some kernel with > unsubmitted patches, but not on netdev-nn/main on this machine either? > > ---- > > tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect > outbound data payload > script packet: 1.000169 P. 36:37(1) ack 1 > actual packet: 1.000167 P. 36:37(1) ack 1 win 1050 > > 14:42:01.330694 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 19:36, ack 1, win 1050, > length 17: HTTP > 0x0000: 6000 842c 0025 0640 fd3d 0a0b 17d6 0000 > 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 > 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05e9 > 0x0030: 0000 0001 5018 041a e883 0000 0000 0000 > 0x0040: 0000 0000 0000 0000 0000 0000 00 > 14:42:01.330723 tun0 In IP6 fd3d:fa7b:d17d::1.50901 > > fd3d:a0b:17d6::1.webcache: Flags [.], ack 36, win 257, length 0 > 0x0000: 6000 0000 0014 06ff fd3d fa7b d17d 0000 > 0x0010: 0000 0000 0000 0001 fd3d 0a0b 17d6 0000 > 0x0020: 0000 0000 0000 0001 c6d5 1f90 0000 0001 > 0x0030: f7fe 05fa 5010 0101 e21b 0000 > 14:42:01.330727 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 36:37, ack 1, win 1050, > length 1: HTTP > 0x0000: 6000 842c 0015 0640 fd3d 0a0b 17d6 0000 > 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 > 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05fa > 0x0030: 0000 0001 5018 041a e873 0000 60 Looking at the tests in tools/testing/selftests/net/packetdrill/, I don't see anything that sets the --send_omit_free packetdrill flag. That flag is needed for TCP zero copy tests, to ensure that packetdrill doesn't free the send() buffer after the send() call. Because the test didn't use the --send_omit_free flag, packetdrill freed the buffer. And the memory probably got reused before the transmit. Perhaps for an IPv6 packet, whose first byte is 0x60, and thus what was transmitted was the garbage 0x60. Does that sound plausible, Willem? If you agree, do you have cycles to cook a commit of some kind to fix this? One option is to put the --send_omit_free flag near the top of the /tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt script. Thanks! neal ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [TEST] tcp_zerocopy_maxfrags.pkt fails 2025-11-25 20:31 ` Neal Cardwell @ 2025-11-25 20:44 ` Willem de Bruijn 0 siblings, 0 replies; 6+ messages in thread From: Willem de Bruijn @ 2025-11-25 20:44 UTC (permalink / raw) To: Neal Cardwell, Willem de Bruijn; +Cc: Jakub Kicinski, Willem de Bruijn, netdev Neal Cardwell wrote: > On Tue, Nov 25, 2025 at 2:49 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > Neal Cardwell wrote: > > > On Mon, Nov 24, 2025 at 11:33 AM Willem de Bruijn > > > <willemdebruijn.kernel@gmail.com> wrote: > > > > > > > > Jakub Kicinski wrote: > > > > > Hi Willem! > > > > > > > > > > I migrated netdev CI to our own infra now, and the slightly faster, > > > > > Fedora-based system is failing tcp_zerocopy_maxfrags.pkt: > > > > > > > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > > > # script packet: 1.000237 P. 36:37(1) ack 1 > > > > > # actual packet: 1.000235 P. 36:37(1) ack 1 win 1050 > > > > > # not ok 1 ipv4 > > > > > # tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect outbound data payload > > > > > # script packet: 1.000209 P. 36:37(1) ack 1 > > > > > # actual packet: 1.000208 P. 36:37(1) ack 1 win 1050 > > > > > # not ok 2 ipv6 > > > > > # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 > > > > > > > > > > https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/399942/13-tcp-zerocopy-maxfrags-pkt/stdout > > > > > > > > > > This happens on both debug and non-debug kernel (tho on the former > > > > > the failure is masked due to MACHINE_SLOW). > > > > > > > > That's an odd error. > > > > > > > > The test send an msg_iov of 18 1 byte fragments. And verifies that > > > > only 17 fit in one packet, followed by a single 1 byte packet. The > > > > test does not explicitly initialize payload, but trusts packetdrill > > > > to handle that. Relevant snippet below. > > > > > > > > Packetdrill complains about payload contents. That error is only > > > > generated by the below check in run_packet.c. Pretty straightforward. > > > > > > > > Packetdrill agrees that the packet is one byte long. The win argument > > > > is optional on outgoing packets, not relevant to the failure. > > > > > > > > So somehow the data in that frag got overwritten in the short window > > > > between when it was injected into the kernel and when it was observed? > > > > Seems so unlikely. > > > > > > > > Sorry, I'm a bit at a loss at least initially as to the cause. > > > > > > I agree this is odd. It looks like either a very concerning kernel > > > bug, or very concerning packetdrill bug. :-) > > > > > > Could someone please run the test with tcpump in the background to > > > capture the full packet contents, to verify that indeed the packet has > > > the wrong contents? > > > > > > This would help make sure that this is a kernel bug and not a > > > packetdrill bug. :-) > > > > I'm not able to reproduce this on my own machine with the latest nn. > > But could reproduce it on the netdev machine. > > > > I assume all payload is supposed to be zeroed. And indeed the packet > > seen has a non-zero single byte of payload: 0x60. > > > > Is there any chance that this happens on some kernel with > > unsubmitted patches, but not on netdev-nn/main on this machine either? > > > > ---- > > > > tcp_zerocopy_maxfrags.pkt:56: error handling packet: incorrect > > outbound data payload > > script packet: 1.000169 P. 36:37(1) ack 1 > > actual packet: 1.000167 P. 36:37(1) ack 1 win 1050 > > > > 14:42:01.330694 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > > > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 19:36, ack 1, win 1050, > > length 17: HTTP > > 0x0000: 6000 842c 0025 0640 fd3d 0a0b 17d6 0000 > > 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 > > 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05e9 > > 0x0030: 0000 0001 5018 041a e883 0000 0000 0000 > > 0x0040: 0000 0000 0000 0000 0000 0000 00 > > 14:42:01.330723 tun0 In IP6 fd3d:fa7b:d17d::1.50901 > > > fd3d:a0b:17d6::1.webcache: Flags [.], ack 36, win 257, length 0 > > 0x0000: 6000 0000 0014 06ff fd3d fa7b d17d 0000 > > 0x0010: 0000 0000 0000 0001 fd3d 0a0b 17d6 0000 > > 0x0020: 0000 0000 0000 0001 c6d5 1f90 0000 0001 > > 0x0030: f7fe 05fa 5010 0101 e21b 0000 > > 14:42:01.330727 tun0 Out IP6 fd3d:a0b:17d6::1.webcache > > > fd3d:fa7b:d17d::1.50901: Flags [P.], seq 36:37, ack 1, win 1050, > > length 1: HTTP > > 0x0000: 6000 842c 0015 0640 fd3d 0a0b 17d6 0000 > > 0x0010: 0000 0000 0000 0001 fd3d fa7b d17d 0000 > > 0x0020: 0000 0000 0000 0001 1f90 c6d5 f7fe 05fa > > 0x0030: 0000 0001 5018 041a e873 0000 60 > > Looking at the tests in tools/testing/selftests/net/packetdrill/, I > don't see anything that sets the --send_omit_free packetdrill flag. > That flag is needed for TCP zero copy tests, to ensure that > packetdrill doesn't free the send() buffer after the send() call. > > Because the test didn't use the --send_omit_free flag, packetdrill > freed the buffer. And the memory probably got reused before the > transmit. Perhaps for an IPv6 packet, whose first byte is 0x60, and > thus what was transmitted was the garbage 0x60. > > Does that sound plausible, Willem? If you agree, do you have cycles to > cook a commit of some kind to fix this? > > One option is to put the --send_omit_free flag near the top of the > /tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt > script. > > Thanks! > > neal Thanks Neal! I verified that that fixed the failure. And that our original Google internal runner passes that flag on the command line, only for these zerocopy tests. I can send a fix. Only, the ipv4 test appears to be failing with a different error. Equally surprising. It times out just waiting for the SYNACK. ./ksft_runner.sh tcp_zerocopy_maxfrags.pkt TAP version 13 1..2 tcp_zerocopy_maxfrags.pkt:25: error handling packet: Timed out waiting for packet Which corresponds with the last line in this snippet. 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 setsockopt(3, SOL_SOCKET, SO_ZEROCOPY, [1], 4) = 0 // Each pinned zerocopy page is fully accounted to skb->truesize. // This test generates a worst case packet with each frag storing // one byte, but increasing truesize with a page (64KB on PPC). +0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [2000000], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-25 20:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-24 15:18 [TEST] tcp_zerocopy_maxfrags.pkt fails Jakub Kicinski 2025-11-24 16:29 ` Willem de Bruijn 2025-11-24 16:38 ` Neal Cardwell 2025-11-25 19:49 ` Willem de Bruijn 2025-11-25 20:31 ` Neal Cardwell 2025-11-25 20:44 ` Willem de Bruijn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).