public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [TEST] txtimestamp.sh pains after netdev foundation migration
@ 2026-01-07 19:05 Jakub Kicinski
  2026-01-08  0:19 ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-01-07 19:05 UTC (permalink / raw)
  To: Willem de Bruijn, netdev@vger.kernel.org

Hi Willem!

We discussed instability of txtimestamp.sh in the past but it has
gotten even worse after we migrated from AWS to netdev foundation
machines. Possibly because it's different HW. Possibly because we
now run much newer kernels (AWS Linux vs Fedora).

The test flakes a lot (we're talking about non-debug builds):
https://netdev.bots.linux.dev/contest.html?test=txtimestamp-sh

I tried a few things. The VM threads (vCPU, not IO) are now all pinned
to dedicated CPUs. I added this patch to avoid long idle periods:
https://github.com/linux-netdev/testing/commit/d468f582c617adece2a576788746a09d91e91574

These both help a little bit, but w still get 10+ flakes a week.
I believe you have access to netdev foundation machines so feel
free to poke if you have cycles..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-07 19:05 [TEST] txtimestamp.sh pains after netdev foundation migration Jakub Kicinski
@ 2026-01-08  0:19 ` Willem de Bruijn
  2026-01-08  3:25   ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-08  0:19 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn, netdev@vger.kernel.org

Jakub Kicinski wrote:
> Hi Willem!
> 
> We discussed instability of txtimestamp.sh in the past but it has
> gotten even worse after we migrated from AWS to netdev foundation
> machines. Possibly because it's different HW. Possibly because we
> now run much newer kernels (AWS Linux vs Fedora).
> 
> The test flakes a lot (we're talking about non-debug builds):
> https://netdev.bots.linux.dev/contest.html?test=txtimestamp-sh
> 
> I tried a few things. The VM threads (vCPU, not IO) are now all pinned
> to dedicated CPUs. I added this patch to avoid long idle periods:
> https://github.com/linux-netdev/testing/commit/d468f582c617adece2a576788746a09d91e91574
> 
> These both help a little bit, but w still get 10+ flakes a week.
> I believe you have access to netdev foundation machines so feel
> free to poke if you have cycles..

From a first look at the most recent 20 flakes 
(ignoring two unrelated sockaddr failures).

17 out of 20 happen in the first SND-USR calculation.
One representative example:

    # 7.11 [+0.00] test SND
    # 7.11 [+0.00]     USR: 1767443466 s 155019 us (seq=0, len=0)
    # 7.19 [+0.08] ERROR: 18600 us expected between 10000 and 18000
    # 7.19 [+0.00]     SND: 1767443466 s 173619 us (seq=0, len=10)  (USR +18599 us)
    # 7.20 [+0.00]     USR: 1767443466 s 243683 us (seq=0, len=0)
    # 7.27 [+0.07]     SND: 1767443466 s 253690 us (seq=1, len=10)  (USR +10006 us)
    # 7.27 [+0.00]     USR: 1767443466 s 323746 us (seq=0, len=0)
    # 7.35 [+0.08]     SND: 1767443466 s 333752 us (seq=2, len=10)  (USR +10006 us)
    # 7.35 [+0.00]     USR: 1767443466 s 403811 us (seq=0, len=0)
    # 7.43 [+0.08]     SND: 1767443466 s 413817 us (seq=3, len=10)  (USR +10006 us)
    # 7.43 [+0.00]     USR-SND: count=4, avg=12154 us, min=10006 us, max=18599 us

These are just outside the bounds of 18000. So increasing the
tolerance in txtimestamp.sh will probably mitigate them. All 17
would have passed with the following change.

-        local -r args="$@ -v 10000 -V 60000 -t 8000 -S 80000"
+        local -r args="$@ -v 10000 -V 60000 -t 8000 -S 100000"

Admittedly a hacky workaround that will only reduce the rate.

It's interesting that

- every time it is the first of the four measurements that fails.
- it never seems to occur for TCP sockets.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08  0:19 ` Willem de Bruijn
@ 2026-01-08  3:25   ` Jakub Kicinski
  2026-01-08 16:06     ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-01-08  3:25 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

Thanks for investigating!

On Wed, 07 Jan 2026 19:19:53 -0500 Willem de Bruijn wrote:
> 17 out of 20 happen in the first SND-USR calculation.
> One representative example:
> 
>     # 7.11 [+0.00] test SND
>     # 7.11 [+0.00]     USR: 1767443466 s 155019 us (seq=0, len=0)
>     # 7.19 [+0.08] ERROR: 18600 us expected between 10000 and 18000
>     # 7.19 [+0.00]     SND: 1767443466 s 173619 us (seq=0, len=10)  (USR +18599 us)
>     # 7.20 [+0.00]     USR: 1767443466 s 243683 us (seq=0, len=0)
>     # 7.27 [+0.07]     SND: 1767443466 s 253690 us (seq=1, len=10)  (USR +10006 us)
>     # 7.27 [+0.00]     USR: 1767443466 s 323746 us (seq=0, len=0)
>     # 7.35 [+0.08]     SND: 1767443466 s 333752 us (seq=2, len=10)  (USR +10006 us)
>     # 7.35 [+0.00]     USR: 1767443466 s 403811 us (seq=0, len=0)
>     # 7.43 [+0.08]     SND: 1767443466 s 413817 us (seq=3, len=10)  (USR +10006 us)
>     # 7.43 [+0.00]     USR-SND: count=4, avg=12154 us, min=10006 us, max=18599 us

Hm, that's the first kernel timestamp vs the timestamp in user space?
I wonder if we could catch this by re-taking the user stamp after
sendmsg() returns, if >1msec elapsed something is probably wrong 
(we got scheduled out before having a chance to complete the send?)

> These are just outside the bounds of 18000. So increasing the
> tolerance in txtimestamp.sh will probably mitigate them. All 17
> would have passed with the following change.
> 
> -        local -r args="$@ -v 10000 -V 60000 -t 8000 -S 80000"
> +        local -r args="$@ -v 10000 -V 60000 -t 8000 -S 100000"
> 
> Admittedly a hacky workaround that will only reduce the rate.
> 
> It's interesting that
> 
> - every time it is the first of the four measurements that fails.
> - it never seems to occur for TCP sockets.

FWIW:
https://netdev-ctrl.bots.linux.dev/logs/vmksft/net/results/449080/13-txtimestamp-sh/stdout
but that could be related to some bad patch..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08  3:25   ` Jakub Kicinski
@ 2026-01-08 16:06     ` Jakub Kicinski
  2026-01-08 19:02       ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-01-08 16:06 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

On Wed, 7 Jan 2026 19:25:11 -0800 Jakub Kicinski wrote:
> On Wed, 07 Jan 2026 19:19:53 -0500 Willem de Bruijn wrote:
> > 17 out of 20 happen in the first SND-USR calculation.
> > One representative example:
> > 
> >     # 7.11 [+0.00] test SND
> >     # 7.11 [+0.00]     USR: 1767443466 s 155019 us (seq=0, len=0)
> >     # 7.19 [+0.08] ERROR: 18600 us expected between 10000 and 18000
> >     # 7.19 [+0.00]     SND: 1767443466 s 173619 us (seq=0, len=10)  (USR +18599 us)
> >     # 7.20 [+0.00]     USR: 1767443466 s 243683 us (seq=0, len=0)
> >     # 7.27 [+0.07]     SND: 1767443466 s 253690 us (seq=1, len=10)  (USR +10006 us)
> >     # 7.27 [+0.00]     USR: 1767443466 s 323746 us (seq=0, len=0)
> >     # 7.35 [+0.08]     SND: 1767443466 s 333752 us (seq=2, len=10)  (USR +10006 us)
> >     # 7.35 [+0.00]     USR: 1767443466 s 403811 us (seq=0, len=0)
> >     # 7.43 [+0.08]     SND: 1767443466 s 413817 us (seq=3, len=10)  (USR +10006 us)
> >     # 7.43 [+0.00]     USR-SND: count=4, avg=12154 us, min=10006 us, max=18599 us  
> 
> Hm, that's the first kernel timestamp vs the timestamp in user space?
> I wonder if we could catch this by re-taking the user stamp after
> sendmsg() returns, if >1msec elapsed something is probably wrong 
> (we got scheduled out before having a chance to complete the send?)

How about:

diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
index 4b4bbc2ce5c9..abcec47ec2e6 100644
--- a/tools/testing/selftests/net/txtimestamp.c
+++ b/tools/testing/selftests/net/txtimestamp.c
@@ -215,6 +215,24 @@ static void print_timestamp_usr(void)
 	__print_timestamp("  USR", &ts_usr, 0, 0);
 }
 
+static void check_timestamp_usr(void)
+{
+	long long unsigned ts_delta_usec;
+	struct timespec now;
+
+	if (clock_gettime(CLOCK_REALTIME, &now))
+		error(1, errno, "clock_gettime");
+
+	ts_delta_usec = timespec_to_ns64(&now) - timespec_to_ns64(&ts_usr);
+	ts_delta_usec /= 1000;
+	if (ts_delta_usec > cfg_delay_tolerance_usec / 2) {
+		cfg_delay_tolerance_usec =
+			ts_delta_usec + cfg_delay_tolerance_usec / 2;
+		fprintf(stderr, "WARN: sendmsg() took %llu us, increasing delay tolerance to %d us\n",
+			ts_delta_usec, cfg_delay_tolerance_usec);
+	}
+}
+
 static void print_timestamp(struct scm_timestamping *tss, int tstype,
 			    int tskey, int payload_len)
 {
@@ -678,6 +696,8 @@ static void do_test(int family, unsigned int report_opt)
 		if (val != total_len)
 			error(1, errno, "send");
 
+		check_timestamp_usr();
+
 		/* wait for all errors to be queued, else ACKs arrive OOO */
 		if (cfg_sleep_usec)
 			usleep(cfg_sleep_usec);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08 16:06     ` Jakub Kicinski
@ 2026-01-08 19:02       ` Willem de Bruijn
  2026-01-08 20:38         ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-08 19:02 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

Jakub Kicinski wrote:
> On Wed, 7 Jan 2026 19:25:11 -0800 Jakub Kicinski wrote:
> > On Wed, 07 Jan 2026 19:19:53 -0500 Willem de Bruijn wrote:
> > > 17 out of 20 happen in the first SND-USR calculation.
> > > One representative example:
> > > 
> > >     # 7.11 [+0.00] test SND
> > >     # 7.11 [+0.00]     USR: 1767443466 s 155019 us (seq=0, len=0)
> > >     # 7.19 [+0.08] ERROR: 18600 us expected between 10000 and 18000
> > >     # 7.19 [+0.00]     SND: 1767443466 s 173619 us (seq=0, len=10)  (USR +18599 us)
> > >     # 7.20 [+0.00]     USR: 1767443466 s 243683 us (seq=0, len=0)
> > >     # 7.27 [+0.07]     SND: 1767443466 s 253690 us (seq=1, len=10)  (USR +10006 us)
> > >     # 7.27 [+0.00]     USR: 1767443466 s 323746 us (seq=0, len=0)
> > >     # 7.35 [+0.08]     SND: 1767443466 s 333752 us (seq=2, len=10)  (USR +10006 us)
> > >     # 7.35 [+0.00]     USR: 1767443466 s 403811 us (seq=0, len=0)
> > >     # 7.43 [+0.08]     SND: 1767443466 s 413817 us (seq=3, len=10)  (USR +10006 us)
> > >     # 7.43 [+0.00]     USR-SND: count=4, avg=12154 us, min=10006 us, max=18599 us  
> > 
> > Hm, that's the first kernel timestamp vs the timestamp in user space?
> > I wonder if we could catch this by re-taking the user stamp after
> > sendmsg() returns, if >1msec elapsed something is probably wrong 
> > (we got scheduled out before having a chance to complete the send?)
> 
> How about:
> 
> diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
> index 4b4bbc2ce5c9..abcec47ec2e6 100644
> --- a/tools/testing/selftests/net/txtimestamp.c
> +++ b/tools/testing/selftests/net/txtimestamp.c
> @@ -215,6 +215,24 @@ static void print_timestamp_usr(void)
>  	__print_timestamp("  USR", &ts_usr, 0, 0);
>  }
>  
> +static void check_timestamp_usr(void)
> +{
> +	long long unsigned ts_delta_usec;
> +	struct timespec now;
> +
> +	if (clock_gettime(CLOCK_REALTIME, &now))
> +		error(1, errno, "clock_gettime");
> +
> +	ts_delta_usec = timespec_to_ns64(&now) - timespec_to_ns64(&ts_usr);
> +	ts_delta_usec /= 1000;
> +	if (ts_delta_usec > cfg_delay_tolerance_usec / 2) {
> +		cfg_delay_tolerance_usec =
> +			ts_delta_usec + cfg_delay_tolerance_usec / 2;
> +		fprintf(stderr, "WARN: sendmsg() took %llu us, increasing delay tolerance to %d us\n",
> +			ts_delta_usec, cfg_delay_tolerance_usec);
> +	}
> +}
> +
>  static void print_timestamp(struct scm_timestamping *tss, int tstype,
>  			    int tskey, int payload_len)
>  {
> @@ -678,6 +696,8 @@ static void do_test(int family, unsigned int report_opt)
>  		if (val != total_len)
>  			error(1, errno, "send");
>  
> +		check_timestamp_usr();
> +
>  		/* wait for all errors to be queued, else ACKs arrive OOO */
>  		if (cfg_sleep_usec)
>  			usleep(cfg_sleep_usec);
> -- 
> 2.52.0

Increasing tolerance should work.

The current values are pragmatic choices to be so low as to minimize
total test runtime, but high enough to avoid flakes. Well..

If increasing tolerance, we also need to increase the time the test
waits for all notifications to arrive, cfg_sleep_usec.

Since the majority of errors happen on the first measurement, I was
thinking of a different approach of just taking that as a warm up
measurement.

But I'd also like to poke some more to understand what makes that
run stand out.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08 19:02       ` Willem de Bruijn
@ 2026-01-08 20:38         ` Jakub Kicinski
  2026-01-08 21:19           ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-01-08 20:38 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

On Thu, 08 Jan 2026 14:02:15 -0500 Willem de Bruijn wrote:
> Increasing tolerance should work.
> 
> The current values are pragmatic choices to be so low as to minimize
> total test runtime, but high enough to avoid flakes. Well..
> 
> If increasing tolerance, we also need to increase the time the test
> waits for all notifications to arrive, cfg_sleep_usec.

To be clear the theory is that we got scheduled out between taking the
USR timestamp and sending the packet. But once the packet is in the
kernel it seems to flow, so AFAIU cfg_sleep_usec can remain untouched.

Thinking about it more - maybe what blocks us is the print? Maybe under
vng there's a non-trivial chance that a print to stderr ends up
blocking on serial and schedules us out? I mean maybe we should:

diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
index abcec47ec2e6..e2273fdff495 100644
--- a/tools/testing/selftests/net/txtimestamp.c
+++ b/tools/testing/selftests/net/txtimestamp.c
@@ -207,12 +207,10 @@ static void __print_timestamp(const char *name, struct timespec *cur,
        fprintf(stderr, "\n");
 }
 
-static void print_timestamp_usr(void)
+static void record_timestamp_usr(void)
 {
        if (clock_gettime(CLOCK_REALTIME, &ts_usr))
                error(1, errno, "clock_gettime");
-
-       __print_timestamp("  USR", &ts_usr, 0, 0);
 }
 
 static void check_timestamp_usr(void)
@@ -636,8 +634,6 @@ static void do_test(int family, unsigned int report_opt)
                        fill_header_udp(buf + off, family == PF_INET);
                }
 
-               print_timestamp_usr();
-
                iov.iov_base = buf;
                iov.iov_len = total_len;
 
@@ -692,10 +688,14 @@ static void do_test(int family, unsigned int report_opt)
 
                }
 
+               record_timestamp_usr();
                val = sendmsg(fd, &msg, 0);
                if (val != total_len)
                        error(1, errno, "send");
 
+               /* Avoid I/O between taking ts_usr and sendmsg() */
+               __print_timestamp("  USR", &ts_usr, 0, 0);
+
                check_timestamp_usr();
 
                /* wait for all errors to be queued, else ACKs arrive OOO */

> Since the majority of errors happen on the first measurement, I was
> thinking of a different approach of just taking that as a warm up
> measurement.
> 
> But I'd also like to poke some more to understand what makes that
> run stand out.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08 20:38         ` Jakub Kicinski
@ 2026-01-08 21:19           ` Willem de Bruijn
  2026-01-12  3:24             ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-08 21:19 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

Jakub Kicinski wrote:
> On Thu, 08 Jan 2026 14:02:15 -0500 Willem de Bruijn wrote:
> > Increasing tolerance should work.
> > 
> > The current values are pragmatic choices to be so low as to minimize
> > total test runtime, but high enough to avoid flakes. Well..
> > 
> > If increasing tolerance, we also need to increase the time the test
> > waits for all notifications to arrive, cfg_sleep_usec.
> 
> To be clear the theory is that we got scheduled out between taking the
> USR timestamp and sending the packet. But once the packet is in the
> kernel it seems to flow, so AFAIU cfg_sleep_usec can remain untouched.
> 
> Thinking about it more - maybe what blocks us is the print? Maybe under
> vng there's a non-trivial chance that a print to stderr ends up
> blocking on serial and schedules us out? I mean maybe we should:
> 
> diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
> index abcec47ec2e6..e2273fdff495 100644
> --- a/tools/testing/selftests/net/txtimestamp.c
> +++ b/tools/testing/selftests/net/txtimestamp.c
> @@ -207,12 +207,10 @@ static void __print_timestamp(const char *name, struct timespec *cur,
>         fprintf(stderr, "\n");
>  }
>  
> -static void print_timestamp_usr(void)
> +static void record_timestamp_usr(void)
>  {
>         if (clock_gettime(CLOCK_REALTIME, &ts_usr))
>                 error(1, errno, "clock_gettime");
> -
> -       __print_timestamp("  USR", &ts_usr, 0, 0);
>  }
>  
>  static void check_timestamp_usr(void)
> @@ -636,8 +634,6 @@ static void do_test(int family, unsigned int report_opt)
>                         fill_header_udp(buf + off, family == PF_INET);
>                 }
>  
> -               print_timestamp_usr();
> -
>                 iov.iov_base = buf;
>                 iov.iov_len = total_len;
>  
> @@ -692,10 +688,14 @@ static void do_test(int family, unsigned int report_opt)
>  
>                 }
>  
> +               record_timestamp_usr();
>                 val = sendmsg(fd, &msg, 0);
>                 if (val != total_len)
>                         error(1, errno, "send");
>  
> +               /* Avoid I/O between taking ts_usr and sendmsg() */
> +               __print_timestamp("  USR", &ts_usr, 0, 0);
> +
>                 check_timestamp_usr();
>  
>                 /* wait for all errors to be queued, else ACKs arrive OOO */

Definitely worth including.

Could it be helpful to schedule at RR or FIFO prio. Depends on the
reason for descheduling. And it only affects priority within the VM.

I'm having trouble reproducing it in vng both locally and on 
netdev-virt.

At this point, an initial obviously correct patch and observe how
much that mitigates the issue is likely the fastest way forward.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-08 21:19           ` Willem de Bruijn
@ 2026-01-12  3:24             ` Willem de Bruijn
  2026-01-12  3:28               ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-12  3:24 UTC (permalink / raw)
  To: Willem de Bruijn, Jakub Kicinski, Willem de Bruijn
  Cc: Willem de Bruijn, netdev@vger.kernel.org

Willem de Bruijn wrote:
> Jakub Kicinski wrote:
> > On Thu, 08 Jan 2026 14:02:15 -0500 Willem de Bruijn wrote:
> > > Increasing tolerance should work.
> > > 
> > > The current values are pragmatic choices to be so low as to minimize
> > > total test runtime, but high enough to avoid flakes. Well..
> > > 
> > > If increasing tolerance, we also need to increase the time the test
> > > waits for all notifications to arrive, cfg_sleep_usec.
> > 
> > To be clear the theory is that we got scheduled out between taking the
> > USR timestamp and sending the packet. But once the packet is in the
> > kernel it seems to flow, so AFAIU cfg_sleep_usec can remain untouched.
> > 
> > Thinking about it more - maybe what blocks us is the print? Maybe under
> > vng there's a non-trivial chance that a print to stderr ends up
> > blocking on serial and schedules us out? I mean maybe we should:
> > 
> > diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
> > index abcec47ec2e6..e2273fdff495 100644
> > --- a/tools/testing/selftests/net/txtimestamp.c
> > +++ b/tools/testing/selftests/net/txtimestamp.c
> > @@ -207,12 +207,10 @@ static void __print_timestamp(const char *name, struct timespec *cur,
> >         fprintf(stderr, "\n");
> >  }
> >  
> > -static void print_timestamp_usr(void)
> > +static void record_timestamp_usr(void)
> >  {
> >         if (clock_gettime(CLOCK_REALTIME, &ts_usr))
> >                 error(1, errno, "clock_gettime");
> > -
> > -       __print_timestamp("  USR", &ts_usr, 0, 0);
> >  }
> >  
> >  static void check_timestamp_usr(void)
> > @@ -636,8 +634,6 @@ static void do_test(int family, unsigned int report_opt)
> >                         fill_header_udp(buf + off, family == PF_INET);
> >                 }
> >  
> > -               print_timestamp_usr();
> > -
> >                 iov.iov_base = buf;
> >                 iov.iov_len = total_len;
> >  
> > @@ -692,10 +688,14 @@ static void do_test(int family, unsigned int report_opt)
> >  
> >                 }
> >  
> > +               record_timestamp_usr();
> >                 val = sendmsg(fd, &msg, 0);
> >                 if (val != total_len)
> >                         error(1, errno, "send");
> >  
> > +               /* Avoid I/O between taking ts_usr and sendmsg() */
> > +               __print_timestamp("  USR", &ts_usr, 0, 0);
> > +
> >                 check_timestamp_usr();
> >  
> >                 /* wait for all errors to be queued, else ACKs arrive OOO */
> 
> Definitely worth including.
> 
> Could it be helpful to schedule at RR or FIFO prio. Depends on the
> reason for descheduling. And it only affects priority within the VM.
> 
> I'm having trouble reproducing it in vng both locally and on 
> netdev-virt.
> 
> At this point, an initial obviously correct patch and observe how
> much that mitigates the issue is likely the fastest way forward.

Instead of increasing tolerance, how about optionally allowing one
moderate timing error:

@@ -166,8 +167,15 @@ static void validate_timestamp(struct timespec *cur, int min_delay)
        if (cur64 < start64 + min_delay || cur64 > start64 + max_delay) {
                fprintf(stderr, "ERROR: %" PRId64 " us expected between %d and %d\n",
                                cur64 - start64, min_delay, max_delay);
-               if (!getenv("KSFT_MACHINE_SLOW"))
-                       test_failed = true;
+               if (!getenv("KSFT_MACHINE_SLOW")) {
+                       if (cfg_num_max_timing_failures &&
+                           (cur64 <= start64 + (max_delay * 2))) {
+                               cfg_num_max_timing_failures--;
+                               fprintf(stderr, "CONTINUE: ignore 1 timing failure\n");
+                       } else {
+                               test_failed = true;
+                       }
+               }
        }
 }

@@ -746,6 +755,10 @@ static void parse_opt(int argc, char **argv)
                case 'E':
                        cfg_use_epoll = true;
                        cfg_epollet = true;
+                       break;
+               case 'f':
+                       cfg_num_max_timing_failures = strtoul(optarg, NULL, 10);
+                       break;

+++ b/tools/testing/selftests/net/txtimestamp.sh
@@ -30,8 +30,8 @@ run_test_v4v6() {
        # wait for ACK to be queued
        local -r args="$@ -v 10000 -V 60000 -t 8000 -S 80000"
 
-       ./txtimestamp ${args} -4 -L 127.0.0.1
-       ./txtimestamp ${args} -6 -L ::1
+       ./txtimestamp ${args} -f 1 -4 -L 127.0.0.1
+       ./txtimestamp ${args} -f 1 -6 -L ::1
 }

and some boilerplate.

Can fold in the record_timestamp_usr() change too.

I can send this, your alternative with Suggested-by, or let me know if
you prefer to send that.

It's tricky to reproduce, but evidently on some platforms this occurs,
so not unreasonable to give some leeway. A single UDP test runs 12
timing validations: 4 packets * {SND, ENQ, END + SND} setups. A single
TCP test runs additional {ACK, SND + ACK, ENQ + SND + ACK} cases. If
we consider 1/12 skips too high, we could increase packet count. 

txtimestamp.sh runs 3 * 7 * 2 test variants. Alternatively we suppress
1 failure here, rather than in the individual tests.

Any of these approaches should significantly reduce the flake rate
reported on netdev.bots.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-12  3:24             ` Willem de Bruijn
@ 2026-01-12  3:28               ` Willem de Bruijn
  2026-01-12 14:29                 ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-12  3:28 UTC (permalink / raw)
  To: Willem de Bruijn, Willem de Bruijn, Jakub Kicinski,
	Willem de Bruijn
  Cc: Willem de Bruijn, netdev@vger.kernel.org

Willem de Bruijn wrote:
> Willem de Bruijn wrote:
> > Jakub Kicinski wrote:
> > > On Thu, 08 Jan 2026 14:02:15 -0500 Willem de Bruijn wrote:
> > > > Increasing tolerance should work.
> > > > 
> > > > The current values are pragmatic choices to be so low as to minimize
> > > > total test runtime, but high enough to avoid flakes. Well..
> > > > 
> > > > If increasing tolerance, we also need to increase the time the test
> > > > waits for all notifications to arrive, cfg_sleep_usec.
> > > 
> > > To be clear the theory is that we got scheduled out between taking the
> > > USR timestamp and sending the packet. But once the packet is in the
> > > kernel it seems to flow, so AFAIU cfg_sleep_usec can remain untouched.
> > > 
> > > Thinking about it more - maybe what blocks us is the print? Maybe under
> > > vng there's a non-trivial chance that a print to stderr ends up
> > > blocking on serial and schedules us out? I mean maybe we should:
> > > 
> > > diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
> > > index abcec47ec2e6..e2273fdff495 100644
> > > --- a/tools/testing/selftests/net/txtimestamp.c
> > > +++ b/tools/testing/selftests/net/txtimestamp.c
> > > @@ -207,12 +207,10 @@ static void __print_timestamp(const char *name, struct timespec *cur,
> > >         fprintf(stderr, "\n");
> > >  }
> > >  
> > > -static void print_timestamp_usr(void)
> > > +static void record_timestamp_usr(void)
> > >  {
> > >         if (clock_gettime(CLOCK_REALTIME, &ts_usr))
> > >                 error(1, errno, "clock_gettime");
> > > -
> > > -       __print_timestamp("  USR", &ts_usr, 0, 0);
> > >  }
> > >  
> > >  static void check_timestamp_usr(void)
> > > @@ -636,8 +634,6 @@ static void do_test(int family, unsigned int report_opt)
> > >                         fill_header_udp(buf + off, family == PF_INET);
> > >                 }
> > >  
> > > -               print_timestamp_usr();
> > > -
> > >                 iov.iov_base = buf;
> > >                 iov.iov_len = total_len;
> > >  
> > > @@ -692,10 +688,14 @@ static void do_test(int family, unsigned int report_opt)
> > >  
> > >                 }
> > >  
> > > +               record_timestamp_usr();
> > >                 val = sendmsg(fd, &msg, 0);
> > >                 if (val != total_len)
> > >                         error(1, errno, "send");
> > >  
> > > +               /* Avoid I/O between taking ts_usr and sendmsg() */
> > > +               __print_timestamp("  USR", &ts_usr, 0, 0);
> > > +
> > >                 check_timestamp_usr();
> > >  
> > >                 /* wait for all errors to be queued, else ACKs arrive OOO */
> > 
> > Definitely worth including.
> > 
> > Could it be helpful to schedule at RR or FIFO prio. Depends on the
> > reason for descheduling. And it only affects priority within the VM.
> > 
> > I'm having trouble reproducing it in vng both locally and on 
> > netdev-virt.
> > 
> > At this point, an initial obviously correct patch and observe how
> > much that mitigates the issue is likely the fastest way forward.
> 
> Instead of increasing tolerance, how about optionally allowing one
> moderate timing error:
> 
> @@ -166,8 +167,15 @@ static void validate_timestamp(struct timespec *cur, int min_delay)
>         if (cur64 < start64 + min_delay || cur64 > start64 + max_delay) {
>                 fprintf(stderr, "ERROR: %" PRId64 " us expected between %d and %d\n",
>                                 cur64 - start64, min_delay, max_delay);
> -               if (!getenv("KSFT_MACHINE_SLOW"))
> -                       test_failed = true;
> +               if (!getenv("KSFT_MACHINE_SLOW")) {
> +                       if (cfg_num_max_timing_failures &&
> +                           (cur64 <= start64 + (max_delay * 2))) {
> +                               cfg_num_max_timing_failures--;
> +                               fprintf(stderr, "CONTINUE: ignore 1 timing failure\n");
> +                       } else {
> +                               test_failed = true;
> +                       }
> +               }
>         }
>  }
> 
> @@ -746,6 +755,10 @@ static void parse_opt(int argc, char **argv)
>                 case 'E':
>                         cfg_use_epoll = true;
>                         cfg_epollet = true;
> +                       break;
> +               case 'f':
> +                       cfg_num_max_timing_failures = strtoul(optarg, NULL, 10);
> +                       break;
> 
> +++ b/tools/testing/selftests/net/txtimestamp.sh
> @@ -30,8 +30,8 @@ run_test_v4v6() {
>         # wait for ACK to be queued
>         local -r args="$@ -v 10000 -V 60000 -t 8000 -S 80000"
>  
> -       ./txtimestamp ${args} -4 -L 127.0.0.1
> -       ./txtimestamp ${args} -6 -L ::1
> +       ./txtimestamp ${args} -f 1 -4 -L 127.0.0.1
> +       ./txtimestamp ${args} -f 1 -6 -L ::1
>  }
> 
> and some boilerplate.
> 
> Can fold in the record_timestamp_usr() change too.
> 
> I can send this, your alternative with Suggested-by, or let me know if
> you prefer to send that.
> 
> It's tricky to reproduce, but evidently on some platforms this occurs,
> so not unreasonable to give some leeway. A single UDP test runs 12
> timing validations: 4 packets * {SND, ENQ, END + SND} setups. A single
> TCP test runs additional {ACK, SND + ACK, ENQ + SND + ACK} cases. If
> we consider 1/12 skips too high, we could increase packet count. 

That should say 16 validations: ENQ + SND validates both.
 
> txtimestamp.sh runs 3 * 7 * 2 test variants. Alternatively we suppress
> 1 failure here, rather than in the individual tests.
> 
> Any of these approaches should significantly reduce the flake rate
> reported on netdev.bots.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-12  3:28               ` Willem de Bruijn
@ 2026-01-12 14:29                 ` Jakub Kicinski
  2026-01-12 16:38                   ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-01-12 14:29 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

On Sun, 11 Jan 2026 22:28:39 -0500 Willem de Bruijn wrote:
> > Can fold in the record_timestamp_usr() change too.
> > 
> > I can send this, your alternative with Suggested-by, or let me know if
> > you prefer to send that.

Looks like we got 30 clean runs since I added the printing change 
to the local NIPA hacks. Let's start with that and see if it's good
enough? As trivial as the change is I have the feeling that is has
to be something trivial for us to fail 30% of the attempts.

Please go ahead and submit with my Suggested-by.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [TEST] txtimestamp.sh pains after netdev foundation migration
  2026-01-12 14:29                 ` Jakub Kicinski
@ 2026-01-12 16:38                   ` Willem de Bruijn
  0 siblings, 0 replies; 11+ messages in thread
From: Willem de Bruijn @ 2026-01-12 16:38 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn; +Cc: Willem de Bruijn, netdev@vger.kernel.org

Jakub Kicinski wrote:
> On Sun, 11 Jan 2026 22:28:39 -0500 Willem de Bruijn wrote:
> > > Can fold in the record_timestamp_usr() change too.
> > > 
> > > I can send this, your alternative with Suggested-by, or let me know if
> > > you prefer to send that.
> 
> Looks like we got 30 clean runs since I added the printing change 
> to the local NIPA hacks.

Wow nice.

> Let's start with that and see if it's good
> enough? As trivial as the change is I have the feeling that is has
> to be something trivial for us to fail 30% of the attempts.
> 
> Please go ahead and submit with my Suggested-by.

https://lore.kernel.org/netdev/20260112163355.3510150-1-willemdebruijn.kernel@gmail.com/T/#u

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-01-12 16:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-07 19:05 [TEST] txtimestamp.sh pains after netdev foundation migration Jakub Kicinski
2026-01-08  0:19 ` Willem de Bruijn
2026-01-08  3:25   ` Jakub Kicinski
2026-01-08 16:06     ` Jakub Kicinski
2026-01-08 19:02       ` Willem de Bruijn
2026-01-08 20:38         ` Jakub Kicinski
2026-01-08 21:19           ` Willem de Bruijn
2026-01-12  3:24             ` Willem de Bruijn
2026-01-12  3:28               ` Willem de Bruijn
2026-01-12 14:29                 ` Jakub Kicinski
2026-01-12 16:38                   ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox