netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* selftests/bpf test_sockmap failure
@ 2018-07-24 15:45 Yonghong Song
  2018-07-24 22:40 ` John Fastabend
  0 siblings, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2018-07-24 15:45 UTC (permalink / raw)
  To: John Fastabend, netdev; +Cc: Yonghong Song, Martin Lau

In one of our production machines, tools/testing/selftests/bpf
test_sockmap failed randomly like below:

...
[TEST 78]: (512, 1, 1, sendmsg, pass,apply 1,): rx thread exited with 
err 1. FAILED
...

...
[TEST 80]: (2, 1024, 256, sendmsg, pass,apply 1,): rx thread exited with 
err 1. FAILED
...

...
[TEST 83]: (100, 1, 5, sendpage, pass,apply 1,): rx thread exited with 
err 1. FAILED
...

...
[TEST 79]: (512, 1, 1, sendpage, pass,apply 1,): rx thread exited with 
err 1. FAILED
...

The command line is just `test_sockmap`. The machine has 80 cpus, 256G 
memory. The kernel is based on 4.16 but backported with latest bpf-next 
bpf changes.

The failed test number (78, 79, 80, or 83) is random. But they all share
similar characteristics:
    . the option rate is greater than one, i.e., more than one
      sendmsg/sendpage in the sender forked process.
    . The txmsg_apply is not 0

I debugged a little bit. It happens in msg_loop() function below
"unexpected timeout" path.

...
                         slct = select(max_fd + 1, &w, NULL, NULL, 
&timeout);
                         if (slct == -1) {
                                 perror("select()");
                                 clock_gettime(CLOCK_MONOTONIC, &s->end);
                                 goto out_errno;
                         } else if (!slct) {
                                 if (opt->verbose)
                                         fprintf(stderr, "unexpected 
timeout\n");
                                 errno = -EIO;
                                 clock_gettime(CLOCK_MONOTONIC, &s->end);
                                 goto out_errno;
                         }
...

It appears that when the error happens, the receive process does not 
receive all bytes sent from the send process and eventually times out.

Has anybody seen this issue as well?
John, any comments on this failure?

Thanks,

Yonghong

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-25  6:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-24 15:45 selftests/bpf test_sockmap failure Yonghong Song
2018-07-24 22:40 ` John Fastabend
2018-07-24 23:02   ` Yonghong Song
2018-07-25  0:49     ` Prashant Bhole
2018-07-25  5:09       ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).