BPF List
 help / color / mirror / Atom feed
* [bpf-next 0/4] selftests/bpf: fix for bpf_signal stalls, watchdog for test_progs
@ 2024-11-12 11:09 Eduard Zingerman
  2024-11-12 11:09 ` [bpf-next 1/4] selftests/bpf: watchdog timer " Eduard Zingerman
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Eduard Zingerman @ 2024-11-12 11:09 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Test case 'bpf_signal' had been recently reported to stall, both on
the mailing list [1] and CI [2]. The stall is caused by CPU cycles
perf event not being delivered within expected time frame, before test
process enters system call and waits indefinitely.

This patch-set addresses the issue in several ways:
- A watchdog timer is added to test_progs.c runner:
  - it prints current sub-test name to stderr if sub-test takes longer
    than 10 seconds to finish;
  - it terminates process executing sub-test if sub-test takes longer
    than 120 seconds to finish.
- The test case is updated to await perf event notification with a
  timeout and a few retries, this serves two purposes:
  - busy loops longer to increase the time frame for CPU cycles event
    generation/delivery;
  - makes a timeout, not stall, a worst case scenario.
- The test case is updated to lower frequency of perf events, as high
  frequency of such events caused events generation throttling,
  which in turn delayed events delivery by amount of time sufficient
  to cause test case failure.

Note:

  librt pthread-based timer API is used to implement watchdog timer.
  I chose this API over SIGALRM because signal handler execution
  within test process context was sufficient to trigger perf event
  delivery for send_signal/send_signal_nmi_thread_remote test case,
  w/o any additional changes. Thus I concluded that SIGALRM based
  implementation interferes with tests execution.

[1] https://lore.kernel.org/bpf/CAP01T75OUeE8E-Lw9df84dm8ag2YmHW619f1DmPSVZ5_O89+Bg@mail.gmail.com/
[2] https://github.com/kernel-patches/bpf/actions/runs/11791485271/job/32843996871

Eduard Zingerman (4):
  selftests/bpf: watchdog timer for test_progs
  selftests/bpf: add read_with_timeout() utility function
  selftests/bpf: allow send_signal test to timeout
  selftests/bpf: update send_signal to lower perf evemts frequency

 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/io_helpers.c      |  21 ++++
 tools/testing/selftests/bpf/io_helpers.h      |   7 ++
 .../selftests/bpf/prog_tests/bpf_iter.c       |   8 +-
 .../testing/selftests/bpf/prog_tests/iters.c  |   4 +-
 .../selftests/bpf/prog_tests/send_signal.c    |  35 +++---
 tools/testing/selftests/bpf/test_progs.c      | 104 ++++++++++++++++++
 tools/testing/selftests/bpf/test_progs.h      |   6 +
 8 files changed, 166 insertions(+), 20 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/io_helpers.c
 create mode 100644 tools/testing/selftests/bpf/io_helpers.h

-- 
2.47.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-12 22:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-12 11:09 [bpf-next 0/4] selftests/bpf: fix for bpf_signal stalls, watchdog for test_progs Eduard Zingerman
2024-11-12 11:09 ` [bpf-next 1/4] selftests/bpf: watchdog timer " Eduard Zingerman
2024-11-12 11:09 ` [bpf-next 2/4] selftests/bpf: add read_with_timeout() utility function Eduard Zingerman
2024-11-12 11:09 ` [bpf-next 3/4] selftests/bpf: allow send_signal test to timeout Eduard Zingerman
2024-11-12 11:09 ` [bpf-next 4/4] selftests/bpf: update send_signal to lower perf evemts frequency Eduard Zingerman
2024-11-12 22:10 ` [bpf-next 0/4] selftests/bpf: fix for bpf_signal stalls, watchdog for test_progs patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox