From: Eduard Zingerman <eddyz87@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Puranjay Mohan <puranjay@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
bpf <bpf@vger.kernel.org>,
kkd@meta.com, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Song Liu <song@kernel.org>, Steven Rostedt <rostedt@goodmis.org>,
Jiri Olsa <olsajiri@gmail.com>,
Juri Lelli <juri.lelli@redhat.com>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v1 0/2] Handle possible NULL trusted raw_tp arguments
Date: Thu, 07 Nov 2024 21:08:41 -0800 [thread overview]
Message-ID: <df84c4c41d3fa9cbc43738ad226bc9efc5fa495c.camel@gmail.com> (raw)
In-Reply-To: <57dfdda6a89819b65be8960c3c6953bb9b8ceed3.camel@gmail.com>
On Fri, 2024-11-01 at 17:32 -0700, Eduard Zingerman wrote:
> On Fri, 2024-11-01 at 17:29 -0700, Alexei Starovoitov wrote:
>
> [...]
>
> > Hmm.
> > Puranjay touched it last with extra logic.
> >
> > And before that David Vernet tried to address flakiness
> > in commit 4a54de65964d.
> > Yonghong also noticed lockups in paravirt
> > and added workaround 7015843afc.
> >
> > Your additional timeout/workaround makes sense to me,
> > but would be good to bisect whether Puranjay's change caused it.
>
> I'll debug what's going on some time later today or on Sat.
I finally had time to investigate this a bit.
First, here is how to trigger lockup:
t1=send_signal/send_signal_perf_thread_remote; \
t2=send_signal/send_signal_nmi_thread_remote; \
for i in $(seq 1 100); do ./test_progs -t $t1,$t2; done
Must be both tests for whatever reason.
The failing test is 'send_signal_nmi_thread_remote'.
The test is organized as parent and child processes communicating
various events to each other. The intended sequence of events:
- child:
- install SIGUSR1 handler
- notify parent
- wait for parent
- parent:
- open PERF_COUNT_SW_CPU_CLOCK event
- attach BPF program to the event
- notify child
- enter busy loop for 10^8 iterations
- wait for child
- BPF program:
- send SIGUSR1 to child
- child:
- poll for SIGUSR1 in a busy loop
- notify parent
- parent:
- check value communicated by child,
terminate test.
The lockup happens because on every other test run perf event is not
triggered, child does not receive SIGUSR1 and thus both parent and
child are stuck.
For 'send_signal_nmi_thread_remote' perf event is defined as:
struct perf_event_attr attr = {
.sample_period = 1,
.type = PERF_TYPE_HARDWARE,
.config = PERF_COUNT_HW_CPU_CYCLES,
};
And is opened for parent process pid.
Apparently, the perf event is not always triggered between lines
send_signal.c:165-180. And at line 180 parent enters system call,
so cpu cycles stop ticking for 'parent', thus if perf event
had not been triggered already it won't be triggered at all
(as far as I understand).
Applying same fix as Yonghong did in 7015843afc is sufficient to
reliably trigger perf event:
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -223,7 +223,8 @@ static void test_send_signal_perf(bool signal_thread, bool remote)
static void test_send_signal_nmi(bool signal_thread, bool remote)
{
struct perf_event_attr attr = {
- .sample_period = 1,
+ .freq = 1,
+ .sample_freq = 1000,
.type = PERF_TYPE_HARDWARE,
.config = PERF_COUNT_HW_CPU_CYCLES,
};
But I don't understand why.
As far as I can figure from kernel source code,
sample_period is measured in nanoseconds (is it?),
so busy loop at send_signal.c:174-175 should run long enough for perf
event to be triggered before.
Can someone with understanding of how perf event work explain why
above change helps?
next prev parent reply other threads:[~2024-11-08 5:08 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-01 0:00 [PATCH bpf-next v1 0/2] Handle possible NULL trusted raw_tp arguments Kumar Kartikeya Dwivedi
2024-11-01 0:00 ` [PATCH bpf-next v1 1/2] bpf: Mark raw_tp arguments with PTR_MAYBE_NULL Kumar Kartikeya Dwivedi
2024-11-01 19:16 ` Andrii Nakryiko
2024-11-01 22:55 ` Alexei Starovoitov
2024-11-03 16:16 ` Kumar Kartikeya Dwivedi
2024-11-03 16:40 ` Kumar Kartikeya Dwivedi
2024-11-03 17:00 ` Kumar Kartikeya Dwivedi
2024-11-03 17:37 ` Alexei Starovoitov
2024-11-06 20:17 ` Andrii Nakryiko
2024-11-01 0:00 ` [PATCH bpf-next v1 2/2] selftests/bpf: Add tests for raw_tp null handling Kumar Kartikeya Dwivedi
2024-11-01 19:19 ` Andrii Nakryiko
2024-11-03 15:58 ` Kumar Kartikeya Dwivedi
2024-11-01 13:18 ` [PATCH bpf-next v1 0/2] Handle possible NULL trusted raw_tp arguments Kumar Kartikeya Dwivedi
2024-11-02 0:21 ` Eduard Zingerman
2024-11-02 0:29 ` Alexei Starovoitov
2024-11-02 0:32 ` Eduard Zingerman
2024-11-08 5:08 ` Eduard Zingerman [this message]
2024-11-08 20:13 ` Alexei Starovoitov
2024-11-08 21:57 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df84c4c41d3fa9cbc43738ad226bc9efc5fa495c.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@fb.com \
--cc=kkd@meta.com \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=olsajiri@gmail.com \
--cc=puranjay@kernel.org \
--cc=rostedt@goodmis.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox