From: Yonghong Song <yhs@meta.com>
To: Hou Tao <houtao1@huawei.com>, Tonghao Zhang <xiangxia.m.yue@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>
Subject: Re: [bpf-next v3 2/2] selftests/bpf: add test case for htab map
Date: Wed, 4 Jan 2023 00:01:15 -0800 [thread overview]
Message-ID: <ff6473d8-c640-267e-c0f7-a92ce747c888@meta.com> (raw)
In-Reply-To: <dc658ded-719f-17bd-9166-e335a86150a6@huawei.com>
On 1/3/23 11:51 PM, Hou Tao wrote:
> Hi,
>
> On 1/4/2023 3:09 PM, Yonghong Song wrote:
>>
>>
>> On 1/2/23 6:40 PM, Tonghao Zhang wrote:
>>> a
>>>
>>> On Thu, Dec 29, 2022 at 2:29 PM Yonghong Song <yhs@meta.com> wrote:
>>>>
>>>>
>>>>
>>>> On 12/28/22 2:24 PM, Alexei Starovoitov wrote:
>>>>> On Tue, Dec 27, 2022 at 8:43 PM Yonghong Song <yhs@meta.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/18/22 8:15 PM, xiangxia.m.yue@gmail.com wrote:
>>>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>>>
>>>>>>> This testing show how to reproduce deadlock in special case.
>>>>>>> We update htab map in Task and NMI context. Task can be interrupted by
>>>>>>> NMI, if the same map bucket was locked, there will be a deadlock.
>>>>>>>
>>>>>>> * map max_entries is 2.
>>>>>>> * NMI using key 4 and Task context using key 20.
>>>>>>> * so same bucket index but map_locked index is different.
>>>>>>>
>>>>>>> The selftest use perf to produce the NMI and fentry nmi_handle.
>>>>>>> Note that bpf_overflow_handler checks bpf_prog_active, but in bpf update
>>>>>>> map syscall increase this counter in bpf_disable_instrumentation.
>>>>>>> Then fentry nmi_handle and update hash map will reproduce the issue.
> SNIP
>>>>>>> diff --git a/tools/testing/selftests/bpf/progs/htab_deadlock.c
>>>>>>> b/tools/testing/selftests/bpf/progs/htab_deadlock.c
>>>>>>> new file mode 100644
>>>>>>> index 000000000000..d394f95e97c3
>>>>>>> --- /dev/null
>>>>>>> +++ b/tools/testing/selftests/bpf/progs/htab_deadlock.c
>>>>>>> @@ -0,0 +1,32 @@
>>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>>> +/* Copyright (c) 2022 DiDi Global Inc. */
>>>>>>> +#include <linux/bpf.h>
>>>>>>> +#include <bpf/bpf_helpers.h>
>>>>>>> +#include <bpf/bpf_tracing.h>
>>>>>>> +
>>>>>>> +char _license[] SEC("license") = "GPL";
>>>>>>> +
>>>>>>> +struct {
>>>>>>> + __uint(type, BPF_MAP_TYPE_HASH);
>>>>>>> + __uint(max_entries, 2);
>>>>>>> + __uint(map_flags, BPF_F_ZERO_SEED);
>>>>>>> + __type(key, unsigned int);
>>>>>>> + __type(value, unsigned int);
>>>>>>> +} htab SEC(".maps");
>>>>>>> +
>>>>>>> +/* nmi_handle on x86 platform. If changing keyword
>>>>>>> + * "static" to "inline", this prog load failed. */
>>>>>>> +SEC("fentry/nmi_handle")
>>>>>>
>>>>>> The above comment is not what I mean. In arch/x86/kernel/nmi.c,
>>>>>> we have
>>>>>> static int nmi_handle(unsigned int type, struct pt_regs *regs)
>>>>>> {
>>>>>> ...
>>>>>> }
>>>>>> ...
>>>>>> static noinstr void default_do_nmi(struct pt_regs *regs)
>>>>>> {
>>>>>> ...
>>>>>> handled = nmi_handle(NMI_LOCAL, regs);
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Since nmi_handle is a static function, it is possible that
>>>>>> the function might be inlined in default_do_nmi by the
>>>>>> compiler. If this happens, fentry/nmi_handle will not
>>>>>> be triggered and the test will pass.
>>>>>>
>>>>>> So I suggest to change the comment to
>>>>>> nmi_handle() is a static function and might be
>>>>>> inlined into its caller. If this happens, the
>>>>>> test can still pass without previous kernel fix.
>>>>>
>>>>> It's worse than this.
>>>>> fentry is buggy.
>>>>> We shouldn't allow attaching fentry to:
>>>>> NOKPROBE_SYMBOL(nmi_handle);
>>>>
>>>> Okay, I see. Looks we should prevent fentry from
>>>> attaching any NOKPROBE_SYMBOL functions.
>>>>
>>>> BTW, I think fentry/nmi_handle can be replaced with
>>>> tracepoint nmi/nmi_handler. it is more reliable
>>> The tracepoint will not reproduce the deadlock(we have discussed v2).
>>> If it's not easy to complete a test for this case, should we drop this
>>> testcase patch? or fentry the nmi_handle and update the comments.
>>
>> could we use a softirq perf event (timer), e.g.,
>>
>> struct perf_event_attr attr = {
>> .sample_period = 1,
>> .type = PERF_TYPE_SOFTWARE,
>> .config = PERF_COUNT_SW_CPU_CLOCK,
>> };
>>
>> then you can attach function hrtimer_run_softirq (not tested) or
>> similar functions?
> The context will be a hard-irq context, right ? Because htab_lock_bucket() has
> already disabled hard-irq on current CPU, so the dead-lock will be impossible.
Okay, I see. soft-irq doesn't work. The only thing it works is nmi since
it is non-masking.
>>
>> I suspect most (if not all) functions in nmi path cannot
>> be kprobe'd.
> It seems that perf_event_nmi_handler() is also nokprobe function. However I
> think we could try its callees (e.g., x86_pmu_handle_irq or perf_event_overflow).
If we can find a function in nmi handling path which is not marked as
nonkprobe, sure, we can use that function for fentry.
>>
>>>> and won't be impacted by potential NOKPROBE_SYMBOL
>>>> issues.
>>>
>>>
>>>
>> .
>
next prev parent reply other threads:[~2023-01-04 8:02 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-19 4:15 [bpf-next v3 1/2] bpf: hash map, avoid deadlock with suitable hash mask xiangxia.m.yue
2022-12-19 4:15 ` [bpf-next v3 2/2] selftests/bpf: add test case for htab map xiangxia.m.yue
2022-12-28 4:42 ` Yonghong Song
2022-12-28 22:24 ` Alexei Starovoitov
2022-12-29 6:29 ` Yonghong Song
2023-01-03 2:40 ` Tonghao Zhang
2023-01-04 7:09 ` Yonghong Song
2023-01-04 7:51 ` Hou Tao
2023-01-04 8:01 ` Yonghong Song [this message]
2023-01-04 14:32 ` Tonghao Zhang
2023-01-04 17:10 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ff6473d8-c640-267e-c0f7-a92ce747c888@meta.com \
--to=yhs@meta.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=houtao1@huawei.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=xiangxia.m.yue@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox