All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jiri Olsa <jolsa@kernel.org>
Subject: Re: [BUG] Deadlock triggered by bpfsnoop funcgraph feature
Date: Thu, 28 Aug 2025 10:40:47 +0800	[thread overview]
Message-ID: <b3463ffa-c2cb-43c8-a0d2-92bad49e3c23@linux.dev> (raw)
In-Reply-To: <CAADnVQLDG=Oavh9He=ivXm9MPwsqWHttbTYQh1-EZuHpwujaBA@mail.gmail.com>



On 28/8/25 08:42, Alexei Starovoitov wrote:
> On Tue, Aug 26, 2025 at 7:58 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>>
>>
>> On 27/8/25 10:23, Alexei Starovoitov wrote:
>>> On Tue, Aug 26, 2025 at 7:13 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I’ve encountered a reproducible deadlock while developing the funcgraph
>>>> feature for bpfsnoop [0].
>>>
>>> debug it pls.
>>
>> It’s quite difficult for me. I’ve tried debugging it but didn’t succeed.
>>
>>> Sounds like you're implying that the root cause is in bpf,
>>> but why do you think so?
>>>
>>> You're attaching to things that shouldn't be attached to.
>>> Like rcu_lockdep_current_cpu_online()
>>> so effectively you're recursing in that lockdep code.
>>> See big lock there. It will dead lock for sure.
>>
>> If a function that acquires a lock can be traced by a tracing program,
>> bpfsnoop’s funcgraph will attempt to trace it as well. In such cases, a
>> deadlock is highly likely to occur.
>>
>> With bpfsnoop I try my best to avoid such deadlock issues. But what
>> about other bpf tracing tools? If they don’t handle this properly, the
>> kernel is very likely to crash.
> 
> bpf infra is trying hard not to crash it, but debug kernel is a different
> category. rcu_read_lock_held() doesn't exist in production kernels.
> You can propose adding "notrace" for it, but in general that doesn't scale.
> Same with rcu_lockdep_current_cpu_online().
> It probably deserves "notrace" too.

Indeed, it doesn't scale.

When I run
./bpfsnoop -k "htab_*_elem" --output-fgraph --fgraph-debug
--fgraph-exclude
'rcu_read_lock_*held,rcu_lockdep_current_cpu_online,*raw_spin_*lock*,kvfree,show_stack,put_task_stack',
the kernel doesn’t panic, but the OS eventually stalls and becomes
unresponsive to key presses.

It seems preferable to avoid running BPF programs continuously in such
cases.

Thanks,
Leon


  reply	other threads:[~2025-08-28  2:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  2:13 [BUG] Deadlock triggered by bpfsnoop funcgraph feature Leon Hwang
2025-08-27  2:23 ` Alexei Starovoitov
2025-08-27  2:58   ` Leon Hwang
2025-08-28  0:42     ` Alexei Starovoitov
2025-08-28  2:40       ` Leon Hwang [this message]
2025-08-28 11:50         ` Paul E. McKenney
2025-08-28 13:39           ` Leon Hwang
2025-08-28 16:43             ` Alexei Starovoitov
2025-08-28 17:24               ` Paul E. McKenney
2025-08-29  2:21                 ` Leon Hwang
2025-08-29 18:08                   ` Alexei Starovoitov
2025-09-01  2:38                     ` Leon Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3463ffa-c2cb-43c8-a0d2-92bad49e3c23@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.