Re: [RFC PATCH] bpf: Prevent recursive deadlocks in BPF programs attached to spin lock helpers using fentry/ fexit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Siddharth Chintamaneni <sidchintamaneni@gmail.com>,
	Jiri Olsa <olsajiri@gmail.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	"alexei.starovoitov@gmail.com" <alexei.starovoitov@gmail.com>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"andrii@kernel.org" <andrii@kernel.org>,
	"Williams, Dan" <djwillia@vt.edu>,
	"Somaraju, Sai Roop" <sairoop@vt.edu>,
	"Sahu, Raj" <rjsu26@vt.edu>, "Craun, Milo" <miloc@vt.edu>,
	"sidchintamaneni@vt.edu" <sidchintamaneni@vt.edu>
Subject: Re: [RFC PATCH] bpf: Prevent recursive deadlocks in BPF programs attached to spin lock helpers using fentry/ fexit
Date: Sun, 4 Feb 2024 11:09:04 -0800	[thread overview]
Message-ID: <d320eb5c-3276-49f9-879a-fb318164d4f1@linux.dev> (raw)
In-Reply-To: <CAE5sdEg6yUc_Jz50AnUXEEUh6O73yQ1Z6NV2srJnef0ZrQkZew@mail.gmail.com>


On 2/2/24 4:21 PM, Siddharth Chintamaneni wrote:
> On Tue, 30 Jan 2024 at 04:25, Jiri Olsa <olsajiri@gmail.com> wrote:
>> On Wed, Jan 24, 2024 at 10:43:32AM -0500, Siddharth Chintamaneni wrote:
>>> While we were working on some experiments with BPF trampoline, we came
>>> across a deadlock scenario that could happen.
>>>
>>> A deadlock happens when two nested BPF programs tries to acquire the
>>> same lock i.e, If a BPF program is attached using fexit to
>>> bpf_spin_lock or using a fentry to bpf_spin_unlock, and it then
>>> attempts to acquire the same lock as the previous BPF program, a
>>> deadlock situation arises.
>>>
>>> Here is an example:
>>>
>>> SEC(fentry/bpf_spin_unlock)
>>> int fentry_2{
>>>    bpf_spin_lock(&x->lock);
>>>    bpf_spin_unlock(&x->lock);
>>> }
>>>
>>> SEC(fentry/xxx)
>>> int fentry_1{
>>>    bpf_spin_lock(&x->lock);
>>>    bpf_spin_unlock(&x->lock);
>>> }
>> hi,
>> looks like valid issue, could you add selftest for that?
> Hello,
> I have added selftest for the deadlock scenario.
>
>> I wonder we could restrict just programs that use bpf_spin_lock/bpf_spin_unlock
>> helpers? I'm not sure there's any useful use case for tracing spin lock helpers,
>> but I think we should at least try this before we deny it completely
>>
> If we restrict programs (attached to spinlock helpers) that use
> bpf_spin_lock/unlock helpers, there could be a scenario where a helper
> function called within the program has a BPF program attached that
> tries to acquire the same lock.
>
>>> To prevent these cases, a simple fix could be adding these helpers to
>>> denylist in the verifier. This fix will prevent the BPF programs from
>>> being loaded by the verifier.
>>>
>>> previously, a similar solution was proposed to prevent recursion.
>>> https://lore.kernel.org/lkml/20230417154737.12740-2-laoar.shao@gmail.com/
>> the difference is that __rcu_read_lock/__rcu_read_unlock are called unconditionally
>> (always) when executing bpf tracing probe, the problem you described above is only
>> for programs calling spin lock helpers (on same spin lock)
>>
>>> Signed-off-by: Siddharth Chintamaneni <sidchintamaneni@vt.edu>
>>> ---
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index 65f598694d55..8f1834f27f81 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -20617,6 +20617,10 @@ BTF_ID(func, preempt_count_sub)
>>>   BTF_ID(func, __rcu_read_lock)
>>>   BTF_ID(func, __rcu_read_unlock)
>>>   #endif
>>> +#if defined(CONFIG_DYNAMIC_FTRACE)
>> why the CONFIG_DYNAMIC_FTRACE dependency?
> As we described in the self-tests, nesting of multiple BPF programs
> could only happen with fentry/fexit programs when DYNAMIC_FTRACE is
> enabled. In other scenarios, when DYNAMIC_FTRACE is disabled, a BPF
> program cannot be attached to any helper functions.
>> jirka
>>
>>> +BTF_ID(func, bpf_spin_lock)
>>> +BTF_ID(func, bpf_spin_unlock)
>>> +#endif
>>>   BTF_SET_END(btf_id_deny)

Currently, we already have 'notrace' marked to bpf_spin_lock
and bpf_spin_unlock:

notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock)
{
         __bpf_spin_lock_irqsave(lock);
         return 0;
}
notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock)
{
         __bpf_spin_unlock_irqrestore(lock);
         return 0;
}

But unfortunately, BPF_CALL_* macros put notrace to the static
inline function ____bpf_spin_lock()/____bpf_spin_unlock(), and not
to static function bpf_spin_lock()/bpf_spin_unlock().

I think the following is a better fix and reflects original
intention:

diff --git a/include/linux/filter.h b/include/linux/filter.h
index fee070b9826e..779f8ee71607 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -566,6 +566,25 @@ static inline bool insn_is_zext(const struct bpf_insn *insn)
  #define BPF_CALL_4(name, ...)  BPF_CALL_x(4, name, __VA_ARGS__)
  #define BPF_CALL_5(name, ...)  BPF_CALL_x(5, name, __VA_ARGS__)
  
+#define NOTRACE_BPF_CALL_x(x, name, ...)                                              \
+       static __always_inline                                                 \
+       u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__));   \
+       typedef u64 (*btf_##name)(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__)); \
+       notrace u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__));         \
+       notrace u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__))          \
+       {                                                                      \
+               return ((btf_##name)____##name)(__BPF_MAP(x,__BPF_CAST,__BPF_N,__VA_ARGS__));\
+       }                                                                      \
+       static __always_inline                                                 \
+       u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__))
+
+#define NOTRACE_BPF_CALL_0(name, ...)  NOTRACE_BPF_CALL_x(0, name, __VA_ARGS__)
+#define NOTRACE_BPF_CALL_1(name, ...)  NOTRACE_BPF_CALL_x(1, name, __VA_ARGS__)
+#define NOTRACE_BPF_CALL_2(name, ...)  NOTRACE_BPF_CALL_x(2, name, __VA_ARGS__)
+#define NOTRACE_BPF_CALL_3(name, ...)  NOTRACE_BPF_CALL_x(3, name, __VA_ARGS__)
+#define NOTRACE_BPF_CALL_4(name, ...)  NOTRACE_BPF_CALL_x(4, name, __VA_ARGS__)
+#define NOTRACE_BPF_CALL_5(name, ...)  NOTRACE_BPF_CALL_x(5, name, __VA_ARGS__)
+
  #define bpf_ctx_range(TYPE, MEMBER)                                            \
         offsetof(TYPE, MEMBER) ... offsetofend(TYPE, MEMBER) - 1
  #define bpf_ctx_range_till(TYPE, MEMBER1, MEMBER2)                             \
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4db1c658254c..87136e27a99a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -334,7 +334,7 @@ static inline void __bpf_spin_lock_irqsave(struct bpf_spin_lock *lock)
         __this_cpu_write(irqsave_flags, flags);
  }
  
-notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock)
+NOTRACE_BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock)
  {
         __bpf_spin_lock_irqsave(lock);
         return 0;
@@ -357,7 +357,7 @@ static inline void __bpf_spin_unlock_irqrestore(struct bpf_spin_lock *lock)
         local_irq_restore(flags);
  }
  
-notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock)
+NOTRACE_BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock)
  {
         __bpf_spin_unlock_irqrestore(lock);
         return 0;


Macros NOTRACE_BPF_CALL_*() could be consolated with BPF_CALL_*() but I think
a separate macro might be easier to understand.

> Signed-off-by: Siddharth Chintamaneni <sidchintamaneni@vt.edu>
> ---
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 65f598694d55..ffc2515195f1 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -20617,6 +20617,10 @@ BTF_ID(func, preempt_count_sub)
>   BTF_ID(func, __rcu_read_lock)
>   BTF_ID(func, __rcu_read_unlock)
>   #endif
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +BTF_ID(func, bpf_spin_lock)
> +BTF_ID(func, bpf_spin_unlock)
> +#endif
>   BTF_SET_END(btf_id_deny)
>
>   static bool can_be_sleepable(struct bpf_prog *prog)
[...]

next prev parent reply	other threads:[~2024-02-04 19:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 15:43 [RFC PATCH] bpf: Prevent recursive deadlocks in BPF programs attached to spin lock helpers using fentry/ fexit Siddharth Chintamaneni
2024-01-30  9:25 ` Jiri Olsa
2024-02-03  0:21   ` Siddharth Chintamaneni
2024-02-04 19:09     ` Yonghong Song [this message]
2024-02-07  0:21       ` Siddharth Chintamaneni
2024-02-07  4:25         ` Yonghong Song

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:fee070b9826 dfblob:779f8ee7160 dfblob:4db1c658254
dfblob:87136e27a99 )
 OR (
bs:"Re: [RFC PATCH] bpf: Prevent recursive deadlocks in BPF programs attached to spin lock helpers using fentry/ fexit" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d320eb5c-3276-49f9-879a-fb318164d4f1@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=djwillia@vt.edu \
    --cc=miloc@vt.edu \
    --cc=olsajiri@gmail.com \
    --cc=rjsu26@vt.edu \
    --cc=sairoop@vt.edu \
    --cc=sidchintamaneni@gmail.com \
    --cc=sidchintamaneni@vt.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.