public inbox for linux-kselftest@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bot+bpf-ci@kernel.org, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>, Eduard <eddyz87@gmail.com>,
	Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	kernel-patches-bot@fb.com,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Chris Mason <clm@meta.com>,
	Ihor Solodrai <ihor.solodrai@linux.dev>
Subject: Re: [PATCH bpf-next 2/3] bpf: Avoid deadlock using trylock when popping LRU free nodes
Date: Tue, 20 Jan 2026 10:19:50 +0800	[thread overview]
Message-ID: <cf9f8df6-91bc-4fc2-8605-4c3c6666e44e@linux.dev> (raw)
In-Reply-To: <CAADnVQKMOK8jMcYugtrjsqh3O-C_hUsKVL6jKfcGVD5g_1pVxQ@mail.gmail.com>



On 20/1/26 10:01, Alexei Starovoitov wrote:
> On Mon, Jan 19, 2026 at 5:57 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>>
>>
>> On 20/1/26 02:46, bot+bpf-ci@kernel.org wrote:
>>>> diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c
>>>> index c091f3232cc5..03d37f72731a 100644
>>>> --- a/kernel/bpf/bpf_lru_list.c
>>>> +++ b/kernel/bpf/bpf_lru_list.c
>>>
>>> [ ... ]
>>>
>>>> @@ -451,13 +455,12 @@ static struct bpf_lru_node *bpf_common_lru_pop_free(struct bpf_lru *lru,
>>>>
>>>>      loc_l = per_cpu_ptr(clru->local_list, cpu);
>>>>
>>>> -    raw_spin_lock_irqsave(&loc_l->lock, flags);
>>>> +    if (!raw_spin_trylock_irqsave(&loc_l->lock, flags))
>>>> +            return NULL;
>>>>
>>>>      node = __local_list_pop_free(loc_l);
>>>> -    if (!node) {
>>>> -            bpf_lru_list_pop_free_to_local(lru, loc_l);
>>>> +    if (!node && bpf_lru_list_pop_free_to_local(lru, loc_l))
>>>>              node = __local_list_pop_free(loc_l);
>>>> -    }
>>>>
>>>>      if (node)
>>>>              __local_list_add_pending(lru, loc_l, cpu, node, hash);
>>>
>>> The trylock conversion here looks correct, but what about the steal path
>>> that follows later in bpf_common_lru_pop_free()? Looking at the existing
>>> code (not shown in this diff), the steal loop still uses unconditional
>>> raw_spin_lock_irqsave():
>>>
>>>     do {
>>>         steal_loc_l = per_cpu_ptr(clru->local_list, steal);
>>>
>>>         raw_spin_lock_irqsave(&steal_loc_l->lock, flags);
>>>         ...
>>>     } while (...);
>>>
>>> If the goal is to avoid NMI-context deadlocks when acquiring LRU locks,
>>> can the same deadlock scenario occur when NMI interrupts during the steal
>>> loop and the NMI handler tries to acquire the same steal_loc_l->lock?
>>>
>>> Similarly, after a successful steal, there is another unconditional lock:
>>>
>>>     if (node) {
>>>         raw_spin_lock_irqsave(&loc_l->lock, flags);
>>>         __local_list_add_pending(lru, loc_l, cpu, node, hash);
>>>         raw_spin_unlock_irqrestore(&loc_l->lock, flags);
>>>     }
>>>
>>> Should these also use trylock to maintain consistency with the stated goal
>>> of avoiding NMI-context deadlocks?
>>>
>>
>> This patch is not intended to eliminate all possible deadlock scenarios.
>> Its goal is to avoid deadlocks caused by long-lived critical sections
>> in the free-node pop paths, where lock contention can persist and lead
>> to re-entrant lock acquisition from NMI context.
>>
>> The steal path and the post-steal update are both short-lived critical
>> sections. They do not exhibit the same contention characteristics and
>> have not been observed to trigger the reported deadlock scenarios.
>> Converting these paths to trylock would add complexity without clear
>> benefit, and is therefore unnecessary for the stated goal of this change.
> 
> AI is correct. Either everything needs to be converted or none.
> Adding trylock in a few places because syzbot found them is not fixing anything.
> Just silencing one (or a few?) syzbot reports.
> As I said in the other email, trylock is not an option.
> rqspinlock is the only true way of addressing potential deadlocks.
> If it's too hard, then leave it as-is. Do not hack things half way.

Understood.

Leave it as-is.

Thanks,
Leon


  reply	other threads:[~2026-01-20  2:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19 14:21 [PATCH bpf-next 0/3] bpf: Avoid deadlock using trylock when popping LRU free nodes Leon Hwang
2026-01-19 14:21 ` [PATCH bpf-next 1/3] bpf: Factor out bpf_lru_node_set_hash() helper Leon Hwang
2026-01-19 14:21 ` [PATCH bpf-next 2/3] bpf: Avoid deadlock using trylock when popping LRU free nodes Leon Hwang
2026-01-19 18:46   ` bot+bpf-ci
2026-01-20  1:56     ` Leon Hwang
2026-01-20  2:01       ` Alexei Starovoitov
2026-01-20  2:19         ` Leon Hwang [this message]
2026-01-19 19:47   ` Daniel Borkmann
2026-01-20  1:49     ` Leon Hwang
2026-01-20  1:54       ` Alexei Starovoitov
2026-01-19 14:21 ` [PATCH bpf-next 3/3] selftests/bpf: Allow -ENOMEM on LRU map updates Leon Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cf9f8df6-91bc-4fc2-8605-4c3c6666e44e@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bot+bpf-ci@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-patches-bot@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox