From: Vlastimil Babka <vbabka@suse.cz>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf@vger.kernel.org, andrii@kernel.org, memxor@gmail.com,
akpm@linux-foundation.org, peterz@infradead.org,
rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org,
shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org,
tglx@linutronix.de, jannh@google.com, tj@kernel.org,
linux-mm@kvack.org, kernel-team@fb.com
Subject: Re: [PATCH bpf-next v5 3/7] locking/local_lock: Introduce local_trylock_irqsave()
Date: Tue, 21 Jan 2025 16:59:40 +0100 [thread overview]
Message-ID: <cec11348-a55f-40b4-9011-0e83113ade63@suse.cz> (raw)
In-Reply-To: <20250117203315.FWviQT38@linutronix.de>
On 1/17/25 9:33 PM, Sebastian Andrzej Siewior wrote:
> On 2025-01-14 18:17:42 [-0800], Alexei Starovoitov wrote:
>> diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
>> index 8dd71fbbb6d2..93672127c73d 100644
>> --- a/include/linux/local_lock_internal.h
>> +++ b/include/linux/local_lock_internal.h
>> @@ -75,37 +85,73 @@ do { \
>>
>> #define __local_lock(lock) \
>> do { \
>> + local_lock_t *l; \
>> preempt_disable(); \
>> - local_lock_acquire(this_cpu_ptr(lock)); \
>> + l = this_cpu_ptr(lock); \
>> + lockdep_assert(l->active == 0); \
>> + WRITE_ONCE(l->active, 1); \
>> + local_lock_acquire(l); \
>> } while (0)
>
> …
>
>> +#define __local_trylock_irqsave(lock, flags) \
>> + ({ \
>> + local_lock_t *l; \
>> + local_irq_save(flags); \
>> + l = this_cpu_ptr(lock); \
>> + if (READ_ONCE(l->active) == 1) { \
>> + local_irq_restore(flags); \
>> + l = NULL; \
>> + } else { \
>> + WRITE_ONCE(l->active, 1); \
>> + local_trylock_acquire(l); \
>> + } \
>> + !!l; \
>> + })
>> +
>
> Part of the selling for local_lock_t was that it does not affect
> !PREEMPT_RT builds. By adding `active' you extend every data structure
> and you have an extra write on every local_lock(). It was meant to
> replace preempt_disable()/ local_irq_save() based locking with something
> that actually does locking on PREEMPT_RT without risking my life once
> people with pitchforks come talk about the new locking :)
>
> I admire that you try to make PREEMPT_RT and !PREEMPT_RT similar in a
> way that both detect recursive locking which not meant to be supported.
>
> Realistically speaking as of today we don't have any recursive lock
> detection other than lockdep. So it should be enough given that the bots
> use it often and hopefully local testing.
> Your assert in local_lock() does not work without lockdep. It will only
> make local_trylock_irqsave() detect recursion and lead to two splats
> with lockdep enabled in local_lock() (one from the assert and the second
> from lockdep).
>
> I would say you could get rid of the `active' field and solely rely on
> lockdep and the owner field. So __local_trylock_irqsave() could maybe
> use local_trylock_acquire() to conditionally acquire the lock if `owner'
> is NULL.
>
> This makes it possible to have recursive code without lockdep, but with
> lockdep enabled the testcase should fail if it relies on recursion.
> Other than that, I don't see any advantage. Would that work?
I don't think it would work, or am I missing something? The goal is to
allow the operation (alloc, free) to opportunistically succeed in e.g.
nmi context, but only if we didn't interrupt anything that holds the
lock. Otherwise we must allow for failure - hence trylock.
(a possible extension that I mentioned is to also stop doing irqsave to
avoid its overhead and thus also operations from an irq context would be
oportunistic)
But if we detect the "trylock must fail" cases only using lockdep, we'll
deadlock without lockdep. So e.g. the "active" flag has to be there?
So yes this goes beyond the original purpose of local_lock. Do you think
it should be a different lock type then, which would mean the other
users of current local_lock that don't want the opportunistic nesting
via trylock, would not inflict the "active" flag overhead?
AFAICS the RT implementation of local_lock could then be shared for both
of these types, but I might be missing some nuance there.
> Sebastian
next prev parent reply other threads:[~2025-01-21 15:58 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-15 2:17 [PATCH bpf-next v5 0/7] bpf, mm: Introduce try_alloc_pages() Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 1/7] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Alexei Starovoitov
2025-01-15 11:19 ` Vlastimil Babka
2025-01-15 23:00 ` Alexei Starovoitov
2025-01-15 23:47 ` Shakeel Butt
2025-01-16 2:44 ` Alexei Starovoitov
2025-01-15 23:16 ` Shakeel Butt
2025-01-17 18:19 ` Sebastian Andrzej Siewior
2025-01-15 2:17 ` [PATCH bpf-next v5 2/7] mm, bpf: Introduce free_pages_nolock() Alexei Starovoitov
2025-01-15 11:47 ` Vlastimil Babka
2025-01-15 23:15 ` Alexei Starovoitov
2025-01-16 8:31 ` Vlastimil Babka
2025-01-17 18:20 ` Sebastian Andrzej Siewior
2025-01-15 2:17 ` [PATCH bpf-next v5 3/7] locking/local_lock: Introduce local_trylock_irqsave() Alexei Starovoitov
2025-01-15 2:23 ` Alexei Starovoitov
2025-01-15 7:22 ` Sebastian Sewior
2025-01-15 14:22 ` Vlastimil Babka
2025-01-16 2:20 ` Alexei Starovoitov
2025-01-17 20:33 ` Sebastian Andrzej Siewior
2025-01-21 15:59 ` Vlastimil Babka [this message]
2025-01-21 16:43 ` Sebastian Andrzej Siewior
2025-01-22 1:35 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 4/7] memcg: Use trylock to access memcg stock_lock Alexei Starovoitov
2025-01-15 16:07 ` Vlastimil Babka
2025-01-16 0:12 ` Shakeel Butt
2025-01-16 2:22 ` Alexei Starovoitov
2025-01-16 20:07 ` Joshua Hahn
2025-01-17 17:36 ` Johannes Weiner
2025-01-15 2:17 ` [PATCH bpf-next v5 5/7] mm, bpf: Use memcg in try_alloc_pages() Alexei Starovoitov
2025-01-15 17:51 ` Vlastimil Babka
2025-01-16 0:24 ` Shakeel Butt
2025-01-15 2:17 ` [PATCH bpf-next v5 6/7] mm: Make failslab, kfence, kmemleak aware of trylock mode Alexei Starovoitov
2025-01-15 17:57 ` Vlastimil Babka
2025-01-16 2:23 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 7/7] bpf: Use try_alloc_pages() to allocate pages for bpf needs Alexei Starovoitov
2025-01-15 18:02 ` Vlastimil Babka
2025-01-16 2:25 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cec11348-a55f-40b4-9011-0e83113ade63@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=houtao1@huawei.com \
--cc=jannh@google.com \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox