* [PATCH] bpf: use raw_spinlock_t in ringbuf
@ 2024-09-20 19:06 Wander Lairson Costa
2024-09-24 16:46 ` Daniel Borkmann
2024-09-25 10:00 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 3+ messages in thread
From: Wander Lairson Costa @ 2024-09-20 19:06 UTC (permalink / raw)
To: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
open list:BPF [RINGBUF], open list
Cc: Wander Lairson Costa, Brian Grech, Wander Lairson Costa
From: Wander Lairson Costa <wander.lairson@gmail.com>
The function __bpf_ringbuf_reserve is invoked from a tracepoint, which
disables preemption. Using spinlock_t in this context can lead to a
"sleep in atomic" warning in the RT variant. This issue is illustrated
in the example below:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556208, name: test_progs
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
INFO: lockdep is turned off.
Preemption disabled at:
[<ffffd33a5c88ea44>] migrate_enable+0xc0/0x39c
CPU: 7 PID: 556208 Comm: test_progs Tainted: G
Hardware name: Qualcomm SA8775P Ride (DT)
Call trace:
dump_backtrace+0xac/0x130
show_stack+0x1c/0x30
dump_stack_lvl+0xac/0xe8
dump_stack+0x18/0x30
__might_resched+0x3bc/0x4fc
rt_spin_lock+0x8c/0x1a4
__bpf_ringbuf_reserve+0xc4/0x254
bpf_ringbuf_reserve_dynptr+0x5c/0xdc
bpf_prog_ac3d15160d62622a_test_read_write+0x104/0x238
trace_call_bpf+0x238/0x774
perf_call_bpf_enter.isra.0+0x104/0x194
perf_syscall_enter+0x2f8/0x510
trace_sys_enter+0x39c/0x564
syscall_trace_enter+0x220/0x3c0
do_el0_svc+0x138/0x1dc
el0_svc+0x54/0x130
el0t_64_sync_handler+0x134/0x150
el0t_64_sync+0x17c/0x180
Switch the spinlock to raw_spinlock_t to avoid this error.
Signed-off-by: Wander Lairson Costa <wander.lairson@gmail.com>
Reported-by: Brian Grech <bgrech@redhat.com>
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
---
kernel/bpf/ringbuf.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index e20b90c36131..de3b681d1d13 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -29,7 +29,7 @@ struct bpf_ringbuf {
u64 mask;
struct page **pages;
int nr_pages;
- spinlock_t spinlock ____cacheline_aligned_in_smp;
+ raw_spinlock_t spinlock ____cacheline_aligned_in_smp;
/* For user-space producer ring buffers, an atomic_t busy bit is used
* to synchronize access to the ring buffers in the kernel, rather than
* the spinlock that is used for kernel-producer ring buffers. This is
@@ -173,7 +173,7 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node)
if (!rb)
return NULL;
- spin_lock_init(&rb->spinlock);
+ raw_spin_lock_init(&rb->spinlock);
atomic_set(&rb->busy, 0);
init_waitqueue_head(&rb->waitq);
init_irq_work(&rb->work, bpf_ringbuf_notify);
@@ -421,10 +421,10 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size)
cons_pos = smp_load_acquire(&rb->consumer_pos);
if (in_nmi()) {
- if (!spin_trylock_irqsave(&rb->spinlock, flags))
+ if (!raw_spin_trylock_irqsave(&rb->spinlock, flags))
return NULL;
} else {
- spin_lock_irqsave(&rb->spinlock, flags);
+ raw_spin_lock_irqsave(&rb->spinlock, flags);
}
pend_pos = rb->pending_pos;
@@ -450,7 +450,7 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size)
*/
if (new_prod_pos - cons_pos > rb->mask ||
new_prod_pos - pend_pos > rb->mask) {
- spin_unlock_irqrestore(&rb->spinlock, flags);
+ raw_spin_unlock_irqrestore(&rb->spinlock, flags);
return NULL;
}
@@ -462,7 +462,7 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size)
/* pairs with consumer's smp_load_acquire() */
smp_store_release(&rb->producer_pos, new_prod_pos);
- spin_unlock_irqrestore(&rb->spinlock, flags);
+ raw_spin_unlock_irqrestore(&rb->spinlock, flags);
return (void *)hdr + BPF_RINGBUF_HDR_SZ;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] bpf: use raw_spinlock_t in ringbuf
2024-09-20 19:06 [PATCH] bpf: use raw_spinlock_t in ringbuf Wander Lairson Costa
@ 2024-09-24 16:46 ` Daniel Borkmann
2024-09-25 10:00 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: Daniel Borkmann @ 2024-09-24 16:46 UTC (permalink / raw)
To: Wander Lairson Costa, Andrii Nakryiko, Alexei Starovoitov,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
open list:BPF [RINGBUF], open list
Cc: Wander Lairson Costa, Brian Grech
On 9/20/24 9:06 PM, Wander Lairson Costa wrote:
> From: Wander Lairson Costa <wander.lairson@gmail.com>
>
> The function __bpf_ringbuf_reserve is invoked from a tracepoint, which
> disables preemption. Using spinlock_t in this context can lead to a
> "sleep in atomic" warning in the RT variant. This issue is illustrated
> in the example below:
>
> BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556208, name: test_progs
> preempt_count: 1, expected: 0
> RCU nest depth: 1, expected: 1
> INFO: lockdep is turned off.
> Preemption disabled at:
> [<ffffd33a5c88ea44>] migrate_enable+0xc0/0x39c
> CPU: 7 PID: 556208 Comm: test_progs Tainted: G
> Hardware name: Qualcomm SA8775P Ride (DT)
> Call trace:
> dump_backtrace+0xac/0x130
> show_stack+0x1c/0x30
> dump_stack_lvl+0xac/0xe8
> dump_stack+0x18/0x30
> __might_resched+0x3bc/0x4fc
> rt_spin_lock+0x8c/0x1a4
> __bpf_ringbuf_reserve+0xc4/0x254
> bpf_ringbuf_reserve_dynptr+0x5c/0xdc
> bpf_prog_ac3d15160d62622a_test_read_write+0x104/0x238
> trace_call_bpf+0x238/0x774
> perf_call_bpf_enter.isra.0+0x104/0x194
> perf_syscall_enter+0x2f8/0x510
> trace_sys_enter+0x39c/0x564
> syscall_trace_enter+0x220/0x3c0
> do_el0_svc+0x138/0x1dc
> el0_svc+0x54/0x130
> el0t_64_sync_handler+0x134/0x150
> el0t_64_sync+0x17c/0x180
>
> Switch the spinlock to raw_spinlock_t to avoid this error.
>
> Signed-off-by: Wander Lairson Costa <wander.lairson@gmail.com>
> Reported-by: Brian Grech <bgrech@redhat.com>
> Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Fix is for bpf tree, lgtm:
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] bpf: use raw_spinlock_t in ringbuf
2024-09-20 19:06 [PATCH] bpf: use raw_spinlock_t in ringbuf Wander Lairson Costa
2024-09-24 16:46 ` Daniel Borkmann
@ 2024-09-25 10:00 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-09-25 10:00 UTC (permalink / raw)
To: Wander Lairson Costa
Cc: andrii, ast, daniel, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, bpf, linux-kernel,
wander.lairson, bgrech
Hello:
This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:
On Fri, 20 Sep 2024 16:06:59 -0300 you wrote:
> From: Wander Lairson Costa <wander.lairson@gmail.com>
>
> The function __bpf_ringbuf_reserve is invoked from a tracepoint, which
> disables preemption. Using spinlock_t in this context can lead to a
> "sleep in atomic" warning in the RT variant. This issue is illustrated
> in the example below:
>
> [...]
Here is the summary with links:
- bpf: use raw_spinlock_t in ringbuf
https://git.kernel.org/bpf/bpf/c/8b62645b09f8
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-09-25 10:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-20 19:06 [PATCH] bpf: use raw_spinlock_t in ringbuf Wander Lairson Costa
2024-09-24 16:46 ` Daniel Borkmann
2024-09-25 10:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox