From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Vadim Goncharov <vadimnuclight@gmail.com>, xdp-newbies@vger.kernel.org
Subject: Re: XDP/eBPF map thread safety in kernel (e.g. lookup/delete)
Date: Fri, 25 Oct 2024 13:06:57 +0200 [thread overview]
Message-ID: <87ttd0e90u.fsf@toke.dk> (raw)
In-Reply-To: <20241025001819.2475ec77@nuclight.lan>
Vadim Goncharov <vadimnuclight@gmail.com> writes:
>> > OK, no crash is half of good, thanks. But how'd I lock it from
>> > deletion? A "concurrent updates" are usually about memory area that
>> > still persist as a whole, but in my particular case it's a
>> > bpf_timer callback who does bpf_map_delete_elem(). This is a
>> > conntrack-like map - an entry contains some information about
>> > connection, a struct bpf_timer and expire field when this entry
>> > should be deleted due to inactivity, based on information in a
>> > connection (thus *_LRU map types are not suitable for me). So it's
>> > possible for a race condition here:
>> >
>> > 1. Thread 1 receives packet, does bpf_map_lookup_elem() and begins
>> > processing, at end of which bpf_timer_start() will be called to
>> > reschedule removal into future.
>> >
>> > 2. At moment after lookup, timer callback fires and does
>> > bpf_map_delete_elem() while first thread is not yet finished.
>> >
>> > So how do I prevent connection record loss in such scenario?
>>
>> Yeah, there is no protection against this; you will have to do your
>> own locking to prevent it. Something like putting a spinlock into
>> your data structure and using that to synchronise access, maybe?
>
> Well, if I'll put bpf_spin_lock into map element, then it looks like
> the following scenario is possible:
>
> 1. Thread 1 receives packet, does bpf_map_lookup_elem(map, key)
>
> 1a. Timer callback(map, same_key) does bpf_spin_lock on same_key
>
> 2. Thread 2 does bpf_spin_lock on same_key and waits.
>
> 2a. Timer callback sees lock successful, sets DELETED flag in entry
> and bpf_map_delete_elem()'s it.
>
> 3. Thread 1 unlocked and either updates map entry directly by pointer,
> or reinserts new entry reinitializing timer.
Yeah, something like this.
> Am I missing some race condition still present here? atomics or
> something
Unfortunately, I think the delete and update can still race. Deletion is
based on the key only, there is no cmpxchg() semantics. So I don't see
any reason why this sequence of events couldn't happen:
- T2 sets the DELETED flag
- T2 releases the lock, and then gets preempted
- T1 grabs the lock, sees the deleted flag, allocates a new item and
inserts it with update()
- T2 wakes back up and does a bpf_map_delete() with the expired key,
which then ends up deleting the new entry
The only way I can think of to avoid this, is if the timer callback uses
bpf_map_lookup_and_delete_item(), then rechecks the value and re-inserts
it if the DELETED flag disappeared between operations. But this still
leaves a window where the entry doesn't exist.
Maybe someone else has a good idea for how to avoid this being racy?
-Toke
prev parent reply other threads:[~2024-10-25 11:07 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 11:54 XDP/eBPF map thread safety in kernel (e.g. lookup/delete) Vadim Goncharov
2024-10-23 12:10 ` Toke Høiland-Jørgensen
2024-10-23 12:31 ` Vadim Goncharov
[not found] ` <20241023152810.42936dc4@nuclight.lan>
[not found] ` <875xphftdq.fsf@toke.dk>
2024-10-24 21:18 ` Vadim Goncharov
2024-10-25 11:06 ` Toke Høiland-Jørgensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttd0e90u.fsf@toke.dk \
--to=toke@redhat.com \
--cc=vadimnuclight@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox