From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Vadim Goncharov <vadimnuclight@gmail.com>, xdp-newbies@vger.kernel.org
Subject: Re: XDP/eBPF map thread safety in kernel (e.g. lookup/delete)
Date: Fri, 25 Oct 2024 13:06:57 +0200 [thread overview]
Message-ID: <87ttd0e90u.fsf@toke.dk> (raw)
In-Reply-To: <20241025001819.2475ec77@nuclight.lan>
Vadim Goncharov <vadimnuclight@gmail.com> writes:
>> > OK, no crash is half of good, thanks. But how'd I lock it from
>> > deletion? A "concurrent updates" are usually about memory area that
>> > still persist as a whole, but in my particular case it's a
>> > bpf_timer callback who does bpf_map_delete_elem(). This is a
>> > conntrack-like map - an entry contains some information about
>> > connection, a struct bpf_timer and expire field when this entry
>> > should be deleted due to inactivity, based on information in a
>> > connection (thus *_LRU map types are not suitable for me). So it's
>> > possible for a race condition here:
>> >
>> > 1. Thread 1 receives packet, does bpf_map_lookup_elem() and begins
>> > processing, at end of which bpf_timer_start() will be called to
>> > reschedule removal into future.
>> >
>> > 2. At moment after lookup, timer callback fires and does
>> > bpf_map_delete_elem() while first thread is not yet finished.
>> >
>> > So how do I prevent connection record loss in such scenario?
>>
>> Yeah, there is no protection against this; you will have to do your
>> own locking to prevent it. Something like putting a spinlock into
>> your data structure and using that to synchronise access, maybe?
>
> Well, if I'll put bpf_spin_lock into map element, then it looks like
> the following scenario is possible:
>
> 1. Thread 1 receives packet, does bpf_map_lookup_elem(map, key)
>
> 1a. Timer callback(map, same_key) does bpf_spin_lock on same_key
>
> 2. Thread 2 does bpf_spin_lock on same_key and waits.
>
> 2a. Timer callback sees lock successful, sets DELETED flag in entry
> and bpf_map_delete_elem()'s it.
>
> 3. Thread 1 unlocked and either updates map entry directly by pointer,
> or reinserts new entry reinitializing timer.
Yeah, something like this.
> Am I missing some race condition still present here? atomics or
> something
Unfortunately, I think the delete and update can still race. Deletion is
based on the key only, there is no cmpxchg() semantics. So I don't see
any reason why this sequence of events couldn't happen:
- T2 sets the DELETED flag
- T2 releases the lock, and then gets preempted
- T1 grabs the lock, sees the deleted flag, allocates a new item and
inserts it with update()
- T2 wakes back up and does a bpf_map_delete() with the expired key,
which then ends up deleting the new entry
The only way I can think of to avoid this, is if the timer callback uses
bpf_map_lookup_and_delete_item(), then rechecks the value and re-inserts
it if the DELETED flag disappeared between operations. But this still
leaves a window where the entry doesn't exist.
Maybe someone else has a good idea for how to avoid this being racy?
-Toke
prev parent reply other threads:[~2024-10-25 11:07 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 11:54 XDP/eBPF map thread safety in kernel (e.g. lookup/delete) Vadim Goncharov
2024-10-23 12:10 ` Toke Høiland-Jørgensen
2024-10-23 12:31 ` Vadim Goncharov
[not found] ` <20241023152810.42936dc4@nuclight.lan>
[not found] ` <875xphftdq.fsf@toke.dk>
2024-10-24 21:18 ` Vadim Goncharov
2024-10-25 11:06 ` Toke Høiland-Jørgensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttd0e90u.fsf@toke.dk \
--to=toke@redhat.com \
--cc=vadimnuclight@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.