public inbox for xdp-newbies@vger.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Vadim Goncharov <vadimnuclight@gmail.com>, xdp-newbies@vger.kernel.org
Subject: Re: XDP/eBPF map thread safety in kernel (e.g. lookup/delete)
Date: Fri, 25 Oct 2024 13:06:57 +0200	[thread overview]
Message-ID: <87ttd0e90u.fsf@toke.dk> (raw)
In-Reply-To: <20241025001819.2475ec77@nuclight.lan>

Vadim Goncharov <vadimnuclight@gmail.com> writes:

>> > OK, no crash is half of good, thanks. But how'd I lock it from
>> > deletion? A "concurrent updates" are usually about memory area that
>> > still persist as a whole, but in my particular case it's a
>> > bpf_timer callback who does bpf_map_delete_elem(). This is a
>> > conntrack-like map - an entry contains some information about
>> > connection, a struct bpf_timer and expire field when this entry
>> > should be deleted due to inactivity, based on information in a
>> > connection (thus *_LRU map types are not suitable for me). So it's
>> > possible for a race condition here:
>> >
>> > 1.  Thread 1 receives packet, does bpf_map_lookup_elem() and begins
>> >     processing, at end of which bpf_timer_start() will be called to
>> >     reschedule removal into future.
>> >
>> > 2.  At moment after lookup, timer callback fires and does
>> >     bpf_map_delete_elem() while first thread is not yet finished.
>> >
>> > So how do I prevent connection record loss in such scenario?  
>> 
>> Yeah, there is no protection against this; you will have to do your
>> own locking to prevent it. Something like putting a spinlock into
>> your data structure and using that to synchronise access, maybe?
>
> Well, if I'll put bpf_spin_lock into map element, then it looks like
> the following scenario is possible:
>
> 1. Thread 1 receives packet, does bpf_map_lookup_elem(map, key)
>
>      1a. Timer callback(map, same_key) does bpf_spin_lock on same_key
>
> 2. Thread 2 does bpf_spin_lock on same_key and waits.
>
>      2a. Timer callback sees lock successful, sets DELETED flag in entry
>          and bpf_map_delete_elem()'s it.
>
> 3. Thread 1 unlocked and either updates map entry directly by pointer,
>     or reinserts new entry reinitializing timer.

Yeah, something like this.

> Am I missing some race condition still present here? atomics or
> something

Unfortunately, I think the delete and update can still race. Deletion is
based on the key only, there is no cmpxchg() semantics. So I don't see
any reason why this sequence of events couldn't happen:

- T2 sets the DELETED flag
- T2 releases the lock, and then gets preempted
- T1 grabs the lock, sees the deleted flag, allocates a new item and
  inserts it with update()
- T2 wakes back up and does a bpf_map_delete() with the expired key,
  which then ends up deleting the new entry

The only way I can think of to avoid this, is if the timer callback uses
bpf_map_lookup_and_delete_item(), then rechecks the value and re-inserts
it if the DELETED flag disappeared between operations. But this still
leaves a window where the entry doesn't exist.

Maybe someone else has a good idea for how to avoid this being racy?

-Toke


      reply	other threads:[~2024-10-25 11:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-23 11:54 XDP/eBPF map thread safety in kernel (e.g. lookup/delete) Vadim Goncharov
2024-10-23 12:10 ` Toke Høiland-Jørgensen
2024-10-23 12:31   ` Vadim Goncharov
     [not found]   ` <20241023152810.42936dc4@nuclight.lan>
     [not found]     ` <875xphftdq.fsf@toke.dk>
2024-10-24 21:18       ` Vadim Goncharov
2024-10-25 11:06         ` Toke Høiland-Jørgensen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttd0e90u.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=vadimnuclight@gmail.com \
    --cc=xdp-newbies@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox