From: Francis Laniel <flaniel@linux.microsoft.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
Joanne Koong <joannelkoong@gmail.com>,
Dave Marchevsky <davemarchevsky@fb.com>,
Lorenzo Bianconi <lorenzo@kernel.org>,
Geliang Tang <geliang.tang@suse.com>,
Hengqi Chen <hengqi.chen@gmail.com>
Subject: Re: [RFC PATCH v1 1/3] bpf: Make ring buffer overwritable.
Date: Tue, 16 Aug 2022 12:23:41 +0200 [thread overview]
Message-ID: <1735233.VLH7GnMWUR@pwmachine> (raw)
In-Reply-To: <CAEf4BzYex03T7aYjLnbkfHb8vUsCHhj_DiMU6KbK29F+DyhXyA@mail.gmail.com>
Hi.
Le lundi 15 août 2022, 23:52:22 CEST Andrii Nakryiko a écrit :
> On Wed, Aug 10, 2022 at 10:18 AM Francis Laniel
>
> <flaniel@linux.microsoft.com> wrote:
> > By default, BPF ring buffer are size bounded, when producers already
> > filled the buffer, they need to wait for the consumer to get those data
> > before adding new ones.
> > In terms of API, bpf_ringbuf_reserve() returns NULL if the buffer is full.
> >
> > This patch permits making BPF ring buffer overwritable.
> > When producers already wrote as many data as the buffer size, they will
> > begin to over write existing data, so the oldest will be replaced.
> > As a result, bpf_ringbuf_reserve() never returns NULL.
>
> Part of BPF ringbuf record (first 8 bytes) stores information like
> record size and offset in pages to the beginning of ringbuf map
> metadata. This is used by consumer to know how much data belongs to
> data record, but also for making sure that
> bpf_ringbuf_reserve()/bpf_ringbuf_submit() work correctly and don't
> corrupt kernel memory.
>
> If we simply allow overwriting this information (and no, spinlock
> doesn't protect from that, you can have multiple producers writing to
> different parts of ringbuf data area in parallel after "reserving"
> their respective records), it completely breaks any sort of
> correctness, both for user-space consumer and kernel-side producers.
Thank you for your answer.
My current implementation is indeed wrong as I based it on the wrong
assumption than BPF ring buffer could only store data of the same size...
With data of different size, we can have the troubles you described.
I will rework my patches and send a new version once polished but I
cannot give an ETA.
> > Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
> > ---
> >
> > include/uapi/linux/bpf.h | 3 +++
> > kernel/bpf/ringbuf.c | 51 +++++++++++++++++++++++++++++++---------
> > 2 files changed, 43 insertions(+), 11 deletions(-)
>
> [...]
Best regards.
next prev parent reply other threads:[~2022-08-16 10:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-10 17:16 [RFC PATCH v1 0/3] Make BPF ring buffer over writable Francis Laniel
2022-08-10 17:16 ` [RFC PATCH v1 1/3] bpf: Make ring buffer overwritable Francis Laniel
2022-08-15 21:52 ` Andrii Nakryiko
2022-08-16 10:23 ` Francis Laniel [this message]
2022-08-16 12:28 ` Alban Crequy
2022-08-10 17:16 ` [RFC PATCH v1 2/3] do not merge: Temporary fix for is_power_of_2 Francis Laniel
2022-08-10 17:16 ` [RFC PATCH v1 3/3] libbpf: Make bpf ring buffer overwritable Francis Laniel
2022-08-10 17:16 ` [PATCH] for test purpose only: Add toy to play with BPF ring buffer Francis Laniel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1735233.VLH7GnMWUR@pwmachine \
--to=flaniel@linux.microsoft.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=geliang.tang@suse.com \
--cc=haoluo@google.com \
--cc=hengqi.chen@gmail.com \
--cc=joannelkoong@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lorenzo@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.