From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: Yan Vugenfirer <yan@daynix.com>, Jason Wang <jasowang@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Andrew Melnychenko <andrew@daynix.com>,
qemu-devel@nongnu.org
Subject: Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net
Date: Tue, 3 Nov 2020 11:56:02 +0000 [thread overview]
Message-ID: <20201103115602.GI205187@redhat.com> (raw)
In-Reply-To: <CAOEp5Oe3btwgPcOA6v=kK9s2to=x2Hg6Qw2iCFXOOWZs49s=-Q@mail.gmail.com>
On Tue, Nov 03, 2020 at 12:32:43PM +0200, Yuri Benditovich wrote:
> On Tue, Nov 3, 2020 at 11:02 AM Jason Wang <jasowang@redhat.com> wrote:
>
> >
> > On 2020/11/3 上午2:51, Andrew Melnychenko wrote:
> > > Basic idea is to use eBPF to calculate and steer packets in TAP.
> > > RSS(Receive Side Scaling) is used to distribute network packets to guest
> > virtqueues
> > > by calculating packet hash.
> > > eBPF RSS allows us to use RSS with vhost TAP.
> > >
> > > This set of patches introduces the usage of eBPF for packet steering
> > > and RSS hash calculation:
> > > * RSS(Receive Side Scaling) is used to distribute network packets to
> > > guest virtqueues by calculating packet hash
> > > * eBPF RSS suppose to be faster than already existing 'software'
> > > implementation in QEMU
> > > * Additionally adding support for the usage of RSS with vhost
> > >
> > > Supported kernels: 5.8+
> > >
> > > Implementation notes:
> > > Linux TAP TUNSETSTEERINGEBPF ioctl was used to set the eBPF program.
> > > Added eBPF support to qemu directly through a system call, see the
> > > bpf(2) for details.
> > > The eBPF program is part of the qemu and presented as an array of bpf
> > > instructions.
> > > The program can be recompiled by provided Makefile.ebpf(need to adjust
> > > 'linuxhdrs'),
> > > although it's not required to build QEMU with eBPF support.
> > > Added changes to virtio-net and vhost, primary eBPF RSS is used.
> > > 'Software' RSS used in the case of hash population and as a fallback
> > option.
> > > For vhost, the hash population feature is not reported to the guest.
> > >
> > > Please also see the documentation in PATCH 6/6.
> > >
> > > I am sending those patches as RFC to initiate the discussions and get
> > > feedback on the following points:
> > > * Fallback when eBPF is not supported by the kernel
> >
> >
> > Yes, and it could also a lacking of CAP_BPF.
> >
> >
> > > * Live migration to the kernel that doesn't have eBPF support
> >
> >
> > Is there anything that we needs special treatment here?
> >
> > Possible case: rss=on, vhost=on, source system with kernel 5.8 (everything
> works) -> dest. system 5.6 (bpf does not work), the adapter functions, but
> all the steering does not use proper queues.
>
>
>
>
> >
> > > * Integration with current QEMU build
> >
> >
> > Yes, a question here:
> >
> > 1) Any reason for not using libbpf, e.g it has been shipped with some
> > distros
> >
>
> We intentionally do not use libbpf, as it present only on some distros.
> We can switch to libbpf, but this will disable bpf if libbpf is not
> installed
If we were modifying existing funtionality then introducing a dep on
libbpf would be a problem as you'd be breaking existing QEMU users
on distros without libbpf.
This is brand new functionality though, so it is fine to place a
requirement on libbpf. If distros don't ship that library and they
want BPF features in QEMU, then those distros should take responsibility
for adding libbpf to their package set.
> > 2) It would be better if we can avoid shipping bytecodes
> >
>
>
> This creates new dependencies: llvm + clang + ...
> We would prefer byte code and ability to generate it if prerequisites are
> installed.
I've double checked with Fedora, and generating the BPF program from
source is a mandatory requirement for QEMU. Pre-generated BPF bytecode
is not permitted.
There was also a question raised about the kernel ABI compatibility
for BPF programs ?
https://lwn.net/Articles/831402/
"The basic problem is that when BPF is compiled, it uses a set
of kernel headers that describe various kernel data structures
for that particular version, which may be different from those
on the kernel where the program is run. Until relatively recently,
that was solved by distributing the BPF as C code along with the
Clang compiler to build the BPF on the system where it was going
to be run."
Is this not an issue for QEMU's usage of BPF here ?
The dependancy on llvm is unfortunate for people who build with GCC,
but at least they can opt-out via a configure switch if they really
want to. As that LWN article notes, GCC will gain BPF support
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2020-11-03 11:58 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-02 18:51 [RFC PATCH 0/6] eBPF RSS support for virtio-net Andrew Melnychenko
2020-11-02 18:51 ` [RFC PATCH 1/6] net: Added SetSteeringEBPF method for NetClientState Andrew Melnychenko
2020-11-04 2:49 ` Jason Wang
2020-11-04 9:34 ` Yuri Benditovich
2020-11-02 18:51 ` [RFC PATCH 2/6] ebpf: Added basic eBPF API Andrew Melnychenko
2020-11-02 18:51 ` [RFC PATCH 3/6] ebpf: Added eBPF RSS program Andrew Melnychenko
2020-11-03 13:07 ` Daniel P. Berrangé
2020-11-02 18:51 ` [RFC PATCH 4/6] ebpf: Added eBPF RSS loader Andrew Melnychenko
2020-11-02 18:51 ` [RFC PATCH 5/6] virtio-net: Added eBPF RSS to virtio-net Andrew Melnychenko
2020-11-04 3:09 ` Jason Wang
2020-11-04 11:07 ` Yuri Benditovich
2020-11-04 11:13 ` Daniel P. Berrangé
2020-11-04 15:51 ` Yuri Benditovich
2020-11-05 3:29 ` Jason Wang
2020-11-02 18:51 ` [RFC PATCH 6/6] docs: Added eBPF documentation Andrew Melnychenko
2020-11-04 3:15 ` Jason Wang
2020-11-05 3:56 ` Jason Wang
2020-11-05 9:40 ` Yuri Benditovich
2020-11-03 9:02 ` [RFC PATCH 0/6] eBPF RSS support for virtio-net Jason Wang
2020-11-03 10:32 ` Yuri Benditovich
2020-11-03 11:56 ` Daniel P. Berrangé [this message]
2020-11-04 2:15 ` Jason Wang
2020-11-04 2:07 ` Jason Wang
2020-11-04 9:31 ` Daniel P. Berrangé
2020-11-05 3:46 ` Jason Wang
2020-11-05 3:52 ` Jason Wang
2020-11-05 9:11 ` Yuri Benditovich
2020-11-05 10:01 ` Daniel P. Berrangé
2020-11-05 13:19 ` Daniel P. Berrangé
2020-11-05 15:13 ` Yuri Benditovich
2020-11-09 2:13 ` Jason Wang
2020-11-09 13:33 ` Yuri Benditovich
2020-11-10 2:23 ` Jason Wang
2020-11-10 8:00 ` Yuri Benditovich
2020-11-04 11:49 ` Yuri Benditovich
2020-11-04 12:04 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201103115602.GI205187@redhat.com \
--to=berrange@redhat.com \
--cc=andrew@daynix.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=yan@daynix.com \
--cc=yuri.benditovich@daynix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).