From: Petar Penkov <ppenkov@google.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Petar Penkov <peterpenkov96@gmail.com>,
Networking <netdev@vger.kernel.org>,
"David S . Miller" <davem@davemloft.net>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
simon.horman@netronome.com, Willem de Bruijn <willemb@google.com>
Subject: Re: [bpf-next RFC 0/3] Introduce eBPF flow dissector
Date: Tue, 21 Aug 2018 17:19:26 -0700 [thread overview]
Message-ID: <CAG4SDVVPc=uRev7kpo7bAJ2OU5AJHGHdkpJnN+VG4c_XCD4jJQ@mail.gmail.com> (raw)
In-Reply-To: <20180820205205.jj7e75pulwqkorpr@ast-mbp>
On Mon, Aug 20, 2018 at 1:52 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Thu, Aug 16, 2018 at 09:44:20AM -0700, Petar Penkov wrote:
>> From: Petar Penkov <ppenkov@google.com>
>>
>> This patch series hardens the RX stack by allowing flow dissection in BPF,
>> as previously discussed [1]. Because of the rigorous checks of the BPF
>> verifier, this provides significant security guarantees. In particular, the
>> BPF flow dissector cannot get inside of an infinite loop, as with
>> CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
>> read outside of packet bounds, because all memory accesses are checked.
>> Also, with BPF the administrator can decide which protocols to support,
>> reducing potential attack surface. Rarely encountered protocols can be
>> excluded from dissection and the program can be updated without kernel
>> recompile or reboot if a bug is discovered.
>>
>> Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
>> This includes a new BPF program and attach type.
>>
>> Patch 2 adds a flow dissector program in BPF. This parses most protocols in
>> __skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
>> and address types).
>>
>> Patch 3 adds a selftest that attaches the BPF program to the flow dissector
>> and sends traffic with different levels of encapsulation.
>
> Overall I fully support the direction. Few things to consider:
>
>> This RFC patchset exposes a few design considerations:
>>
>> 1/ Because the flow dissector key definitions live in
>> include/linux/net/flow_dissector.h, they are not visible from userspace,
>> and the flow keys definitions need to be copied in the BPF program.
>
> I don't think copy-paste avoids the issue of uapi.
> Anything used by BPF program is uapi.
> The only exception is offsets of kernel internal structures
> passed into bpf_probe_read().
> So we have several options:
> 1. be honest and say 'struct flow_dissect_key*' is now uapi
> 2. wrap all of them into 'struct bpf_flow_dissect_key*' and do rewrites
> when/if 'struct flow_dissect_key*' changes
> 3. wait for BTF to solve it for tracing use case and for this one two.
> The idea is that kernel internal structs can be defined in bpf prog
> and since they will be described precisely in BTF that comes with the prog
> the kernel can validate that prog's BTF matches what kernel thinks it has.
> imo that's the most flexible, but BTF for all of vmlinux won't be ready
> tomorrow and looks like this patch set is ready to go, so I would go with 1 or 2.
I will pursue #2 then as both you and David are in support of it.
>
>> 2/ An alternative to adding a new hook would have been to attach flow
>> dissection programs at the XDP hook. Because this hook is executed before
>> GRO, it would have to execute on every MSS, which would be more
>> computationally expensive. Furthermore, the XDP hook is executed before an
>> SKB has been allocated and there is no clear way to move the dissected keys
>> into the SKB after it has been allocated. Eventually, perhaps a single pass
>> can implement both GRO and flow dissection -- but napi_gro_cb shows that a
>> lot more flow state would need to be parsed for this.
>
> global flow_dissect bpf hook semantics are problematic for testing.
> like patch 3 test affects the whole system including all containers.
> should the hook be per-netns or may be per netdevice?
>
Having the hook be per-netns would definitely be cleaner for testing.
I will look into
refactoring the hook from global to per-netns.
>> 3/ The BPF program cannot use direct packet access everywhere because it
>> uses an offset, initially supplied by the flow dissector. Because the
>> initial value of this non-constant offset comes from outside of the
>> program, the verifier does not know what its value is, and it cannot verify
>> that it is within packet bounds. Therefore, direct packet access programs
>> get rejected.
>
> this part doesn't seem to match the code.
> direct packet access is allowed and usable even for fragmented skbs.
> in such case only linear part of skb is in "direct access".
I am not sure I understand. What I meant was that I use bpf_skb_load_bytes
rather than direct packet access because the offset at which I read headers,
nhoff, depends on an initial value that cannot be statically verified - namely
what __skb_flow_dissect provides. Is there an alternative approach I should
be taking here, and/or am I misunderstanding direct access?
>
> Last bit, I'm curious about, how the 'demo' flow dissector program
> from patch 2 fairs vs in-kernel dissector when performance tested?
>
I used the test in patch 3 to compare the two implementations and the
difference seemed to be small, with the BPF flow dissector being slightly
slower.
Thank you so much for your feedback, Alexei and David!
next prev parent reply other threads:[~2018-08-22 3:41 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-16 16:44 [bpf-next RFC 0/3] Introduce eBPF flow dissector Petar Penkov
2018-08-16 16:44 ` [bpf-next RFC 1/3] flow_dissector: implements flow dissector BPF hook Petar Penkov
2018-08-16 18:34 ` Edward Cree
2018-08-16 22:37 ` Willem de Bruijn
2018-08-16 22:40 ` Song Liu
2018-08-16 23:14 ` Petar Penkov
2018-08-20 5:44 ` Song Liu
2018-08-20 14:13 ` Willem de Bruijn
2018-08-20 14:20 ` Daniel Borkmann
2018-08-20 16:04 ` Song Liu
2018-08-16 16:44 ` [bpf-next RFC 2/3] flow_dissector: implements eBPF parser Petar Penkov
2018-08-18 15:50 ` Tom Herbert
2018-08-18 19:49 ` Willem de Bruijn
2018-08-16 16:44 ` [bpf-next RFC 3/3] selftests/bpf: test bpf flow dissection Petar Penkov
2018-08-20 20:52 ` [bpf-next RFC 0/3] Introduce eBPF flow dissector Alexei Starovoitov
2018-08-21 2:24 ` David Miller
2018-08-22 0:19 ` Petar Penkov [this message]
2018-08-22 7:22 ` Daniel Borkmann
2018-08-22 7:28 ` Daniel Borkmann
2018-08-23 0:10 ` Petar Penkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAG4SDVVPc=uRev7kpo7bAJ2OU5AJHGHdkpJnN+VG4c_XCD4jJQ@mail.gmail.com' \
--to=ppenkov@google.com \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=peterpenkov96@gmail.com \
--cc=simon.horman@netronome.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).