From: John Fastabend <john.fastabend@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org,
davejwatson@fb.com
Subject: Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
Date: Mon, 5 Mar 2018 23:06:01 -0800 [thread overview]
Message-ID: <36057f08-dc87-c0f5-591f-859eaa508f2d@gmail.com> (raw)
In-Reply-To: <20180306.014242.2009864411917823422.davem@davemloft.net>
On 03/05/2018 10:42 PM, David Miller wrote:
> From: John Fastabend <john.fastabend@gmail.com>
> Date: Mon, 5 Mar 2018 22:22:21 -0800
>
>> All I meant by this is if an application uses sendfile() call
>> there is no good way to know when/if the kernel side will copy or
>> xmit the data. So a reliable user space application will need to
>> only modify the data if it "knows" there are no outstanding sends
>> in-flight. So if we assume applications follow this then it
>> is OK to avoid the copy. Of course this is not good enough for
>> security, but for monitoring/statistics (my use case 1 it works).
>
> For an application implementing a networking file system, it's pretty
> legitimate for file contents to change before the page gets DMA's to
> the networking card.
>
Still there are useful BPF programs that can tolerate this. So I
would prefer to allow BPF programs to operate in the no-copy mode
if wanted. It doesn't have to be the default though as it currently
is. A l7 load balancer is a good example of this.
> And that's perfectly fine, and we everything such that this will work
> properly.
>
> The card checksums what ends up being DMA'd so nothing from the
> networking side is broken.
Assuming the card has checksum support correct? Which is why we have
the SKBTX_SHARED_FRAG checked in skb_has_shared_frag() and the checksum
helpers called by the drivers when they do not support the protocol
being used. So probably OK assumption if using supported protocols and
hardware? Perhaps in general folks just use normal protocols and
hardware so it works.
>
> So this assumption you mention really does not hold.
>
OK.
> There needs to be some feedback from the BPF program that parses the
> packet. This way it can say, "I need at least X more bytes before I
> can generate a verdict". And you keep copying more and more bytes
> into a linear buffer and calling the parser over and over until it can
> generate a full verdict or you run out of networking data.
>
So the "I need at least X more bytes" is the msg_cork_bytes() in patch
7. I could handle the sendpage case the same as I handle the sendmsg
case and copy the data into the buffer until N bytes are received. I
had planned to add this mode in a follow up series but could add it in
this series so we have all the pieces in one submission.
Although I used a scatterlist instead of a linear buffer. I was
planning to add a helper to pull in next sg list item if needed
rather than try to allocate a large linear block up front.
next prev parent reply other threads:[~2018-03-06 7:06 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-05 19:50 [bpf-next PATCH 00/16] bpf,sockmap: sendmsg/sendfile ULP John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 01/16] sock: make static tls function alloc_sg generic sock helper John Fastabend
2018-03-05 21:32 ` David Miller
2018-03-05 19:51 ` [bpf-next PATCH 02/16] sockmap: convert refcnt to an atomic refcnt John Fastabend
2018-03-05 21:34 ` David Miller
2018-03-05 19:51 ` [bpf-next PATCH 03/16] net: do_tcp_sendpages flag to avoid SKBTX_SHARED_FRAG John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 04/16] net: generalize sk_alloc_sg to work with scatterlist rings John Fastabend
2018-03-05 21:35 ` David Miller
2018-03-05 19:51 ` [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data John Fastabend
2018-03-05 21:40 ` David Miller
2018-03-05 22:53 ` John Fastabend
2018-03-06 5:42 ` David Miller
2018-03-06 6:22 ` John Fastabend
2018-03-06 6:42 ` David Miller
2018-03-06 7:06 ` John Fastabend [this message]
2018-03-06 15:47 ` David Miller
2018-03-06 18:18 ` John Fastabend
2018-03-07 3:25 ` John Fastabend
2018-03-07 4:41 ` David Miller
2018-03-07 13:03 ` Daniel Borkmann
2018-03-05 19:51 ` [bpf-next PATCH 06/16] bpf: sockmap, add bpf_msg_apply_bytes() helper John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 07/16] bpf: sockmap, add msg_cork_bytes() helper John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 08/16] bpf: add map tests for BPF_PROG_TYPE_SK_MSG John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 09/16] bpf: add verifier " John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 10/16] bpf: sockmap sample, add option to attach SK_MSG program John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 11/16] bpf: sockmap sample, add sendfile test John Fastabend
2018-03-05 19:51 ` [bpf-next PATCH 12/16] bpf: sockmap sample, add data verification option John Fastabend
2018-03-05 19:52 ` [bpf-next PATCH 13/16] bpf: sockmap, add sample option to test apply_bytes helper John Fastabend
2018-03-05 19:52 ` [bpf-next PATCH 14/16] bpf: sockmap sample support for bpf_msg_cork_bytes() John Fastabend
2018-03-05 19:52 ` [bpf-next PATCH 15/16] sockmap: add SK_DROP tests John Fastabend
2018-03-05 19:52 ` [bpf-next PATCH 16/16] bpf: sockmap test script John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=36057f08-dc87-c0f5-591f-859eaa508f2d@gmail.com \
--to=john.fastabend@gmail.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davejwatson@fb.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).