From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from shards.monkeyblade.net ([184.105.139.130]:39492 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750797AbeCFPrk (ORCPT ); Tue, 6 Mar 2018 10:47:40 -0500 Date: Tue, 06 Mar 2018 10:47:35 -0500 (EST) Message-Id: <20180306.104735.201608464583585143.davem@davemloft.net> To: john.fastabend@gmail.com Cc: ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, davejwatson@fb.com Subject: Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data From: David Miller In-Reply-To: <36057f08-dc87-c0f5-591f-859eaa508f2d@gmail.com> References: <9c82ae47-59a1-f140-9d10-31e691fb3c51@gmail.com> <20180306.014242.2009864411917823422.davem@davemloft.net> <36057f08-dc87-c0f5-591f-859eaa508f2d@gmail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org List-ID: From: John Fastabend Date: Mon, 5 Mar 2018 23:06:01 -0800 > On 03/05/2018 10:42 PM, David Miller wrote: >> From: John Fastabend >> Date: Mon, 5 Mar 2018 22:22:21 -0800 >> >>> All I meant by this is if an application uses sendfile() call >>> there is no good way to know when/if the kernel side will copy or >>> xmit the data. So a reliable user space application will need to >>> only modify the data if it "knows" there are no outstanding sends >>> in-flight. So if we assume applications follow this then it >>> is OK to avoid the copy. Of course this is not good enough for >>> security, but for monitoring/statistics (my use case 1 it works). >> >> For an application implementing a networking file system, it's pretty >> legitimate for file contents to change before the page gets DMA's to >> the networking card. >> > > Still there are useful BPF programs that can tolerate this. So I > would prefer to allow BPF programs to operate in the no-copy mode > if wanted. It doesn't have to be the default though as it currently > is. A l7 load balancer is a good example of this. Maybe I'd be ok if it were not the default. But do you really want to expose a potential attack vector, even if the app gets to choose and say "I'm ok"? >> And that's perfectly fine, and we everything such that this will work >> properly. >> >> The card checksums what ends up being DMA'd so nothing from the >> networking side is broken. > > Assuming the card has checksum support correct? Which is why we have > the SKBTX_SHARED_FRAG checked in skb_has_shared_frag() and the checksum > helpers called by the drivers when they do not support the protocol > being used. So probably OK assumption if using supported protocols and > hardware? Perhaps in general folks just use normal protocols and > hardware so it works. If the hardware doesn't support the checksums, we linearize the SKB (therefore obtain a snapshot of the data), and checksum. Exactly what would happen if the hardware did the checksum. So OK in that case too. We always guarantee that you will always get a correct checksum on outgoing packets, even if you modify the page contents meanwhile. > So the "I need at least X more bytes" is the msg_cork_bytes() in patch > 7. I could handle the sendpage case the same as I handle the sendmsg > case and copy the data into the buffer until N bytes are received. I > had planned to add this mode in a follow up series but could add it in > this series so we have all the pieces in one submission. > > Although I used a scatterlist instead of a linear buffer. I was > planning to add a helper to pull in next sg list item if needed > rather than try to allocate a large linear block up front. For non-deep packet inspection cases this re-running of the parser case will probably not trigger at all.