netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Network Development <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	simon.horman@netronome.com, Willem de Bruijn <willemb@google.com>
Subject: Re: [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen
Date: Tue, 5 Feb 2019 21:49:46 -0800	[thread overview]
Message-ID: <20190206054946.GH10769@mini-arch> (raw)
In-Reply-To: <20190206041112.h3ccahimp2uerdfk@ast-mbp>

On 02/05, Alexei Starovoitov wrote:
> On Tue, Feb 05, 2019 at 07:56:19PM -0800, Stanislav Fomichev wrote:
> > On 02/05, Alexei Starovoitov wrote:
> > > On Tue, Feb 05, 2019 at 04:59:31PM -0800, Stanislav Fomichev wrote:
> > > > On 02/05, Alexei Starovoitov wrote:
> > > > > On Tue, Feb 05, 2019 at 12:40:03PM -0800, Stanislav Fomichev wrote:
> > > > > > On 02/05, Willem de Bruijn wrote:
> > > > > > > On Tue, Feb 5, 2019 at 12:57 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > > > >
> > > > > > > > Currently, when eth_get_headlen calls flow dissector, it doesn't pass any
> > > > > > > > skb. Because we use passed skb to lookup associated networking namespace
> > > > > > > > to find whether we have a BPF program attached or not, we always use
> > > > > > > > C-based flow dissector in this case.
> > > > > > > >
> > > > > > > > The goal of this patch series is to add new networking namespace argument
> > > > > > > > to the eth_get_headlen and make BPF flow dissector programs be able to
> > > > > > > > work in the skb-less case.
> > > > > > > >
> > > > > > > > The series goes like this:
> > > > > > > > 1. introduce __init_skb and __init_skb_shinfo; those will be used to
> > > > > > > >    initialize temporary skb
> > > > > > > > 2. introduce skb_net which can be used to get networking namespace
> > > > > > > >    associated with an skb
> > > > > > > > 3. add new optional network namespace argument to __skb_flow_dissect and
> > > > > > > >    plumb through the callers
> > > > > > > > 4. add new __flow_bpf_dissect which constructs temporary on-stack skb
> > > > > > > >    (using __init_skb) and calls BPF flow dissector program
> > > > > > > 
> > > > > > > The main concern I see with this series is this cost of skb zeroing
> > > > > > > for every packet in the device driver receive routine, *independent*
> > > > > > > from the real skb allocation and zeroing which will likely happen
> > > > > > > later.
> > > > > > Yes, plus ~200 bytes on the stack for the callers.
> > > > > > 
> > > > > > Not sure how visible this zeroing though, I can probably try to get some
> > > > > > numbers from BPF_PROG_TEST_RUN (running current version vs running with
> > > > > > on-stack skb).
> > > > > 
> > > > > imo extra 256 byte memset for every packet is non starter.
> > > > We can put pre-allocated/initialized skbs without data into percpu or even
> > > > use pcpu_freelist_pop/pcpu_freelist_push to make sure we don't have to think
> > > > about having multiple percpu for irq/softirq/process contexts.
> > > > Any concerns with that approach?
> > > > Any other possible concerns with the overall series?
> > > 
> > > I'm missing why the whole thing is needed.
> > > You're saying:
> > > " make BPF flow dissector programs be able to work in the skb-less case".
> > > What does it mean specifically?
> > > The only non-skb case is XDP.
> > > Are you saying you want flow_dissector prog to be run in XDP?
> > eth_get_headlen that drivers call on RX path on a chunk of data to
> > guesstimate the length of the headers calls flow dissector without an skb
> > (__skb_flow_dissect was a weird interface where it accepts skb or
> > data+len). Right now, there is no way to trigger BPF flow dissector
> > for this case (we don't have an skb to get associated namespace/etc/etc).
> > The patch series tries to fix that to make sure that we always trigger
> > BPF program if it's attached to a device's namespace.
> 
> then why not to create flow_dissector prog type that works without skb?
> Why do you need to fake an skb?
> XDP progs work just fine without it.
What's the advantage of having another prog type? In this case we would have
to write the same flow dissector program twice: first time against __skb_buff
interface, second time against xdp_md.
By using fake skb, we make the same flow dissector __sk_buff BPF program
work in both contexts without a rewrite to an xdp interface (I don't
think users should care whether flow dissector was called form "xdp" vs skb
context; and we're sort of stuck with __sk_buff interface already).

  reply	other threads:[~2019-02-06  5:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-05 17:36 [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen Stanislav Fomichev
2019-02-05 17:36 ` [RFC bpf-next 1/7] net: introduce __init_skb and __init_skb_shinfo helpers Stanislav Fomichev
2019-02-05 20:18   ` Willem de Bruijn
2019-02-05 17:36 ` [RFC bpf-next 2/7] net: introduce skb_net helper Stanislav Fomichev
2019-02-05 20:19   ` Willem de Bruijn
2019-02-05 20:42     ` Stanislav Fomichev
2019-02-05 17:36 ` [RFC bpf-next 3/7] net: plumb network namespace into __skb_flow_dissect Stanislav Fomichev
2019-02-05 20:19   ` Willem de Bruijn
2019-02-05 20:40     ` Stanislav Fomichev
2019-02-05 17:36 ` [RFC bpf-next 4/7] net: flow_dissector: handle no-skb use case Stanislav Fomichev
2019-02-05 20:19   ` Willem de Bruijn
2019-02-05 20:45     ` Stanislav Fomichev
2019-02-05 17:36 ` [RFC bpf-next 5/7] bpf: when doing BPF_PROG_TEST_RUN for flow dissector use no-skb mode Stanislav Fomichev
2019-02-05 20:19   ` Willem de Bruijn
2019-02-05 17:36 ` [RFC bpf-next 6/7] selftests/bpf: add flow dissector bpf_skb_load_bytes helper test Stanislav Fomichev
2019-02-05 17:36 ` [RFC bpf-next 7/7] net: flow_dissector: pass net argument to the eth_get_headlen Stanislav Fomichev
2019-02-05 20:18 ` [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen Willem de Bruijn
2019-02-05 20:40   ` Stanislav Fomichev
2019-02-06  0:47     ` Alexei Starovoitov
2019-02-06  0:59       ` Stanislav Fomichev
2019-02-06  3:12         ` Alexei Starovoitov
2019-02-06  3:56           ` Stanislav Fomichev
2019-02-06  4:11             ` Alexei Starovoitov
2019-02-06  5:49               ` Stanislav Fomichev [this message]
2019-02-12 17:02                 ` Stanislav Fomichev
2019-02-14  4:39                   ` Alexei Starovoitov
2019-02-14  5:57                     ` Stanislav Fomichev
2019-02-14  6:38                       ` Alexei Starovoitov
2019-02-14 17:35                         ` Stanislav Fomichev
2019-02-25 20:33                           ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190206054946.GH10769@mini-arch \
    --to=sdf@fomichev.me \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=simon.horman@netronome.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).