From: John Fastabend <john.fastabend@gmail.com>
To: Yonghong Song <yhs@meta.com>,
John Fastabend <john.fastabend@gmail.com>,
hawk@kernel.org, daniel@iogearbox.net, kuba@kernel.org,
davem@davemloft.net, ast@kernel.org
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, sdf@google.com
Subject: Re: [1/2 bpf-next] bpf: expose net_device from xdp for metadata
Date: Sun, 13 Nov 2022 10:27:06 -0800 [thread overview]
Message-ID: <637136faa95e5_2c136208dc@john.notmuch> (raw)
In-Reply-To: <86af974c-a970-863f-53f5-c57ebba9754e@meta.com>
Yonghong Song wrote:
>
>
> On 11/10/22 3:11 PM, John Fastabend wrote:
> > John Fastabend wrote:
> >> Yonghong Song wrote:
> >>>
> >>>
> >>> On 11/9/22 6:17 PM, John Fastabend wrote:
> >>>> Yonghong Song wrote:
> >>>>>
> >>>>>
> >>>>> On 11/9/22 1:52 PM, John Fastabend wrote:
> >>>>>> Allow xdp progs to read the net_device structure. Its useful to extract
> >>>>>> info from the dev itself. Currently, our tracing tooling uses kprobes
> >>>>>> to capture statistics and information about running net devices. We use
> >>>>>> kprobes instead of other hooks tc/xdp because we need to collect
> >>>>>> information about the interface not exposed through the xdp_md structures.
> >>>>>> This has some down sides that we want to avoid by moving these into the
> >>>>>> XDP hook itself. First, placing the kprobes in a generic function in
> >>>>>> the kernel is after XDP so we miss redirects and such done by the
> >>>>>> XDP networking program. And its needless overhead because we are
> >>>>>> already paying the cost for calling the XDP program, calling yet
> >>>>>> another prog is a waste. Better to do everything in one hook from
> >>>>>> performance side.
> >>>>>>
> >>>>>> Of course we could one-off each one of these fields, but that would
> >>>>>> explode the xdp_md struct and then require writing convert_ctx_access
> >>>>>> writers for each field. By using BTF we avoid writing field specific
> >>>>>> convertion logic, BTF just knows how to read the fields, we don't
> >>>>>> have to add many fields to xdp_md, and I don't have to get every
> >>>>>> field we will use in the future correct.
> >>>>>>
> >>>>>> For reference current examples in our code base use the ifindex,
> >>>>>> ifname, qdisc stats, net_ns fields, among others. With this
> >>>>>> patch we can now do the following,
> >>>>>>
> >>>>>> dev = ctx->rx_dev;
> >>>>>> net = dev->nd_net.net;
> >>>>>>
> >>>>>> uid.ifindex = dev->ifindex;
> >>>>>> memcpy(uid.ifname, dev->ifname, NAME);
> >>>>>> if (net)
> >>>>>> uid.inum = net->ns.inum;
> >>>>>>
> >>>>>> to report the name, index and ns.inum which identifies an
> >>>>>> interface in our system.
> >>>>>
[...]
> >> Yep.
> >>
> >> I'm fine doing it with bpf_get_kern_ctx() did you want me to code it
> >> the rest of the way up and test it?
> >>
> >> .John
> >
> > Related I think. We also want to get kernel variable net_namespace_list,
> > this points to the network namespace lists. Based on above should
> > we do something like,
> >
> > void *bpf_get_kern_var(enum var_id);
> >
> > then,
> >
> > net_ns_list = bpf_get_kern_var(__btf_net_namesapce_list);
> >
> > would get us a ptr to the list? The other thought was to put it in the
> > xdp_md but from above seems better idea to get it through helper.
>
> Sounds great. I guess my new proposed bpf_get_kern_btf_id() kfunc could
> cover such a use case as well.
Yes I think this should be good. The only catch is that we need to
get the kernel global var pointer net_namespace_list.
Then we can write iterators on network namespaces and net_devices
without having to do anything else. The usecase is to iterate
the network namespace and collect some subset of netdevices. Populate
a map with these and then keep it in sync from XDP with stats. We
already hook create/destroy paths so have built up maps that track
this and have some XDP stats but not everything we would want.
The other piece I would like to get out of the xdp ctx is the
rx descriptor of the device. I want to use this to pull out info
about the received buffer for debug mostly, but could also grab
some fields that are useful for us to track. That we can likely
do this,
ctx->rxdesc
Recently had to debug an ugly hardware/driver bug where this would
have been very useful.
.John
next prev parent reply other threads:[~2022-11-13 18:27 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-09 21:52 [0/2 bpf-next] Expose netdev in XDP progs with BTF_ID John Fastabend
2022-11-09 21:52 ` [1/2 bpf-next] bpf: expose net_device from xdp for metadata John Fastabend
2022-11-10 1:37 ` Yonghong Song
2022-11-10 2:17 ` John Fastabend
2022-11-10 12:45 ` Toke Høiland-Jørgensen
2022-11-10 16:53 ` Yonghong Song
2022-11-10 17:02 ` John Fastabend
2022-11-11 10:51 ` Jesper Dangaard Brouer
2022-11-11 15:15 ` Yonghong Song
2022-11-10 16:46 ` Yonghong Song
2022-11-10 22:58 ` John Fastabend
2022-11-10 23:11 ` John Fastabend
2022-11-11 6:34 ` Yonghong Song
2022-11-13 18:27 ` John Fastabend [this message]
2022-11-14 16:51 ` Yonghong Song
2022-11-14 18:23 ` John Fastabend
2022-11-16 19:46 ` Jakub Kicinski
2022-11-11 6:28 ` Yonghong Song
2022-11-11 1:13 ` kernel test robot
2022-11-11 3:04 ` kernel test robot
2022-11-11 5:15 ` kernel test robot
2022-11-09 21:52 ` [2/2 bpf-next] bpf: add selftest to read xdp_md fields John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=637136faa95e5_2c136208dc@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=hawk@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=yhs@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).